Azure Data Factory:
*******************
As data is coming from a no of different products, to analyze and store all this
data, we need a powerful (Azure Data Factory) helps us here by:
1-->Storing data with help of Azure Data Lake Storage
2-->Analyzing the data
3-->Transforming the data with the help of pipelines(logical grouping of activities
that together perform a task)
4-->Publishing the organized data
5-->Visualizing the data with third-party applications like Apache Spark/Hadoop
Flow process of Data Factory:
Input Dataset ==> Pipeline ==> Output Dataset==> Linked Services ==> Azure Data
Lake & Block Storage & SQL ==> Gateway ==> Cloud.
Input Dataset: Data that we have within our data store. We need it to be processed,
so we pass this data through a pipeline.
PipelIne: performs an operation on the data that transforms data,which could be
anything
Output Dataset: contains the data that is in s
Linked Services: Store information that is very important when it comes to
connecting to an external source.
Gateway: It Connects our on-premises data to the cloud. system so that we can
connect to Azure cloud
Cloud: Our
ADF Components:
1. Activity ---> Copy, Delete
2.DATASET ---> This is pointing to file
3.Linked Service --> Connection
4.Integration Run Time --> 1.Auto Resolve
2.Self Hosted
3.SSIS