0% found this document useful (0 votes)
10 views7 pages

Adf Part 1

Uploaded by

TECHNO PDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views7 pages

Adf Part 1

Uploaded by

TECHNO PDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Azure Data Factory

Notes Part-1

Abhishek Agrawal
Azure Data Engineer
Why Use ADF?
Azure Data Factory (ADF) enables data-driven workflows
to orchestrate and automate data movement and
transformation at scale, allowing seamless integration
across various sources for efficient data management.

What is ADF?
Azure Data Factory is Microsoft's cloud-based ETL
(Extract, Transform, Load) service for data integration. It
supports data movement and transformation between
on-premises and cloud systems, featuring a user-
friendly interface for designing workflows and
monitoring data pipelines.

How to Use ADF?


For example, to generate reports from data across 100
sources, ADF allows you to create an automated
pipeline that pulls, transforms, and presents the data in
structured reports, streamlining the process and
ensuring up-to-date information with minimal manual
intervention

Abhishek Agrawal | Azure Data Engineer


Integration Run Time

Azure Default Self Hosted


SSIS
Integration Run Time Integration Run Time

How to Connect Azure Data Factory (ADF) with


Different Source Systems
To connect Azure Data Factory with various data sources, the key steps
involve configuring an Integration Runtime (IR) and a Linked Service.

1. Configure Integration Runtime (IR)

Azure Default Integration Runtime:


Managed by Microsoft for connecting to cloud-based data sources.
Accesses publicly available data stores and services.
Self-Hosted Integration Runtime:
Connects to on-premises data sources or private networks.
Enables secure data movement between private and cloud
environments.
SSIS Integration Runtime:
Executes SQL Server Integration Services (SSIS) packages within ADF.
Leverages existing SSIS packages without rewriting ETL logic.

Abhishek Agrawal | Azure Data Engineer


Source Integration Run Time Linked Service ADF

2. Configure Linked Service

Linked Service in ADF

After setting up the Integration Runtime (IR), the next step is to configure
a Linked Service, which defines the connection details for data sources
or destinations.

Function: Acts as a connection string, storing authentication credentials


(username, password, access keys) and connection information.

Configuration: You set up one Linked Service for each data source,
such as Azure SQL Database, Blob Storage, or on-premises SQL Server.

Summary and difference

Integration Runtime: Provides the compute for data integration and


determines how ADF connects to data sources.

Linked Service: Defines specific connection and authentication details,


enabling secure interaction with each source system.

Abhishek Agrawal | Azure Data Engineer


Consumes

is a logical
Produces ACTIVITY grouping of PIPELINE
DATA SET (e.g. hive, stored proc., schedule, monitor,
(e.g. table, file)
copy) manage)

KEY COMPONENTS OF AZURE DATA FACTORY:

Pipeline:
A logical grouping of activities that perform a specific task, such
as ingesting, transforming, and loading data in a single workflow.
Helps organize and orchestrate data processes.

Activity:
An individual step within a pipeline defining a specific task, such
as copying data, transforming data, or executing scripts.
Multiple activities can be combined to create complex workflows.

Dataset:
Represents a reference to the data being processed, pointing to
its location (file, database table, or blob) and structure.
Acts as an abstraction layer to work with data without directly
handling the raw storage format.

Abhishek Agrawal | Azure Data Engineer


Types of Activities in Azure Data Factory
Azure Data Factory encompasses three primary types of
activities, each serving a distinct purpose in data processing

Data
Data Movement Control Flow
Transformation
Activity Activity
Activity
Purpose: Transfers Purpose: Transforms or Purpose: Manages
data between manipulates ingested pipeline execution by
different storage data by cleaning, defining workflow
systems or services, aggregating, logic and structure,
including on- reshaping, or enriching including sequencing,
premises and cloud it before loading into a branching, looping,
environments. target destination. and error handling.

Example: Data Flow,


Example: Copy Example: If Condition
Mapping Data Flow,
Activity efficiently Activity, ForEach
and custom
moves data transformations using Activity, and Wait
between various Azure Databricks or Activity.
data stores. SQL stored procedures.

Abhishek Agrawal | Azure Data Engineer


Follow for more
content like this

Abhishek Agrawal
Azure Data Engineer

You might also like