0% found this document useful (0 votes)
100 views52 pages

DP700 3

The document outlines a case study for Contoso, Ltd. and Litware, Inc., detailing their analytics platforms, existing environments, user problems, and requirements for data processing and security. Contoso aims to modernize its analytics by implementing lakehouses and improving data ingestion, while Litware seeks to enhance its sales data processing and manage SEO data. Each case study includes specific questions regarding the implementation of solutions based on the provided scenarios.

Uploaded by

Veependra Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views52 pages

DP700 3

The document outlines a case study for Contoso, Ltd. and Litware, Inc., detailing their analytics platforms, existing environments, user problems, and requirements for data processing and security. Contoso aims to modernize its analytics by implementing lakehouses and improving data ingestion, while Litware seeks to enhance its sales data processing and manage SEO data. Each case study includes specific questions regarding the implementation of solutions based on the provided scenarios.

Uploaded by

Veependra Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

QUESTION: 1

Case Study:
This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.

Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
What should you do?

A. Add the DataAnalyst group to the Viewer role for Workspace

B. Share the lakehouse with the DataAnalysts group and grant the Build reports on
the default semantic model permission.

C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL
Endpoint data permission.
D. Share the lakehouse with the DataAnalysts group and grant the Read all Apache
Spark permission.
Answer(s): C

QUESTION: 2

Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Overview
Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the
authors it represents.

Existing Environment:
Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for
Workspace1.
The company has a data engineering team that uses Python for data processing.

Existing Environment:
Data Processing
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource
planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that
land in the Files folder in a lakehouse. Notebooks are used to transform the files in a
Delta table for the bronze and silver layers. The gold layer is in a warehouse that has V-
Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into
the Files folder.

Existing Environment:
Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is
older than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each
day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new
and historical data is captured. The dataflow captures the following fields of the source:
Sales Date
Author
Price
Units
SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.

Existing Environment:
Security Groups
Litware has the following security groups:
Sales
Fabric Admins
Streaming Admins

Existing Environment:
Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users
indicate that reports against the warehouse sometimes run for two hours and fail to load
as expected. Upon further investigation, the data engineering team receives the
following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more
than one failure.
When the authors have new book releases, there is often an increase in sales activity.
This increase slows the data ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT
been up-to-date when they arrive at work in the morning.
Requirements:
Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO
data will be streamed from a REST API.
Requirements. Version Control
Litware plans to implement a version control solution in Fabric that will use GitHub
integration and follow the principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and
items. Additional Azure resources must NOT be provisioned.
Requirements. Data Requirements
Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as
possible.
You need to implement the solution for the book reviews.
Which should you do?

A. Create a Dataflow Gen2 dataflow.

B. Create a shortcut.

C. Enable external data sharing.

D. Create a data pipeline.


Answer(s): B

QUESTION: 3

HOTSPOT (Drag and Drop is not supported)


Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Overview
Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the
authors it represents.

Existing Environment:
Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for
Workspace1.
The company has a data engineering team that uses Python for data processing.

Existing Environment:
Data Processing
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource
planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that
land in the Files folder in a lakehouse. Notebooks are used to transform the files in a
Delta table for the bronze and silver layers. The gold layer is in a warehouse that has V-
Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into
the Files folder.

Existing Environment:
Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is
older than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each
day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new
and historical data is captured. The dataflow captures the following fields of the source:
Sales Date
Author
Price
Units
SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.

Existing Environment:
Security Groups
Litware has the following security groups:
Sales
Fabric Admins
Streaming Admins

Existing Environment:
Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users
indicate that reports against the warehouse sometimes run for two hours and fail to load
as expected. Upon further investigation, the data engineering team receives the
following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more
than one failure.
When the authors have new book releases, there is often an increase in sales activity.
This increase slows the data ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT
been up-to-date when they arrive at work in the morning.

Requirements:
Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO
data will be streamed from a REST API.
Requirements. Version Control
Litware plans to implement a version control solution in Fabric that will use GitHub
integration and follow the principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and
items. Additional Azure resources must NOT be provisioned.
Requirements. Data Requirements
Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as
possible.
You need to troubleshoot the ad-hoc query issue.
How should you complete the statement? To answer, select the appropriate options in
the answer area.
NOTE: Each correct selection is worth one point.

A. See Explanation section for answer.


Answer(s): A

Explanation:

QUESTION: 4

Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Overview
Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the
authors it represents.

Existing Environment:
Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for
Workspace1.
The company has a data engineering team that uses Python for data processing.

Existing Environment:
Data Processing
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource
planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that
land in the Files folder in a lakehouse. Notebooks are used to transform the files in a
Delta table for the bronze and silver layers. The gold layer is in a warehouse that has V-
Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into
the Files folder.

Existing Environment:
Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is
older than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each
day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new
and historical data is captured. The dataflow captures the following fields of the source:
Sales Date
Author
Price
Units
SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.

Existing Environment:
Security Groups
Litware has the following security groups:
Sales
Fabric Admins
Streaming Admins

Existing Environment:
Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users
indicate that reports against the warehouse sometimes run for two hours and fail to load
as expected. Upon further investigation, the data engineering team receives the
following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more
than one failure.
When the authors have new book releases, there is often an increase in sales activity.
This increase slows the data ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT
been up-to-date when they arrive at work in the morning.

Requirements:
Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO
data will be streamed from a REST API.
Requirements. Version Control
Litware plans to implement a version control solution in Fabric that will use GitHub
integration and follow the principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and
items. Additional Azure resources must NOT be provisioned.
Requirements. Data Requirements
Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as
possible.
You need to resolve the sales data issue. The solution must minimize the amount of
data transferred.
What should you do?

A. Spilt the dataflow into two dataflows.

B. Configure scheduled refresh for the dataflow.

C. Configure incremental refresh for the dataflow. Set Store rows from the past to 1
Month.

D. Configure incremental refresh for the dataflow. Set Refresh rows from the past to
1 Year.

E. Configure incremental refresh for the dataflow. Set Refresh rows from the past to
1 Month.
Answer(s): E

QUESTION: 5

Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.

Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to schedule the population of the medallion layers to meet the technical
requirements.
What should you do?

A. Schedule a data pipeline that calls other data pipelines.

B. Schedule a notebook.

C. Schedule an Apache Spark job.

D. Schedule multiple data pipelines.


Answer(s): A

QUESTION: 6

You have a Fabric workspace.


You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only
be written by using Spark.
What should you use to store the data?

A. a lakehouse

B. an eventhouse

C. a datamart

D. a warehouse
Answer(s): A

QUESTION: 7

HOTSPOT (Drag and Drop is not supported)


Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.
Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.

Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to recommend a method to populate the POS1 data to the lakehouse
medallion layers.
What should you recommend for each layer? To answer, select the appropriate options
in the answer area.
NOTE: Each correct selection is worth one point.

A. See Explanation section for answer.


Answer(s): A

Explanation:
QUESTION: 8

DRAG DROP (Drag and Drop is not supported)


You have a Fabric eventhouse that contains a KQL database. The database contains a
table named TaxiData. The following is a sample of the data in TaxiData.
You need to build two KQL queries. The solution must meet the following requirements:
One of the queries must partition RunningTotalAmount by VendorID.
The other query must create a column named FirstPickupDateTime that shows the first
value of each hour from tpep_pickup_datetime partitioned by payment_type.
How should you complete each query? To answer, drag the appropriate values the
correct targets. Each value may be used once, more than once, or not at all. You may
need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
A. See Explanation section for answer.
Answer(s): A

Explanation:

QUESTION: 9

You have a Fabric workspace that contains a warehouse named Warehouse1.


You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

A. a Dataflow Gen1 dataflow


B. a data pipeline

C. a KQL queryset

D. a notebook
Answer(s): B

QUESTION: 10

Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.

Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical
requirements.
What should you do?

A. Create a workspace identity and enable high concurrency for the notebooks.

B. Create a shortcut and ensure that caching is disabled for the workspace.

C. Create a workspace identity and use the identity in a data pipeline.


D. Create a shortcut and ensure that caching is enabled for the workspace.
Answer(s): B

QUESTION: 11
HOTSPOT (Drag and Drop is not supported)
You are processing streaming data from an external data provider.
You have the following code segment.
For each of the following statements, select Yes if the statement is true. Otherwise,
select No.
NOTE: Each correct selection is worth one point.

A. See Explanation section for answer.


Answer(s): A

Explanation:
QUESTION: 12

You have a Fabric workspace that contains a warehouse named Warehouse1.


You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

A. an Apache Spark job definition

B. a data pipeline

C. a Dataflow Gen1 dataflow

D. an eventstream
Answer(s): B

QUESTION: 13

HOTSPOT (Drag and Drop is not supported)


Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.
Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to create the product dimension.
How should you complete the Apache Spark SQL code? To answer, select the
appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

A. See Explanation section for answer.


Answer(s): A

Explanation:
Hide Solution Next Question

QUESTION: 14

You have a Fabric workspace that contains a lakehouse named Lakehouse1.


Lakehouse1 contains a Delta table named Table1.
You analyze Table1 and discover that Table1 contains 2,000 Parquet files of 1 MB each.
You need to minimize how long it takes to query Table1.
What should you do?

A. Disable V-Order and run the OPTIMIZE command.

B. Disable V-Order and run the VACUUM command.

C. Run the OPTIMIZE and VACUUM commands.


Answer(s): C
QUESTION: 15

You have a Fabric F32 capacity that contains a workspace. The workspace contains a
warehouse named DW1 that is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million
rows during the past year.
You have Microsoft Power BI reports that are based on Direct Lake. The reports show
year-over-year values.
Users report that the performance of some of the reports has degraded over time and
some visuals show errors.
You need to resolve the performance issues. The solution must meet the following
requirements:
Provide the best query performance.
Minimize operational costs.
Which should you do?

A. Change the MD5 hash to SHA256.

B. Increase the capacity.

C. Enable V-Order.

D. Modify the surrogate keys to use a different data type.

E. Create views.
Answer(s): C

QUESTION: 16

Case Study:

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are
able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other
resources that provide more information about the scenario that is described in the case
study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to
review your answers and to make changes before you move to the next section of the
exam. After you begin a new section, you cannot return to this section.

To start the case study


To display the first question in this case study, click the Next button. Use the buttons in
the left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the
subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview:


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform
by moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure:
The company’s IT department has a team of data analysts and a team of data
engineers that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They
prefer to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are
qualified to write queries in Power Query and T-SQL.

Existing Environment:
Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use
Pro license mode.

Existing Environment:
Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL
Server on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The
host virtual machine is on a private virtual network that has public access blocked.
POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1.
MAR1 has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1
by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet
files in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that
range in size from 300 MB to 900 MB and relate to email interactions.

Existing Environment:
Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are
related to product categories.

Existing Environment:
Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security
groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment:
User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types
of email content. It typically takes a week to manually compile and analyze the data.
Contoso wants to reduce the time to less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements:
Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements:
Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to
populate the MAR1 data in the silver layer, including deduplication, the handling of
missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following
requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import
the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be
removed.

Requirements:
Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the
gold layer must include only active products from product list. Active products are
identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They
are NOT analytically relevant and must be omitted from the product dimension in the
gold layer.
Requirements:
Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including
the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to populate the MAR1 data in the bronze layer.
Which two types of activities should you include in the pipeline? Each correct answer
presents part of the solution.
NOTE: Each correct selection is worth one point.

A. ForEach

B. Copy data

C. WebHook

D. Stored procedure
Answer(s): A,B

QUESTION: 17

You have a Fabric workspace that contains a warehouse named Warehouse1. Data is
loaded daily into Warehouse1 by using data pipelines and stored procedures.
You discover that the daily data load takes longer than expected.
You need to monitor Warehouse1 to identify the names of users that are actively
running queries.
Which view should you use?
A. sys.dm_exec_connections

B. sys.dm_exec_requests

C. queryinsights.long_running_queries

D. queryinsights.frequently_run_queries

E. sys.dm_exec_sessions
Answer(s): E
QUESTION: 18

HOTSPOT (Drag and Drop is not supported)


You have a Fabric workspace that contains a warehouse named DW1. DW1 contains
the following tables and columns.
You need to create an output that presents the summarized values of all the order
quantities by year and product. The results must include a summary of the order
quantities at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the
answer area.
NOTE: Each correct selection is worth one point.
A. See Explanation section for answer.
Answer(s): A

Explanation:
QUESTION: 19

HOTSPOT (Drag and Drop is not supported)


You have a Fabric workspace that contains a warehouse named Warehouse1.
Warehouse1 contains the following tables and columns.
You need to denormalize the tables and include the ContractType and StartDate
columns in the Employee table. The solution must meet the following requirements:
Ensure that the StartDate column is of the date data type.
Ensure that all the rows from the Employee table are preserved and include any
matching rows from the Contract table.
Ensure that the result set displays the total number of employees per contract type for
all the contract types that have more than two employees.
How should you complete the statement? To answer, select the appropriate options in
the answer area.
NOTE: Each correct selection is worth one point.
A. See Explanation section for answer.
Answer(s): A

Explanation:
QUESTION: 20
You have a Fabric workspace that contains an eventstream named EventStream1.
EventStream1 outputs events to a table in a lakehouse.
You need to remove files that are older than seven days and are no longer in use.
Which command should you run?

A. VACUUM

B. COMPUTE

C. OPTIMIZE

D. CLONE
Answer(s): A

QUESTION: 21

You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is
ingested into Lakehouse1 as one flat table. The table contains the following columns.
You plan to load the data into a dimensional model and implement a star schema. From
the original flat table, you create two tables named FactSales and DimProduct. You will
track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer
presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Date

B. ProductName

C. ProductColor

D. TransactionID
E. SalesAmount

F. ProductID
Answer(s): B,C,F

QUESTION: 22

HOTSPOT (Drag and Drop is not supported)


You have an Azure Event Hubs data source that contains weather data.
You ingest the data from the data source by using an eventstream named
Eventstream1. Eventstream1 uses a lakehouse as the destination.
You need to batch ingest only rows from the data source where the City attribute has a
value of Kansas. The filter must be added before the destination. The solution must
minimize development effort.
What should you use for the data processor and filtering? To answer, select the
appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

A. See Explanation section for answer.


Answer(s): A

Explanation:
QUESTION: 23

You have a Fabric warehouse named DW1 that loads data by using a data pipeline
named Pipeline1. Pipeline1 uses a Copy data activity with a dynamic SQL source.
Pipeline1 is scheduled to run every 15 minutes.
You discover that Pipeline1 keeps failing.
You need to identify which SQL query was executed when the pipeline failed.
What should you do?

A. From Monitoring hub, select the latest failed run of Pipeline1, and then view the
output JSON.

B. From Monitoring hub, select the latest failed run of Pipeline1, and then view the
input JSON.

C. From Real-time hub, select Fabric events, and then review the details of
Microsoft.Fabric.ItemReadFailed.

D. From Real-time hub, select Fabric events, and then review the details of
Microsoft. Fabric.ItemUpdateFailed.
Answer(s): B

QUESTION: 24
You have a Fabric workspace named Workspace1 that contains a notebook named
Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session
as Notebook1.
What should you do?

A. Enable high concurrency for notebooks.

B. Enable dynamic allocation for the Spark pool.

C. Change the runtime version.

D. Increase the number of executors.


Answer(s): A

QUESTION: 25

You have a Fabric workspace that contains an eventstream named Eventstream1.


Eventstream1 processes data from a thermal sensor by using event stream processing,
and then stores the data in a lakehouse.
You need to modify Eventstream1 to include the standard deviation of the temperature.
Which transform operator should you include in the Eventstream1 logic?

A. Expand

B. Group by

C. Union
D. Aggregate
Answer(s): D

QUESTION: 26

You have a Fabric notebook named Notebook1 that has been executing successfully for
the last week.
During the last run, Notebook1executed nine jobs.
You need to view the jobs in a timeline chart.
What should you use?

A. Real-Time hub
B. Monitoring hub

C. the job history from the application run

D. Spark History Server

E. the run series from the details of the application run


Answer(s): E

QUESTION: 27

You have a Fabric workspace named Workspace1 that contains a lakehouse named
Lakehouse1. Lakehouse1 contains the following tables:
Orders
Customer
Employee
The Employee table contains Personally Identifiable Information (PII).
A data engineer is building a workflow that requires writing data to the Customer table,
however, the user does NOT have the elevated permissions required to view the
contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without
reading data from the Employee table.
Which three actions should you perform? Each correct answer presents part of the
solution.
NOTE: Each correct selection is worth one point.
A. Share Lakehouse1 with the data engineer.

B. Assign the data engineer the Contributor role for Workspace2.

C. Assign the data engineer the Viewer role for Workspace2.

D. Assign the data engineer the Contributor role for Workspace1.

E. Migrate the Employee table from Lakehouse1 to Lakehouse2.


F. Create a new workspace named Workspace2 that contains a new lakehouse
named Lakehouse2.

G. Assign the data engineer the Viewer role for Workspace1.


Answer(s): D,E,F

QUESTION: 28

You have an Azure event hub. Each event contains the following fields:
BikepointID
Street
Neighbourhood
Latitude
Longitude
No_Bikes
No_Empty_Docks
You need to ingest the events. The solution must only retain events that have a
Neighbourhood value of Chelsea, and then store the retained events in a Fabric
lakehouse.
What should you use?

A. a KQL queryset

B. an eventstream

C. a streaming dataset

D. Apache Spark Structured Streaming


Answer(s): B

You might also like