100% found this document useful (1 vote)
3K views149 pages

DP 700 (6th April) 2

The document outlines a series of questions and answers related to configuring and managing a Fabric workspace for data integration and analytics. It covers topics such as Git integration, data pipelines, row-level security, and data transformation requirements for a company named Contoso, which aims to modernize its analytics platform. The case study emphasizes the need for efficient data processing, governance, and security measures while utilizing various Azure services.

Uploaded by

Veependra Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views149 pages

DP 700 (6th April) 2

The document outlines a series of questions and answers related to configuring and managing a Fabric workspace for data integration and analytics. It covers topics such as Git integration, data pipelines, row-level security, and data transformation requirements for a company named Contoso, which aims to modernize its analytics platform. The case study emphasizes the need for efficient data processing, governance, and security measures while utilizing various Azure services.

Uploaded by

Veependra Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 149

1.

You have a Fabric workspace named Workspace1 integra on for Workspace1 An Azure DevOps admin
creates the You plan to configure Git required ar facts to support the integra on of Workspace1. Which
details do you require to perform the integra on?

A. the organiza on, project, Git repository, and branch

B. the Git repository URL and the Git folder

C. the personal access token (PAT) for Git authen ca on and the Git repository URL

D. the project, Git repository, branch, and Git folder

2.You have a Fabric workspace that contains a lakehouse and a seman c model named Model1. You use
a notebook named Notebook1 to ingest and transform data from an external data source. You need to
execute Notebook1 as part of a data pipeline named Pipeline1. The process must meet the following
requirements: Run daily at 07:00 AM UTC. A empt to retry Notebook1 twice if the notebook fails. A er
Notebook1 executes successfully, refresh Model1. Select three ac ons

A. From the Schedule se ngs of Pipeline 1, set the me zone to UTC.

B. From the Schedule se ngs of Notebook1, set the me zone to UTC.

C. Place the Seman c model refresh ac vity a er the Notebook ac vity and link the ac vi es by using an
On comple on condi on.

D. Set the Retry se ng of the Notebook ac vity to 2.

E. Set the Retry se ng of the Seman c model refresh ac vity to 2

f. Place the Seman c model refresh ac vity a er the Notebook ac vity and link the ac vi es by using
the On success condi on.

3. You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by
mul ple sales representa ves. You plan to implement row-level security (RLS). You need to ensure that
the sales representa ves can see only their respec ve data. Which warehouse object do you require to
implement RLS?

A. SECURITY POLICY

B. DATABASE ROLE

C. SCHEMA

D. TABLE

4.You have a Fabric workspace named Workspace1 that contains the following items: Microso Power BI
report named Report1 A Power BI dashboard named Dashboard1. A seman c model named Model1 A
lakehouse name Lakehouse1 Your company requires that specific governance processes be implemented
for the items. Which items can you endorse in Fabric?

A. Report1 and Dashboard1 only

B. Modell, Report1, and Dashboard1 only


C. Lakehouse1, Model1, and Dashboard1 only

D. Lakehouse1, Modell, Report1 and Dashboard1

E. Lakehouse 1, Modell, and Report1 only

5. You have a Fabric workspace that contains a lakehouse named Lakehouse1. You plan to create a data
pipeline named Pipeline1 to ingest data into Lakehouse1. You will use a parameter named param1 to
pass an external value into Pipeline You need to ensure that the pipeline expression returns param1 as
an int value. How should you specify the parameter value?

A. (pipeline().parameters.[param1]}

B. @pipeline().parameters.param1

C.(pipeline().parameters.paraml)

D. @pipeline().parameters.param1)

6. You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse. You
plan to deploy Warehouse1 to a new workspace named Workspace2. As part of the deployment process,
you need to verify whether Warehouse1 contains invalid references. The solu on must minimize
development effort. What should you use?

A. a deployment pipeline

B. a Python script

C. a database project

D. a T-SQL script

7. You have a Fabric workspace that contains an eventstream named EventStream1. EventStream1
outputs events to a table named Table1 in a lakehouse. The streaming

data is sourced from mo of cars. You need to add a transforma on to EventStream1 to average the car
speeds. The speeds must be grouped by non-overlapping and con guous me intervals of one minute.
Each event must Which windowing func on should you use?

A. sliding

B. session

C. hopping

D. tumbling

8.You have a Fabric workspace that contains a warehouse named Warehouse1. You have an on-premises
Microso SQL Server database named Database1 that is accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse 1. Which item should you use?

A. a data pipeline

B. a Dataflow Gent dataflow


C. KQL queryset

D. a notebook

9.You have a Fabric workspace that contains an eventstream named Eventstream1. Eventstream1
processes data from a thermal sensor by using event stream processing and then stores the data in a
lakehouse You need to modify Eventstream1 to include the standard devia on of the temperature.
Which transform operator should you include in the Eventstream1 logic?

A. Union

B. Aggregate

C. Group by

D. Expand

10. You have a Fabric workspace. You have semi-structured data. You need to read the data by using T-
SQL, KQL, and Apache Spark. The data will only be wri en by using Spark. What should you use to store
the data?

A. a datamart

B. an eventhouse

C. a warehouse

D. a lakehouse

11. You have a Fabric workspace that contains a write-intensive warehouse named DW1. DW1 stores
staging tables that are used to load data mensional model. The tables are o en read once, dropped, and
then recreated to process new You need to minimize the load me of DW1. What should you do

A. Disable V-Order

B. Drop sta s cs.

C Enable V-Order.

D. Create sta s cs

12.You have a Fabric workspace that contains a seman c model named Model1) You need to monitor the
refresh history of Model1 and visualize the refresh history in a chart. What should you use?

A. a data pipeline

B. the refresh history from the se ngs of Model1

C. a notebook

D. a Dataflow Gen2 dataflow

Case Study -
This is a case study. Case studies are not med separately. You can use as much exam me as you would
like to complete each case. However, there may be addi onal case studies and sec ons on this exam.
You must manage your me to ensure that you are able to complete all ques ons included on this exam
in the me provided.

To answer the ques ons included in a case study, you will need to reference informa on that is provided
in the case study. Case studies might contain exhibits and other resources that provide more informa on
about the scenario that is described in the case study. Each ques on is independent of the other
ques ons in this case study.

At the end of this case study, a review screen will appear. This screen allows you to review your answers
and to make changes before you move to the next sec on of the exam. A er you begin a new sec on,
you cannot return to this sec on.

To start the case study -

To display the first ques on in this case study, click the Next bu on. Use the bu ons in the le pane to
explore the content of the case study before you answer the ques ons.
Clicking these bu ons displays informa on such as business requirements, exis ng environment, and
problem statements. If the case study has an All Informa on tab, note that the informa on displayed is
iden cal to the informa on displayed on the subsequent tabs. When you are ready to answer a
ques on, click the Ques on bu on to return to the ques on.

Overview. Company Overview -

Contoso, Ltd. is an online retail company that wants to modernize its analy cs pla orm by moving to
Fabric. The company plans to begin using Fabric for marke ng analy cs.

Overview. IT Structure -

The company’s IT department has a team of data analysts and a team of data engineers that use
analy cs systems.

The data engineers perform the inges on, transforma on, and loading of data. They prefer to use
Python or SQL to transform the data.

The data analysts query data and create seman c models and reports. They are qualified to write
queries in Power Query and T-SQL.

Exis ng Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.

Exis ng Environment. Source Systems

Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure
Virtual Machines in the same Microso Entra tenant as Fabric. The host virtual machine is on a private
virtual network that has public access blocked. POS1 contains all the sales transac ons that were
processed on the company’s website.

The company has a so ware as a service (SaaS) online marke ng app named MAR1. MAR1 has seven
en es. The en es contain data that relates to email open rates and interac on rates, as well as
website interac ons. The data can be exported from MAR1 by calling REST APIs. Each en ty has a
different endpoint.

Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon
Simple Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB
and relate to email interac ons.

Exis ng Environment. Product Data

POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product
categories.

Exis ng Environment. Azure -


Contoso has a Microso Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers

Contoso has an Azure subscrip on.

The company has an exis ng Azure DevOps organiza on and creates a new project for repositories that
relate to Fabric.
Exis ng Environment. User Problems
The VP of marke ng at Contoso requires analysis on the effec veness of different types of email content.
It typically takes a week to manually compile and analyze the data. Contoso wants to reduce the me to
less than one day by using Fabric.

The data engineering team has successfully exported data from MAR1. The team experiences transient
connec vity errors, which causes the data exports to fail.

Requirements. Planned Changes -

Contoso plans to create the following two lakehouses:


Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analy cal queries
Addi onal items will be added to facilitate data inges on and transforma on.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements. Technical Requirements

The new lakehouses must follow a medallion architecture by using the following three layers: bronze,
silver, and gold. There will be extensive data cleansing required to populate the MAR1 data in the silver
layer, including deduplica on, the handling of missing values, and the standardizing of capitaliza on.

Each layer must be fully populated before moving on to the next layer. If any step in popula ng the
lakehouses fails, an email must be sent to the data engineers.

Data imports must run simultaneously, when possible.


The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.

Items that relate to data inges on must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.

No changes other than changes to the file formats must be implemented before the data lands in the
bronze layer.

Development effort must be minimized and a built-in connec on must be used to import the source
data.

In the event of a connec vity error, the inges on processes must a empt the connec on again.

Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Seman c models, reports,
and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transforma on

In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must
include only ac ve products from product list. Ac ve products are iden fied by an IsAc ve value of 1.

Some product categories and subcategories are NOT assigned to any product. They are NOT analy cally
relevant and must be omi ed from the product dimension in the gold layer.

Requirements. Data Security -

Security in Fabric must meet the following requirements:


The data engineers must have read and write access to all the lakehouses, including the underlying files.

The data analysts must only have read access to the Delta tables in the gold layer.

The data analysts must NOT have access to the data in the bronze and silver layers.

The data engineers must be able to commit changes to source control in WorkspaceA.

You need to ensure that the data analysts can access the gold layer lakehouse.

What should you do?

 A. Add the DataAnalyst group to the Viewer role for WorkspaceA.

 B. Share the lakehouse with the DataAnalysts group and grant the Build reports on the default
seman c model permission.

 C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data
permission.

 D. Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.

Answer : C

You have a Fabric workspace.


You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be wri en by using
Spark.
What should you use to store the data?

 A. a lakehouse
 B. an eventhouse

 C. a datamart

 D. a warehouse

Answer : A

You have a Fabric workspace that contains a warehouse named Warehouse1.


You have an on-premises Microso SQL Server database named Database1 that is accessed by using an
on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

 A. a Dataflow Gen1 dataflow

 B. a data pipeline

 C. a KQL queryset

 D. a notebook

Answer : B

You have a Fabric workspace that contains a warehouse named Warehouse1.


You have an on-premises Microso SQL Server database named Database1 that is accessed by using an
on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

 A. an Apache Spark job defini on

 B. a data pipeline

 C. a Dataflow Gen1 dataflow

 D. an eventstream

Answer : B

You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named
DW1 that is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows during the
past year.
You have Microso Power BI reports that are based on Direct Lake. The reports show year-over-year
values.
Users report that the performance of some of the reports has degraded over me and some visuals
show errors.
You need to resolve the performance issues. The solu on must meet the following requirements:
Provide the best query performance.
Minimize opera onal costs.
Which should you do?

 A. Change the MD5 hash to SHA256.

 B. Increase the capacity.

 C. Enable V-Order.

 D. Modify the surrogate keys to use a different data type.

 E. Create views.

Answer : D

You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is ingested into
Lakehouse1 as one flat table. The table contains the following columns.

You plan to load the data into a dimensional model and implement a star schema. From the original flat
table, you create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of
the solu on.
NOTE: Each correct selec on is worth one point.

 A. Date

 B. ProductName

 C. ProductColor

 D. Transac onID
 E. SalesAmount

 F. ProductID

Answer : BCF

HOTSPOT -
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the following tables
and columns.

You need to create an output that presents the summarized values of all the order quan es by year and
product. The results must include a summary of the order quan es at the year level for all the
products.
How should you complete the code? To answer, select the appropriate op ons in the answer area.
NOTE: Each correct selec on is worth one point.
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can a ach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?

 A. Enable high concurrency for notebooks.

 B. Enable dynamic alloca on for the Spark pool.

 C. Change the run me version.

 D. Increase the number of executors.

Answer : A

You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1.
Lakehouse1 contains the following tables:

Orders -

Customer -

Employee -
The Employee table contains Personally Iden fiable Informa on (PII).
A data engineer is building a workflow that requires wri ng data to the Customer table, however, the
user does NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data
from the Employee table.
Which three ac ons should you perform? Each correct answer presents part of the solu on.
NOTE: Each correct selec on is worth one point.

 A. Share Lakehouse1 with the data engineer.

 B. Assign the data engineer the Contributor role for Workspace2.

 C. Assign the data engineer the Viewer role for Workspace2.

 D. Assign the data engineer the Contributor role for Workspace1.

 E. Migrate the Employee table from Lakehouse1 to Lakehouse2.

 F. Create a new workspace named Workspace2 that contains a new lakehouse named
Lakehouse2.

 G. Assign the data engineer the Viewer role for Workspace1.


Answer : DEF

You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by
mul ple sales representa ves.
You plan to implement row-level security (RLS).
You need to ensure that the sales representa ves can see only their respec ve data.
Which warehouse object do you require to implement RLS?

 A. STORED PROCEDURE

 B. CONSTRAINT

 C. SCHEMA

 D. FUNCTION

Answer : D

HOTSPOT -
You have a Fabric workspace named Workspace1_DEV that contains the following items:
10 reports

Four notebooks -

Three lakehouses -

Two data pipelines -

Two Dataflow Gen1 dataflows -

Three Dataflow Gen2 dataflows -


Five seman c models that each has a scheduled refresh policy
You create a deployment pipeline named Pipeline1 to move items from Workspace1_DEV to a new
workspace named Workspace1_TEST.
You deploy all the items from Workspace1_DEV to Workspace1_TEST.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selec on is worth one point.


You have a Fabric deployment pipeline that uses three workspaces named Dev, Test, and Prod.
You need to deploy an eventhouse as part of the deployment process.
What should you use to add the eventhouse to the deployment process?

 A. GitHub Ac ons

 B. a deployment pipeline

 C. an Azure DevOps pipeline

Answer : B

You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse1.
You plan to deploy Warehouse1 to a new workspace named Workspace2.
As part of the deployment process, you need to verify whether Warehouse1 contains invalid references.
The solu on must minimize development effort.
What should you use?

 A. a database project
 B. a deployment pipeline

 C. a Python script

 D. a T-SQL script

Answer : B

You have a Fabric workspace named Workspace1.


You plan to integrate Workspace1 with Azure DevOps.
You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from Workspace1 to
higher environment workspaces as part of a medallion architecture. You will run deployPipeline1 by
using an API call from an Azure DevOps pipeline.
You need to configure API authen ca on between Azure DevOps and Fabric.
Which type of authen ca on should you use?

 A. service principal

 B. Microso Entra username and password

 C. managed private endpoint

 D. workspace iden ty

Answer : A

You have a Google Cloud Storage (GCS) container named storage1 that contains the files shown in the
following table.

You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1
contains a lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.

You need to read data from all the shortcuts.


Which shortcuts will retrieve data from the cache?

 A. Stores only
 B. Products only

 C. Stores and Products only

 D. Products, Stores, and Trips

 E. Trips only

 F. Products and Trips only

Answer : C

You have a Fabric workspace named Workspace1 that contains an Apache Spark job defini on named
Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
What should you create?

 A. an on-premises data gateway

 B. a managed private endpoint

 C. an integra on run me

 D. a data management gateway

Answer : B

You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3 bucket named
storage2.
You have the Delta Parquet files shown in the following table.

You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1
contains a lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?

 A. Trips and Stores only


 B. Products and Store only

 C. Stores only

 D. Products only

 E. Products, Stores, and Trips

Answer : B

HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.

For Model1, the Keep your Direct Lake data up to date op on is disabled.
You need to configure the execu on of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate op ons in the answer area.
NOTE: Each correct selec on is worth one point.
Your company has a sales department that uses two Fabric workspaces named Workspace1 and
Workspace2.
The company decides to implement a domain strategy to organize the workspaces.
You need to ensure that a user can perform the following tasks:
Create a new domain for the sales department.
Create two subdomains: one for the east region and one for the west region.
Assign Workspace1 to the east region subdomain.
Assign Workspace2 to the west region subdomain.
The solu on must follow the principle of least privilege.
Which role should you assign to the user?

 A. workspace Admin

 B. domain admin

 C. domain contributor

 D. Fabric admin

Answer : B

You have a Fabric workspace named Workspace1 that contains a warehouse named DW1 and a data
pipeline named Pipeline1.
You plan to add a user named User3 to Workspace1.
You need to ensure that User3 can perform the following ac ons:
View all the items in Workspace1.
Update the tables in DW1.
The solu on must follow the principle of least privilege.
You already assigned the appropriate object-level permissions to DW1.
Which workspace role should you assign to User3?

 A. Admin

 B. Member

 C. Viewer

 D. Contributor

Answer : D

You have a Fabric capacity that contains a workspace named Workspace1. Workspace1 contains a
lakehouse named Lakehouse1, a data pipeline, a notebook, and several Microso Power BI reports.
A user named User1 wants to use SQL to analyze the data in Lakehouse1.
You need to configure access for User1. The solu on must meet the following requirements:
Provide User1 with read access to the table data in Lakehouse1.
Prevent User1 from using Apache Spark to query the underlying files in Lakehouse1.
Prevent User1 from accessing other items in Workspace1.
What should you do?

 A. Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.

 B. Assign User1 the Viewer role for Workspace1. Share Lakehouse1 with User1 and select Read
all SQL endpoint data.

 C. Share Lakehouse1 with User1 directly and select Build reports on the default seman c model.

 D. Assign User1 the Member role for Workspace1. Share Lakehouse1 with User1 and select Read
all SQL endpoint data.

Answer : B

DRAG DROP -
You are implemen ng the following data en es in a Fabric environment:
En ty1: Available in a lakehouse and contains data that will be used as a core organiza on en ty
En ty2: Available in a seman c model and contains data that meets organiza onal standards
En ty3: Available in a Microso Power BI report and contains data that is ready for sharing and reuse
En ty4: Available in a Power BI dashboard and contains approved data for execu ve-level decision
making
Your company requires that specific governance processes be implemented for the data.
You need to apply endorsement badges to the en es based on each en ty’s use case.
Which badge should you apply to each en ty? To answer, drag the appropriate badges the correct
en es. Each badge may be used once, more than once, or not at all. You may need to drag the split bar
between panes or scroll to view content.
NOTE: Each correct selec on is worth one point.
HOTSPOT -
You have three users named User1, User2, and User3.
You have the Fabric workspaces shown in the following table.

You have a security group named Group1 that contains User1 and User3.
The Fabric admin creates the domains shown in the following table.

User1 creates a new workspace named Workspace3.


You add Group1 to the default domain of Domain1.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selec on is worth one point.

You have two Fabric workspaces named Workspace1 and Workspace2.


You have a Fabric deployment pipeline named deployPipeline1 that deploys items from Workspace1 to
Workspace2. DeployPipeline1 contains all the items in Workspace1.
You recently modified the items in Workspaces1.
The workspaces currently contain the items shown in the following table.

Items in Workspace1 that have the same name as items in Workspace2 are currently paired.
You need to ensure that the items in Workspace1 overwrite the corresponding items in Workspace2. The
solu on must minimize effort.
What should you do?

 A. Delete all the items in Workspace2, and then run deployPipeline1.

 B. Rename each item in Workspace2 to have the same name as the items in Workspace1.

 C. Back up the items in Workspace2, and then run deployPipeline1.

 D. Run deployPipeline1 without modifying the items in Workspace2.

Answer : D

You have a Fabric workspace named Workspace1 that contains a data pipeline named Pipeline1 and a
lakehouse named Lakehouse1.
You have a deployment pipeline named deployPipeline1 that deploys Workspace1 to Workspace2.
You restructure Workspace1 by adding a folder named Folder1 and moving Pipeline1 to Folder1.
You use deployPipeline1 to deploy Workspace1 to Workspace2.
What occurs to Workspace2?

 A. Folder1 is created, Pipeline1 moves to Folder1, and Lakehouse1 is deployed.

 B. Only Pipeline1 and Lakehouse1 are deployed.

 C. Folder1 is created, and Pipeline1 and Lakehouse1 move to Folder1.

 D. Only Folder1 is created and Pipeline1 moves to Folder1.

Answer : A

DRAG DROP -
Your company has a team of developers. The team creates Python libraries of reusable code that is used
to transform data.
You create a Fabric workspace name Workspace1 that will be used to develop extract, transform, and
load (ETL) solu ons by using notebooks.
You need to ensure that the libraries are available by default to new notebooks in Workspace1.
Which three ac ons should you perform in sequence? To answer, move the appropriate ac ons from the
list of ac ons to the answer area and arrange them in the correct order.

You have a Fabric workspace that contains a lakehouse and a notebook named Notebook1. Notebook1
reads data into a DataFrame from a table named Table1 and applies transforma on logic. The data from
the DataFrame is then wri en to a new Delta table named Table2 by using a merge opera on.
You need to consolidate the underlying Parquet files in Table1.
Which command should you run?

 A. VACUUM

 B. BROADCAST

 C. OPTIMIZE

 D. CACHE
Answer : C

You have five Fabric workspaces.


You are monitoring the execu on of items by using Monitoring hub.
You need to iden fy in which workspace a specific item runs.
Which column should you view in Monitoring hub?

 A. Start me

 B. Capacity

 C. Ac vity name

 D. Submi er

 E. Item type

 F. Job type

 G. Loca on

Answer : G

You have a Fabric workspace that contains a warehouse named DW1. DW1 is loaded by using a notebook
named Notebook1.
You need to iden fy which version of Delta was used when Notebook1 was executed.
What should you use?

 A. Real-Time hub

 B. OneLake data hub

 C. the Admin monitoring workspace

 D. Fabric Monitor

 E. the Microso Fabric Capacity Metrics app

Answer : C

You have an Azure Event Hubs data source that contains weather data.

You ingest the data from the data source by using an eventstream named Eventstream1.
Eventstream1 uses a lakehouse as the des na on.

You need to batch ingest only rows from the data source where the City a ribute has a value of
Kansas. The filter must be added before the des na on. The solu on must minimize development
effort.

What should you use for the data processor and filtering? To answer, select the appropriate op ons in
the answer area. NOTE: Each correct selec on is worth one point.

 wrong

You have a Fabric workspace that contains a warehouse named Warehouse1.

In Warehouse1, you create a table named DimCustomer by running the following statement.
You need to set the Customerkey column as a primary key of the DimCustomer table.

Which three code segments should you run in sequence? To answer, move the appropriate code
segments from the list of code segments to the answer area and arrange them in the correct order.

 wrong

You have a Fabric eventstream that loads data into a table named Bike_Loca on in a KQL database.

The table contains the following columns:

- BikepointID

- Street
- Neighbourhood

- No_Bikes

- No_Empty_Docks

- Timestamp

You need to apply transforma on and filter logic to prepare the data for consump on. The solu on
must return data for a neighbourhood named Sands End when No_Bikes is at least 15. The results
must be ordered by No_Bikes in ascending order.

Solu on: You use the following code segment:

Does this meet the goal?

 Yes

 nocorrect

You are developing a data pipeline named Pipeline1.

You need to add a Copy data ac vity that will copy data from a Snowflake data source to a Fabric
warehouse.

What should you configure?

 Degree of copy parallelism

 Fault tolerance

 Enable stagingcorrect

 Enable logging

You have a Fabric warehouse named DW1 that loads data by using a data pipeline named Pipeline1.
Pipeline1 uses a Copy data ac vity with a dynamic SQL source. Pipeline1 is scheduled to run every 15
minutes.

You discover that Pipeline1 keeps failing.


You need to iden fy which SQL query was executed when the pipeline failed.

What should you do?

 From Monitoring hub, select the latest failed run of Pipeline1, and then view the output JSO

 From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSO

 From Real- me hub, select Fabric events, and then review the details of
Microso .Fabric.ItemReadFailed.

 From Real- me hub, select Fabric events, and then review the details of Microso .
Fabric.ItemUpdateFailed.

Ques on was not answered

You need to ensure that usage of the data in the Amazon S3 bucket meets the technical requirements.

What should you do?

 Create a workspace iden ty and enable high concurrency for the notebooks.

 Create a shortcut and ensure that caching is disabled for the workspace.correct

 Create a workspace iden ty and use the iden ty in a data pipeline.

 Create a shortcut and ensure that caching is enabled for the workspace.

. You need to schedule the popula on of the medallion layers to meet the technical requirements.

What should you do?

 Schedule a data pipeline that calls other data pipelines.correct

 Schedule a notebook.

 Schedule an Apache Spark job.

 Schedule mul ple data pipelines.


1. Topic 1, Contoso, Ltd

Case Study

Overview

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are able
to complete all questions included on this exam in the time provided.

To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other resources
that provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.

At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.

To start the case study

To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the subsequent
tabs. When you are ready to answer a question, click the Question button to return to the
question.

Overview. Company Overview

Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by
moving to Fabric. The company plans to begin using Fabric for marketing analytics.

Overview. IT Structure

The company’s IT department has a team of data analysts and a team of data engineers
that use analytics systems.

The data engineers perform the ingestion, transformation, and loading of data. They prefer
to use Python or SQL to transform the data.

The data analysts query data and create semantic models and reports. They are qualified
to write queries in Power Query and T-SQL.

Existing Environment. Fabric

Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro
license mode.

Existing Environment. Source Systems

Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server
on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The host virtual
machine is on a private virtual network that has public access blocked. POS1 contains all
the sales transactions that were processed on the company’s website.

The company has a software as a service (SaaS) online marketing app named MAR1. MAR1
has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1 by
calling REST APIs. Each entity has a di erent endpoint.

Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files
in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that range in
size from 300 MB to 900 MB and relate to email interactions.

Existing Environment. Product Data

POS1 contains a product list and related data.

The data comes from the following three tables:

- Products

- ProductCategories

- ProductSubcategories

In the data, products are related to product subcategories, and subcategories are related
to product categories.

Existing Environment. Azure

Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:

- DataAnalysts: Contains the data analysts

- DataEngineers: Contains the data engineers

Contoso has an Azure subscription.

The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.

Existing Environment. User Problems

The VP of marketing at Contoso requires analysis on the e ectiveness of di erent types of


email content. It typically takes a week to manually compile and analyze the data. Contoso
wants to reduce the time to less than one day by using Fabric.

The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes

Contoso plans to create the following two lakehouses:

- Lakehouse1: Will store both raw and cleansed data from the sources

- Lakehouse2: Will serve data in a dimensional model to users for analytical queries

Additional items will be added to facilitate data ingestion and transformation.

Contoso plans to use Azure Repos for source control in Fabric.

Requirements. Technical Requirements

The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to populate
the MAR1 data in the silver layer, including deduplication, the handling of missing values,
and the standardizing of capitalization.

Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.

Data imports must run simultaneously, when possible.

The use of email data from the Amazon S3 bucket must meet the following requirements:

- Minimize egress costs associated with cross-cloud data access.

- Prevent saving a copy of the raw data in the lakehouses.

Items that relate to data ingestion must meet the following requirements:

- The items must be source controlled alongside other workspace items.

- Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.

- No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.

- Development e ort must be minimized and a built-in connection must be used to import
the source data.

- In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.

Once a week, old files that are no longer referenced by a Delta table log must be removed.

Requirements. Data Transformation

In the POS1 product data, ProductID values are unique. The product dimension in the gold
layer must include only active products from product list. Active products are identified by
an IsActive value of 1.

Some product categories and subcategories are NOT assigned to any product. They are
NOT analytically relevant and must be omitted from the product dimension in the gold
layer.

Requirements. Data Security

Security in Fabric must meet the following requirements:

- The data engineers must have read and write access to all the lakehouses, including the
underlying files.

- The data analysts must only have read access to the Delta tables in the gold layer.

- The data analysts must NOT have access to the data in the bronze and silver layers.

- The data engineers must be able to commit changes to source control in WorkspaceA.

You need to ensure that the data analysts can access the gold layer lakehouse.

What should you do?

 Add the DataAnalyst group to the Viewer role for Workspace

 Share the lakehouse with the DataAnalysts group and grant the Build reports on the
default semantic model permission.correct

 Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint
data permission.

 Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.

Question was not answered

Explanation:

Data Analysts' Access Requirements must only have read access to the Delta tables in the gold
layer and not have access to the bronze and silver layers.

The gold layer data is typically queried via SQL Endpoints. Granting the Read all SQL Endpoint
data
permission allows data analysts to query the data using familiar SQL-based tools while
restricting access to the underlying files

2. HOTSPOT

You need to recommend a method to populate the POS1 data to the lakehouse medallion
layers.

What should you recommend for each layer? To answer, select the appropriate options in
the answer area. NOTE: Each correct selection is worth one point.

 wrong

Explanation:

Bronze Layer: A pipeline Copy activity

The bronze layer is used to store raw, unprocessed data. The requirements specify that no
transformations should be applied before landing the data in this layer. Using a pipeline Copy
activity ensures minimal development e ort, built-in connectors, and the ability to ingest the
data directly into the Delta format in the bronze layer.

Silver Layer: A notebook

The silver layer involves extensive data cleansing (deduplication, handling missing values, and
standardizing capitalization). A notebook provides the flexibility to implement complex
transformations and is well-suited for this task.

3. You need to ensure that usage of the data in the Amazon S3 bucket meets the technical
requirements.

What should you do?

 Create a workspace identity and enable high concurrency for the notebooks.

 Create a shortcut and ensure that caching is disabled for the workspace.correct
 Create a workspace identity and use the identity in a data pipeline.

 Create a shortcut and ensure that caching is enabled for the workspace.

Question was not answered

Explanation:

To ensure that the usage of the data in the Amazon S3 bucket meets the technical requirements,
we must address two key points:

- Minimize egress costs associated with cross-cloud data access: Using a shortcut ensures that
Fabric does not replicate the data from the S3 bucket into the lakehouse but rather provides
direct access to the data in its original location. This minimizes cross-cloud data transfer and
avoids additional egress costs.

- Prevent saving a copy of the raw data in the lakehouses: Disabling caching ensures that the
raw data is not copied or persisted in the Fabric workspace. The data is accessed on-demand
directly from the Amazon S3 bucket.

4. HOTSPOT

You need to create the product dimension.

How should you complete the Apache Spark SQL code? To answer, select the appropriate
options in the answer area. NOTE: Each correct selection is worth one point.

 wrong

Explanation:

Join between Products and ProductSubCategories:


- Use an INNER JOIN.

- The goal is to include only products that are assigned to a subcategory. An INNER JOIN ensures
that only matching records (i.e., products with a valid subcategory) are included.

Join between ProductSubCategories and ProductCategories:

- Use an INNER JOIN.

- Similar to the above logic, we want to include only subcategories assigned to a valid product
category. An INNER JOIN ensures this condition is met.

WHERE Clause

Condition: IsActive = 1

Only active products (where IsActive equals 1) should be included in the gold layer. This filters
out inactive products.

5. You need to populate the MAR1 data in the bronze layer.

Which two types of activities should you include in the pipeline? Each correct answer
presents part of the solution. NOTE: Each correct selection is worth one point.

 ForEachcorrect

 Copy datacorrect

 WebHook

 Stored procedure

Question was not answered

Explanation:

MAR1 has seven entities, each accessible via a di erent API endpoint. A ForEach activity is
required to iterate over these endpoints to fetch data from each one. It enables dynamic
execution of API calls for each entity.

The Copy data activity is the primary mechanism to extract data from REST APIs and load it into
the bronze layer in Delta format. It supports native connectors for REST APIs and Delta,
minimizing development e ort.

6. Topic 2, Litware, Inc

Case Study

Overview

This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are able
to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other resources
that provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.

At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.

To start the case study

To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the subsequent
tabs. When you are ready to answer a question, click the Question button to return to the
question.

Overview

Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the
authors it represents.

Existing Environment. Fabric Environment

Litware has a Fabric workspace named Workspace1. High concurrency is enabled for
Workspace1.

The company has a data engineering team that uses Python for data processing.

Existing Environment. Data Processing

The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource
planning (ERP) system.

Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that land in
the Files folder in a lakehouse. Notebooks are used to transform the files in a Delta table
for the bronze and silver layers. The gold layer is in a warehouse that has V-Order disabled.

Litware has image files of book covers in Azure Blob Storage. The files are loaded into the
Files folder.

Existing Environment. Sales Data

Month-end sales data is processed on the first calendar day of each month. Data that is
older than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each
day.

The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and
historical data is captured.

The dataflow captures the following fields of the source:

- Sales Date

- Author

- Price

- Units

- SKU

A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.

Existing Environment. Security Groups

Litware has the following security groups:

- Sales

- Fabric Admins

- Streaming Admins

Existing Environment. Performance Issues

Business users perform ad-hoc queries against the warehouse. The business users
indicate that reports against the warehouse sometimes run for two hours and fail to load
as expected. Upon further investigation, the data engineering team receives the following
error message when the reports fail to load: “The SQL query failed while running.”

The data engineering team wants to debug the issue and find queries that cause more than
one failure.

When the authors have new book releases, there is often an increase in sales activity. This
increase slows the data ingestion process.

The company’s sales team reports that during the last month, the sales data has NOT been
up-to-date when they arrive at work in the morning.
Requirements. Planned Changes

Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.

Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data
will be streamed from a REST API.

Requirements. Version Control

Litware plans to implement a version control solution in Fabric that will use GitHub
integration and follow the principle of least privilege.

Requirements. Governance Requirements

To control data platform costs, the data platform must use only Fabric services and items.
Additional Azure resources must NOT be provisioned.

Requirements. Data Requirements

Litware identifies the following data requirements:

- Process the SEO data in near-real-time (NRT).

- Make the book reviews available in the lakehouse without making a copy of the data.

- When a new book cover image arrives in the Files folder, process the image as soon as
possible.

You need to implement the solution for the book reviews.

Which should you do?

 Create a Dataflow Gen2 dataflow.

 Create a shortcut.correct

 Enable external data sharing.

 Create a data pipeline.

Question was not answered

Explanation:

The requirement specifies that Litware plans to make the book reviews available in the
lakehouse without making a copy of the data. In this case, creating a shortcut in Fabric is the
most appropriate solution. A shortcut is a reference to the external data, and it allows Litware to
access the book reviews stored in Amazon S3 without duplicating the data into the lakehouse.

7. You need to resolve the sales data issue. The solution must minimize the amount of data
transferred.
What should you do?

 Spilt the dataflow into two dataflows.

 Configure scheduled refresh for the dataflow.

 Configure incremental refresh for the dataflow. Set Store rows from the past to 1 Month.

 Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Year.

 Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1
Month.correct

Question was not answered

Explanation:

The sales data issue can be resolved by configuring incremental refresh for the dataflow.
Incremental refresh allows for only the new or changed data to be processed, minimizing the
amount of data transferred and improving performance.

The solution specifies that data older than one month never changes, so setting the refresh
period to 1 Month is appropriate. This ensures that only the most recent month of data will be
refreshed, reducing unnecessary data transfers.

8. HOTSPOT

You need to troubleshoot the ad-hoc query issue.

How should you complete the statement? To answer, select the appropriate options in the
answer area. NOTE: Each correct selection is worth one point.
 wrong

Explanation:

SELECT last_run_start_time, last_run_command: These fields will help identify the execution
details of the long-running queries.

FROM queryinsights.long_running_queries: The correct solution is to check the long-running


queries using the queryinsights.long_running_queries view, which provides insights into queries
that take longer than expected to execute.

WHERE last_run_total_elapsed_time_ms > 7200000: This condition filters queries that took
more than 2 hours to complete (7200000 milliseconds), which is relevant to the issue
described.

AND number_of_failed_runs > 1: This condition is key for identifying queries that have failed
more than once, helping to isolate the problematic queries that cause failures and need
attention.

Topic 3, Misc. Questions Set

9. You have a Fabric workspace.

You have semi-structured data.

You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be
written by using Spark.

What should you use to store the data?

 a lakehousecorrect

 an eventhouse

 a datamart

 a warehouse

Question was not answered

Explanation:

A lakehouse is the best option for storing semi-structured data when you need to read it using T-
SQL, KQL, and Apache Spark. A lakehouse combines the flexibility of a data lake (which can
handle semi-structured and unstructured data) with the performance features of a data
warehouse. It allows data to be written using Apache Spark and can be queried using di erent
technologies such as T-SQL (for SQL-based querying), KQL (Kusto Query Language for
querying), and Apache Spark (for distributed processing). This solution is ideal when dealing
with semi-structured data and requiring a versatile querying approach.

10. You have a Fabric workspace that contains a warehouse named Warehouse1.

You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1.

Which item should you use?

 a Dataflow Gen1 dataflow

 a data pipelinecorrect

 a KQL queryset

 a notebook

Question was not answered

Explanation:

To copy data from an on-premises Microsoft SQL Server database (Database1) to a warehouse
(Warehouse1) in Microsoft Fabric, the best option is to use a data pipeline. A data pipeline in
Fabric allows for the orchestration of data movement, from source to destination, using
connectors, transformations, and scheduled workflows. Since the data is being transferred
from an on-premises database and requires the use of a data gateway, a data pipeline provides
the appropriate framework to facilitate this data movement e iciently and reliably.

11. You have a Fabric workspace that contains a warehouse named Warehouse1.

You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.

You need to copy data from Database1 to Warehouse1.

Which item should you use?

 an Apache Spark job definition

 a data pipelinecorrect

 a Dataflow Gen1 dataflow

 an eventstream

Question was not answered

Explanation:

To copy data from an on-premises Microsoft SQL Server database (Database1) to a warehouse
(Warehouse1) in Fabric, a data pipeline is the most appropriate tool. A data pipeline in Fabric is
designed to move data between various data sources and destinations, including on-premises
databases like SQL Server, and cloud-based storage like Fabric warehouses. The data pipeline
can handle the connection through an on-premises data gateway, which is required to access
on-premises data. This solution facilitates the orchestration of data movement and
transformations if needed.

12. You have a Fabric F32 capacity that contains a workspace. The workspace contains a
warehouse named DW1 that is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows
during the past year.

You have Microsoft Power BI reports that are based on Direct Lake. The reports show year-
over-year values.

Users report that the performance of some of the reports has degraded over time and
some visuals show errors.

You need to resolve the performance issues.

The solution must meet the following requirements:

Provide the best query performance.

Minimize operational costs.

Which should you do?

 Change the MD5 hash to SHA256.

 Increase the capacity.

 Enable V-Order

 Modify the surrogate keys to use a di erent data type.correct

 Create views.

Question was not answered

Explanation:

In this case, the key issue causing performance degradation likely stems from the use of MD5
hash surrogate keys. MD5 hashes are 128-bit values, which can be ine icient for large datasets
like the 500 million rows in your fact table. Using a more e icient data type for surrogate keys
(such as integer or bigint) would reduce the storage and processing overhead, leading to better
query performance. This approach will improve performance while minimizing operational costs
because it reduces the complexity of querying and indexing, as smaller data types are generally
faster and more e icient to process.

13. HOTSPOT

You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the
following tables and columns.
You need to create an output that presents the summarized values of all the order
quantities by year and product. The results must include a summary of the order quantities
at the year level for all the products.

How should you complete the code? To answer, select the appropriate options in the
answer area. NOTE: Each correct selection is worth one point.

 wrong

14. You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is
ingested into Lakehouse1 as one flat table.

The table contains the following columns.


You plan to load the data into a dimensional model and implement a star schema. From
the original flat table, you create two tables named FactSales and DimProduct. You will
track changes in DimProduct.

You need to prepare the data.

Which three columns should you include in the DimProduct table? Each correct answer
presents part of the solution. NOTE: Each correct selection is worth one point.

 Datecorrect

 ProductNamecorrect

 ProductColorcorrect

 TransactionID

 SalesAmount

 ProductIDcorrect

Question was not answered

Explanation:

In a star schema, the DimProduct table serves as a dimension table that contains descriptive
attributes about products. It will provide context for the FactSales table, which contains
transactional data. The following columns should be included in the DimProduct table:

ProductName: The ProductName is an important descriptive attribute of the product, which is


needed for analysis and reporting in a dimensional model.

ProductColor: ProductColor is another descriptive attribute of the product. In a star schema, it


makes sense to include attributes like color in the dimension table to help categorize products
in the analysis.

ProductID: ProductID is the primary key for the DimProduct table, which will be used to join the
FactSales table to the product dimension. It's essential for uniquely identifying each product in
the model.

15. You have a Fabric workspace named Workspace1 that contains a notebook named
Notebook1.
In Workspace1, you create a new notebook named Notebook2.

You need to ensure that you can attach Notebook2 to the same Apache Spark session as
Notebook1.

What should you do?

 Enable high concurrency for notebooks.correct

 Enable dynamic allocation for the Spark pool.

 Change the runtime version.

 Increase the number of executors.

Question was not answered

Explanation:

To ensure that Notebook2 can attach to the same Apache Spark session as Notebook1, you
need to enable high concurrency for notebooks. High concurrency allows multiple notebooks to
share a Spark session, enabling them to run within the same Spark context and thus share
resources like cached data, session state, and compute capabilities. This is particularly useful
when you need notebooks to run in sequence or together while leveraging shared resources.

16. You have a Fabric workspace named Workspace1 that contains a lakehouse named
Lakehouse1.

Lakehouse1 contains the following tables:

- Orders

- Customer

- Employee

The Employee table contains Personally Identifiable Information (PII).

A data engineer is building a workflow that requires writing data to the Customer table,
however, the user does NOT have the elevated permissions required to view the contents
of the Employee table. You need to ensure that the data engineer can write data to the
Customer table without reading data from the Employee table.

Which three actions should you perform? Each correct answer presents part of the
solution. NOTE: Each correct selection is worth one point.

 Share Lakehouse1 with the data engineer.correct

 Assign the data engineer the Contributor role for Workspace2.

 Assign the data engineer the Viewer role for Workspace2.

 Assign the data engineer the Contributor role for Workspace1.correct


 Migrate the Employee table from Lakehouse1 to Lakehouse2.correct

 Create a new workspace named Workspace2 that contains a new lakehouse named
Lakehouse2.

 Assign the data engineer the Viewer role for Workspace1.

Question was not answered

Explanation:

To meet the requirements of ensuring that the data engineer can write data to the Customer
table

without reading data from the Employee table (which contains Personally Identifiable
Information, or

PII), you can implement the following steps:

Share Lakehouse1 with the data engineer.

By sharing Lakehouse1 with the data engineer, you provide the necessary access to the data
within the lakehouse. However, this access should be controlled through roles and permissions,
which will allow writing to the Customer table but prevent reading from the Employee table.

Assign the data engineer the Contributor role for Workspace1.

Assigning the Contributor role for Workspace1 grants the data engineer the ability to perform
actions such as writing to tables (e.g., the Customer table) within the workspace. This role
typically allows users to modify and manage data without necessarily granting them access to
view all data (e.g., PII data in the Employee table).

Migrate the Employee table from Lakehouse1 to Lakehouse2.

To prevent the data engineer from accessing the Employee table (which contains PII), you can
migrate the Employee table to a separate lakehouse (Lakehouse2) or workspace (Workspace2).
This separation of sensitive data ensures that the data engineer's access is restricted to the
Customer table in Lakehouse1, while the Employee table can be managed separately and
protected under di erent access controls.

17. You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data
and is used by multiple sales representatives.

You plan to implement row-level security (RLS).

You need to ensure that the sales representatives can see only their respective data.

Which warehouse object do you require to implement RLS?

 ISTORED PROCEDURE

 CONSTRAINT

 SCHEMA

 FUNCTIONcorrect
Question was not answered

Explanation:

To implement Row-Level Security (RLS) in a Fabric warehouse, you need to use a function that
defines the security logic for filtering the rows of data based on the user's identity or role. This
function can be used in conjunction with a security policy to control access to specific rows in a
table.

In the case of sales representatives, the function would define the filtering criteria (e.g., based
on a column such as SalesRepID or SalesRepName), ensuring that each representative can
only see their respective data.

18. HOTSPOT

You have a Fabric workspace named Workspace1_DEV that contains the following items:

- 10 reports

- Four notebooks

- Three lakehouses

- Two data pipelines

- Two Dataflow Gen1 dataflows

- Three Dataflow Gen2 dataflows

- Five semantic models that each has a scheduled refresh policy

You create a deployment pipeline named Pipeline1 to move items from Workspace1_DEV
to a new workspace named Workspace1_TEST.

You deploy all the items from Workspace1_DEV to Workspace1_TEST.

For each of the following statements, select Yes if the statement is true. Otherwise, select
No. NOTE: Each correct selection is worth one point.
 wrong

19. You have a Fabric deployment pipeline that uses three workspaces named Dev, Test,
and Prod.

You need to deploy an eventhouse as part of the deployment process.

What should you use to add the eventhouse to the deployment process?

 GitHub Actions

 a deployment pipelinecorrect

 an Azure DevOps pipeline

Question was not answered

Explanation:

A deployment pipeline in Fabric is designed to automate the process of deploying assets (such
as reports, datasets, eventhouses, and other objects) between environments like Dev, Test, and
Prod. Since you need to deploy an eventhouse as part of the deployment process, a deployment
pipeline is the appropriate tool to move this asset through the di erent stages of your
environment.

20. You have a Fabric workspace named Workspace1 that contains a warehouse named
Warehouse1.

You plan to deploy Warehouse1 to a new workspace named Workspace2.

As part of the deployment process, you need to verify whether Warehouse1 contains
invalid references. The solution must minimize development e ort.

What should you use?

 a database project

 a deployment pipeline

 a Python scriptcorrect
 a T-SQL script

Question was not answered

Explanation:

A deployment pipeline in Fabric allows you to deploy assets like warehouses, datasets, and
reports between di erent workspaces (such as from Workspace1 to Workspace2). One of the
key features of a deployment pipeline is the ability to check for invalid references before
deployment. This can help identify issues with assets, such as broken links or dependencies,
ensuring the deployment is successful without introducing errors. This is the most e icient way
to verify references and manage the deployment with minimal development e ort.

21. You have a Fabric workspace that contains a Real-Time Intelligence solution and an
eventhouse.

Users report that from OneLake file explorer, they cannot see the data from the
eventhouse.

You enable OneLake availability for the eventhouse.

What will be copied to OneLake?

 only data added to new databases that are added to the eventhouse

 only the existing data in the eventhouse

 no data

 both new data and existing data in the eventhousecorrect

 only new data added to the eventhouse

Question was not answered

Explanation:

When you enable OneLake availability for an eventhouse, both new and existing data in the
eventhouse will be copied to OneLake. This feature ensures that data, whether newly ingested
or already present, becomes available for access through OneLake, making it easier for users to
interact with and explore the data directly from OneLake file explorer.

22. You have a Fabric workspace named Workspace1.

You plan to integrate Workspace1 with Azure DevOps.

You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from
Workspace1 to higher environment workspaces as part of a medallion architecture. You
will run deployPipeline1 by using an API call from an Azure DevOps pipeline.

You need to configure API authentication between Azure DevOps and Fabric.

Which type of authentication should you use?


 service principalcorrect

 Microsoft Entra username and password

 managed private endpoint

 workspace identity

Question was not answered

Explanation:

When integrating Azure DevOps with Fabric (Workspace1), using a service principal is the
recommended authentication method. A service principal provides a way for applications (such
as an Azure DevOps pipeline) to authenticate and interact with resources securely. It allows
Azure DevOps to authenticate API calls to Fabric without requiring direct user credentials. This
method is ideal for automating tasks such as deploying items through a Fabric deployment
pipeline.

23. You have a Google Cloud Storage (GCS) container named storage1 that contains the
files shown in the following table.

You have a Fabric workspace named Workspace1 that has the cache for shortcuts
enabled. Workspace1 contains a lakehouse named Lakehouse1.

Lakehouse1 has the shortcuts shown in the following table.

You need to read data from all the shortcuts.

Which shortcuts will retrieve data from the cache?

 Stores only

 Products only

 Stores and Products onlycorrect

 Products, Stores, and Trips

 Trips only

 Products and Trips only

Question was not answered


Explanation:

When reading data from shortcuts in Fabric (in this case, from a lakehouse like Lakehouse1),
the cache for shortcuts helps by storing the data locally for quick access. The last accessed
timestamp and the cache expiration rules determine whether data is fetched from the cache or
from the source (Google Cloud Storage, in this case).

Products: The ProductFile.parquet was last accessed 12 hours ago. Since the cache has data
available for up to 12 hours, it is likely that this data will be retrieved from the cache, as it hasn't
been too long since it was last accessed.

Stores: The StoreFile.json was last accessed 4 hours ago, which is within the cache retention
period.

Therefore, this data will also be retrieved from the cache.

Trips: The TripsFile.csv was last accessed 48 hours ago. Given that it's outside the typical
caching window (assuming the cache has a maximum retention period of around 24 hours), it
would not be retrieved from the cache. Instead, it will likely require a fresh read from the source.

24. You have a Fabric workspace named Workspace1 that contains an Apache Spark job
definition named Job1.

You have an Azure SQL database named Source1 that has public internet access disabled.

You need to ensure that Job1 can access the data in Source1.

What should you create?

 an on-premises data gateway

 a managed private endpointcorrect

 an integration runtime

 a data management gateway

Question was not answered

Explanation:

To allow Job1 in Workspace1 to access an Azure SQL database (Source1) with public internet
access disabled, you need to create a managed private endpoint. A managed private endpoint
is a secure, private connection that enables services like Fabric (or other Azure services) to
access resources such as databases, storage accounts, or other services within a virtual
network (VNet) without requiring public internet access. This approach maintains the security
and integrity of your data while enabling access to the Azure SQL database.

25. You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3
bucket named storage2.

You have the Delta Parquet files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts
enabled.

Workspace1 contains a lakehouse named Lakehouse1.

Lakehouse1 has the following shortcuts:

- A shortcut to ProductFile aliased as Products

- A shortcut to StoreFile aliased as Stores

- A shortcut to TripsFile aliased as Trips

The data from which shortcuts will be retrieved from the cache?

 Trips and Stores only

 Products and Store onlycorrect

 Stores only

 Products only

 Products. Stores, and Trips

Question was not answered

Explanation:

When the cache for shortcuts is enabled in Fabric, the data retrieval is governed by the caching
behavior, which generally retains data for a specific period after it was last accessed. The data
from the shortcuts will be retrieved from the cache if the data is stored in locations that support
caching.

Here's a breakdown based on the data's location:

Products: The ProductFile is stored in Azure Data Lake Storage Gen2 (storage1). Since Azure
Data Lake is a supported storage system in Fabric and the file is relatively small (50 MB), this
data is most likely cached and can be retrieved from the cache.

Stores: The StoreFile is stored in Amazon S3 (storage2), and even though it is stored in a
di erent cloud provider, Fabric can cache data from Amazon S3 if caching is enabled. This data
(25 MB) is likely cached and retrievable.

Trips: The TripsFile is stored in Amazon S3 (storage2) and is significantly larger (2 GB) compared
to the other files. While Fabric can cache data from Amazon S3, the larger size of the file (2 GB)
may exceed typical cache sizes or retention windows, causing this file to likely be retrieved
directly from the source instead of the cache.
Understanding the Microsoft Azure Fundamentals AZ-900 Study Guide - The First Step to
Preparing for Your Microsoft AZ-900 Exam

Microsoft MB-310 Dumps (V20.02) - Validate Your Knowledge and Skills by Cracking the
Microsoft Dynamics 365 Finance Exam

Tags:DP-700 exam dumps, DP-700 study materials

Related Posts

70-735 OEM Manufacturing and Deployment for Windows 10 Dumps Online

Microsoft MB-310 Dynamics 365 Finance Functional Consultant Dumps

Free MB6-897 Microsoft Dynamics 365 for Retail Exam Dumps

About The Author

dumps

From our dumpsbase platform you could search what exams you need then test or practice
online by yourself. Download the PDF file if you need directly. Any other questions you can
mail support@dumpsbase.com

Add a Comment

Comment:

Name:

Email Address:

© 2025 Valid IT Exam Dumps Questions

Back to Top ↑
Certy IQ
Premium exam material
Get certification quickly with the CertyIQ Premium exam material.
Everything you need to prepare, learn & pass your certification exam easily. Lifetime free updates
First attempt guaranteed success.
https://www.CertyIQ.com
Microsoft

(DP-700)

Implementing Data Engineering Solutions Using Microsoft Fabric


(beta)

Total: 66 Questions
Link: https://certyiq.com/papers/microsoft/dp-700
Question: 1 CertyIQ
Case Study -

This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.

To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.

At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -

To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing environment, and problem
statements. If the case study has an All Information tab, note that the information displayed is identical to the
information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.

Overview. Company Overview -

Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -

The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.

The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.

The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -

Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.

Existing Environment. Source Systems

Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.

The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.

Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.

Existing Environment. Product Data


POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers

Contoso has an Azure subscription.

The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.

The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -

Contoso plans to create the following two lakehouses:


Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.

Requirements. Technical Requirements

The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.

Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.

Data imports must run simultaneously, when possible.


The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.

Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.

No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.

Development effort must be minimized and a built-in connection must be used to import the source data.

In the event of a connectivity error, the ingestion processes must attempt the connection again.

Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.

Requirements. Data Transformation


In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.

Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -

Security in Fabric must meet the following requirements:


The data engineers must have read and write access to all the lakehouses, including the underlying files.

The data analysts must only have read access to the Delta tables in the gold layer.

The data analysts must NOT have access to the data in the bronze and silver layers.

The data engineers must be able to commit changes to source control in WorkspaceA.

You need to ensure that the data analysts can access the gold layer lakehouse.

What should you do?

A.Add the DataAnalyst group to the Viewer role for WorkspaceA.


B.Share the lakehouse with the DataAnalysts group and grant the Build reports on the default semantic model
permission.
C.Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data permission.
D.Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark permission.

Answer: C

Explanation:

C: Share the lakehouse with the DataAnalysts group and grant the Read all data permission.This approach
ensures that data analysts have the necessary read access to the Delta tables in the gold layer, aligning with
the requirement that they should not have access to data in the bronze and silver layers.

By granting Read all SQL Endpoint data permission, the analysts get the necessary and sufficient access to
query the gold layer data while adhering to the principle of least privilege.

Question: 2 CertyIQ
You have a Fabric workspace.
You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be written by using Spark.
What should you use to store the data?

A.a lakehouse
B.an eventhouse
C.a datamart
D.a warehouse

Answer: A

Explanation:

A lakehouse in Microsoft Fabric is designed to handle semi-structured and unstructured data, combining the
flexibility of a data lake with the structure of a data warehouse. It supports data writing via Apache Spark and
allows querying through T-SQL and KQL, making it suitable for the specified requirements.

A lakehouse combines the features of data lakes and data warehouses. It is designed to handle both
structured and semi-structured data, making it ideal for storing diverse data formats.

Question: 3 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

A.a Dataflow Gen1 dataflow


B.a data pipeline
C.a KQL queryset
D.a notebook

Answer: B

Explanation:

B: a data pipeline.

A data pipeline is the most suitable tool for moving data between different sources and destinations. In this
case, you need to copy data from your on-premises Microsoft SQL Server database (Database1) to your Fabric
warehouse (Warehouse1). A data pipeline can efficiently handle this task by allowing you to define and
manage the data transfer process.

Question: 4 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?

A.an Apache Spark job definition


B.a data pipeline
C.a Dataflow Gen1 dataflow
D.an eventstream

Answer: B

Explanation:

B: a data pipeline.

A data pipeline is specifically designed for orchestrating and automating data movement tasks between
different sources and destinations. Here’s why a data pipeline is the best choice for copying data from your
on-premises Microsoft SQL Server database (Database1) to your Fabric warehouse (Warehouse1)

Data pipelines in Microsoft Fabric are designed to facilitate the movement and transformation of data
between various sources and destinations. In this scenario, a data pipeline can be configured to copy data
from the on-premises SQL Server database to the Fabric warehouse, utilizing the on-premises data gateway
for secure connectivity.

Question: 5 CertyIQ
You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named DW1 that
is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows during the past year.
You have Microsoft Power BI reports that are based on Direct Lake. The reports show year-over-year values.
Users report that the performance of some of the reports has degraded over time and some visuals show errors.
You need to resolve the performance issues. The solution must meet the following requirements:
Provide the best query performance.
Minimize operational costs.
Which should you do?

A.Change the MD5 hash to SHA256.


B.Increase the capacity.
C.Enable V-Order.
D.Modify the surrogate keys to use a different data type.
E.Create views.

Answer: D

Explanation:

The best solution to resolve the performance issues while meeting the requirements of best query
performance and minimizing operational costs is:D. Modify the surrogate keys to use a different data type.

While MD5 hashes are deterministic and ensure uniqueness, they can be less efficient for join operations
compared to integer-based keys. This inefficiency arises because joining on lengthy string keys demands
more computational resources than joining on shorter, integer-based keys.Recommendation: Modify the
surrogate keys to use a different data type, specifically integers.

Question: 6 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the following tables and
columns.

You need to create an output that presents the summarized values of all the order quantities by year and product.
The results must include a summary of the order quantities at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:

Box1 -> SELECT YEAR || Box2 -> ROLLUP(YEAR(SO.ModifiedDATE), P.Name)

Explanation:
Key Details:

The use of ROLLUP ensures compliance with the requirement for summarized values at different grouping
levels.
SUM(SO.OrderQty) calculates the total order quantities.

Question: 7 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is ingested into Lakehouse1 as
one flat table. The table contains the following columns.

You plan to load the data into a dimensional model and implement a star schema. From the original flat table, you
create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of the
solution.
NOTE: Each correct selection is worth one point.

A.Date
B.ProductName
C.ProductColor
D.TransactionID
E.SalesAmount
F.ProductID

Answer: BCF

Explanation:

B. ProductName: This attribute describes the product and is crucial for understanding and analyzing the data
related to each product.

C. ProductColor: This attribute provides additional information about the product, which can be useful for
analysis, reporting, and segmentation.

F. ProductID: This is the unique identifier for each product and serves as the primary key for the DimProduct
table. It's essential for establishing the relationship between the FactSales table and the DimProduct table.
Question: 8 CertyIQ
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?

A.Enable high concurrency for notebooks.


B.Enable dynamic allocation for the Spark pool.
C.Change the runtime version.
D.Increase the number of executors.

Answer: A

Explanation:

A.Enable high concurrency for notebooks: High concurrency allows multiple notebooks to share the same
Apache Spark session. This setting ensures that different notebooks can run simultaneously within the same
session, facilitating collaboration and efficient resource usage.

Question: 9 CertyIQ
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1. Lakehouse1
contains the following tables:

Orders -

Customer -

Employee -
The Employee table contains Personally Identifiable Information (PII).
A data engineer is building a workflow that requires writing data to the Customer table, however, the user does
NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data from the
Employee table.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A.Share Lakehouse1 with the data engineer.


B.Assign the data engineer the Contributor role for Workspace2.
C.Assign the data engineer the Viewer role for Workspace2.
D.Assign the data engineer the Contributor role for Workspace1.
E.Migrate the Employee table from Lakehouse1 to Lakehouse2.
F.Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2.
G.Assign the data engineer the Viewer role for Workspace1.

Answer: DEF

Explanation:

D. Assign the data engineer the Contributor role for Workspace1:

Assigning the Contributor role to the data engineer for Workspace1 grants them the necessary permissions to
write data to the Customer table in Lakehouse1. However, since the data engineer does not have elevated
permissions to view the Employee table, they won't be able to access its content.
E. Migrate the Employee table from Lakehouse1 to Lakehouse2:

Moving the Employee table, which contains Personally Identifiable Information (PII), to a separate Lakehouse2
helps ensure that the data engineer cannot accidentally or intentionally access it. This action keeps sensitive
data segregated from the data engineer's operational environment.

F. Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2:

By creating a new workspace and lakehouse for the Employee table, you further isolate the sensitive data.
The data engineer can still perform their tasks in Workspace1 without accessing Workspace2, ensuring secure
data handling and compliance with privacy requirements.

Question: 10 CertyIQ
You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by multiple sales
representatives.
You plan to implement row-level security (RLS).
You need to ensure that the sales representatives can see only their respective data.
Which warehouse object do you require to implement RLS?

A.STORED PROCEDURE
B.CONSTRAINT
C.SCHEMA
D.FUNCTION

Answer: D

Explanation:

To implement Row-Level Security (RLS) in a Fabric warehouse like DW1, need to use a FUNCTION to define
the filtering logic. Specifically, a user-defined function (UDF) is created and associated with the RLS policy to
determine which rows each user can access.

Reference:

https://learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-row-level-security#2-define-security-
policies

Question: 11 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1_DEV that contains the following items:
10 reports

Four notebooks -

Three lakehouses -

Two data pipelines -

Two Dataflow Gen1 dataflows -

Three Dataflow Gen2 dataflows -


Five semantic models that each has a scheduled refresh policy
You create a deployment pipeline named Pipeline1 to move items from Workspace1_DEV to a new workspace
named Workspace1_TEST.
You deploy all the items from Workspace1_DEV to Workspace1_TEST.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

Answer: No/Yes/No

Explanation:

1. Data from the semantic models will be deployed to the target stage.

Answer: No
Semantic models are only deployed to the target stage in the form of metadata. The deployment process
does not copy actual data; instead, only the structural and configuration metadata (e.g., model schema and
measures) is deployed. The target stage will require a refresh to fetch the data into the semantic models.
Reference: Microsoft Learn - Item Properties Copied During Deployment

2.The Dataflow Gen1 dataflows will be deployed to the target stage.

Answer: Yes
Dataflow Gen1 objects are included in the deployment pipeline and are fully deployed to the target stage,
including their configurations. This ensures that Dataflow Gen1 pipelines can run in the target environment.
The deployment process supports this functionality without requiring a manual configuration.

3.The scheduled refresh policies will be deployed to the target stage.

Answer: No
The deployment process does not copy or deploy refresh schedules for datasets, semantic models, or other
items. Although metadata for the items is deployed, refresh schedules must be manually recreated or
configured in the target stage. This limitation is highlighted in Microsoft's documentation.
Reference: Microsoft Learn - Item Properties Copied During Deployment

Question: 12 CertyIQ
You have a Fabric deployment pipeline that uses three workspaces named Dev, Test, and Prod.
You need to deploy an eventhouse as part of the deployment process.
What should you use to add the eventhouse to the deployment process?

A.GitHub Actions
B.a deployment pipeline
C.an Azure DevOps pipeline

Answer: B

Explanation:

B. a deployment pipeline.

Deployment Pipeline: In Microsoft Fabric, a deployment pipeline is specifically designed for managing and
deploying resources across different environments (Dev, Test, and Prod). It allows you to automate the
deployment process, ensuring consistency and efficiency. By using a deployment pipeline, you can easily
include the eventhouse in your deployment process and manage its promotion through the different stages
(Dev, Test, Prod).

Reference:

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipelines?
tabs=from-fabric%2Cnew%2Cstage-settings-new

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/understand-the-deployment-process?
tabs=new

Question: 13 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse1.
You plan to deploy Warehouse1 to a new workspace named Workspace2.
As part of the deployment process, you need to verify whether Warehouse1 contains invalid references. The
solution must minimize development effort.
What should you use?

A.a database project


B.a deployment pipeline
C.a Python script
D.a T-SQL script

Answer: B
Explanation:

Microsoft Fabric's deployment pipelines provide a built-in mechanism to manage and validate the deployment
of artifacts like warehouses. When you use a deployment pipeline to move Warehouse1 from one workspace
(Workspace1) to another (Workspace2), the pipeline automatically checks for issues such as invalid
references or missing dependencies during the deployment process.

Question: 14 CertyIQ
You have a Fabric workspace that contains a Real-Time Intelligence solution and an eventhouse.
Users report that from OneLake file explorer, they cannot see the data from the eventhouse.
You enable OneLake availability for the eventhouse.
What will be copied to OneLake?

A.only data added to new databases that are added to the eventhouse
B.only the existing data in the eventhouse
C.no data
D.both new data and existing data in the eventhouse
E.only new data added to the eventhouse

Answer: E

Explanation:

E. only new data added to the eventhouse.

When you enable OneLake availability for an eventhouse, only the new data that is added to the eventhouse
after enabling this setting will be copied to OneLake. The existing data present in the eventhouse prior to
enabling OneLake availability will not be copied automatically. This ensures that users can access the most
recent data through the OneLake file explorer while maintaining the efficiency of data synchronization.

Question: 15 CertyIQ
You have a Fabric workspace named Workspace1.
You plan to integrate Workspace1 with Azure DevOps.
You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from Workspace1 to higher
environment workspaces as part of a medallion architecture. You will run deployPipeline1 by using an API call from
an Azure DevOps pipeline.
You need to configure API authentication between Azure DevOps and Fabric.
Which type of authentication should you use?

A.service principal
B.Microsoft Entra username and password
C.managed private endpoint
D.workspace identity

Answer: A

Explanation:

A. service principal.

Service Principal: A service principal is a security identity used by applications, services, and automation tools
to access specific Azure resources. It provides a secure way to authenticate and authorize API calls between
Azure DevOps and Fabric. By using a service principal, you can grant the necessary permissions to
deployPipeline1 to interact with the Fabric workspace (Workspace1) and deploy items to higher environments.
This approach ensures secure and managed access without relying on individual user credentials.

Question: 16 CertyIQ
You have a Google Cloud Storage (GCS) container named storage1 that contains the files shown in the following
table.

You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.

You need to read data from all the shortcuts.


Which shortcuts will retrieve data from the cache?

A.Stores only
B.Products only
C.Stores and Products only
D.Products, Stores, and Trips
E.Trips only
F.Products and Trips only

Answer: C

Explanation:

C. Stores and Products only.

When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. However, the effectiveness of this caching depends on whether the cache was
enabled before the files were added to the storage or if the shortcuts were already pointing to those files.

Question: 17 CertyIQ
You have a Fabric workspace named Workspace1 that contains an Apache Spark job definition named Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
What should you create?

A.an on-premises data gateway


B.a managed private endpoint
C.an integration runtime
D.a data management gateway

Answer: B

Explanation:

B. a managed private endpoint.

Managed Private Endpoint: This allows secure and private communication between Azure services without
exposing data to the public internet. By creating a managed private endpoint, you can establish a direct
connection between the Apache Spark job in Workspace1 and the Azure SQL database (Source1) while
keeping public internet access disabled. This approach ensures that data transfer happens securely within the
Azure network.

To ensure that Job1 can access the data in Source1, you need to create a managed private endpoint. This will
allow the Spark job to securely connect to the Azure SQL database without requiring public internet access.

Question: 18 CertyIQ
You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3 bucket named storage2.
You have the Delta Parquet files shown in the following table.

You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?

A.Trips and Stores only


B.Products and Store only
C.Stores only
D.Products only
E.Products, Stores, and Trips

Answer: B

Explanation:

B. Products and Stores only.

When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. This means that data accessed through the cached shortcuts is retrieved from the
local cache instead of the original storage locations, which improves performance.

Reference:

https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts
Question: 19 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.

For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

Box1 ->2 || Box2 -> 3 || Box3 -> 2 || Box4 -> 1


Explanation:
Question: 20 CertyIQ
Your company has a sales department that uses two Fabric workspaces named Workspace1 and Workspace2.
The company decides to implement a domain strategy to organize the workspaces.
You need to ensure that a user can perform the following tasks:
Create a new domain for the sales department.
Create two subdomains: one for the east region and one for the west region.
Assign Workspace1 to the east region subdomain.
Assign Workspace2 to the west region subdomain.
The solution must follow the principle of least privilege.
Which role should you assign to the user?

A.workspace Admin
B.domain admin
C.domain contributor
D.Fabric admin

Answer: D

Explanation:

Fabric Admin: Possesses the highest level of permissions within the Fabric environment, enabling the creation
of domains and subdomains, as well as the assignment of resources to those subdomains.

Question: 21 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named DW1 and a data pipeline named
Pipeline1.
You plan to add a user named User3 to Workspace1.
You need to ensure that User3 can perform the following actions:
View all the items in Workspace1.
Update the tables in DW1.
The solution must follow the principle of least privilege.
You already assigned the appropriate object-level permissions to DW1.
Which workspace role should you assign to User3?

A.Admin
B.Member
C.Viewer
D.Contributor

Answer: B

Explanation:

Member: This role allows users to view and interact with all the items in the workspace. When combined with
the already assigned object-level permissions to DW1, it ensures that User3 can update the tables in DW1.

Question: 22 CertyIQ
You have a Fabric capacity that contains a workspace named Workspace1. Workspace1 contains a lakehouse
named Lakehouse1, a data pipeline, a notebook, and several Microsoft Power BI reports.
A user named User1 wants to use SQL to analyze the data in Lakehouse1.
You need to configure access for User1. The solution must meet the following requirements:
Provide User1 with read access to the table data in Lakehouse1.
Prevent User1 from using Apache Spark to query the underlying files in Lakehouse1.
Prevent User1 from accessing other items in Workspace1.
What should you do?

A.Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
B.Assign User1 the Viewer role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL endpoint
data.
C.Share Lakehouse1 with User1 directly and select Build reports on the default semantic model.
D.Assign User1 the Member role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL
endpoint data.

Answer: A

Explanation:

A. Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.

Share Lakehouse1 with User1 directly and select Read all SQL endpoint data: This approach grants User1
read access specifically to the table data in Lakehouse1 through the SQL endpoint, without giving them
broader permissions in Workspace1 or access to other items. By directly sharing Lakehouse1 and selecting the
"Read all SQL endpoint data" option, you ensure User1 can use SQL to analyze the data while preventing them
from using Apache Spark to query the underlying files.

Question: 23 CertyIQ
DRAG DROP -
You are implementing the following data entities in a Fabric environment:
Entity1: Available in a lakehouse and contains data that will be used as a core organization entity
Entity2: Available in a semantic model and contains data that meets organizational standards
Entity3: Available in a Microsoft Power BI report and contains data that is ready for sharing and reuse
Entity4: Available in a Power BI dashboard and contains approved data for executive-level decision making
Your company requires that specific governance processes be implemented for the data.
You need to apply endorsement badges to the entities based on each entity’s use case.
Which badge should you apply to each entity? To answer, drag the appropriate badges the correct entities. Each
badge may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll
to view content.
NOTE: Each correct selection is worth one point.
Answer:

Explanation:

1.Master Data.

Refers to authoritative data that is central to business operations, often stored in a master data management
system.

This is typically well-maintained and used across multiple departments.

Assigned to Entity1, as it represents centralized and validated business data.

2.Certified.

Indicates that an entity (such as a dataset or report) is officially validated by an authority in the organization.

Typically used for trusted and critical business data.


Assigned to Entity2 because this entity meets the highest quality standards.

3.Promoted.

Indicates that an entity is recommended for use but is not fully certified.

This badge is usually given when an item is considered useful but has not gone through a formal approval
process.

Assigned to Entity3, which signifies that it is endorsed for use but not yet fully certified.

4. Cannot be Endorsed.

Indicates that an entity does not qualify for endorsement (either promoted or certified).

This could be due to low-quality data, lack of validation, or experimental datasets.

Assigned to Entity4, meaning it has not met the standards for endorsement.

Question: 24 CertyIQ
HOTSPOT -
You have three users named User1, User2, and User3.
You have the Fabric workspaces shown in the following table.

You have a security group named Group1 that contains User1 and User3.
The Fabric admin creates the domains shown in the following table.

User1 creates a new workspace named Workspace3.


You add Group1 to the default domain of Domain1.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:

Explanation:

User3 has Viewer role access to Workspace3.

The "Yes" option is selected, meaning User3 does have Viewer access to Workspace3.

The Viewer role allows read-only access to the workspace but does not permit modifications.

User3 has Domain Contributor access to Domain1.

The "Yes" option is selected, meaning User3 has Domain Contributor permissions in Domain1.

The Domain Contributor role typically allows managing content within a domain but does not grant full admin
rights.

User2 has Contributor role access to Workspace3.

The "No" option is selected, meaning User2 does NOT have Contributor access to Workspace3.

The Contributor role would allow editing content in the workspace, but since "No" is selected, User2 lacks
these permissions.

Question: 25 CertyIQ
You have two Fabric workspaces named Workspace1 and Workspace2.
You have a Fabric deployment pipeline named deployPipeline1 that deploys items from Workspace1 to
Workspace2. DeployPipeline1 contains all the items in Workspace1.
You recently modified the items in Workspaces1.
The workspaces currently contain the items shown in the following table.
Items in Workspace1 that have the same name as items in Workspace2 are currently paired.
You need to ensure that the items in Workspace1 overwrite the corresponding items in Workspace2. The solution
must minimize effort.
What should you do?

A.Delete all the items in Workspace2, and then run deployPipeline1.


B.Rename each item in Workspace2 to have the same name as the items in Workspace1.
C.Back up the items in Workspace2, and then run deployPipeline1.
D.Run deployPipeline1 without modifying the items in Workspace2.

Answer: D

Explanation:

D. Run deployPipeline1 without modifying the items in Workspace2.

When items in Workspace1 and Workspace2 are paired and you run the deployment pipeline (deployPipeline1),
the pipeline will automatically update the paired items in Workspace2 with the changes made in Workspace1.
This means that the modifications in Workspace1 will overwrite the corresponding items in Workspace2
without requiring any additional steps.

Question: 26 CertyIQ
You have a Fabric workspace named Workspace1 that contains a data pipeline named Pipeline1 and a lakehouse
named Lakehouse1.
You have a deployment pipeline named deployPipeline1 that deploys Workspace1 to Workspace2.
You restructure Workspace1 by adding a folder named Folder1 and moving Pipeline1 to Folder1.
You use deployPipeline1 to deploy Workspace1 to Workspace2.
What occurs to Workspace2?

A.Folder1 is created, Pipeline1 moves to Folder1, and Lakehouse1 is deployed.


B.Only Pipeline1 and Lakehouse1 are deployed.
C.Folder1 is created, and Pipeline1 and Lakehouse1 move to Folder1.
D.Only Folder1 is created and Pipeline1 moves to Folder1.

Answer: A
Question: 27 CertyIQ
DRAG DROP -
Your company has a team of developers. The team creates Python libraries of reusable code that is used to
transform data.
You create a Fabric workspace name Workspace1 that will be used to develop extract, transform, and load (ETL)
solutions by using notebooks.
You need to ensure that the libraries are available by default to new notebooks in Workspace1.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of
actions to the answer area and arrange them in the correct order.

Answer:

Explanation:

Create an environment: Matches with "Create an environment."

This action involves defining an environment where libraries, dependencies, and configurations are managed.

Install the libraries :Matches with "Install the libraries."

Installing libraries involves setting up necessary packages required for development and execution.

Set the default environment :Matches with "Set the default environment."

This action defines a specific environment as the default for execution.


Question: 28 CertyIQ
You have a Fabric workspace that contains a lakehouse and a notebook named Notebook1. Notebook1 reads data
into a DataFrame from a table named Table1 and applies transformation logic. The data from the DataFrame is then
written to a new Delta table named Table2 by using a merge operation.
You need to consolidate the underlying Parquet files in Table1.
Which command should you run?

A.VACUUM
B.BROADCAST
C.OPTIMIZE
D.CACHE

Answer: C

Explanation:

OPTIMIZE: This command is used to compact small files into larger ones and optimize the layout of data in a
Delta table. By running the OPTIMIZE command on Table1, you can consolidate the Parquet files and improve
the performance of read and write operations on the table. To consolidate the underlying Parquet files in
Table1, you should run the OPTIMIZE command.

Question: 29 CertyIQ
You have five Fabric workspaces.
You are monitoring the execution of items by using Monitoring hub.
You need to identify in which workspace a specific item runs.
Which column should you view in Monitoring hub?

A.Start time
B.Capacity
C.Activity name
D.Submitter
E.Item type
F.Job type
G.Location

Answer: G

Explanation:

The Location shows the Workspace.

Location: This column displays the workspace where the item is being executed, helping you pinpoint the
exact workspace of the item.

Reference:

https://learn.microsoft.com/en-us/training/modules/monitor-fabric-items/3-use-monitor-hub
Question: 30 CertyIQ
You have a Fabric workspace that contains a warehouse named DW1. DW1 is loaded by using a notebook named
Notebook1.
You need to identify which version of Delta was used when Notebook1 was executed.
What should you use?

A.Real-Time hub
B.OneLake data hub
C.the Admin monitoring workspace
D.Fabric Monitor
E.the Microsoft Fabric Capacity Metrics app

Answer: D

Explanation:

D. Fabric Monitor.

Fabric Monitor: This tool provides detailed monitoring and logging capabilities for various components within
a Fabric workspace, including notebooks and data processing tasks. By using Fabric Monitor, you can track
and analyze the execution details of Notebook1, including the version of Delta used during its execution. This
information is crucial for debugging, auditing, and ensuring compatibility across different versions of Delta.

Question: 31 CertyIQ
DRAG DROP -
You have a Fabric workspace that contains a warehouse named Warehouse1.
In Warehouse1, you create a table named DimCustomer by running the following statement.

You need to set the Customerkey column as a primary key of the DimCustomer table.
Which three code segments should you run in sequence? To answer, move the appropriate code segments from
the list of code segments to the answer area and arrange them in the correct order.

Answer:
Explanation:

ALTER TABLE dbo.DimCustomer.

This is necessary to modify the structure of an existing table.

Since adding or dropping a primary key constraint requires modifying a table, this statement is correct.

ADD CONSTRAINT PK_DimCustomer PRIMARY KEY NONCLUSTERED (CustomerKey)

This statement is used to define a primary key on the CustomerKey column.

It specifies a NONCLUSTERED primary key, meaning the physical ordering of data is not changed, and a
separate index structure is created.

This selection aligns with the requirement of having a nonclustered primary key.

NOT ENFORCED

In some data warehousing scenarios, constraints might not be enforced to allow better query performance
and faster data ingestion.

If the system does not enforce referential integrity (e.g., in Azure Synapse Analytics), this would be applicable.

Question: 32 CertyIQ
You have a Fabric workspace that contains a semantic model named Model1.
You need to dynamically execute and monitor the refresh progress of Model1.
What should you use?

A.dynamic management views in Microsoft SQL Server Management Studio (SSMS)


B.Monitoring hub
C.dynamic management views in Azure Data Studio
D.a semantic link in a notebook

Answer: D

Explanation:

D. a semantic link in a notebook.

Semantic link in a notebook: This approach allows you to dynamically execute operations and monitor the
refresh progress of the semantic model (Model1) within the interactive and flexible environment of a
notebook. By using a semantic link, you can write custom scripts to trigger the refresh process and track its
progress in real-time. This method provides a high degree of control and visibility over the operations on your
semantic model.

Question: 33 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:

BikepointID -

Street -

Neighbourhood -

No_Bikes -

No_Empty_Docks -

Timestamp -

You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.

Solution: You use the following code segment:

Does this meet the goal?

A.Yes
B.No

Answer: B

Explanation:

The answer is B. No because the "sort by" is sorting values in descending order (default behavior -->
https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric). One should add "asc" to
sort values as required. The double "project" at the end does not affect the final result

Question: 34 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:

BikepointID -

Street -

Neighbourhood -

No_Bikes -

No_Empty_Docks -

Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.

Solution: You use the following code segment:

Does this meet the goal?

A.Yes
B.No

Answer: B

Explanation:

The default sorting order in KQL is descending (desc), not ascending (asc).
The solution does not explicitly specify asc in the order by clause, so the results will be sorted in descending
order by default.
The requirement is to sort the data by No_Bikes in ascending order, which is not achieved without explicitly
specifying asc.

Why other answers are not correct:

A. Yes: This would be incorrect because the solution fails to meet the requirement of sorting in ascending
order due to the default descending behavior in KQL.

Important Tip:

Always explicitly specify the sorting order (asc or desc) in KQL to avoid confusion, especially since its default
behavior differs from SQL.
Question: 35 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:

BikepointID -

Street -

Neighbourhood -

No_Bikes -

No_Empty_Docks -

Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.

Solution: You use the following code segment:

Does this meet the goal?

A.Yes
B.No

Answer: A

Explanation:

The sort and order operators are equivalent.

The provided code segment correctly filters the data for the neighborhood "Sands End" where the number of
bikes (No_Bikes) is at least 15. It then explicitly sorts the results by No_Bikes in ascending order using sort by
No_Bikes asc and projects the required columns (BikepointID, Street, Neighbourhood, No_Bikes,
No_Empty_Docks, Timestamp). This meets all the stated goals of the problem.

Why other answers are not correct:

B. No: This would be incorrect because the solution explicitly specifies asc in the sort by clause, ensuring the
data is ordered by No_Bikes in ascending order as required.

Important Tip:

Always ensure that the sorting order is explicitly specified in KQL to match the requirements, as the default
behavior might differ from other query languages like SQL.
Reference:

https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric

Question: 36 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:

BikepointID -

Street -

Neighbourhood -

No_Bikes -

No_Empty_Docks -

Timestamp -

You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.

Solution: You use the following code segment:

Does this meet the goal?

A.Yes
B.No

Answer: B

Explanation:

The provided solution uses SQL syntax (SELECT, FROM, WHERE, ORDER BY), but the scenario specifies that
the data is in a KQL (Kusto Query Language) database. KQL and SQL have different syntax and functions. The
correct KQL syntax should be used to filter and sort the data in a KQL database.

Why other answers are not correct:

A. Yes: This would be incorrect because the solution uses SQL syntax instead of KQL, which is not applicable
in this context.

Important Tip:

Always use the appropriate query language for the database you are working with. In this case, KQL should be
used instead of SQL to interact with the KQL database. The correct KQL query would use filter, sort by, and
project as shown in previous examples.

Question: 37 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:

Sales Date -

Author -

Price -

Units -

SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:

Sales -

Fabric Admins -

Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.

Requirements. Planned Changes -


Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.

Requirements. Version Control -


Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.

Requirements. Data Requirements -


Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as possible.
You need to implement the solution for the book reviews.
Which should you do?

A.Create a Dataflow Gen2 dataflow.


B.Create a shortcut.
C.Enable external data sharing.
D.Create a data pipeline.

Answer: B

Explanation:

B. Create a shortcut.

Create a Shortcut: Creating a shortcut in the lakehouse allows you to link to external data sources without
making a copy of the data. This means you can make the book reviews available in the lakehouse by creating a
shortcut to the location where the book reviews are stored. The data remains in its original location but is
accessible from the lakehouse, meeting the requirement of not duplicating the data.

Question: 38 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:

Sales Date -

Author -

Price -

Units -

SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:

Sales -

Fabric Admins -

Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Requirements. Planned Changes -
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.

Requirements. Version Control -


Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.

Requirements. Data Requirements -


Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as possible.
You need to resolve the sales data issue. The solution must minimize the amount of data transferred.
What should you do?

A.Spilt the dataflow into two dataflows.


B.Configure scheduled refresh for the dataflow.
C.Configure incremental refresh for the dataflow. Set Store rows from the past to 1 Month.
D.Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Year.
E.Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Month.

Answer: E

Explanation:

E. Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Month.

This approach ensures minimal data transfer while keeping the refresh scope limited to the most recent and
relevant data (1 month), which is aligned with the requirement to minimize data transfer.

Question: 39 CertyIQ
HOTSPOT -

Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview. Company Overview -


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -


Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -


Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to recommend a method to populate the POS1 data to the lakehouse medallion layers.
What should you recommend for each layer? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Answer:
Explanation:

1.Bronze Layer: A pipeline Copy activity.

The Bronze Layer is typically the raw data ingestion layer in a medallion architecture.

A Copy activity in a pipeline is commonly used in Azure Data Factory (ADF) or Synapse Pipelines to ingest and
store raw data into the Bronze Layer (such as a Data Lake or Delta Lake).

This choice ensures efficient and scalable data ingestion from various sources.

2.Silver Layer: A notebook.

The Silver Layer is used for data transformation and cleansing.

A notebook (such as an Azure Databricks or Synapse notebook) is often used to apply transformations,
perform data validation, and enrich the raw ingested data.

This choice aligns with the goal of refining, structuring, and preparing the data before moving it to the Gold
Layer.

Question: 40 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.
Overview. Company Overview -
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -


Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -


Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical requirements.
What should you do?

A.Create a workspace identity and enable high concurrency for the notebooks.
B.Create a shortcut and ensure that caching is disabled for the workspace.
C.Create a workspace identity and use the identity in a data pipeline.
D.Create a shortcut and ensure that caching is enabled for the workspace.

Answer: D

Explanation:

Enabling caching for the workspace will help minimize egress costs by reducing the amount of data that
needs to be transferred across clouds. Creating a shortcut ensures that the raw data is not duplicated in the
lakehouse.

Question: 41 CertyIQ
HOTSPOT -

Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview. Company Overview -


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -


Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -


Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to create the product dimension.
How should you complete the Apache Spark SQL code? To answer, select the appropriate options in the answer
area.
NOTE: Each correct selection is worth one point.

Answer:
Explanation:

Final Answer:

Joins:

First Join: LEFT OUTER JOIN


Second Join: INNER JOIN

WHERE Clause:

IsActive = 1
These selections ensure that:

All products are retained, even if they are not assigned to a subcategory.

Only valid categories and subcategories assigned to products are included.

Only active products are considered.

The first join should be a LEFT OUTER JOIN to ensure that all products are retained, even if they are not
assigned to a subcategory. The second join should be an INNER JOIN to exclude categories and subcategories
that are not linked to any product, as they are not analytically relevant.

Only active products, identified by an IsActive value of 1, should be included in the product dimension in the
gold layer. Additionally, in the POS1 product data, ProductID values are unique. Categories and subcategories
without assigned products must be omitted to maintain analytical relevance.

Question: 42 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview. Company Overview -


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -


Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -


Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to populate the MAR1 data in the bronze layer.
Which two types of activities should you include in the pipeline? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A.ForEach
B.Copy data
C.WebHook
D.Stored procedure

Answer: AB

Explanation:

ForEach: This activity allows you to iterate over a collection of items and execute activities for each item. In
this context, it can be used to process multiple datasets or files within the bronze layer, ensuring that each
file is appropriately handled and transformed.

Copy Data: This activity is fundamental in pipelines for data movement. It enables you to copy data from a
source to a destination, such as moving data from a staging area to the bronze layer. The Copy Data activity
can read the MAR1 data from its source and write it to the bronze layer, ensuring the data is properly ingested.

Question: 43 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named Warehouse1. Warehouse1 contains the following
tables and columns.

You need to denormalize the tables and include the ContractType and StartDate columns in the Employee table.
The solution must meet the following requirements:
Ensure that the StartDate column is of the date data type.
Ensure that all the rows from the Employee table are preserved and include any matching rows from the Contract
table.
Ensure that the result set displays the total number of employees per contract type for all the contract types that
have more than two employees.
How should you complete the statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.


Answer:
Explanation:

1. CONVERT(date, c.StartDate) as StartDate

The CONVERT function is used to explicitly convert data types in SQL Server.

In this case, it converts c.StartDate to date format, which is appropriate.

2. LEFT OUTER JOIN between Employee and Contract tables

A LEFT OUTER JOIN ensures all employees are included, even if they do not have a corresponding contract.

If some employees do not have contracts, this join type ensures they are still listed with NULL contract values.

3. HAVING COUNT(DISTINCT EmployeeID) > 2

HAVING is used because COUNT(DISTINCT EmployeeID) is an aggregate function, and aggregate functions
cannot be used in WHERE.

HAVING filters groups after aggregation.

Question: 44 CertyIQ
HOTSPOT -
You have an Azure Event Hubs data source that contains weather data.
You ingest the data from the data source by using an eventstream named Eventstream1. Eventstream1 uses a
lakehouse as the destination.
You need to batch ingest only rows from the data source where the City attribute has a value of Kansas. The filter
must be added before the destination. The solution must minimize development effort.
What should you use for the data processor and filtering? To answer, select the appropriate options in the answer
area.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

1. Data Processor: An eventstream with an external data source.

Eventstream refers to real-time streaming data processing.

Selecting "An eventstream with an external data source" means data is coming from an external system such
as IoT devices, logs, or real-time telemetry.

This is appropriate when dealing with real-time ingestion from sources like Azure Event Hubs, IoT Hub, or
Kafka.
2.Filtering: An eventstream processor.

Filtering in streaming systems typically happens during real-time data ingestion to remove irrelevant or
unnecessary events before further processing.

An eventstream processor can be used to apply transformations, filtering, and aggregations dynamically.

This ensures that only relevant data moves forward in the pipeline.

Question: 45 CertyIQ
You have a Fabric workspace that contains an eventstream named Eventstream1. Eventstream1 processes data
from a thermal sensor by using event stream processing, and then stores the data in a lakehouse.
You need to modify Eventstream1 to include the standard deviation of the temperature.
Which transform operator should you include in the Eventstream1 logic?

A.Expand
B.Group by
C.Union
D.Aggregate

Answer: B

Explanation:

The Group by transform operator contains the Standard deviation aggregation. The Aggregate transform
operator only contains Average, Max, Min and Sum aggregation.

Reference:

https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/process-events-using-event-
processor-editor?pivots=standard-capabilities#group-by

Question: 46 CertyIQ
You have an Azure event hub. Each event contains the following fields:

BikepointID -

Street -

Neighbourhood -

Latitude -

Longitude -

No_Bikes -

No_Empty_Docks -
You need to ingest the events. The solution must only retain events that have a Neighbourhood value of Chelsea,
and then store the retained events in a Fabric lakehouse.
What should you use?

A.a KQL queryset


B.an eventstream
C.a streaming dataset
D.Apache Spark Structured Streaming

Answer: B

Explanation:

B. an eventstream.

Eventstream: An eventstream is specifically designed for processing and managing events in real-time. It
allows you to filter, transform, and route events efficiently. In this scenario, you can configure the
eventstream to retain only the events where the Neighbourhood value is "Chelsea" and then store the filtered
events in a Fabric lakehouse. This approach ensures that only the relevant events are ingested, adhering to
the requirement to retain only specific events based on the Neighbourhood value.

Question: 47 CertyIQ
HOTSPOT -
You are building a data loading pattern for Fabric notebook workloads.
You have the following code segment:

For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:

Explanation:

"The target table will always be overwritten."

Selected Answer: No

In many data loading strategies, especially when using incremental loads or merge operations, the target
table is not always overwritten. Instead, new data is appended, updated, or merged based on keys.
Overwriting usually happens in full refresh scenarios, which is not always the case.

"The merge operation will always run."

Selected Answer: No

The merge operation (such as SQL MERGE or Delta Lake MERGE INTO) only runs if certain conditions are met,
such as the presence of new or changed data. If there is no data to update or merge, it may not execute. Thus,
it's correct to say that it does not always run.

"The loading pattern supports both full and incremental loading requirements."

Selected Answer: Yes

A well-designed data pipeline often supports both full and incremental loads. Full loads replace the entire
dataset, while incremental loads append or update only changed records. Since this is a common practice,
selecting "Yes" is correct.
Question: 48 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains two lakehouses named Lakehouse1 and Lakehouse2. Lakehouse1
contains staging data in a Delta table named Orderlines. Lakehouse2 contains a Type 2 slowly changing dimension
(SCD) dimension table named Dim_Customer.
You need to build a query that will combine data from Orderlines and Dim_Customer to create a new fact table
named Fact_Orders. The new table must meet the following requirements:
Enable the analysis of customer orders based on historical attributes.
Enable the analysis of customer orders based on the current attributes.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:
Explanation:

1.o.OrderDate >= c.valid_from_datetime

This ensures that the OrderDate falls on or after the start of the valid period.

This is essential to capture all orders that are valid based on the entity's timeline.

2.o.OrderDate < c.valid_to_datetime

This ensures that the OrderDate is strictly before the valid end date.

This prevents fetching orders that occur after the entity has expired.

Question: 49 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?

A.Eventstream
B.Dataflow Gen2
C.Streaming dataset
D.Data pipeline

Answer: D

Explanation:

D. Data pipeline.

Data pipeline: A data pipeline is designed to handle large-scale data ingestion and movement efficiently. It
can be configured to automatically trigger the ingestion process when a new file is added to the external data
source, ensuring that the data is ingested into Lakehouse1 as soon as it becomes available. Data pipelines are
optimized for high throughput, making them suitable for handling large files (like the 500 GB files mentioned)
and ensuring the process is both fast and efficient.
Question: 50 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?

A.Data pipeline
B.Environment
C.KQL queryset
D.Dataflow Gen2

Answer: A

Question: 51 CertyIQ
You have a Fabric workspace that contains an eventhouse and a KQL database named Database1. Database1 has
the following:

A table named Table1 -

A table named Table2 -

An update policy named Policy1 -


Policy1 sends data from Table1 to Table2.
The following is a sample of the data in Table2.
Recently, the following actions were performed on Table1:
An additional element named temperature was added to the StreamData column.
The data type of the Timestamp column was changed to date.
The data type of the DeviceId column was changed to string.
You plan to load additional records to Table2.
Which two records will load from Table1 to Table2? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

A.
B.

C.

D.

Answer: BD

Explanation:

Record B loads because it conforms to the updated schema (string DeviceId, StreamData with temperature).

Record D loads because it conforms to the original schema (guid DeviceId, no temperature in StreamData).

Question: 52 CertyIQ
HOTSPOT -
You have a Fabric workspace.
You are debugging a statement and discover the following issues:
Sometimes, the statement fails to return all the expected rows.
The PurchaseDate output column is NOT in the expected format of mmm dd, yy.
You need to resolve the issues. The solution must ensure that the data types of the results are retained. The results
can contain blank cells.
How should you complete the statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.


Answer:
Explanation:

1. try_cast(item_name as varchar(20))

Function: TRY_CAST() is a safer alternative to CAST(), introduced in SQL Server 2012.

Purpose: Attempts to convert item_name into a VARCHAR(20). If conversion fails, it returns NULL instead of
an error.

2. convert(varchar, purchase_date, 7)

Function: CONVERT() is used to transform purchase_date into a string (VARCHAR).

Purpose: Converts purchase_date to a specific date format.

Format Code 7 (British/French Format):

7 corresponds to dd/mm/yy (e.g., 05/02/25 for February 5, 2025).

Useful when presenting dates in a human-readable format.

Question: 53 CertyIQ
You are developing a data pipeline named Pipeline1.
You need to add a Copy data activity that will copy data from a Snowflake data source to a Fabric warehouse.
What should you configure?

A.Degree of copy parallelism


B.Fault tolerance
C.Enable staging
D.Enable logging

Answer: C

Explanation:

Enable Staging: When copying data from a Snowflake data source to a Fabric warehouse, enabling staging
can significantly improve the efficiency and reliability of the data transfer process. Staging involves
temporarily storing the data in an intermediate location before loading it into the final destination. This
approach helps in handling large datasets and complex transformations, ensuring that the data is transferred
smoothly without interruptions. It also allows for more manageable and optimized data movement,
particularly when dealing with different data storage systems like Snowflake and Fabric.

Question: 54 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.

Reference contains reference data in the following format.

Both tables contain millions of rows.


You have the following KQL queryset.

You need to reduce how long it takes to run the KQL queryset.
Solution: You change the join type to kind=outer.
Does this meet the goal?

A.Yes
B.No

Answer: B

Explanation:

No. An outer join can be more computationally intensive than an inner join because it needs to process all rows
from both tables and include rows that don't have matching entries.

Question: 55 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.

Reference contains reference data in the following format.

Both tables contain millions of rows.


You have the following KQL queryset.

You need to reduce how long it takes to run the KQL queryset.
Solution: You change project to extend.

Does this meet the goal?

A.Yes
B.No

Answer: B
Explanation:

No. The `project` operator is used to select specific columns, whereas `extend` is used to add new calculated
columns to the result set. They serve different purposes.

Question: 56 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.

Reference contains reference data in the following format.

Both tables contain millions of rows.


You have the following KQL queryset.

You need to reduce how long it takes to run the KQL queryset.
Solution: You move the filter to line 02.

Does this meet the goal?

A.Yes
B.No

Answer: A

Explanation:

Yes. By applying the `where` clause early in the query, you reduce the number of rows processed in
subsequent operations, which improves performance.
Question: 57 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.

Reference contains reference data in the following format.

Both tables contain millions of rows.


You have the following KQL queryset.

You need to reduce how long it takes to run the KQL queryset.
Solution: You add the make_list() function to the output columns.

Does this meet the goal?

A.Yes
B.No

Answer: B

Explanation:

No. The `make_list()` function aggregates values into a list, which can be useful for certain types of analysis
but does not inherently improve query performance.

Question: 58 CertyIQ
HOTSPOT -

Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:

Sales Date -

Author -

Price -

Units -

SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:

Sales -

Fabric Admins -

Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Requirements. Planned Changes -
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.

Requirements. Version Control -


Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.

Requirements. Data Requirements -


Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as possible.
You need to troubleshoot the ad-hoc query issue.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:
Explanation:

queryinsights.frequently_run_queries

number_of_failed_runs > 1

only this table have the fields specified in the SELECT AND WHERE statements

The data engineering team wants to debug the issue and find queries that cause more than one failure.

https://learn.microsoft.com/en-us/sql/relational-databases/system-views/queryinsights-frequently-run-
queries-transact-sql?view=fabric&preserve-view=true
Question: 59 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.

To start the case study -


To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions. Clicking these buttons displays information such
as business requirements, existing environment, and problem statements. If the case study has an All Information
tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you
are ready to answer a question, click the Question button to return to the question.

Overview. Company Overview -


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.

Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.

Existing Environment. Fabric -


Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three tables:

Products -

ProductCategories -

ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.

Existing Environment. Azure -


Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.

Requirements. Planned Changes -


Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries
Additional items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.

Requirements. Data Security -


Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to schedule the population of the medallion layers to meet the technical requirements.
What should you do?

A.Schedule a data pipeline that calls other data pipelines.


B.Schedule a notebook.
C.Schedule an Apache Spark job.
D.Schedule multiple data pipelines.

Answer: A

Explanation:

A. Schedule a data pipeline that calls other data pipelines.

Schedule a data pipeline that calls other data pipelines: This approach allows you to orchestrate and
manage the population of medallion layers efficiently. By scheduling a main data pipeline that calls other data
pipelines, you can ensure that each step in the data processing workflow is executed in the correct sequence.
This method provides better modularity and manageability, as each sub-pipeline can focus on a specific layer
or task within the medallion architecture.

Question: 60 CertyIQ
HOTSPOT -
You are processing streaming data from an external data provider.
You have the following code segment.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:
Question: 61 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Lakehouse1 contains a Delta table
named Table1.
You analyze Table1 and discover that Table1 contains 2,000 Parquet files of 1 MB each.
You need to minimize how long it takes to query Table1.
What should you do?

A.Disable V-Order and run the OPTIMIZE command.


B.Disable V-Order and run the VACUUM command.
C.Run the OPTIMIZE and VACUUM commands.

Answer: C

Explanation:

C. Run the OPTIMIZE and VACUUM commands.

OPTIMIZE Command: Running the OPTIMIZE command on a Delta table helps to combine smaller files into
larger ones, which can significantly improve query performance. This process, known as compaction, reduces
the number of Parquet files that need to be read during a query, thereby decreasing query latency. In your
case, with 2,000 Parquet files of 1 MB each, running OPTIMIZE will consolidate these files into fewer, larger
files, making queries faster and more efficient.

VACUUM Command: The VACUUM command cleans up old versions of data files that are no longer needed,
which helps to free up storage space and maintain the performance of the Delta table. After running
OPTIMIZE, it's a good practice to run VACUUM to remove any obsolete files and further streamline the data
storage.

Question: 62 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1. Data is loaded daily into Warehouse1
by using data pipelines and stored procedures.
You discover that the daily data load takes longer than expected.
You need to monitor Warehouse1 to identify the names of users that are actively running queries.
Which view should you use?

A.sys.dm_exec_connections
B.sys.dm_exec_requests
C.queryinsights.long_running_queries
D.queryinsights.frequently_run_queries
E.sys.dm_exec_sessions

Answer: E

Explanation:

sys.dm_exec_sessions: This view provides detailed information about all active user connections to the SQL
server. It includes information about the user, session ID, login time, and more. By querying this view, you can
identify which users are currently connected and actively running queries.

Use sys.dm_exec_sessions. This view has info about all active user sessions, including user names, session IDs
and status.
Question: 63 CertyIQ
You have a Fabric workspace that contains an eventstream named EventStream1. EventStream1 outputs events to
a table in a lakehouse.
You need to remove files that are older than seven days and are no longer in use.
Which command should you run?

A.VACUUM
B.COMPUTE
C.OPTIMIZE
D.CLONE

Answer: A

Explanation:

The VACUUM command is used to clean up old files that are no longer in use, which fits the requirement of
removing files that are older than seven days. This command is typically used in data lake environments to
delete files that are no longer needed by the system, ensuring that storage is efficiently managed.

The default retention period for the VACUUM command is 7 days, therefore it will remove files older than 7
days.

Question: 64 CertyIQ
You have a Fabric warehouse named DW1 that loads data by using a data pipeline named Pipeline1. Pipeline1 uses a
Copy data activity with a dynamic SQL source. Pipeline1 is scheduled to run every 15 minutes.
You discover that Pipeline1 keeps failing.
You need to identify which SQL query was executed when the pipeline failed.
What should you do?

A.From Monitoring hub, select the latest failed run of Pipeline1, and then view the output JSON.
B.From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.
C.From Real-time hub, select Fabric events, and then review the details of Microsoft.Fabric.ItemReadFailed.
D.From Real-time hub, select Fabric events, and then review the details of Microsoft. Fabric.ItemUpdateFailed.

Answer: B

Explanation:

B. From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.

Monitoring hub: The Monitoring hub provides detailed logs and information about the execution of your data
pipelines. By selecting the latest failed run of Pipeline1, you can access the execution details and diagnose
the issue.

View the input JSON: The input JSON contains the parameters, configurations, and the dynamic SQL query
used for the Copy data activity. By examining the input JSON, you can identify the specific SQL query that was
executed at the time the pipeline failed. This information is crucial for troubleshooting the issue and
understanding why the pipeline keeps failing.

Question: 65 CertyIQ
You have a Fabric notebook named Notebook1 that has been executing successfully for the last week.
During the last run, Notebook1executed nine jobs.
You need to view the jobs in a timeline chart.
What should you use?

A.Real-Time hub
B.Monitoring hub
C.the job history from the application run
D.Spark History Server
E.the run series from the details of the application run

Answer: E

Explanation:

E. the run series from the details of the application run.

The run series from the details of the application run: This option allows you to view a detailed timeline of the
jobs that were executed during the last run of Notebook1. The run series provides a chronological view of all
the jobs, including their start and end times, which enables you to visualize the execution timeline effectively.

Question: 66 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains an eventstream named EventStream1.
You discover that an EventStream1 transformation fails.
You need to find the following error information:
The error details, including the occurrence time

The total number of errors -

What should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:

Explanation:

To find the error details:

The selected answer is Data insights.

"Data insights" typically provide an in-depth analysis of errors, offering detailed information about what went
wrong, where, and possibly why the error occurred. It helps in diagnosing and understanding the root cause of
an issue.

To find the total number of errors:

The selected answer is Runtime logs.

"Runtime logs" generally contain records of system execution, including the number of errors encountered.
These logs provide a summary of system performance and error occurrences, making them a reliable source
for determining the total number of errors.
Thank you
Thank you for being so interested in the premium exam material.
I'm glad to hear that you found it informative and helpful.

If you have any feedback or thoughts on the bumps, I would love to hear them.
Your insights can help me improve our writing and better understand our readers.

Best of Luck
You have worked hard to get to this point, and you are well-prepared for the exam
Keep your head up, stay positive, and go show that exam what you're made of!

Feedback More Papers

Total: 66 Questions
Link: https://certyiq.com/papers/microsoft/dp-700
ExamDiscuss

https://www.examdiscuss.com
Certification Exams Discussions and Preparation
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Exam : DP-700

Title : Implementing Data


Engineering Solutions
Using Microsoft Fabric

Vendor : Microsoft

Version : DEMO

1 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

QUESTION NO: 1
You need to schedule the population of the medallion layers to meet the technical
requirements.
What should you do?
A. Schedule a data pipeline that calls other data pipelines.
B. Schedule a notebook.
C. Schedule an Apache Spark job.
D. Schedule multiple data pipelines.
Answer: A
Explanation:
The technical requirements specify that:
Medallion layers must be fully populated sequentially (bronze → silver → gold). Each layer
must be populated before the next.
If any step fails, the process must notify the data engineers.
Data imports should run simultaneously when possible.
Why Use a Data Pipeline That Calls Other Data Pipelines?
A data pipeline provides a modular and reusable approach to orchestrating the sequential
population of medallion layers.
By calling other pipelines, each pipeline can focus on populating a specific layer (bronze,
silver, or gold), simplifying development and maintenance.
A parent pipeline can handle:
- Sequential execution of child pipelines.
- Error handling to send email notifications upon failures.
- Parallel execution of tasks where possible (e.g., simultaneous imports into the bronze layer)
.
Topic 1, Contoso, Ltd
Overview
This is a case study. Case studies are not timed separately. You can use as much exam time
as you would like to complete each case. However, there may be additional case studies and
sections on this exam. You must manage your time to ensure that you are able to complete
all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that
is provided in the case study. Case studies might contain exhibits and other resources that
provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions. Clicking
these buttons displays information such as business requirements, existing environment, and
problem statements. If the case study has an All Information tab, note that the information
displayed is identical to the information displayed on the subsequent tabs. When you are
ready to answer a question, click the Question button to return to the question.

2 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Overview. Company Overview


Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by
moving to Fabric. The company plans to begin using Fabric for marketing analytics.
Overview. IT Structure
The company's IT department has a team of data analysts and a team of data engineers that
use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to
use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are qualified to
write queries in Power Query and T-SQL.
Existing Environment. Fabric
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro
license mode.
Existing Environment. Source Systems
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server
on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The host virtual
machine is on a private virtual network that has public access blocked. POS1 contains all the
sales transactions that were processed on the company's website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1
has seven entities. The entities contain data that relates to email open rates and interaction
rates, as well as website interactions. The data can be exported from MAR1 by calling REST
APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files
in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that range in
size from 300 MB to 900 MB and relate to email interactions.
Existing Environment. Product Data
POS1 contains a product list and related data. The data comes from the following three
tables:
Products
ProductCategories
ProductSubcategories
In the data, products are related to product subcategories, and subcategories are related to
product categories.
Existing Environment. Azure
Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
DataAnalysts: Contains the data analysts
DataEngineers: Contains the data engineers
Contoso has an Azure subscription.
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of
email content. It typically takes a week to manually compile and analyze the data. Contoso
wants to reduce the time to less than one day by using Fabric.

3 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.
Requirements. Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries Additional
items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers:
bronze, silver, and gold. There will be extensive data cleansing required to populate the
MAR1 data in the silver layer, including deduplication, the handling of missing values, and the
standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the
source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold
layer must include only active products from product list. Active products are identified by an
IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are
NOT analytically relevant and must be omitted from the product dimension in the gold layer.
Requirements. Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the
underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.

QUESTION NO: 2

4 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

You need to populate the MAR1 data in the bronze layer.


Which two types of activities should you include in the pipeline? Each correct answer
presents part of the solution.
NOTE: Each correct selection is worth one point.
A. ForEach
B. Copy data
C. WebHook
D. Stored procedure
Answer: A,B
Explanation:
MAR1 has seven entities, each accessible via a different API endpoint. A ForEach activity is
required to iterate over these endpoints to fetch data from each one. It enables dynamic
execution of API calls for each entity.
The Copy data activity is the primary mechanism to extract data from REST APIs and load it
into the bronze layer in Delta format. It supports native connectors for REST APIs and Delta,
minimizing development effort.

QUESTION NO: 3
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical
requirements.
What should you do?
A. Create a workspace identity and enable high concurrency for the notebooks.
B. Create a shortcut and ensure that caching is disabled for the workspace.
C. Create a workspace identity and use the identity in a data pipeline.
D. Create a shortcut and ensure that caching is enabled for the workspace.
Answer: B
Explanation:
To ensure that the usage of the data in the Amazon S3 bucket meets the technical
requirements, we must address two key points:
Minimize egress costs associated with cross-cloud data access: Using a shortcut ensures
that Fabric does not replicate the data from the S3 bucket into the lakehouse but rather
provides direct access to the data in its original location. This minimizes cross-cloud data
transfer and avoids additional egress costs.
Prevent saving a copy of the raw data in the lakehouses: Disabling caching ensures that the
raw data is not copied or persisted in the Fabric workspace. The data is accessed on-
demand directly from the Amazon S3 bucket.

QUESTION NO: 4
You need to recommend a method to populate the POS1 data to the lakehouse medallion
layers.
What should you recommend for each layer? To answer, select the appropriate options in the
answer area.
NOTE: Each correct selection is worth one point.

5 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Answer:

6 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

QUESTION NO: 5
You need to ensure that the data analysts can access the gold layer lakehouse.
What should you do?
A. Add the DataAnalyst group to the Viewer role for WorkspaceA.
B. Share the lakehouse with the DataAnalysts group and grant the Build reports on the
default semantic model permission.
C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint
data permission.
D. Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.
Answer: C
Explanation:
Data Analysts' Access Requirements must only have read access to the Delta tables in the
gold layer and not have access to the bronze and silver layers.
The gold layer data is typically queried via SQL Endpoints. Granting the Read all SQL
Endpoint data permission allows data analysts to query the data using familiar SQL-based
tools while restricting access to the underlying files.

QUESTION NO: 6
You need to create the product dimension.
How should you complete the Apache Spark SQL code? To answer, select the appropriate
options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

7 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Topic 2, Litware, Inc


Overview
This is a case study. Case studies are not timed separately. You can use as much exam time
as you would like to complete each case. However, there may be additional case studies and
sections on this exam. You must manage your time to ensure that you are able to complete
all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that
is provided in the case study. Case studies might contain exhibits and other resources that
provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions. Clicking
these buttons displays information such as business requirements, existing environment, and
problem statements. If the case study has an All Information tab, note that the information
displayed is identical to the information displayed on the subsequent tabs. When you are
ready to answer a question, click the Question button to return to the question.
Overview
Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the authors it
represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for

8 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource planning
(ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that land in
the Files folder in a lakehouse. Notebooks are used to transform the files in a Delta table for
the bronze and silver layers. The gold layer is in a warehouse that has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the
Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older
than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and
historical data is captured. The dataflow captures the following fields of the source:
Sales Date
Author
Price
Units
SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales
Fabric Admins
Streaming Admins
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate
that reports against the warehouse sometimes run for two hours and fail to load as expected.
Upon further investigation, the data engineering team receives the following error message
when the reports fail to load: "The SQL query failed while running." The data engineering
team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This
increase slows the data ingestion process.
The company's sales team reports that during the last month, the sales data has NOT been
up-to-date when they arrive at work in the morning.
Requirements. Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data

9 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

will be streamed from a REST API.


Requirements. Version Control
Litware plans to implement a version control solution in Fabric that will use GitHub integration
and follow the principle of least privilege.
Requirements. Governance Requirements
To control data platform costs, the data platform must use only Fabric services and items.
Additional Azure resources must NOT be provisioned.
Requirements. Data Requirements
Litware identifies the following data requirements:
Process the SEO data in near-real-time (NRT).
Make the book reviews available in the lakehouse without making a copy of the data.
When a new book cover image arrives in the Files folder, process the image as soon as
possible.

QUESTION NO: 7
You are building a data orchestration pattern by using a Fabric data pipeline named Dynamic
Data Copy as shown in the exhibit. (Click the Exhibit tab.)

10 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

Dynamic Data Copy does NOT use parametrization.


You need to configure the ForEach activity to receive the list of tables to be copied.
How should you complete the pipeline expression? To answer, select the appropriate options
in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

11 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

QUESTION NO: 8
HOTSPOT
You need to troubleshoot the ad-hoc query issue.
How should you complete the statement? To answer, select the appropriate options in the
answer area.
NOTE: Each correct selection is worth one point.

Answer:

12 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

13 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

QUESTION NO: 9
You have a Fabric workspace named Workspace1 that contains the items shown in the
following table.

For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the
answer area.

14 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

NOTE: Each correct selection is worth one point.

Answer:

15 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!

QUESTION NO: 10
You need to develop an orchestration solution in fabric that will load each item one after the
other. The solution must be scheduled to run every 15 minutes. Which type of item should
you use?
A. warehouse
B. data pipeline
C. Dataflow Gen2 dataflow
D. notebook
Answer: B

16 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/

You might also like