DP 700 (6th April) 2
DP 700 (6th April) 2
You have a Fabric workspace named Workspace1 integra on for Workspace1 An Azure DevOps admin
creates the You plan to configure Git required ar facts to support the integra on of Workspace1. Which
details do you require to perform the integra on?
C. the personal access token (PAT) for Git authen ca on and the Git repository URL
2.You have a Fabric workspace that contains a lakehouse and a seman c model named Model1. You use
a notebook named Notebook1 to ingest and transform data from an external data source. You need to
execute Notebook1 as part of a data pipeline named Pipeline1. The process must meet the following
requirements: Run daily at 07:00 AM UTC. A empt to retry Notebook1 twice if the notebook fails. A er
Notebook1 executes successfully, refresh Model1. Select three ac ons
C. Place the Seman c model refresh ac vity a er the Notebook ac vity and link the ac vi es by using an
On comple on condi on.
f. Place the Seman c model refresh ac vity a er the Notebook ac vity and link the ac vi es by using
the On success condi on.
3. You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by
mul ple sales representa ves. You plan to implement row-level security (RLS). You need to ensure that
the sales representa ves can see only their respec ve data. Which warehouse object do you require to
implement RLS?
A. SECURITY POLICY
B. DATABASE ROLE
C. SCHEMA
D. TABLE
4.You have a Fabric workspace named Workspace1 that contains the following items: Microso Power BI
report named Report1 A Power BI dashboard named Dashboard1. A seman c model named Model1 A
lakehouse name Lakehouse1 Your company requires that specific governance processes be implemented
for the items. Which items can you endorse in Fabric?
5. You have a Fabric workspace that contains a lakehouse named Lakehouse1. You plan to create a data
pipeline named Pipeline1 to ingest data into Lakehouse1. You will use a parameter named param1 to
pass an external value into Pipeline You need to ensure that the pipeline expression returns param1 as
an int value. How should you specify the parameter value?
A. (pipeline().parameters.[param1]}
B. @pipeline().parameters.param1
C.(pipeline().parameters.paraml)
D. @pipeline().parameters.param1)
6. You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse. You
plan to deploy Warehouse1 to a new workspace named Workspace2. As part of the deployment process,
you need to verify whether Warehouse1 contains invalid references. The solu on must minimize
development effort. What should you use?
A. a deployment pipeline
B. a Python script
C. a database project
D. a T-SQL script
7. You have a Fabric workspace that contains an eventstream named EventStream1. EventStream1
outputs events to a table named Table1 in a lakehouse. The streaming
data is sourced from mo of cars. You need to add a transforma on to EventStream1 to average the car
speeds. The speeds must be grouped by non-overlapping and con guous me intervals of one minute.
Each event must Which windowing func on should you use?
A. sliding
B. session
C. hopping
D. tumbling
8.You have a Fabric workspace that contains a warehouse named Warehouse1. You have an on-premises
Microso SQL Server database named Database1 that is accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse 1. Which item should you use?
A. a data pipeline
D. a notebook
9.You have a Fabric workspace that contains an eventstream named Eventstream1. Eventstream1
processes data from a thermal sensor by using event stream processing and then stores the data in a
lakehouse You need to modify Eventstream1 to include the standard devia on of the temperature.
Which transform operator should you include in the Eventstream1 logic?
A. Union
B. Aggregate
C. Group by
D. Expand
10. You have a Fabric workspace. You have semi-structured data. You need to read the data by using T-
SQL, KQL, and Apache Spark. The data will only be wri en by using Spark. What should you use to store
the data?
A. a datamart
B. an eventhouse
C. a warehouse
D. a lakehouse
11. You have a Fabric workspace that contains a write-intensive warehouse named DW1. DW1 stores
staging tables that are used to load data mensional model. The tables are o en read once, dropped, and
then recreated to process new You need to minimize the load me of DW1. What should you do
A. Disable V-Order
C Enable V-Order.
D. Create sta s cs
12.You have a Fabric workspace that contains a seman c model named Model1) You need to monitor the
refresh history of Model1 and visualize the refresh history in a chart. What should you use?
A. a data pipeline
C. a notebook
Case Study -
This is a case study. Case studies are not med separately. You can use as much exam me as you would
like to complete each case. However, there may be addi onal case studies and sec ons on this exam.
You must manage your me to ensure that you are able to complete all ques ons included on this exam
in the me provided.
To answer the ques ons included in a case study, you will need to reference informa on that is provided
in the case study. Case studies might contain exhibits and other resources that provide more informa on
about the scenario that is described in the case study. Each ques on is independent of the other
ques ons in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers
and to make changes before you move to the next sec on of the exam. A er you begin a new sec on,
you cannot return to this sec on.
To display the first ques on in this case study, click the Next bu on. Use the bu ons in the le pane to
explore the content of the case study before you answer the ques ons.
Clicking these bu ons displays informa on such as business requirements, exis ng environment, and
problem statements. If the case study has an All Informa on tab, note that the informa on displayed is
iden cal to the informa on displayed on the subsequent tabs. When you are ready to answer a
ques on, click the Ques on bu on to return to the ques on.
Contoso, Ltd. is an online retail company that wants to modernize its analy cs pla orm by moving to
Fabric. The company plans to begin using Fabric for marke ng analy cs.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use
analy cs systems.
The data engineers perform the inges on, transforma on, and loading of data. They prefer to use
Python or SQL to transform the data.
The data analysts query data and create seman c models and reports. They are qualified to write
queries in Power Query and T-SQL.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure
Virtual Machines in the same Microso Entra tenant as Fabric. The host virtual machine is on a private
virtual network that has public access blocked. POS1 contains all the sales transac ons that were
processed on the company’s website.
The company has a so ware as a service (SaaS) online marke ng app named MAR1. MAR1 has seven
en es. The en es contain data that relates to email open rates and interac on rates, as well as
website interac ons. The data can be exported from MAR1 by calling REST APIs. Each en ty has a
different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon
Simple Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB
and relate to email interac ons.
POS1 contains a product list and related data. The data comes from the following three tables:
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product
categories.
The company has an exis ng Azure DevOps organiza on and creates a new project for repositories that
relate to Fabric.
Exis ng Environment. User Problems
The VP of marke ng at Contoso requires analysis on the effec veness of different types of email content.
It typically takes a week to manually compile and analyze the data. Contoso wants to reduce the me to
less than one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connec vity errors, which causes the data exports to fail.
The new lakehouses must follow a medallion architecture by using the following three layers: bronze,
silver, and gold. There will be extensive data cleansing required to populate the MAR1 data in the silver
layer, including deduplica on, the handling of missing values, and the standardizing of capitaliza on.
Each layer must be fully populated before moving on to the next layer. If any step in popula ng the
lakehouses fails, an email must be sent to the data engineers.
Items that relate to data inges on must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the
bronze layer.
Development effort must be minimized and a built-in connec on must be used to import the source
data.
In the event of a connec vity error, the inges on processes must a empt the connec on again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Seman c models, reports,
and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transforma on
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must
include only ac ve products from product list. Ac ve products are iden fied by an IsAc ve value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analy cally
relevant and must be omi ed from the product dimension in the gold layer.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
B. Share the lakehouse with the DataAnalysts group and grant the Build reports on the default
seman c model permission.
C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data
permission.
D. Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.
Answer : C
A. a lakehouse
B. an eventhouse
C. a datamart
D. a warehouse
Answer : A
B. a data pipeline
C. a KQL queryset
D. a notebook
Answer : B
B. a data pipeline
D. an eventstream
Answer : B
You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named
DW1 that is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows during the
past year.
You have Microso Power BI reports that are based on Direct Lake. The reports show year-over-year
values.
Users report that the performance of some of the reports has degraded over me and some visuals
show errors.
You need to resolve the performance issues. The solu on must meet the following requirements:
Provide the best query performance.
Minimize opera onal costs.
Which should you do?
C. Enable V-Order.
E. Create views.
Answer : D
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is ingested into
Lakehouse1 as one flat table. The table contains the following columns.
You plan to load the data into a dimensional model and implement a star schema. From the original flat
table, you create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of
the solu on.
NOTE: Each correct selec on is worth one point.
A. Date
B. ProductName
C. ProductColor
D. Transac onID
E. SalesAmount
F. ProductID
Answer : BCF
HOTSPOT -
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the following tables
and columns.
You need to create an output that presents the summarized values of all the order quan es by year and
product. The results must include a summary of the order quan es at the year level for all the
products.
How should you complete the code? To answer, select the appropriate op ons in the answer area.
NOTE: Each correct selec on is worth one point.
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can a ach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?
Answer : A
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1.
Lakehouse1 contains the following tables:
Orders -
Customer -
Employee -
The Employee table contains Personally Iden fiable Informa on (PII).
A data engineer is building a workflow that requires wri ng data to the Customer table, however, the
user does NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data
from the Employee table.
Which three ac ons should you perform? Each correct answer presents part of the solu on.
NOTE: Each correct selec on is worth one point.
F. Create a new workspace named Workspace2 that contains a new lakehouse named
Lakehouse2.
You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by
mul ple sales representa ves.
You plan to implement row-level security (RLS).
You need to ensure that the sales representa ves can see only their respec ve data.
Which warehouse object do you require to implement RLS?
A. STORED PROCEDURE
B. CONSTRAINT
C. SCHEMA
D. FUNCTION
Answer : D
HOTSPOT -
You have a Fabric workspace named Workspace1_DEV that contains the following items:
10 reports
Four notebooks -
Three lakehouses -
A. GitHub Ac ons
B. a deployment pipeline
Answer : B
You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse1.
You plan to deploy Warehouse1 to a new workspace named Workspace2.
As part of the deployment process, you need to verify whether Warehouse1 contains invalid references.
The solu on must minimize development effort.
What should you use?
A. a database project
B. a deployment pipeline
C. a Python script
D. a T-SQL script
Answer : B
A. service principal
D. workspace iden ty
Answer : A
You have a Google Cloud Storage (GCS) container named storage1 that contains the files shown in the
following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1
contains a lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.
A. Stores only
B. Products only
E. Trips only
Answer : C
You have a Fabric workspace named Workspace1 that contains an Apache Spark job defini on named
Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
What should you create?
C. an integra on run me
Answer : B
You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3 bucket named
storage2.
You have the Delta Parquet files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1
contains a lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?
C. Stores only
D. Products only
Answer : B
HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.
For Model1, the Keep your Direct Lake data up to date op on is disabled.
You need to configure the execu on of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate op ons in the answer area.
NOTE: Each correct selec on is worth one point.
Your company has a sales department that uses two Fabric workspaces named Workspace1 and
Workspace2.
The company decides to implement a domain strategy to organize the workspaces.
You need to ensure that a user can perform the following tasks:
Create a new domain for the sales department.
Create two subdomains: one for the east region and one for the west region.
Assign Workspace1 to the east region subdomain.
Assign Workspace2 to the west region subdomain.
The solu on must follow the principle of least privilege.
Which role should you assign to the user?
A. workspace Admin
B. domain admin
C. domain contributor
D. Fabric admin
Answer : B
You have a Fabric workspace named Workspace1 that contains a warehouse named DW1 and a data
pipeline named Pipeline1.
You plan to add a user named User3 to Workspace1.
You need to ensure that User3 can perform the following ac ons:
View all the items in Workspace1.
Update the tables in DW1.
The solu on must follow the principle of least privilege.
You already assigned the appropriate object-level permissions to DW1.
Which workspace role should you assign to User3?
A. Admin
B. Member
C. Viewer
D. Contributor
Answer : D
You have a Fabric capacity that contains a workspace named Workspace1. Workspace1 contains a
lakehouse named Lakehouse1, a data pipeline, a notebook, and several Microso Power BI reports.
A user named User1 wants to use SQL to analyze the data in Lakehouse1.
You need to configure access for User1. The solu on must meet the following requirements:
Provide User1 with read access to the table data in Lakehouse1.
Prevent User1 from using Apache Spark to query the underlying files in Lakehouse1.
Prevent User1 from accessing other items in Workspace1.
What should you do?
A. Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
B. Assign User1 the Viewer role for Workspace1. Share Lakehouse1 with User1 and select Read
all SQL endpoint data.
C. Share Lakehouse1 with User1 directly and select Build reports on the default seman c model.
D. Assign User1 the Member role for Workspace1. Share Lakehouse1 with User1 and select Read
all SQL endpoint data.
Answer : B
DRAG DROP -
You are implemen ng the following data en es in a Fabric environment:
En ty1: Available in a lakehouse and contains data that will be used as a core organiza on en ty
En ty2: Available in a seman c model and contains data that meets organiza onal standards
En ty3: Available in a Microso Power BI report and contains data that is ready for sharing and reuse
En ty4: Available in a Power BI dashboard and contains approved data for execu ve-level decision
making
Your company requires that specific governance processes be implemented for the data.
You need to apply endorsement badges to the en es based on each en ty’s use case.
Which badge should you apply to each en ty? To answer, drag the appropriate badges the correct
en es. Each badge may be used once, more than once, or not at all. You may need to drag the split bar
between panes or scroll to view content.
NOTE: Each correct selec on is worth one point.
HOTSPOT -
You have three users named User1, User2, and User3.
You have the Fabric workspaces shown in the following table.
You have a security group named Group1 that contains User1 and User3.
The Fabric admin creates the domains shown in the following table.
Items in Workspace1 that have the same name as items in Workspace2 are currently paired.
You need to ensure that the items in Workspace1 overwrite the corresponding items in Workspace2. The
solu on must minimize effort.
What should you do?
B. Rename each item in Workspace2 to have the same name as the items in Workspace1.
Answer : D
You have a Fabric workspace named Workspace1 that contains a data pipeline named Pipeline1 and a
lakehouse named Lakehouse1.
You have a deployment pipeline named deployPipeline1 that deploys Workspace1 to Workspace2.
You restructure Workspace1 by adding a folder named Folder1 and moving Pipeline1 to Folder1.
You use deployPipeline1 to deploy Workspace1 to Workspace2.
What occurs to Workspace2?
Answer : A
DRAG DROP -
Your company has a team of developers. The team creates Python libraries of reusable code that is used
to transform data.
You create a Fabric workspace name Workspace1 that will be used to develop extract, transform, and
load (ETL) solu ons by using notebooks.
You need to ensure that the libraries are available by default to new notebooks in Workspace1.
Which three ac ons should you perform in sequence? To answer, move the appropriate ac ons from the
list of ac ons to the answer area and arrange them in the correct order.
You have a Fabric workspace that contains a lakehouse and a notebook named Notebook1. Notebook1
reads data into a DataFrame from a table named Table1 and applies transforma on logic. The data from
the DataFrame is then wri en to a new Delta table named Table2 by using a merge opera on.
You need to consolidate the underlying Parquet files in Table1.
Which command should you run?
A. VACUUM
B. BROADCAST
C. OPTIMIZE
D. CACHE
Answer : C
A. Start me
B. Capacity
C. Ac vity name
D. Submi er
E. Item type
F. Job type
G. Loca on
Answer : G
You have a Fabric workspace that contains a warehouse named DW1. DW1 is loaded by using a notebook
named Notebook1.
You need to iden fy which version of Delta was used when Notebook1 was executed.
What should you use?
A. Real-Time hub
D. Fabric Monitor
Answer : C
You have an Azure Event Hubs data source that contains weather data.
You ingest the data from the data source by using an eventstream named Eventstream1.
Eventstream1 uses a lakehouse as the des na on.
You need to batch ingest only rows from the data source where the City a ribute has a value of
Kansas. The filter must be added before the des na on. The solu on must minimize development
effort.
What should you use for the data processor and filtering? To answer, select the appropriate op ons in
the answer area. NOTE: Each correct selec on is worth one point.
wrong
In Warehouse1, you create a table named DimCustomer by running the following statement.
You need to set the Customerkey column as a primary key of the DimCustomer table.
Which three code segments should you run in sequence? To answer, move the appropriate code
segments from the list of code segments to the answer area and arrange them in the correct order.
wrong
You have a Fabric eventstream that loads data into a table named Bike_Loca on in a KQL database.
- BikepointID
- Street
- Neighbourhood
- No_Bikes
- No_Empty_Docks
- Timestamp
You need to apply transforma on and filter logic to prepare the data for consump on. The solu on
must return data for a neighbourhood named Sands End when No_Bikes is at least 15. The results
must be ordered by No_Bikes in ascending order.
Yes
nocorrect
You need to add a Copy data ac vity that will copy data from a Snowflake data source to a Fabric
warehouse.
Fault tolerance
Enable stagingcorrect
Enable logging
You have a Fabric warehouse named DW1 that loads data by using a data pipeline named Pipeline1.
Pipeline1 uses a Copy data ac vity with a dynamic SQL source. Pipeline1 is scheduled to run every 15
minutes.
From Monitoring hub, select the latest failed run of Pipeline1, and then view the output JSO
From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSO
From Real- me hub, select Fabric events, and then review the details of
Microso .Fabric.ItemReadFailed.
From Real- me hub, select Fabric events, and then review the details of Microso .
Fabric.ItemUpdateFailed.
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical requirements.
Create a workspace iden ty and enable high concurrency for the notebooks.
Create a shortcut and ensure that caching is disabled for the workspace.correct
Create a shortcut and ensure that caching is enabled for the workspace.
. You need to schedule the popula on of the medallion layers to meet the technical requirements.
Schedule a notebook.
Case Study
Overview
This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are able
to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other resources
that provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.
To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the subsequent
tabs. When you are ready to answer a question, click the Question button to return to the
question.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by
moving to Fabric. The company plans to begin using Fabric for marketing analytics.
Overview. IT Structure
The company’s IT department has a team of data analysts and a team of data engineers
that use analytics systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer
to use Python or SQL to transform the data.
The data analysts query data and create semantic models and reports. They are qualified
to write queries in Power Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro
license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server
on Azure Virtual Machines in the same Microsoft Entra tenant as Fabric. The host virtual
machine is on a private virtual network that has public access blocked. POS1 contains all
the sales transactions that were processed on the company’s website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1
has seven entities. The entities contain data that relates to email open rates and
interaction rates, as well as website interactions. The data can be exported from MAR1 by
calling REST APIs. Each entity has a di erent endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files
in an Amazon Simple Storage Service (Amazon S3) bucket. There are 12 files that range in
size from 300 MB to 900 MB and relate to email interactions.
- Products
- ProductCategories
- ProductSubcategories
In the data, products are related to product subcategories, and subcategories are related
to product categories.
Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
The company has an existing Azure DevOps organization and creates a new project for
repositories that relate to Fabric.
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.
- Lakehouse1: Will store both raw and cleansed data from the sources
- Lakehouse2: Will serve data in a dimensional model to users for analytical queries
The new lakehouses must follow a medallion architecture by using the following three
layers: bronze, silver, and gold. There will be extensive data cleansing required to populate
the MAR1 data in the silver layer, including deduplication, the handling of missing values,
and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Items that relate to data ingestion must meet the following requirements:
- Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
- No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
- Development e ort must be minimized and a built-in connection must be used to import
the source data.
- In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
In the POS1 product data, ProductID values are unique. The product dimension in the gold
layer must include only active products from product list. Active products are identified by
an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are
NOT analytically relevant and must be omitted from the product dimension in the gold
layer.
- The data engineers must have read and write access to all the lakehouses, including the
underlying files.
- The data analysts must only have read access to the Delta tables in the gold layer.
- The data analysts must NOT have access to the data in the bronze and silver layers.
- The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
Share the lakehouse with the DataAnalysts group and grant the Build reports on the
default semantic model permission.correct
Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint
data permission.
Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.
Explanation:
Data Analysts' Access Requirements must only have read access to the Delta tables in the gold
layer and not have access to the bronze and silver layers.
The gold layer data is typically queried via SQL Endpoints. Granting the Read all SQL Endpoint
data
permission allows data analysts to query the data using familiar SQL-based tools while
restricting access to the underlying files
2. HOTSPOT
You need to recommend a method to populate the POS1 data to the lakehouse medallion
layers.
What should you recommend for each layer? To answer, select the appropriate options in
the answer area. NOTE: Each correct selection is worth one point.
wrong
Explanation:
The bronze layer is used to store raw, unprocessed data. The requirements specify that no
transformations should be applied before landing the data in this layer. Using a pipeline Copy
activity ensures minimal development e ort, built-in connectors, and the ability to ingest the
data directly into the Delta format in the bronze layer.
The silver layer involves extensive data cleansing (deduplication, handling missing values, and
standardizing capitalization). A notebook provides the flexibility to implement complex
transformations and is well-suited for this task.
3. You need to ensure that usage of the data in the Amazon S3 bucket meets the technical
requirements.
Create a workspace identity and enable high concurrency for the notebooks.
Create a shortcut and ensure that caching is disabled for the workspace.correct
Create a workspace identity and use the identity in a data pipeline.
Create a shortcut and ensure that caching is enabled for the workspace.
Explanation:
To ensure that the usage of the data in the Amazon S3 bucket meets the technical requirements,
we must address two key points:
- Minimize egress costs associated with cross-cloud data access: Using a shortcut ensures that
Fabric does not replicate the data from the S3 bucket into the lakehouse but rather provides
direct access to the data in its original location. This minimizes cross-cloud data transfer and
avoids additional egress costs.
- Prevent saving a copy of the raw data in the lakehouses: Disabling caching ensures that the
raw data is not copied or persisted in the Fabric workspace. The data is accessed on-demand
directly from the Amazon S3 bucket.
4. HOTSPOT
How should you complete the Apache Spark SQL code? To answer, select the appropriate
options in the answer area. NOTE: Each correct selection is worth one point.
wrong
Explanation:
- The goal is to include only products that are assigned to a subcategory. An INNER JOIN ensures
that only matching records (i.e., products with a valid subcategory) are included.
- Similar to the above logic, we want to include only subcategories assigned to a valid product
category. An INNER JOIN ensures this condition is met.
WHERE Clause
Condition: IsActive = 1
Only active products (where IsActive equals 1) should be included in the gold layer. This filters
out inactive products.
Which two types of activities should you include in the pipeline? Each correct answer
presents part of the solution. NOTE: Each correct selection is worth one point.
ForEachcorrect
Copy datacorrect
WebHook
Stored procedure
Explanation:
MAR1 has seven entities, each accessible via a di erent API endpoint. A ForEach activity is
required to iterate over these endpoints to fetch data from each one. It enables dynamic
execution of API calls for each entity.
The Copy data activity is the primary mechanism to extract data from REST APIs and load it into
the bronze layer in Delta format. It supports native connectors for REST APIs and Delta,
minimizing development e ort.
Case Study
Overview
This is a case study. Case studies are not timed separately. You can use as much exam
time as you would like to complete each case. However, there may be additional case
studies and sections on this exam. You must manage your time to ensure that you are able
to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information
that is provided in the case study. Case studies might contain exhibits and other resources
that provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.
To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing
environment, and problem statements. If the case study has an All Information tab, note
that the information displayed is identical to the information displayed on the subsequent
tabs. When you are ready to answer a question, click the Question button to return to the
question.
Overview
Litware, Inc. is a publishing company that has an online bookstore and several retail
bookstores worldwide. Litware also manages an online advertising business for the
authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for
Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource
planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that land in
the Files folder in a lakehouse. Notebooks are used to transform the files in a Delta table
for the bronze and silver layers. The gold layer is in a warehouse that has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the
Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is
older than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each
day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and
historical data is captured.
- Sales Date
- Author
- Price
- Units
- SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.
- Sales
- Fabric Admins
- Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users
indicate that reports against the warehouse sometimes run for two hours and fail to load
as expected. Upon further investigation, the data engineering team receives the following
error message when the reports fail to load: “The SQL query failed while running.”
The data engineering team wants to debug the issue and find queries that cause more than
one failure.
When the authors have new book releases, there is often an increase in sales activity. This
increase slows the data ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been
up-to-date when they arrive at work in the morning.
Requirements. Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data
will be streamed from a REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub
integration and follow the principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items.
Additional Azure resources must NOT be provisioned.
- Make the book reviews available in the lakehouse without making a copy of the data.
- When a new book cover image arrives in the Files folder, process the image as soon as
possible.
Create a shortcut.correct
Explanation:
The requirement specifies that Litware plans to make the book reviews available in the
lakehouse without making a copy of the data. In this case, creating a shortcut in Fabric is the
most appropriate solution. A shortcut is a reference to the external data, and it allows Litware to
access the book reviews stored in Amazon S3 without duplicating the data into the lakehouse.
7. You need to resolve the sales data issue. The solution must minimize the amount of data
transferred.
What should you do?
Configure incremental refresh for the dataflow. Set Store rows from the past to 1 Month.
Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Year.
Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1
Month.correct
Explanation:
The sales data issue can be resolved by configuring incremental refresh for the dataflow.
Incremental refresh allows for only the new or changed data to be processed, minimizing the
amount of data transferred and improving performance.
The solution specifies that data older than one month never changes, so setting the refresh
period to 1 Month is appropriate. This ensures that only the most recent month of data will be
refreshed, reducing unnecessary data transfers.
8. HOTSPOT
How should you complete the statement? To answer, select the appropriate options in the
answer area. NOTE: Each correct selection is worth one point.
wrong
Explanation:
SELECT last_run_start_time, last_run_command: These fields will help identify the execution
details of the long-running queries.
WHERE last_run_total_elapsed_time_ms > 7200000: This condition filters queries that took
more than 2 hours to complete (7200000 milliseconds), which is relevant to the issue
described.
AND number_of_failed_runs > 1: This condition is key for identifying queries that have failed
more than once, helping to isolate the problematic queries that cause failures and need
attention.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be
written by using Spark.
a lakehousecorrect
an eventhouse
a datamart
a warehouse
Explanation:
A lakehouse is the best option for storing semi-structured data when you need to read it using T-
SQL, KQL, and Apache Spark. A lakehouse combines the flexibility of a data lake (which can
handle semi-structured and unstructured data) with the performance features of a data
warehouse. It allows data to be written using Apache Spark and can be queried using di erent
technologies such as T-SQL (for SQL-based querying), KQL (Kusto Query Language for
querying), and Apache Spark (for distributed processing). This solution is ideal when dealing
with semi-structured data and requiring a versatile querying approach.
10. You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
a data pipelinecorrect
a KQL queryset
a notebook
Explanation:
To copy data from an on-premises Microsoft SQL Server database (Database1) to a warehouse
(Warehouse1) in Microsoft Fabric, the best option is to use a data pipeline. A data pipeline in
Fabric allows for the orchestration of data movement, from source to destination, using
connectors, transformations, and scheduled workflows. Since the data is being transferred
from an on-premises database and requires the use of a data gateway, a data pipeline provides
the appropriate framework to facilitate this data movement e iciently and reliably.
11. You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is
accessed by using an on-premises data gateway.
a data pipelinecorrect
an eventstream
Explanation:
To copy data from an on-premises Microsoft SQL Server database (Database1) to a warehouse
(Warehouse1) in Fabric, a data pipeline is the most appropriate tool. A data pipeline in Fabric is
designed to move data between various data sources and destinations, including on-premises
databases like SQL Server, and cloud-based storage like Fabric warehouses. The data pipeline
can handle the connection through an on-premises data gateway, which is required to access
on-premises data. This solution facilitates the orchestration of data movement and
transformations if needed.
12. You have a Fabric F32 capacity that contains a workspace. The workspace contains a
warehouse named DW1 that is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows
during the past year.
You have Microsoft Power BI reports that are based on Direct Lake. The reports show year-
over-year values.
Users report that the performance of some of the reports has degraded over time and
some visuals show errors.
Enable V-Order
Create views.
Explanation:
In this case, the key issue causing performance degradation likely stems from the use of MD5
hash surrogate keys. MD5 hashes are 128-bit values, which can be ine icient for large datasets
like the 500 million rows in your fact table. Using a more e icient data type for surrogate keys
(such as integer or bigint) would reduce the storage and processing overhead, leading to better
query performance. This approach will improve performance while minimizing operational costs
because it reduces the complexity of querying and indexing, as smaller data types are generally
faster and more e icient to process.
13. HOTSPOT
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the
following tables and columns.
You need to create an output that presents the summarized values of all the order
quantities by year and product. The results must include a summary of the order quantities
at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the
answer area. NOTE: Each correct selection is worth one point.
wrong
14. You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is
ingested into Lakehouse1 as one flat table.
Which three columns should you include in the DimProduct table? Each correct answer
presents part of the solution. NOTE: Each correct selection is worth one point.
Datecorrect
ProductNamecorrect
ProductColorcorrect
TransactionID
SalesAmount
ProductIDcorrect
Explanation:
In a star schema, the DimProduct table serves as a dimension table that contains descriptive
attributes about products. It will provide context for the FactSales table, which contains
transactional data. The following columns should be included in the DimProduct table:
ProductID: ProductID is the primary key for the DimProduct table, which will be used to join the
FactSales table to the product dimension. It's essential for uniquely identifying each product in
the model.
15. You have a Fabric workspace named Workspace1 that contains a notebook named
Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session as
Notebook1.
Explanation:
To ensure that Notebook2 can attach to the same Apache Spark session as Notebook1, you
need to enable high concurrency for notebooks. High concurrency allows multiple notebooks to
share a Spark session, enabling them to run within the same Spark context and thus share
resources like cached data, session state, and compute capabilities. This is particularly useful
when you need notebooks to run in sequence or together while leveraging shared resources.
16. You have a Fabric workspace named Workspace1 that contains a lakehouse named
Lakehouse1.
- Orders
- Customer
- Employee
A data engineer is building a workflow that requires writing data to the Customer table,
however, the user does NOT have the elevated permissions required to view the contents
of the Employee table. You need to ensure that the data engineer can write data to the
Customer table without reading data from the Employee table.
Which three actions should you perform? Each correct answer presents part of the
solution. NOTE: Each correct selection is worth one point.
Create a new workspace named Workspace2 that contains a new lakehouse named
Lakehouse2.
Explanation:
To meet the requirements of ensuring that the data engineer can write data to the Customer
table
without reading data from the Employee table (which contains Personally Identifiable
Information, or
By sharing Lakehouse1 with the data engineer, you provide the necessary access to the data
within the lakehouse. However, this access should be controlled through roles and permissions,
which will allow writing to the Customer table but prevent reading from the Employee table.
Assigning the Contributor role for Workspace1 grants the data engineer the ability to perform
actions such as writing to tables (e.g., the Customer table) within the workspace. This role
typically allows users to modify and manage data without necessarily granting them access to
view all data (e.g., PII data in the Employee table).
To prevent the data engineer from accessing the Employee table (which contains PII), you can
migrate the Employee table to a separate lakehouse (Lakehouse2) or workspace (Workspace2).
This separation of sensitive data ensures that the data engineer's access is restricted to the
Customer table in Lakehouse1, while the Employee table can be managed separately and
protected under di erent access controls.
17. You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data
and is used by multiple sales representatives.
You need to ensure that the sales representatives can see only their respective data.
ISTORED PROCEDURE
CONSTRAINT
SCHEMA
FUNCTIONcorrect
Question was not answered
Explanation:
To implement Row-Level Security (RLS) in a Fabric warehouse, you need to use a function that
defines the security logic for filtering the rows of data based on the user's identity or role. This
function can be used in conjunction with a security policy to control access to specific rows in a
table.
In the case of sales representatives, the function would define the filtering criteria (e.g., based
on a column such as SalesRepID or SalesRepName), ensuring that each representative can
only see their respective data.
18. HOTSPOT
You have a Fabric workspace named Workspace1_DEV that contains the following items:
- 10 reports
- Four notebooks
- Three lakehouses
You create a deployment pipeline named Pipeline1 to move items from Workspace1_DEV
to a new workspace named Workspace1_TEST.
For each of the following statements, select Yes if the statement is true. Otherwise, select
No. NOTE: Each correct selection is worth one point.
wrong
19. You have a Fabric deployment pipeline that uses three workspaces named Dev, Test,
and Prod.
What should you use to add the eventhouse to the deployment process?
GitHub Actions
a deployment pipelinecorrect
Explanation:
A deployment pipeline in Fabric is designed to automate the process of deploying assets (such
as reports, datasets, eventhouses, and other objects) between environments like Dev, Test, and
Prod. Since you need to deploy an eventhouse as part of the deployment process, a deployment
pipeline is the appropriate tool to move this asset through the di erent stages of your
environment.
20. You have a Fabric workspace named Workspace1 that contains a warehouse named
Warehouse1.
As part of the deployment process, you need to verify whether Warehouse1 contains
invalid references. The solution must minimize development e ort.
a database project
a deployment pipeline
a Python scriptcorrect
a T-SQL script
Explanation:
A deployment pipeline in Fabric allows you to deploy assets like warehouses, datasets, and
reports between di erent workspaces (such as from Workspace1 to Workspace2). One of the
key features of a deployment pipeline is the ability to check for invalid references before
deployment. This can help identify issues with assets, such as broken links or dependencies,
ensuring the deployment is successful without introducing errors. This is the most e icient way
to verify references and manage the deployment with minimal development e ort.
21. You have a Fabric workspace that contains a Real-Time Intelligence solution and an
eventhouse.
Users report that from OneLake file explorer, they cannot see the data from the
eventhouse.
only data added to new databases that are added to the eventhouse
no data
Explanation:
When you enable OneLake availability for an eventhouse, both new and existing data in the
eventhouse will be copied to OneLake. This feature ensures that data, whether newly ingested
or already present, becomes available for access through OneLake, making it easier for users to
interact with and explore the data directly from OneLake file explorer.
You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from
Workspace1 to higher environment workspaces as part of a medallion architecture. You
will run deployPipeline1 by using an API call from an Azure DevOps pipeline.
You need to configure API authentication between Azure DevOps and Fabric.
workspace identity
Explanation:
When integrating Azure DevOps with Fabric (Workspace1), using a service principal is the
recommended authentication method. A service principal provides a way for applications (such
as an Azure DevOps pipeline) to authenticate and interact with resources securely. It allows
Azure DevOps to authenticate API calls to Fabric without requiring direct user credentials. This
method is ideal for automating tasks such as deploying items through a Fabric deployment
pipeline.
23. You have a Google Cloud Storage (GCS) container named storage1 that contains the
files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts
enabled. Workspace1 contains a lakehouse named Lakehouse1.
Stores only
Products only
Trips only
When reading data from shortcuts in Fabric (in this case, from a lakehouse like Lakehouse1),
the cache for shortcuts helps by storing the data locally for quick access. The last accessed
timestamp and the cache expiration rules determine whether data is fetched from the cache or
from the source (Google Cloud Storage, in this case).
Products: The ProductFile.parquet was last accessed 12 hours ago. Since the cache has data
available for up to 12 hours, it is likely that this data will be retrieved from the cache, as it hasn't
been too long since it was last accessed.
Stores: The StoreFile.json was last accessed 4 hours ago, which is within the cache retention
period.
Trips: The TripsFile.csv was last accessed 48 hours ago. Given that it's outside the typical
caching window (assuming the cache has a maximum retention period of around 24 hours), it
would not be retrieved from the cache. Instead, it will likely require a fresh read from the source.
24. You have a Fabric workspace named Workspace1 that contains an Apache Spark job
definition named Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
an integration runtime
Explanation:
To allow Job1 in Workspace1 to access an Azure SQL database (Source1) with public internet
access disabled, you need to create a managed private endpoint. A managed private endpoint
is a secure, private connection that enables services like Fabric (or other Azure services) to
access resources such as databases, storage accounts, or other services within a virtual
network (VNet) without requiring public internet access. This approach maintains the security
and integrity of your data while enabling access to the Azure SQL database.
25. You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3
bucket named storage2.
You have the Delta Parquet files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts
enabled.
The data from which shortcuts will be retrieved from the cache?
Stores only
Products only
Explanation:
When the cache for shortcuts is enabled in Fabric, the data retrieval is governed by the caching
behavior, which generally retains data for a specific period after it was last accessed. The data
from the shortcuts will be retrieved from the cache if the data is stored in locations that support
caching.
Products: The ProductFile is stored in Azure Data Lake Storage Gen2 (storage1). Since Azure
Data Lake is a supported storage system in Fabric and the file is relatively small (50 MB), this
data is most likely cached and can be retrieved from the cache.
Stores: The StoreFile is stored in Amazon S3 (storage2), and even though it is stored in a
di erent cloud provider, Fabric can cache data from Amazon S3 if caching is enabled. This data
(25 MB) is likely cached and retrievable.
Trips: The TripsFile is stored in Amazon S3 (storage2) and is significantly larger (2 GB) compared
to the other files. While Fabric can cache data from Amazon S3, the larger size of the file (2 GB)
may exceed typical cache sizes or retention windows, causing this file to likely be retrieved
directly from the source instead of the cache.
Understanding the Microsoft Azure Fundamentals AZ-900 Study Guide - The First Step to
Preparing for Your Microsoft AZ-900 Exam
Microsoft MB-310 Dumps (V20.02) - Validate Your Knowledge and Skills by Cracking the
Microsoft Dynamics 365 Finance Exam
Related Posts
dumps
From our dumpsbase platform you could search what exams you need then test or practice
online by yourself. Download the PDF file if you need directly. Any other questions you can
mail support@dumpsbase.com
Add a Comment
Comment:
Name:
Email Address:
Back to Top ↑
Certy IQ
Premium exam material
Get certification quickly with the CertyIQ Premium exam material.
Everything you need to prepare, learn & pass your certification exam easily. Lifetime free updates
First attempt guaranteed success.
https://www.CertyIQ.com
Microsoft
(DP-700)
Total: 66 Questions
Link: https://certyiq.com/papers/microsoft/dp-700
Question: 1 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing environment, and problem
statements. If the case study has an All Information tab, note that the information displayed is identical to the
information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
Answer: C
Explanation:
C: Share the lakehouse with the DataAnalysts group and grant the Read all data permission.This approach
ensures that data analysts have the necessary read access to the Delta tables in the gold layer, aligning with
the requirement that they should not have access to data in the bronze and silver layers.
By granting Read all SQL Endpoint data permission, the analysts get the necessary and sufficient access to
query the gold layer data while adhering to the principle of least privilege.
Question: 2 CertyIQ
You have a Fabric workspace.
You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be written by using Spark.
What should you use to store the data?
A.a lakehouse
B.an eventhouse
C.a datamart
D.a warehouse
Answer: A
Explanation:
A lakehouse in Microsoft Fabric is designed to handle semi-structured and unstructured data, combining the
flexibility of a data lake with the structure of a data warehouse. It supports data writing via Apache Spark and
allows querying through T-SQL and KQL, making it suitable for the specified requirements.
A lakehouse combines the features of data lakes and data warehouses. It is designed to handle both
structured and semi-structured data, making it ideal for storing diverse data formats.
Question: 3 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is the most suitable tool for moving data between different sources and destinations. In this
case, you need to copy data from your on-premises Microsoft SQL Server database (Database1) to your Fabric
warehouse (Warehouse1). A data pipeline can efficiently handle this task by allowing you to define and
manage the data transfer process.
Question: 4 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is specifically designed for orchestrating and automating data movement tasks between
different sources and destinations. Here’s why a data pipeline is the best choice for copying data from your
on-premises Microsoft SQL Server database (Database1) to your Fabric warehouse (Warehouse1)
Data pipelines in Microsoft Fabric are designed to facilitate the movement and transformation of data
between various sources and destinations. In this scenario, a data pipeline can be configured to copy data
from the on-premises SQL Server database to the Fabric warehouse, utilizing the on-premises data gateway
for secure connectivity.
Question: 5 CertyIQ
You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named DW1 that
is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows during the past year.
You have Microsoft Power BI reports that are based on Direct Lake. The reports show year-over-year values.
Users report that the performance of some of the reports has degraded over time and some visuals show errors.
You need to resolve the performance issues. The solution must meet the following requirements:
Provide the best query performance.
Minimize operational costs.
Which should you do?
Answer: D
Explanation:
The best solution to resolve the performance issues while meeting the requirements of best query
performance and minimizing operational costs is:D. Modify the surrogate keys to use a different data type.
While MD5 hashes are deterministic and ensure uniqueness, they can be less efficient for join operations
compared to integer-based keys. This inefficiency arises because joining on lengthy string keys demands
more computational resources than joining on shorter, integer-based keys.Recommendation: Modify the
surrogate keys to use a different data type, specifically integers.
Question: 6 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the following tables and
columns.
You need to create an output that presents the summarized values of all the order quantities by year and product.
The results must include a summary of the order quantities at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Key Details:
The use of ROLLUP ensures compliance with the requirement for summarized values at different grouping
levels.
SUM(SO.OrderQty) calculates the total order quantities.
Question: 7 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is ingested into Lakehouse1 as
one flat table. The table contains the following columns.
You plan to load the data into a dimensional model and implement a star schema. From the original flat table, you
create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of the
solution.
NOTE: Each correct selection is worth one point.
A.Date
B.ProductName
C.ProductColor
D.TransactionID
E.SalesAmount
F.ProductID
Answer: BCF
Explanation:
B. ProductName: This attribute describes the product and is crucial for understanding and analyzing the data
related to each product.
C. ProductColor: This attribute provides additional information about the product, which can be useful for
analysis, reporting, and segmentation.
F. ProductID: This is the unique identifier for each product and serves as the primary key for the DimProduct
table. It's essential for establishing the relationship between the FactSales table and the DimProduct table.
Question: 8 CertyIQ
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?
Answer: A
Explanation:
A.Enable high concurrency for notebooks: High concurrency allows multiple notebooks to share the same
Apache Spark session. This setting ensures that different notebooks can run simultaneously within the same
session, facilitating collaboration and efficient resource usage.
Question: 9 CertyIQ
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1. Lakehouse1
contains the following tables:
Orders -
Customer -
Employee -
The Employee table contains Personally Identifiable Information (PII).
A data engineer is building a workflow that requires writing data to the Customer table, however, the user does
NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data from the
Employee table.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Answer: DEF
Explanation:
Assigning the Contributor role to the data engineer for Workspace1 grants them the necessary permissions to
write data to the Customer table in Lakehouse1. However, since the data engineer does not have elevated
permissions to view the Employee table, they won't be able to access its content.
E. Migrate the Employee table from Lakehouse1 to Lakehouse2:
Moving the Employee table, which contains Personally Identifiable Information (PII), to a separate Lakehouse2
helps ensure that the data engineer cannot accidentally or intentionally access it. This action keeps sensitive
data segregated from the data engineer's operational environment.
F. Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2:
By creating a new workspace and lakehouse for the Employee table, you further isolate the sensitive data.
The data engineer can still perform their tasks in Workspace1 without accessing Workspace2, ensuring secure
data handling and compliance with privacy requirements.
Question: 10 CertyIQ
You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by multiple sales
representatives.
You plan to implement row-level security (RLS).
You need to ensure that the sales representatives can see only their respective data.
Which warehouse object do you require to implement RLS?
A.STORED PROCEDURE
B.CONSTRAINT
C.SCHEMA
D.FUNCTION
Answer: D
Explanation:
To implement Row-Level Security (RLS) in a Fabric warehouse like DW1, need to use a FUNCTION to define
the filtering logic. Specifically, a user-defined function (UDF) is created and associated with the RLS policy to
determine which rows each user can access.
Reference:
https://learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-row-level-security#2-define-security-
policies
Question: 11 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1_DEV that contains the following items:
10 reports
Four notebooks -
Three lakehouses -
Answer: No/Yes/No
Explanation:
1. Data from the semantic models will be deployed to the target stage.
Answer: No
Semantic models are only deployed to the target stage in the form of metadata. The deployment process
does not copy actual data; instead, only the structural and configuration metadata (e.g., model schema and
measures) is deployed. The target stage will require a refresh to fetch the data into the semantic models.
Reference: Microsoft Learn - Item Properties Copied During Deployment
Answer: Yes
Dataflow Gen1 objects are included in the deployment pipeline and are fully deployed to the target stage,
including their configurations. This ensures that Dataflow Gen1 pipelines can run in the target environment.
The deployment process supports this functionality without requiring a manual configuration.
Answer: No
The deployment process does not copy or deploy refresh schedules for datasets, semantic models, or other
items. Although metadata for the items is deployed, refresh schedules must be manually recreated or
configured in the target stage. This limitation is highlighted in Microsoft's documentation.
Reference: Microsoft Learn - Item Properties Copied During Deployment
Question: 12 CertyIQ
You have a Fabric deployment pipeline that uses three workspaces named Dev, Test, and Prod.
You need to deploy an eventhouse as part of the deployment process.
What should you use to add the eventhouse to the deployment process?
A.GitHub Actions
B.a deployment pipeline
C.an Azure DevOps pipeline
Answer: B
Explanation:
B. a deployment pipeline.
Deployment Pipeline: In Microsoft Fabric, a deployment pipeline is specifically designed for managing and
deploying resources across different environments (Dev, Test, and Prod). It allows you to automate the
deployment process, ensuring consistency and efficiency. By using a deployment pipeline, you can easily
include the eventhouse in your deployment process and manage its promotion through the different stages
(Dev, Test, Prod).
Reference:
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipelines?
tabs=from-fabric%2Cnew%2Cstage-settings-new
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/understand-the-deployment-process?
tabs=new
Question: 13 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse1.
You plan to deploy Warehouse1 to a new workspace named Workspace2.
As part of the deployment process, you need to verify whether Warehouse1 contains invalid references. The
solution must minimize development effort.
What should you use?
Answer: B
Explanation:
Microsoft Fabric's deployment pipelines provide a built-in mechanism to manage and validate the deployment
of artifacts like warehouses. When you use a deployment pipeline to move Warehouse1 from one workspace
(Workspace1) to another (Workspace2), the pipeline automatically checks for issues such as invalid
references or missing dependencies during the deployment process.
Question: 14 CertyIQ
You have a Fabric workspace that contains a Real-Time Intelligence solution and an eventhouse.
Users report that from OneLake file explorer, they cannot see the data from the eventhouse.
You enable OneLake availability for the eventhouse.
What will be copied to OneLake?
A.only data added to new databases that are added to the eventhouse
B.only the existing data in the eventhouse
C.no data
D.both new data and existing data in the eventhouse
E.only new data added to the eventhouse
Answer: E
Explanation:
When you enable OneLake availability for an eventhouse, only the new data that is added to the eventhouse
after enabling this setting will be copied to OneLake. The existing data present in the eventhouse prior to
enabling OneLake availability will not be copied automatically. This ensures that users can access the most
recent data through the OneLake file explorer while maintaining the efficiency of data synchronization.
Question: 15 CertyIQ
You have a Fabric workspace named Workspace1.
You plan to integrate Workspace1 with Azure DevOps.
You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from Workspace1 to higher
environment workspaces as part of a medallion architecture. You will run deployPipeline1 by using an API call from
an Azure DevOps pipeline.
You need to configure API authentication between Azure DevOps and Fabric.
Which type of authentication should you use?
A.service principal
B.Microsoft Entra username and password
C.managed private endpoint
D.workspace identity
Answer: A
Explanation:
A. service principal.
Service Principal: A service principal is a security identity used by applications, services, and automation tools
to access specific Azure resources. It provides a secure way to authenticate and authorize API calls between
Azure DevOps and Fabric. By using a service principal, you can grant the necessary permissions to
deployPipeline1 to interact with the Fabric workspace (Workspace1) and deploy items to higher environments.
This approach ensures secure and managed access without relying on individual user credentials.
Question: 16 CertyIQ
You have a Google Cloud Storage (GCS) container named storage1 that contains the files shown in the following
table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.
A.Stores only
B.Products only
C.Stores and Products only
D.Products, Stores, and Trips
E.Trips only
F.Products and Trips only
Answer: C
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. However, the effectiveness of this caching depends on whether the cache was
enabled before the files were added to the storage or if the shortcuts were already pointing to those files.
Question: 17 CertyIQ
You have a Fabric workspace named Workspace1 that contains an Apache Spark job definition named Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
What should you create?
Answer: B
Explanation:
Managed Private Endpoint: This allows secure and private communication between Azure services without
exposing data to the public internet. By creating a managed private endpoint, you can establish a direct
connection between the Apache Spark job in Workspace1 and the Azure SQL database (Source1) while
keeping public internet access disabled. This approach ensures that data transfer happens securely within the
Azure network.
To ensure that Job1 can access the data in Source1, you need to create a managed private endpoint. This will
allow the Spark job to securely connect to the Azure SQL database without requiring public internet access.
Question: 18 CertyIQ
You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3 bucket named storage2.
You have the Delta Parquet files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?
Answer: B
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. This means that data accessed through the cached shortcuts is retrieved from the
local cache instead of the original storage locations, which improves performance.
Reference:
https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts
Question: 19 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.
For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
A.workspace Admin
B.domain admin
C.domain contributor
D.Fabric admin
Answer: D
Explanation:
Fabric Admin: Possesses the highest level of permissions within the Fabric environment, enabling the creation
of domains and subdomains, as well as the assignment of resources to those subdomains.
Question: 21 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named DW1 and a data pipeline named
Pipeline1.
You plan to add a user named User3 to Workspace1.
You need to ensure that User3 can perform the following actions:
View all the items in Workspace1.
Update the tables in DW1.
The solution must follow the principle of least privilege.
You already assigned the appropriate object-level permissions to DW1.
Which workspace role should you assign to User3?
A.Admin
B.Member
C.Viewer
D.Contributor
Answer: B
Explanation:
Member: This role allows users to view and interact with all the items in the workspace. When combined with
the already assigned object-level permissions to DW1, it ensures that User3 can update the tables in DW1.
Question: 22 CertyIQ
You have a Fabric capacity that contains a workspace named Workspace1. Workspace1 contains a lakehouse
named Lakehouse1, a data pipeline, a notebook, and several Microsoft Power BI reports.
A user named User1 wants to use SQL to analyze the data in Lakehouse1.
You need to configure access for User1. The solution must meet the following requirements:
Provide User1 with read access to the table data in Lakehouse1.
Prevent User1 from using Apache Spark to query the underlying files in Lakehouse1.
Prevent User1 from accessing other items in Workspace1.
What should you do?
A.Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
B.Assign User1 the Viewer role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL endpoint
data.
C.Share Lakehouse1 with User1 directly and select Build reports on the default semantic model.
D.Assign User1 the Member role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL
endpoint data.
Answer: A
Explanation:
A. Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
Share Lakehouse1 with User1 directly and select Read all SQL endpoint data: This approach grants User1
read access specifically to the table data in Lakehouse1 through the SQL endpoint, without giving them
broader permissions in Workspace1 or access to other items. By directly sharing Lakehouse1 and selecting the
"Read all SQL endpoint data" option, you ensure User1 can use SQL to analyze the data while preventing them
from using Apache Spark to query the underlying files.
Question: 23 CertyIQ
DRAG DROP -
You are implementing the following data entities in a Fabric environment:
Entity1: Available in a lakehouse and contains data that will be used as a core organization entity
Entity2: Available in a semantic model and contains data that meets organizational standards
Entity3: Available in a Microsoft Power BI report and contains data that is ready for sharing and reuse
Entity4: Available in a Power BI dashboard and contains approved data for executive-level decision making
Your company requires that specific governance processes be implemented for the data.
You need to apply endorsement badges to the entities based on each entity’s use case.
Which badge should you apply to each entity? To answer, drag the appropriate badges the correct entities. Each
badge may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll
to view content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
1.Master Data.
Refers to authoritative data that is central to business operations, often stored in a master data management
system.
2.Certified.
Indicates that an entity (such as a dataset or report) is officially validated by an authority in the organization.
3.Promoted.
Indicates that an entity is recommended for use but is not fully certified.
This badge is usually given when an item is considered useful but has not gone through a formal approval
process.
Assigned to Entity3, which signifies that it is endorsed for use but not yet fully certified.
4. Cannot be Endorsed.
Indicates that an entity does not qualify for endorsement (either promoted or certified).
Assigned to Entity4, meaning it has not met the standards for endorsement.
Question: 24 CertyIQ
HOTSPOT -
You have three users named User1, User2, and User3.
You have the Fabric workspaces shown in the following table.
You have a security group named Group1 that contains User1 and User3.
The Fabric admin creates the domains shown in the following table.
Explanation:
The "Yes" option is selected, meaning User3 does have Viewer access to Workspace3.
The Viewer role allows read-only access to the workspace but does not permit modifications.
The "Yes" option is selected, meaning User3 has Domain Contributor permissions in Domain1.
The Domain Contributor role typically allows managing content within a domain but does not grant full admin
rights.
The "No" option is selected, meaning User2 does NOT have Contributor access to Workspace3.
The Contributor role would allow editing content in the workspace, but since "No" is selected, User2 lacks
these permissions.
Question: 25 CertyIQ
You have two Fabric workspaces named Workspace1 and Workspace2.
You have a Fabric deployment pipeline named deployPipeline1 that deploys items from Workspace1 to
Workspace2. DeployPipeline1 contains all the items in Workspace1.
You recently modified the items in Workspaces1.
The workspaces currently contain the items shown in the following table.
Items in Workspace1 that have the same name as items in Workspace2 are currently paired.
You need to ensure that the items in Workspace1 overwrite the corresponding items in Workspace2. The solution
must minimize effort.
What should you do?
Answer: D
Explanation:
When items in Workspace1 and Workspace2 are paired and you run the deployment pipeline (deployPipeline1),
the pipeline will automatically update the paired items in Workspace2 with the changes made in Workspace1.
This means that the modifications in Workspace1 will overwrite the corresponding items in Workspace2
without requiring any additional steps.
Question: 26 CertyIQ
You have a Fabric workspace named Workspace1 that contains a data pipeline named Pipeline1 and a lakehouse
named Lakehouse1.
You have a deployment pipeline named deployPipeline1 that deploys Workspace1 to Workspace2.
You restructure Workspace1 by adding a folder named Folder1 and moving Pipeline1 to Folder1.
You use deployPipeline1 to deploy Workspace1 to Workspace2.
What occurs to Workspace2?
Answer: A
Question: 27 CertyIQ
DRAG DROP -
Your company has a team of developers. The team creates Python libraries of reusable code that is used to
transform data.
You create a Fabric workspace name Workspace1 that will be used to develop extract, transform, and load (ETL)
solutions by using notebooks.
You need to ensure that the libraries are available by default to new notebooks in Workspace1.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of
actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
This action involves defining an environment where libraries, dependencies, and configurations are managed.
Installing libraries involves setting up necessary packages required for development and execution.
Set the default environment :Matches with "Set the default environment."
A.VACUUM
B.BROADCAST
C.OPTIMIZE
D.CACHE
Answer: C
Explanation:
OPTIMIZE: This command is used to compact small files into larger ones and optimize the layout of data in a
Delta table. By running the OPTIMIZE command on Table1, you can consolidate the Parquet files and improve
the performance of read and write operations on the table. To consolidate the underlying Parquet files in
Table1, you should run the OPTIMIZE command.
Question: 29 CertyIQ
You have five Fabric workspaces.
You are monitoring the execution of items by using Monitoring hub.
You need to identify in which workspace a specific item runs.
Which column should you view in Monitoring hub?
A.Start time
B.Capacity
C.Activity name
D.Submitter
E.Item type
F.Job type
G.Location
Answer: G
Explanation:
Location: This column displays the workspace where the item is being executed, helping you pinpoint the
exact workspace of the item.
Reference:
https://learn.microsoft.com/en-us/training/modules/monitor-fabric-items/3-use-monitor-hub
Question: 30 CertyIQ
You have a Fabric workspace that contains a warehouse named DW1. DW1 is loaded by using a notebook named
Notebook1.
You need to identify which version of Delta was used when Notebook1 was executed.
What should you use?
A.Real-Time hub
B.OneLake data hub
C.the Admin monitoring workspace
D.Fabric Monitor
E.the Microsoft Fabric Capacity Metrics app
Answer: D
Explanation:
D. Fabric Monitor.
Fabric Monitor: This tool provides detailed monitoring and logging capabilities for various components within
a Fabric workspace, including notebooks and data processing tasks. By using Fabric Monitor, you can track
and analyze the execution details of Notebook1, including the version of Delta used during its execution. This
information is crucial for debugging, auditing, and ensuring compatibility across different versions of Delta.
Question: 31 CertyIQ
DRAG DROP -
You have a Fabric workspace that contains a warehouse named Warehouse1.
In Warehouse1, you create a table named DimCustomer by running the following statement.
You need to set the Customerkey column as a primary key of the DimCustomer table.
Which three code segments should you run in sequence? To answer, move the appropriate code segments from
the list of code segments to the answer area and arrange them in the correct order.
Answer:
Explanation:
Since adding or dropping a primary key constraint requires modifying a table, this statement is correct.
It specifies a NONCLUSTERED primary key, meaning the physical ordering of data is not changed, and a
separate index structure is created.
This selection aligns with the requirement of having a nonclustered primary key.
NOT ENFORCED
In some data warehousing scenarios, constraints might not be enforced to allow better query performance
and faster data ingestion.
If the system does not enforce referential integrity (e.g., in Azure Synapse Analytics), this would be applicable.
Question: 32 CertyIQ
You have a Fabric workspace that contains a semantic model named Model1.
You need to dynamically execute and monitor the refresh progress of Model1.
What should you use?
Answer: D
Explanation:
Semantic link in a notebook: This approach allows you to dynamically execute operations and monitor the
refresh progress of the semantic model (Model1) within the interactive and flexible environment of a
notebook. By using a semantic link, you can write custom scripts to trigger the refresh process and track its
progress in real-time. This method provides a high degree of control and visibility over the operations on your
semantic model.
Question: 33 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The answer is B. No because the "sort by" is sorting values in descending order (default behavior -->
https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric). One should add "asc" to
sort values as required. The double "project" at the end does not affect the final result
Question: 34 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The default sorting order in KQL is descending (desc), not ascending (asc).
The solution does not explicitly specify asc in the order by clause, so the results will be sorted in descending
order by default.
The requirement is to sort the data by No_Bikes in ascending order, which is not achieved without explicitly
specifying asc.
A. Yes: This would be incorrect because the solution fails to meet the requirement of sorting in ascending
order due to the default descending behavior in KQL.
Important Tip:
Always explicitly specify the sorting order (asc or desc) in KQL to avoid confusion, especially since its default
behavior differs from SQL.
Question: 35 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: A
Explanation:
The provided code segment correctly filters the data for the neighborhood "Sands End" where the number of
bikes (No_Bikes) is at least 15. It then explicitly sorts the results by No_Bikes in ascending order using sort by
No_Bikes asc and projects the required columns (BikepointID, Street, Neighbourhood, No_Bikes,
No_Empty_Docks, Timestamp). This meets all the stated goals of the problem.
B. No: This would be incorrect because the solution explicitly specifies asc in the sort by clause, ensuring the
data is ordered by No_Bikes in ascending order as required.
Important Tip:
Always ensure that the sorting order is explicitly specified in KQL to match the requirements, as the default
behavior might differ from other query languages like SQL.
Reference:
https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric
Question: 36 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The provided solution uses SQL syntax (SELECT, FROM, WHERE, ORDER BY), but the scenario specifies that
the data is in a KQL (Kusto Query Language) database. KQL and SQL have different syntax and functions. The
correct KQL syntax should be used to filter and sort the data in a KQL database.
A. Yes: This would be incorrect because the solution uses SQL syntax instead of KQL, which is not applicable
in this context.
Important Tip:
Always use the appropriate query language for the database you are working with. In this case, KQL should be
used instead of SQL to interact with the KQL database. The correct KQL query would use filter, sort by, and
project as shown in previous examples.
Question: 37 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Answer: B
Explanation:
B. Create a shortcut.
Create a Shortcut: Creating a shortcut in the lakehouse allows you to link to external data sources without
making a copy of the data. This means you can make the book reviews available in the lakehouse by creating a
shortcut to the location where the book reviews are stored. The data remains in its original location but is
accessible from the lakehouse, meeting the requirement of not duplicating the data.
Question: 38 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Requirements. Planned Changes -
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Answer: E
Explanation:
E. Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Month.
This approach ensures minimal data transfer while keeping the refresh scope limited to the most recent and
relevant data (1 month), which is aligned with the requirement to minimize data transfer.
Question: 39 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer:
Explanation:
The Bronze Layer is typically the raw data ingestion layer in a medallion architecture.
A Copy activity in a pipeline is commonly used in Azure Data Factory (ADF) or Synapse Pipelines to ingest and
store raw data into the Bronze Layer (such as a Data Lake or Delta Lake).
This choice ensures efficient and scalable data ingestion from various sources.
A notebook (such as an Azure Databricks or Synapse notebook) is often used to apply transformations,
perform data validation, and enrich the raw ingested data.
This choice aligns with the goal of refining, structuring, and preparing the data before moving it to the Gold
Layer.
Question: 40 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
A.Create a workspace identity and enable high concurrency for the notebooks.
B.Create a shortcut and ensure that caching is disabled for the workspace.
C.Create a workspace identity and use the identity in a data pipeline.
D.Create a shortcut and ensure that caching is enabled for the workspace.
Answer: D
Explanation:
Enabling caching for the workspace will help minimize egress costs by reducing the amount of data that
needs to be transferred across clouds. Creating a shortcut ensures that the raw data is not duplicated in the
lakehouse.
Question: 41 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer:
Explanation:
Final Answer:
Joins:
WHERE Clause:
IsActive = 1
These selections ensure that:
All products are retained, even if they are not assigned to a subcategory.
The first join should be a LEFT OUTER JOIN to ensure that all products are retained, even if they are not
assigned to a subcategory. The second join should be an INNER JOIN to exclude categories and subcategories
that are not linked to any product, as they are not analytically relevant.
Only active products, identified by an IsActive value of 1, should be included in the product dimension in the
gold layer. Additionally, in the POS1 product data, ProductID values are unique. Categories and subcategories
without assigned products must be omitted to maintain analytical relevance.
Question: 42 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
A.ForEach
B.Copy data
C.WebHook
D.Stored procedure
Answer: AB
Explanation:
ForEach: This activity allows you to iterate over a collection of items and execute activities for each item. In
this context, it can be used to process multiple datasets or files within the bronze layer, ensuring that each
file is appropriately handled and transformed.
Copy Data: This activity is fundamental in pipelines for data movement. It enables you to copy data from a
source to a destination, such as moving data from a staging area to the bronze layer. The Copy Data activity
can read the MAR1 data from its source and write it to the bronze layer, ensuring the data is properly ingested.
Question: 43 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named Warehouse1. Warehouse1 contains the following
tables and columns.
You need to denormalize the tables and include the ContractType and StartDate columns in the Employee table.
The solution must meet the following requirements:
Ensure that the StartDate column is of the date data type.
Ensure that all the rows from the Employee table are preserved and include any matching rows from the Contract
table.
Ensure that the result set displays the total number of employees per contract type for all the contract types that
have more than two employees.
How should you complete the statement? To answer, select the appropriate options in the answer area.
The CONVERT function is used to explicitly convert data types in SQL Server.
A LEFT OUTER JOIN ensures all employees are included, even if they do not have a corresponding contract.
If some employees do not have contracts, this join type ensures they are still listed with NULL contract values.
HAVING is used because COUNT(DISTINCT EmployeeID) is an aggregate function, and aggregate functions
cannot be used in WHERE.
Question: 44 CertyIQ
HOTSPOT -
You have an Azure Event Hubs data source that contains weather data.
You ingest the data from the data source by using an eventstream named Eventstream1. Eventstream1 uses a
lakehouse as the destination.
You need to batch ingest only rows from the data source where the City attribute has a value of Kansas. The filter
must be added before the destination. The solution must minimize development effort.
What should you use for the data processor and filtering? To answer, select the appropriate options in the answer
area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Selecting "An eventstream with an external data source" means data is coming from an external system such
as IoT devices, logs, or real-time telemetry.
This is appropriate when dealing with real-time ingestion from sources like Azure Event Hubs, IoT Hub, or
Kafka.
2.Filtering: An eventstream processor.
Filtering in streaming systems typically happens during real-time data ingestion to remove irrelevant or
unnecessary events before further processing.
An eventstream processor can be used to apply transformations, filtering, and aggregations dynamically.
This ensures that only relevant data moves forward in the pipeline.
Question: 45 CertyIQ
You have a Fabric workspace that contains an eventstream named Eventstream1. Eventstream1 processes data
from a thermal sensor by using event stream processing, and then stores the data in a lakehouse.
You need to modify Eventstream1 to include the standard deviation of the temperature.
Which transform operator should you include in the Eventstream1 logic?
A.Expand
B.Group by
C.Union
D.Aggregate
Answer: B
Explanation:
The Group by transform operator contains the Standard deviation aggregation. The Aggregate transform
operator only contains Average, Max, Min and Sum aggregation.
Reference:
https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/process-events-using-event-
processor-editor?pivots=standard-capabilities#group-by
Question: 46 CertyIQ
You have an Azure event hub. Each event contains the following fields:
BikepointID -
Street -
Neighbourhood -
Latitude -
Longitude -
No_Bikes -
No_Empty_Docks -
You need to ingest the events. The solution must only retain events that have a Neighbourhood value of Chelsea,
and then store the retained events in a Fabric lakehouse.
What should you use?
Answer: B
Explanation:
B. an eventstream.
Eventstream: An eventstream is specifically designed for processing and managing events in real-time. It
allows you to filter, transform, and route events efficiently. In this scenario, you can configure the
eventstream to retain only the events where the Neighbourhood value is "Chelsea" and then store the filtered
events in a Fabric lakehouse. This approach ensures that only the relevant events are ingested, adhering to
the requirement to retain only specific events based on the Neighbourhood value.
Question: 47 CertyIQ
HOTSPOT -
You are building a data loading pattern for Fabric notebook workloads.
You have the following code segment:
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Selected Answer: No
In many data loading strategies, especially when using incremental loads or merge operations, the target
table is not always overwritten. Instead, new data is appended, updated, or merged based on keys.
Overwriting usually happens in full refresh scenarios, which is not always the case.
Selected Answer: No
The merge operation (such as SQL MERGE or Delta Lake MERGE INTO) only runs if certain conditions are met,
such as the presence of new or changed data. If there is no data to update or merge, it may not execute. Thus,
it's correct to say that it does not always run.
"The loading pattern supports both full and incremental loading requirements."
A well-designed data pipeline often supports both full and incremental loads. Full loads replace the entire
dataset, while incremental loads append or update only changed records. Since this is a common practice,
selecting "Yes" is correct.
Question: 48 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains two lakehouses named Lakehouse1 and Lakehouse2. Lakehouse1
contains staging data in a Delta table named Orderlines. Lakehouse2 contains a Type 2 slowly changing dimension
(SCD) dimension table named Dim_Customer.
You need to build a query that will combine data from Orderlines and Dim_Customer to create a new fact table
named Fact_Orders. The new table must meet the following requirements:
Enable the analysis of customer orders based on historical attributes.
Enable the analysis of customer orders based on the current attributes.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
This ensures that the OrderDate falls on or after the start of the valid period.
This is essential to capture all orders that are valid based on the entity's timeline.
This ensures that the OrderDate is strictly before the valid end date.
This prevents fetching orders that occur after the entity has expired.
Question: 49 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?
A.Eventstream
B.Dataflow Gen2
C.Streaming dataset
D.Data pipeline
Answer: D
Explanation:
D. Data pipeline.
Data pipeline: A data pipeline is designed to handle large-scale data ingestion and movement efficiently. It
can be configured to automatically trigger the ingestion process when a new file is added to the external data
source, ensuring that the data is ingested into Lakehouse1 as soon as it becomes available. Data pipelines are
optimized for high throughput, making them suitable for handling large files (like the 500 GB files mentioned)
and ensuring the process is both fast and efficient.
Question: 50 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?
A.Data pipeline
B.Environment
C.KQL queryset
D.Dataflow Gen2
Answer: A
Question: 51 CertyIQ
You have a Fabric workspace that contains an eventhouse and a KQL database named Database1. Database1 has
the following:
A.
B.
C.
D.
Answer: BD
Explanation:
Record B loads because it conforms to the updated schema (string DeviceId, StreamData with temperature).
Record D loads because it conforms to the original schema (guid DeviceId, no temperature in StreamData).
Question: 52 CertyIQ
HOTSPOT -
You have a Fabric workspace.
You are debugging a statement and discover the following issues:
Sometimes, the statement fails to return all the expected rows.
The PurchaseDate output column is NOT in the expected format of mmm dd, yy.
You need to resolve the issues. The solution must ensure that the data types of the results are retained. The results
can contain blank cells.
How should you complete the statement? To answer, select the appropriate options in the answer area.
1. try_cast(item_name as varchar(20))
Purpose: Attempts to convert item_name into a VARCHAR(20). If conversion fails, it returns NULL instead of
an error.
2. convert(varchar, purchase_date, 7)
Question: 53 CertyIQ
You are developing a data pipeline named Pipeline1.
You need to add a Copy data activity that will copy data from a Snowflake data source to a Fabric warehouse.
What should you configure?
Answer: C
Explanation:
Enable Staging: When copying data from a Snowflake data source to a Fabric warehouse, enabling staging
can significantly improve the efficiency and reliability of the data transfer process. Staging involves
temporarily storing the data in an intermediate location before loading it into the final destination. This
approach helps in handling large datasets and complex transformations, ensuring that the data is transferred
smoothly without interruptions. It also allows for more manageable and optimized data movement,
particularly when dealing with different data storage systems like Snowflake and Fabric.
Question: 54 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You change the join type to kind=outer.
Does this meet the goal?
A.Yes
B.No
Answer: B
Explanation:
No. An outer join can be more computationally intensive than an inner join because it needs to process all rows
from both tables and include rows that don't have matching entries.
Question: 55 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You change project to extend.
A.Yes
B.No
Answer: B
Explanation:
No. The `project` operator is used to select specific columns, whereas `extend` is used to add new calculated
columns to the result set. They serve different purposes.
Question: 56 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You move the filter to line 02.
A.Yes
B.No
Answer: A
Explanation:
Yes. By applying the `where` clause early in the query, you reduce the number of rows processed in
subsequent operations, which improves performance.
Question: 57 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You add the make_list() function to the output columns.
A.Yes
B.No
Answer: B
Explanation:
No. The `make_list()` function aggregates values into a list, which can be useful for certain types of analysis
but does not inherently improve query performance.
Question: 58 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Requirements. Planned Changes -
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Answer:
Explanation:
queryinsights.frequently_run_queries
number_of_failed_runs > 1
only this table have the fields specified in the SELECT AND WHERE statements
The data engineering team wants to debug the issue and find queries that cause more than one failure.
https://learn.microsoft.com/en-us/sql/relational-databases/system-views/queryinsights-frequently-run-
queries-transact-sql?view=fabric&preserve-view=true
Question: 59 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer: A
Explanation:
Schedule a data pipeline that calls other data pipelines: This approach allows you to orchestrate and
manage the population of medallion layers efficiently. By scheduling a main data pipeline that calls other data
pipelines, you can ensure that each step in the data processing workflow is executed in the correct sequence.
This method provides better modularity and manageability, as each sub-pipeline can focus on a specific layer
or task within the medallion architecture.
Question: 60 CertyIQ
HOTSPOT -
You are processing streaming data from an external data provider.
You have the following code segment.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Question: 61 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Lakehouse1 contains a Delta table
named Table1.
You analyze Table1 and discover that Table1 contains 2,000 Parquet files of 1 MB each.
You need to minimize how long it takes to query Table1.
What should you do?
Answer: C
Explanation:
OPTIMIZE Command: Running the OPTIMIZE command on a Delta table helps to combine smaller files into
larger ones, which can significantly improve query performance. This process, known as compaction, reduces
the number of Parquet files that need to be read during a query, thereby decreasing query latency. In your
case, with 2,000 Parquet files of 1 MB each, running OPTIMIZE will consolidate these files into fewer, larger
files, making queries faster and more efficient.
VACUUM Command: The VACUUM command cleans up old versions of data files that are no longer needed,
which helps to free up storage space and maintain the performance of the Delta table. After running
OPTIMIZE, it's a good practice to run VACUUM to remove any obsolete files and further streamline the data
storage.
Question: 62 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1. Data is loaded daily into Warehouse1
by using data pipelines and stored procedures.
You discover that the daily data load takes longer than expected.
You need to monitor Warehouse1 to identify the names of users that are actively running queries.
Which view should you use?
A.sys.dm_exec_connections
B.sys.dm_exec_requests
C.queryinsights.long_running_queries
D.queryinsights.frequently_run_queries
E.sys.dm_exec_sessions
Answer: E
Explanation:
sys.dm_exec_sessions: This view provides detailed information about all active user connections to the SQL
server. It includes information about the user, session ID, login time, and more. By querying this view, you can
identify which users are currently connected and actively running queries.
Use sys.dm_exec_sessions. This view has info about all active user sessions, including user names, session IDs
and status.
Question: 63 CertyIQ
You have a Fabric workspace that contains an eventstream named EventStream1. EventStream1 outputs events to
a table in a lakehouse.
You need to remove files that are older than seven days and are no longer in use.
Which command should you run?
A.VACUUM
B.COMPUTE
C.OPTIMIZE
D.CLONE
Answer: A
Explanation:
The VACUUM command is used to clean up old files that are no longer in use, which fits the requirement of
removing files that are older than seven days. This command is typically used in data lake environments to
delete files that are no longer needed by the system, ensuring that storage is efficiently managed.
The default retention period for the VACUUM command is 7 days, therefore it will remove files older than 7
days.
Question: 64 CertyIQ
You have a Fabric warehouse named DW1 that loads data by using a data pipeline named Pipeline1. Pipeline1 uses a
Copy data activity with a dynamic SQL source. Pipeline1 is scheduled to run every 15 minutes.
You discover that Pipeline1 keeps failing.
You need to identify which SQL query was executed when the pipeline failed.
What should you do?
A.From Monitoring hub, select the latest failed run of Pipeline1, and then view the output JSON.
B.From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.
C.From Real-time hub, select Fabric events, and then review the details of Microsoft.Fabric.ItemReadFailed.
D.From Real-time hub, select Fabric events, and then review the details of Microsoft. Fabric.ItemUpdateFailed.
Answer: B
Explanation:
B. From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.
Monitoring hub: The Monitoring hub provides detailed logs and information about the execution of your data
pipelines. By selecting the latest failed run of Pipeline1, you can access the execution details and diagnose
the issue.
View the input JSON: The input JSON contains the parameters, configurations, and the dynamic SQL query
used for the Copy data activity. By examining the input JSON, you can identify the specific SQL query that was
executed at the time the pipeline failed. This information is crucial for troubleshooting the issue and
understanding why the pipeline keeps failing.
Question: 65 CertyIQ
You have a Fabric notebook named Notebook1 that has been executing successfully for the last week.
During the last run, Notebook1executed nine jobs.
You need to view the jobs in a timeline chart.
What should you use?
A.Real-Time hub
B.Monitoring hub
C.the job history from the application run
D.Spark History Server
E.the run series from the details of the application run
Answer: E
Explanation:
The run series from the details of the application run: This option allows you to view a detailed timeline of the
jobs that were executed during the last run of Notebook1. The run series provides a chronological view of all
the jobs, including their start and end times, which enables you to visualize the execution timeline effectively.
Question: 66 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains an eventstream named EventStream1.
You discover that an EventStream1 transformation fails.
You need to find the following error information:
The error details, including the occurrence time
What should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
"Data insights" typically provide an in-depth analysis of errors, offering detailed information about what went
wrong, where, and possibly why the error occurred. It helps in diagnosing and understanding the root cause of
an issue.
"Runtime logs" generally contain records of system execution, including the number of errors encountered.
These logs provide a summary of system performance and error occurrences, making them a reliable source
for determining the total number of errors.
Thank you
Thank you for being so interested in the premium exam material.
I'm glad to hear that you found it informative and helpful.
If you have any feedback or thoughts on the bumps, I would love to hear them.
Your insights can help me improve our writing and better understand our readers.
Best of Luck
You have worked hard to get to this point, and you are well-prepared for the exam
Keep your head up, stay positive, and go show that exam what you're made of!
Total: 66 Questions
Link: https://certyiq.com/papers/microsoft/dp-700
ExamDiscuss
https://www.examdiscuss.com
Certification Exams Discussions and Preparation
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
Exam : DP-700
Vendor : Microsoft
Version : DEMO
1 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 1
You need to schedule the population of the medallion layers to meet the technical
requirements.
What should you do?
A. Schedule a data pipeline that calls other data pipelines.
B. Schedule a notebook.
C. Schedule an Apache Spark job.
D. Schedule multiple data pipelines.
Answer: A
Explanation:
The technical requirements specify that:
Medallion layers must be fully populated sequentially (bronze → silver → gold). Each layer
must be populated before the next.
If any step fails, the process must notify the data engineers.
Data imports should run simultaneously when possible.
Why Use a Data Pipeline That Calls Other Data Pipelines?
A data pipeline provides a modular and reusable approach to orchestrating the sequential
population of medallion layers.
By calling other pipelines, each pipeline can focus on populating a specific layer (bronze,
silver, or gold), simplifying development and maintenance.
A parent pipeline can handle:
- Sequential execution of child pipelines.
- Error handling to send email notifications upon failures.
- Parallel execution of tasks where possible (e.g., simultaneous imports into the bronze layer)
.
Topic 1, Contoso, Ltd
Overview
This is a case study. Case studies are not timed separately. You can use as much exam time
as you would like to complete each case. However, there may be additional case studies and
sections on this exam. You must manage your time to ensure that you are able to complete
all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that
is provided in the case study. Case studies might contain exhibits and other resources that
provide more information about the scenario that is described in the case study. Each
question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review
your answers and to make changes before you move to the next section of the exam. After
you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in the
left pane to explore the content of the case study before you answer the questions. Clicking
these buttons displays information such as business requirements, existing environment, and
problem statements. If the case study has an All Information tab, note that the information
displayed is identical to the information displayed on the subsequent tabs. When you are
ready to answer a question, click the Question button to return to the question.
2 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
3 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
The data engineering team has successfully exported data from MAR1. The team
experiences transient connectivity errors, which causes the data exports to fail.
Requirements. Planned Changes
Contoso plans to create the following two lakehouses:
Lakehouse1: Will store both raw and cleansed data from the sources
Lakehouse2: Will serve data in a dimensional model to users for analytical queries Additional
items will be added to facilitate data ingestion and transformation.
Contoso plans to use Azure Repos for source control in Fabric.
Requirements. Technical Requirements
The new lakehouses must follow a medallion architecture by using the following three layers:
bronze, silver, and gold. There will be extensive data cleansing required to populate the
MAR1 data in the silver layer, including deduplication, the handling of missing values, and the
standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in
populating the lakehouses fails, an email must be sent to the data engineers.
Data imports must run simultaneously, when possible.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Minimize egress costs associated with cross-cloud data access.
Prevent saving a copy of the raw data in the lakehouses.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data
lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the
source data.
In the event of a connectivity error, the ingestion processes must attempt the connection
again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic
models, reports, and dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold
layer must include only active products from product list. Active products are identified by an
IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are
NOT analytically relevant and must be omitted from the product dimension in the gold layer.
Requirements. Data Security
Security in Fabric must meet the following requirements:
The data engineers must have read and write access to all the lakehouses, including the
underlying files.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
QUESTION NO: 2
4 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 3
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical
requirements.
What should you do?
A. Create a workspace identity and enable high concurrency for the notebooks.
B. Create a shortcut and ensure that caching is disabled for the workspace.
C. Create a workspace identity and use the identity in a data pipeline.
D. Create a shortcut and ensure that caching is enabled for the workspace.
Answer: B
Explanation:
To ensure that the usage of the data in the Amazon S3 bucket meets the technical
requirements, we must address two key points:
Minimize egress costs associated with cross-cloud data access: Using a shortcut ensures
that Fabric does not replicate the data from the S3 bucket into the lakehouse but rather
provides direct access to the data in its original location. This minimizes cross-cloud data
transfer and avoids additional egress costs.
Prevent saving a copy of the raw data in the lakehouses: Disabling caching ensures that the
raw data is not copied or persisted in the Fabric workspace. The data is accessed on-
demand directly from the Amazon S3 bucket.
QUESTION NO: 4
You need to recommend a method to populate the POS1 data to the lakehouse medallion
layers.
What should you recommend for each layer? To answer, select the appropriate options in the
answer area.
NOTE: Each correct selection is worth one point.
5 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
Answer:
6 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 5
You need to ensure that the data analysts can access the gold layer lakehouse.
What should you do?
A. Add the DataAnalyst group to the Viewer role for WorkspaceA.
B. Share the lakehouse with the DataAnalysts group and grant the Build reports on the
default semantic model permission.
C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint
data permission.
D. Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark
permission.
Answer: C
Explanation:
Data Analysts' Access Requirements must only have read access to the Delta tables in the
gold layer and not have access to the bronze and silver layers.
The gold layer data is typically queried via SQL Endpoints. Granting the Read all SQL
Endpoint data permission allows data analysts to query the data using familiar SQL-based
tools while restricting access to the underlying files.
QUESTION NO: 6
You need to create the product dimension.
How should you complete the Apache Spark SQL code? To answer, select the appropriate
options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
7 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
8 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online
bookstore constantly provides logs and sales data to a central enterprise resource planning
(ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze,
silver, and gold. The sales data is ingested from the ERP system as Parquet files that land in
the Files folder in a lakehouse. Notebooks are used to transform the files in a Delta table for
the bronze and silver layers. The gold layer is in a warehouse that has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the
Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older
than one month never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and
historical data is captured. The dataflow captures the following fields of the source:
Sales Date
Author
Price
Units
SKU
A table named AuthorSales stores the sales data that relates to each author. The table
contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by
using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales
Fabric Admins
Streaming Admins
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate
that reports against the warehouse sometimes run for two hours and fail to load as expected.
Upon further investigation, the data engineering team receives the following error message
when the reports fail to load: "The SQL query failed while running." The data engineering
team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This
increase slows the data ingestion process.
The company's sales team reports that during the last month, the sales data has NOT been
up-to-date when they arrive at work in the morning.
Requirements. Planned Changes
Litware recently signed a contract to receive book reviews. The provider of the reviews
exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data
9 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 7
You are building a data orchestration pattern by using a Fabric data pipeline named Dynamic
Data Copy as shown in the exhibit. (Click the Exhibit tab.)
10 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
Answer:
11 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 8
HOTSPOT
You need to troubleshoot the ad-hoc query issue.
How should you complete the statement? To answer, select the appropriate options in the
answer area.
NOTE: Each correct selection is worth one point.
Answer:
12 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
13 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 9
You have a Fabric workspace named Workspace1 that contains the items shown in the
following table.
For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the
answer area.
14 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
Answer:
15 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/
Microsoft DP-700 Exam Information and Actual Questions
IT Certification Guaranteed, The Easy Way!
QUESTION NO: 10
You need to develop an orchestration solution in fabric that will load each item one after the
other. The solution must be scheduled to run every 15 minutes. Which type of item should
you use?
A. warehouse
B. data pipeline
C. Dataflow Gen2 dataflow
D. notebook
Answer: B
16 from Examdiscuss.com.
Get Latest & Valid dp-700 Exam's Question and Answers
https://www.examdiscuss.com/Microsoft/exam/DP-700/