DP 700
DP 700
(DP-700)
Total: 67 Questions
Link: https://examheist.com/papers/microsoft/dp-700
Question: 1 Exam Heist
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing environment, and problem
statements. If the case study has an All Information tab, note that the information displayed is identical to the
information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
Answer: C
Explanation:
C: Share the lakehouse with the DataAnalysts group and grant the Read all data permission.This approach
ensures that data analysts have the necessary read access to the Delta tables in the gold layer, aligning with
the requirement that they should not have access to data in the bronze and silver layers.
By granting Read all SQL Endpoint data permission, the analysts get the necessary and sufficient access to
query the gold layer data while adhering to the principle of least privilege.
A.a lakehouse
B.an eventhouse
C.a datamart
D.a warehouse
Answer: A
Explanation:
A lakehouse in Microsoft Fabric is designed to handle semi-structured and unstructured data, combining the
flexibility of a data lake with the structure of a data warehouse. It supports data writing via Apache Spark and
allows querying through T-SQL and KQL, making it suitable for the specified requirements.
A lakehouse combines the features of data lakes and data warehouses. It is designed to handle both
structured and semi-structured data, making it ideal for storing diverse data formats.
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is the most suitable tool for moving data between different sources and destinations. In this
case, you need to copy data from your on-premises Microsoft SQL Server database (Database1) to your Fabric
warehouse (Warehouse1). A data pipeline can efficiently handle this task by allowing you to define and
manage the data transfer process.
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is specifically designed for orchestrating and automating data movement tasks between
different sources and destinations. Here’s why a data pipeline is the best choice for copying data from your
on-premises Microsoft SQL Server database (Database1) to your Fabric warehouse (Warehouse1)
Data pipelines in Microsoft Fabric are designed to facilitate the movement and transformation of data
between various sources and destinations. In this scenario, a data pipeline can be configured to copy data
from the on-premises SQL Server database to the Fabric warehouse, utilizing the on-premises data gateway
for secure connectivity.
Answer: C
Explanation:
V-Order is a columnar storage format that optimizes data storage and retrieval. It can significantly improve
query performance and reduce storage costs by compressing data and minimizing the amount of data read
during queries. This makes it a suitable choice for large fact tables and scenarios where you need to improve
performance without increasing operational costs.
V-Order improves read performance by applying special optimizations such as sorting, row group distribution,
dictionary encoding, and compression on Parquet files. This enhances query performance significantly,
especially for large datasets. Additionally, V-Order is cost-effective as it reduces the amount of resources
needed for reading data, leading to improved performance without increasing operational costs.
You need to create an output that presents the summarized values of all the order quantities by year and product.
The results must include a summary of the order quantities at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Key Details:
The use of ROLLUP ensures compliance with the requirement for summarized values at different grouping
levels.
SUM(SO.OrderQty) calculates the total order quantities.
You plan to load the data into a dimensional model and implement a star schema. From the original flat table, you
create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of the
solution.
NOTE: Each correct selection is worth one point.
A.Date
B.ProductName
C.ProductColor
D.TransactionID
E.SalesAmount
F.ProductID
Answer: BCF
Explanation:
B. ProductName: This attribute describes the product and is crucial for understanding and analyzing the data
related to each product.
C. ProductColor: This attribute provides additional information about the product, which can be useful for
analysis, reporting, and segmentation.
F. ProductID: This is the unique identifier for each product and serves as the primary key for the DimProduct
table. It's essential for establishing the relationship between the FactSales table and the DimProduct table.
Question: 8 Exam Heist
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?
Answer: A
Explanation:
A.Enable high concurrency for notebooks: High concurrency allows multiple notebooks to share the same
Apache Spark session. This setting ensures that different notebooks can run simultaneously within the same
session, facilitating collaboration and efficient resource usage.
Orders -
Customer -
Employee -
The Employee table contains Personally Identifiable Information (PII).
A data engineer is building a workflow that requires writing data to the Customer table, however, the user does
NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data from the
Employee table.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Answer: DEF
Explanation:
Assigning the Contributor role to the data engineer for Workspace1 grants them the necessary permissions to
write data to the Customer table in Lakehouse1. However, since the data engineer does not have elevated
permissions to view the Employee table, they won't be able to access its content.
E. Migrate the Employee table from Lakehouse1 to Lakehouse2:
Moving the Employee table, which contains Personally Identifiable Information (PII), to a separate Lakehouse2
helps ensure that the data engineer cannot accidentally or intentionally access it. This action keeps sensitive
data segregated from the data engineer's operational environment.
F. Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2:
By creating a new workspace and lakehouse for the Employee table, you further isolate the sensitive data.
The data engineer can still perform their tasks in Workspace1 without accessing Workspace2, ensuring secure
data handling and compliance with privacy requirements.
A.STORED PROCEDURE
B.CONSTRAINT
C.SCHEMA
D.FUNCTION
Answer: D
Explanation:
To implement Row-Level Security (RLS) in a Fabric warehouse like DW1, need to use a FUNCTION to define
the filtering logic. Specifically, a user-defined function (UDF) is created and associated with the RLS policy to
determine which rows each user can access.
Reference:
https://learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-row-level-security#2-define-security-
policies
Four notebooks -
Three lakehouses -
Answer: No/Yes/No
Explanation:
1. Data from the semantic models will be deployed to the target stage.
Answer: No
Semantic models are only deployed to the target stage in the form of metadata. The deployment process
does not copy actual data; instead, only the structural and configuration metadata (e.g., model schema and
measures) is deployed. The target stage will require a refresh to fetch the data into the semantic models.
Reference: Microsoft Learn - Item Properties Copied During Deployment
Answer: Yes
Dataflow Gen1 objects are included in the deployment pipeline and are fully deployed to the target stage,
including their configurations. This ensures that Dataflow Gen1 pipelines can run in the target environment.
The deployment process supports this functionality without requiring a manual configuration.
Answer: No
The deployment process does not copy or deploy refresh schedules for datasets, semantic models, or other
items. Although metadata for the items is deployed, refresh schedules must be manually recreated or
configured in the target stage. This limitation is highlighted in Microsoft's documentation.
Reference: Microsoft Learn - Item Properties Copied During Deployment
A.GitHub Actions
B.a deployment pipeline
C.an Azure DevOps pipeline
Answer: B
Explanation:
B. a deployment pipeline.
Deployment Pipeline: In Microsoft Fabric, a deployment pipeline is specifically designed for managing and
deploying resources across different environments (Dev, Test, and Prod). It allows you to automate the
deployment process, ensuring consistency and efficiency. By using a deployment pipeline, you can easily
include the eventhouse in your deployment process and manage its promotion through the different stages
(Dev, Test, Prod).
Reference:
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipelines?
tabs=from-fabric%2Cnew%2Cstage-settings-new
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/understand-the-deployment-process?
tabs=new
Answer: B
Explanation:
Microsoft Fabric's deployment pipelines provide a built-in mechanism to manage and validate the deployment
of artifacts like warehouses. When you use a deployment pipeline to move Warehouse1 from one workspace
(Workspace1) to another (Workspace2), the pipeline automatically checks for issues such as invalid
references or missing dependencies during the deployment process.
A.only data added to new databases that are added to the eventhouse
B.only the existing data in the eventhouse
C.no data
D.both new data and existing data in the eventhouse
E.only new data added to the eventhouse
Answer: E
Explanation:
When you enable OneLake availability for an eventhouse, only the new data that is added to the eventhouse
after enabling this setting will be copied to OneLake. The existing data present in the eventhouse prior to
enabling OneLake availability will not be copied automatically. This ensures that users can access the most
recent data through the OneLake file explorer while maintaining the efficiency of data synchronization.
A.service principal
B.Microsoft Entra username and password
C.managed private endpoint
D.workspace identity
Answer: A
Explanation:
A. service principal.
Service Principal: A service principal is a security identity used by applications, services, and automation tools
to access specific Azure resources. It provides a secure way to authenticate and authorize API calls between
Azure DevOps and Fabric. By using a service principal, you can grant the necessary permissions to
deployPipeline1 to interact with the Fabric workspace (Workspace1) and deploy items to higher environments.
This approach ensures secure and managed access without relying on individual user credentials.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.
A.Stores only
B.Products only
C.Stores and Products only
D.Products, Stores, and Trips
E.Trips only
F.Products and Trips only
Answer: C
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. However, the effectiveness of this caching depends on whether the cache was
enabled before the files were added to the storage or if the shortcuts were already pointing to those files.
Answer: B
Explanation:
Managed Private Endpoint: This allows secure and private communication between Azure services without
exposing data to the public internet. By creating a managed private endpoint, you can establish a direct
connection between the Apache Spark job in Workspace1 and the Azure SQL database (Source1) while
keeping public internet access disabled. This approach ensures that data transfer happens securely within the
Azure network.
To ensure that Job1 can access the data in Source1, you need to create a managed private endpoint. This will
allow the Spark job to securely connect to the Azure SQL database without requiring public internet access.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?
Answer: B
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. This means that data accessed through the cached shortcuts is retrieved from the
local cache instead of the original storage locations, which improves performance.
Reference:
https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts
Question: 19 Exam Heist
HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.
For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
A.workspace Admin
B.domain admin
C.domain contributor
D.Fabric admin
Answer: D
Explanation:
Fabric Admin: Possesses the highest level of permissions within the Fabric environment, enabling the creation
of domains and subdomains, as well as the assignment of resources to those subdomains.
A.Admin
B.Member
C.Viewer
D.Contributor
Answer: B
Explanation:
Member: This role allows users to view and interact with all the items in the workspace. When combined with
the already assigned object-level permissions to DW1, it ensures that User3 can update the tables in DW1.
Thank you
Thank you for being so interested in the premium exam material.
I'm glad to hear that you found it informative and helpful.
But Wait
I wanted to let you know that there is more content available in the full version.
The full paper contains additional sections and information that you may find helpful,
and I encourage you to download it to get a more comprehensive and detailed view of
all the subject matter.
Total: 67 Questions
Link: https://examheist.com/papers/microsoft/dp-700