Pad21302t Eml Unit 3
Pad21302t Eml Unit 3
Most successful entrepreneurs are always keen to gain an edge by improving the odds when it
comes to making important business decisions. This is where the true value of predictive analytics
and Azure Machine Learning can really shine. In the business world, and in life in general, any
time you consistently can improve your chances of determining an outcome—over just pure luck—
you have a distinct advantage. One simple example of this concept in action is the use of predictive
analytics to provide feedback on the effectiveness of sales and marketing campaigns. By
correlating factors such as customer responses to offers, segmented by customer demographics,
the effects of pricing and discounts, the effects of seasonality, and the effects of social media,
patterns soon start to emerge. These patterns provide clues to the causes and effects that will
ultimately help make better and more informed marketing decisions. This is the basic premise
behind most of today’s targeted marketing campaigns.
Let’s face it—humans, your target customer base, are creatures of habit. When dealing with human
behavior, past behavior is always a strong indicator of future behavior. Predictive analytics and
machine learning can help you capitalize27on these key principles by helping to make that past
behavior clearer—and more easily tracked—so that future marketing efforts are more likely to
achieve higher success rates. To make the most of your Azure Machine Learning experience, there
are a few underlying data science principles, algorithms, and theories that are necessary to achieve
a good background and understanding of how it all works. With today’s never-ending explosion
of digital data, along with the rapid advances in “big data” analytics, it’s no wonder that the data
science profession is extremely hot right now. Core to this new, burgeoning industry are
individuals with the right mix of math, statistical, and analytical skills to make sense of all that
data. To that end, in this book we cover only the basics of what you need to know to be effective
with Azure Machine Learning. There are many advanced books and courses on the various
machine learning theories. We leave it to you, the data scientist, to fully explore the depths of
theories behind this exciting new discipline.
Azure Machine Learning (Azure ML) is Microsoft’s product to support your end-to-end Machine Learning
lifecycle in Azure.
Azure Databricks is a fast, easy and collaborative Apache® Spark™ based analytics platform optimized
for Azure. Spark unifies
27
Machine Learning Development
Azure Machine Learning
Azure ML enables developers to use the right compute for each task — best-in-class CPU and
GPU support. Azure ML has a lot of computing options for training as well as deployment.
For training, the following are the compute targets. Azure ML compute cluster, Remote VMs,
Azure Batch, Azure HDInsight, Azure Databricks, Apache Spark Pools, Azure Kubernetes
Service(AKS), Azure Arc Enabled Kubernetes.
Following are the compute options for deployment; Local Web Service, Azure Kubernetes
Service(AKS), Azure Container Instances(ACI), Azure ML Compute Cluster, Azure Cognitive
Search, and Azure IoT Edge Devices.
Azure Databricks
However, in Azure Databricks only the databricks spark cluster can be used for training as well as
deployment. Databricks provides single, optimized ML runtime including popular libraries and
frameworks for use at any scale.
The ML Runtime provides one-click access to reliable and performant distribution of the most
popular ML frameworks, and custom ML environments via pre-built containers or Conda. ML
Runtime includes Tensorflow, scikit learn, XGBoost, Pytorch, Keras, Horovod etc
It is possible to export models from Databricks using MLLeap and deploy them elsewhere.
Azure provides built-in auto-scaling options for most of the compute options. Optimized and
autoscaling single and multi-node clusters (including GPU) are available to efficiently run ML
jobs.
Azure Databricks
Optimized AutoScaling allows clusters to scale up and down more aggressively in response to load
and automatically improves cluster resource
27 utilisation without the need for any complex setup
from users. Databricks clusters spin up and scale for processing massive amounts of data when
needed and spin down when not in use.
Databricks Pools
Enable clusters to start and scale faster by creating a managed cache of virtual machine instances
that can be acquired for use when needed. Databricks pools reduce cluster start and auto-scaling
times by maintaining a set of idle, ready-to-use instances. Databricks do not charge DBUs while
instances are idle in the pool.
Deep integration with Delta Lake including built-in data versioning, time travel and the option to
use the Databricks Feature Store built on top of Delta Lake. Delta Lake is the default storage format
for all operations on Azure Databricks. Unless otherwise specified, all tables on Azure Databricks
are Delta tables.
Azure Databricks with its RDDs are designed to handle data distributed on multiple nodes. This is
advantageous when your data size is huge.
Azure ML provides better user interface that facilitates easy starting for a novice user. It has
productivity for all skill levels including code-first (notebooks/IDEs), low-code (AutoML), and
no-code (Designer).
Azure Databricks
AutoML allows for the automatic training of regression, classification, forecasting, and computer
vision (preview) models. Provides more sophisticated ML model creation compared to Databricks.
27
Azure Machine Learning is a fully-managed cloud service that provides a range of tools
and resources for building, training, and deploying machine learning models.
With Azure Machine Learning, developers can use Python or R to build and train models
using a variety of algorithms, including linear regression, logistic regression, and decision
trees.
Once a model is trained, it can be deployed as a web service or integrated into an
application using Azure’s REST APIs.27
Azure Databricks is a fully-managed cloud service for data engineering, data science, and
analytics. It is built on the popular open-source
Apache Spark framework and offers a range of tools and resources for processing and
analyzing large datasets.
With Azure Databricks, developers can use a variety of programming languages, including
Python, R, and Scala, to build and deploy machine learning models.
Azure Machine Learning Pipelines is a cloud service that provides a range of tools and
resources for automating the process of building, training, and deploying machine learning
models.
With Azure Machine Learning Pipelines, developers can create repeatable workflows for
training and deploying models, as well as manage the entire lifecycle of a machine learning
project.
In addition to these core machine learning services, Azure also provides a range of artificial
intelligence (AI) services that can be used to build intelligent applications and automate business
processes. These services include Azure Cognitive Services, which provides a range of APIs for
tasks such as image and text analysis, and Azure Bot Service, which allows developers to build
and deploy chatbots and other conversational AI applications.
Overall, Azure’s machine learning and AI services provide a range of tools and resources for
building and deploying predictive models and intelligent applications quickly and easily, without
the need for specialized expertise in data science or machine learning.
Whether you are a data scientist, a developer, or a business user, Azure’s machine learning and AI
services can help you turn data into insights and action.
Improving Customer Service: Machine learning can be used to improve customer service
by analyzing customer data and identifying patterns that can help businesses understand
their customers’ needs and preferences.
For example, a retail company might use Azure’s machine learning capabilities to analyze
customer data, including purchase history and customer feedback, to identify trends and
patterns that can help them improve their products and services.
Predicting Maintenance Needs: Machine learning can be used to predict when equipment
is likely to fail or require maintenance, helping businesses to prevent disruptions and
reduce downtime.
27
For example, a manufacturer might use Azure’s machine learning capabilities to analyze
data from equipment sensors to predict when maintenance is required, enabling the
company to schedule maintenance in advance and reduce downtime.
Optimizing Supply Chain Operations: Machine learning can be used to optimize supply
chain operations by analyzing data from various sources, such as sales data, inventory
levels, and logistics data, to identify patterns and trends that can help businesses improve
efficiency and reduce costs.
For example, a logistics company might use Azure’s machine learning capabilities to
analyze data from its operations to identify bottlenecks and inefficiencies in its supply
chain, enabling the company to make improvements that can reduce costs and improve
customer satisfaction.
Overview of Azure Artificial Intelligence Services
Azure offers a range of artificial intelligence (AI) services that can be used to build intelligent
applications and automate business processes. These services include:
Azure Cognitive Services: Azure Cognitive Services is a collection of APIs that provide
access to a range of AI capabilities, including natural language processing, image and video
analysis, and speech recognition.
These APIs can be used to build intelligent applications that can understand and interact
with humans in a natural way.
For example, a customer service chatbot built using Azure Cognitive Services could
understand and respond to customer inquiries in natural language, helping to improve
customer satisfaction.
Azure Bot Service: Azure Bot Service is a cloud service that allows developers to build
and deploy chatbots and other conversational AI applications.
The service provides a range of tools and resources for building chatbots, including
templates and pre-built connectors to popular messaging platforms such as Skype, Slack,
and Facebook Messenger.
With Azure Bot Service, developers can build chatbots that can understand and respond to
customer inquiries in natural language, helping to improve customer service and reduce the
workload of customer service teams.
Data Science and Analytics Tools in Azure:
Azure provides a range of data science and analytics tools that can be used to process and analyze
large datasets. These tools include:
Azure Synapse Analytics: Azure Synapse Analytics is a cloud service that combines big
data and data warehousing with a range of data integration and data processing capabilities.
With Azure Synapse Analytics, developers can use a variety of programming languages,
including SQL, Python, and .NET, to build and deploy data pipelines that can ingest,
process, and analyze large datasets.
The service also provides a range of tools and resources for data visualization and
27 and analyze data in real-time.
reporting, enabling users to explore
Azure Data Factory: Azure Data Factory is a cloud service that provides a range of tools
and resources for building and deploying data pipelines.
With Azure Data Factory, developers can create repeatable workflows for extracting,
transforming, and loading data from a variety of sources, including on-premises systems,
cloud storage, and databases.
The service also provides integration with Azure’s machine learning and artificial
intelligence services, enabling developers to build and deploy predictive models and
intelligent applications based on data from their pipelines.
Azure Stream Analytics: Azure Stream Analytics is a cloud service that enables
developers to build and deploy real-time analytics and event-processing applications.
With Azure Stream Analytics, developers can analyze data streams in real-time and take
action based on the results.
For example, a retail company might use Azure Stream Analytics to analyze data from
customer transactions in real-time, triggering alerts when certain thresholds are reached or
identifying trends that can help the company improve its products and services.
Tools for Training and Deploying Machine Learning Models
Azure provides a range of tools and resources for training and deploying machine learning models.
These tools include:
Azure Machine Learning Workspaces: Azure Machine Learning Workspaces are fully-
managed cloud environments that provide a range of tools and resources for building,
training, and deploying machine learning models.
With Azure Machine Learning Workspaces, developers can use a variety of programming
languages, including Python and R, to build and train machine learning models using a
range of algorithms and frameworks.
The workspaces also provide integration with Azure’s data science and analytics tools,
enabling developers to process and analyze large datasets as part of their machine learning
projects.
Azure Machine Learning Compute: Azure Machine Learning Compute is a cloud service
that provides a range of tools and resources for scaling machine learning workloads.
With Azure Machine Learning Compute, developers can easily scale up their machine
learning projects to take advantage of the power of the cloud, without the need to manage
infrastructure. The service also provides integration with Azure’s data science and
analytics tools, enabling developers to process and analyze large datasets as part of their
machine learning projects.
Overall, Azure’s tools and resources for training and deploying machine learning models provide
a range of tools and resources for building and deploying predictive models and intelligent
applications quickly and easily, without the need for specialized expertise in data science or
machine learning.
AI and ML Projects in Azure
Azure provides a range of services and regulations for projects in Artificial Intelligence and
Machine Learning. Let’s look into the following services provided by Azure:
Security Measures
Azure provides a range of security measures for machine learning and artificial intelligence (AI)
projects to help protect data and ensure compliance with industry regulations. These measures
include:
Data Protection: Azure provides a range of measures to help protect data in machine
learning and AI projects, including encryption, access controls, and data backup and
recovery.
Data stored in Azure is encrypted by default, and developers can use Azure’s Key Vault
service to manage and rotate encryption keys. In addition, Azure provides a range of
access controls, including identity and access management (IAM) and role-based
access controls (RBAC), to help ensure that only authorized users can access data.
Finally, Azure provides a range of backup and recovery options, including snapshot
and point-in-time recovery, to help protect against data loss.
Access Controls: Azure provides a range of access controls to help ensure that only
authorized users can access data in machine learning and AI projects.
These controls include identity and access management (IAM) and role-based access
controls (RBAC), which enable developers to specify who has access to data and what
actions they are allowed to perform.
In addition, Azure provides a range of network security controls, such as network security
groups and virtual private networks (VPNs), to help secure data in transit.
Azure provides a range of machine learning and artificial intelligence (AI) offerings for specific
industries, enabling organizations to build and deploy machine learning and AI solutions that are
tailored to their specific needs. Here are a few examples of Azure’s machine learning and AI
offerings for specific industries:
Healthcare: Azure provides a range of machine learning and AI offerings for the
healthcare industry, including Azure Cognitive Services, Azure Synapse Analytics, and
Azure Bot Service.
These offerings can be used to build and deploy solutions that can help improve patient
care, reduce costs, and drive innovation in the healthcare industry.
For example, a hospital might use Azure’s machine learning and AI offerings to build and
deploy a chatbot that can help patients access medical information and schedule
appointments, or to analyze data from electronic medical records to identify trends and
patterns that can help improve patient care.
Financial Services: Azure provides a range of machine learning and AI offerings for the
financial services industry, including Azure Machine Learning, Azure Databricks, and
Azure Synapse Analytics.
These offerings can be used to build and deploy solutions that can help improve risk
management, reduce costs, and drive innovation in the financial services industry.
For example, a bank might use Azure’s machine learning and AI offerings to build and
deploy a machine learning model that can help predict which customers are likely to default
on loans or analyze data from customer transactions to identify trends and patterns that can
help improve the bank’s products and services.
Retail: Azure provides a range of machine learning and AI offerings for the retail industry,
including Azure Cognitive Services, Azure Synapse Analytics, and Azure Stream
Analytics.
These offerings can be used to build and deploy solutions that can help improve customer
service, reduce costs, and drive innovation in the retail industry.
For example, a retail company might use Azure’s machine learning and AI offerings to
build and deploy a chatbot that can help customers access product information and place
orders or analyze data from customer transactions to identify trends and patterns that can
help improve the company’s products and services.
Collaboration and Integration Tools
Azure provides a range of collaboration and integration tools for machine learning and artificial
intelligence (AI) projects to help teams work together effectively and automate their workflows.
These tools include:
Azure DevOps: Azure DevOps is a cloud-based platform that provides a range of tools
and resources for collaborating on software development projects.
With Azure DevOps, teams can use tools such as version control, work tracking, and
continuous integration and deployment to manage their projects and automate their
workflows.
Azure DevOps also provides integration with Azure’s machine learning and AI services,
enabling teams to build and deploy machine learning and AI solutions as part of their
software development projects. 27
Azure GitHub Actions: Azure GitHub Actions is a cloud-based platform that provides a
range of tools and resources for automating software development workflows.
With Azure GitHub Actions, teams can use pre-built actions and integrations to automate
tasks such as building, testing, and deploying software.
Azure GitHub Actions also provides integration with Azure’s machine learning and AI
services, enabling teams to build and deploy machine learning and AI solutions as part of
their software development workflows.
Resources and Support for Developers
Azure provides a range of resources and support for machine learning and artificial intelligence
(AI) developers to help them build and deploy machine learning and AI solutions quickly and
easily. These resources include:
Machine learning is a subset of artificial intelligence that involves building and training algorithms
to make predictions or decisions based on data. There are several different types of machine
learning tasks that organizations can use Azure to help with, including:
For example, a supervised learning algorithm might be trained to classify emails as spam
or not spam based on a set of labeled emails.
Once the model is trained, it can then be used to predict the output for new, unseen data.
Azure provides a range of tools and resources for building and training supervised
learning models, including Azure Machine Learning, Azure Databricks, and Azure
Machine Learning Pipelines.
Decision Factors
Data Distribution
Azure Machine Learning is good when you are training with limited data on a single machine.
Though Azure ML provides training clusters, the data distribution among the nodes is to be
handled in the code
Azure Databricks with its RDDs(Resilient Distributed Dataset) are designed to handle data
distributed on multiple nodes.
Deep Learning
Both support Deep Learning workloads. if you use Keras, TensorFlow, Chainer, or PyTorch, there
are easy-to-use estimators in the Azure MLs SDK
Azure Databricks is comparatively expensive as the cost for DBUs(Databricks Unit)to be paid but
not used for Deep Learning workloads. The benefit of deep learning in Databricks is the use of
Horovod, a distributed training framework. But this can also be used in Azure ML using SDKs
Azure Machine Learning Service provides both SDKs (Code based) and visual interface to
quickly prepare data, train and deploy machine learning models.
We can have the same drag-and drop experience to Machine Learning Studio, however,
unlike proprietary compute platform of studio, the Visual Interface uses your own compute
resources.
It integrates with python environment, frameworks and tools and it does not restrict you
what you need to use.
Azure Machine Learning Service have full control on training scalability with custom
compute targets.
Azure Machine Learning Studio is a web-based drag-and-drop interface for building and deploying
machine learning models. It provides a variety of tools and services for data scientists and
developers to collaborate on ML projects, including data preparation, automated machine learning,
and model deployment.
Pros:
1. Easy to use: Azure Machine Learning Studio has a simple and intuitive interface that makes
it easy to build and deploy machine learning models.
2. Drag-and-drop interface: Azure Machine Learning Studio has a drag-and-drop interface
27
that allows users to easily create workflows and models.
3. Pre-built modules: Azure Machine Learning Studio provides pre-built modules for
common machine learning tasks, such as data cleaning, feature engineering, and model
selection.
4. Good for beginners: Azure Machine Learning Studio is a good choice for beginners who
are new to machine learning and want to get started quickly.
Cons:
1. Limited functionality: Azure Machine Learning Studio has limited functionality compared
to Azure Machine Learning Service, making it less suitable for advanced ML projects.
2. Limited customization: Azure Machine Learning Studio does not allow for much
customization, making it less flexible for complex ML projects.
3. Limited integration: Azure Machine Learning Studio has limited integration with other
Azure services, making it less suitable for projects that require integration with other Azure
services.
Azure Machine Learning (ML) Studio is the primary tool that you will use to develop predictive
analytic solutions in the Microsoft Azure cloud. The Azure Machine Learning environment is
completely cloud-based and self-contained. It features a complete development, testing, and
production environment for quickly creating predictive analytic solutions. Azure ML Studio gives
you an interactive, visual workspace to easily build, test, and iterate on a predictive analysis model.
You drag and drop datasets and analysis modules onto an interactive canvas, connecting them
together to form an experiment, which you can then run inside Azure ML Studio. You can then
iterate on your predictive analytics model by editing the experiment, saving a copy if desired, and
then running it over and over again. Then, when you're ready, you can publish your experiment as
a web service, so that your predictive analytics model can be accessed by others over the Web.
Another key benefit of the Azure Machine Learning cloud-based environment is that getting
started requires virtually no startup costs in terms of either time or infrastructure. Here’s the best
part of all: Almost all the tasks related to Azure Machine Learning can be accomplished within a
modern web browser.
Hybrid training/scoring gives the flexibility to train locally and deploy the trained model
on the cloud or vice versa
Ability to deploy the trained model as close to the scoring data event as possible
Freedom to use any framework, compute power, tools available
Auto ML and Auto Hyper Parameter tuning options.
Data It’s all about the data. Here’s where you will acquire, compile, and analyze testing
and training data sets for use in creating Azure Machine Learning predictive models.
Create the model Use various machine learning algorithms to create new models that
are capable of making predictions based on inferences about the data sets.
Evaluate the model Examine the accuracy of new predictive models based on ability to
predict the correct outcome, when both the input and output values are known in advance.
Accuracy is measured in terms of confidence factor approaching the whole number one.
Refine and evaluate the model Compare, contrast, and combine alternate predictive
models to find the right combination(s) that can consistently produce the most accurate
results.
Deploy the model Expose the new predictive model as a scalable cloud web service,
one that is easily accessible over the Internet by any web browser or mobile client.
Test and use the model Implement the new predictive model web service in a test or
production application scenario. Add manual or automatic feedback loops for continuous
improvement of the model by capturing 27 the appropriate details when accurate or
inaccurate predictions are made. By allowing the model to constantly learn from
inaccurate predictions and mistakes, unlike humans it will never be destined to repeat
them.
The next stop on our Azure Machine Learning journey is to explore the various learning
theories and algorithms behind the technology to maximize our effectiveness with this
new tooling. Machine learning algorithms typically fall into two general categories:
supervised learning and unsupervised learning.
As we dig deeper into the data sciences behind Azure Machine Learning, it is important to note
there are several different categories of machine learning algorithms that are provided in the Azure
Machine Learning toolkit.
Classification algorithms These are used to classify data into different categories that can then be used to
predict one or more discrete variables, based on the other attributes in the dataset.
Regression algorithms These are used to predict one or more continuous variables, such as profit or
loss, based on other attributes in the dataset.
Clustering algorithms These determine natural groupings and patterns in datasets and are used to predict
grouping classifications for a given variable.
27
Now it’s time to start learning about some of the underlying theories, principles,
and algorithms of data science that will be invaluable in learning how best to use Azure Machine
Learning. One of the first major distinctions in understanding Azure Machine Learning is around
the concepts of supervised and unsupervised learning. With supervised learning, the prediction
model is “trained” by providing known inputs and outputs. This method of training creates a
function that can then predict future outputs when provided only with new inputs. Unsupervised
learning, on the other hand, relies on the system to self-analyze the data and infer common patterns
and structures to create a predictive model. We cover these two concepts in detail in the next
sections.
Supervised learning
Supervised learning is a type of machine learning algorithm that uses known datasets to create a
model that can then make predictions. The known data sets are called and include input
data elements along with known response values. From these training datasets, supervised learning
algorithms attempt to build a new model that can make predictions based on new input values along
with known outcomes.
Supervised learning can be separated into two general categories of algorithms:
Classification These algorithms are used for predicting responses that can have just a few known
values—such as married, single, or divorced—based on the other columns in the dataset.
Regression These algorithms can predict one or more continuous variables, such as profit or loss,
based on other columns in the dataset.
The formula for producing a supervised learning model is expressed in Figure 2-2.
Figure 2-2 illustrates the general flow of creating new prediction models based on the use of
supervised learning along with known input data elements and known outcomes to create an entirely
new prediction model. A supervised learning algorithm analyzes the known inputs and known
outcomes from training data. It then produces a prediction model based on applying algorithms that
are capable of making inferences about the data.
The concept of supervised learning should also now become clearer. As in this example, we are
deliberately controlling or supervising the input data and the known outcomes to “train” our model.
One of the key concepts to understand about using the supervised learning approach—to train a new
prediction model with predictive algorithms—is that usage of the known input data and known
outcome data elements have all been “labeled.” For each row of input data, the data elements are
designated as to their usage to make a prediction.
27
Basically, each row of training data contains data input elements along with a known outcome for
those data inputs. Typically, most of the input columns are labeled as features or vector variables. This
labeling denotes that the columns should be considered by the predictive algorithms as eligible input
elements, which could have an impact on making a more accurate prediction. Most important, for
each row of training data inputs, there is also a column that denotes the known outcomes based on the
combination of data input features or vectors. The remaining data input columns would be considered
not used. These not-used columns could be safely left in the data stream for potential use later, if it
was deemed by the data scientist that they would potentially have a significant impact on the outcome
elements or prediction process. To summarize, using the supervised learning approach for creating
new predictive models requires training datasets. The training datasets require that each input column
can have only one of the three following designations:
Features or vectors Known data that is used as an input element for making a prediction.
Labels or supervisory signal Represents the known outcomes for the corresponding features
for the input record.
27
Building ML pipelines with AZURE ML Service
to build an Azure Machine Learning pipeline using Python SDK v2 to complete an image
classification task containing three steps: prepare data, train an image classification model, and
score the model. Machine learning pipelines optimize your workflow with speed, portability, and
reuse, so you can focus on machine learning instead of infrastructure and automation.
The example trains a small Keras convolutional neural network to classify images in the Fashion
MNIST dataset. The pipeline looks like following.
27
Prepare input data for the pipeline job
Create three components to prepare the data, train and score
Compose a Pipeline from the components
Get access to workspace with compute
Submit the pipeline job
Review the output of the components and the trained neural network
(Optional) Register the component for further reuse and sharing within workspace
This article uses the Python SDK for Azure Machine Learning to create and control an Azure
Machine Learning pipeline. The article assumes that you'll be running the code snippets
interactively in either a Python REPL environment or a Jupyter notebook.
Import all the Azure Machine Learning required libraries that you'll need for this article:
Python
# import required libraries
from azure.identity import DefaultAzureCredential,
InteractiveBrowserCredential
You need to prepare the input data for this image classification pipeline.
Fashion-MNIST is a dataset of fashion images divided into 10 classes. Each image is a 28x28
grayscale image and there are 60,000 training and 10,000 test images. As an image classification
problem, Fashion-MNIST is harder than the classic MNIST handwritten digit database. It's
distributed in the same compressed binary form as the original handwritten digit database.
Import all the Azure Machine Learning required libraries that you'll need.
By defining an Input, you create a reference to the data source location. The data remains in its
existing location, so no extra storage cost is incurred.
The image classification task can be split into three steps: prepare data, train model and score
model.
Azure Machine Learning component is a self-contained piece of code that does one step in a
machine learning pipeline. In this article, you'll create three components for the image
classification task:
The next section will show the create components in two different ways: the first two components
using Python function and the third component using YAML definition.
The first component in this pipeline will convert the compressed data files of fashion_ds into two
csv files, one for training and the other for scoring. You'll use Python function to define this
component.
If you're following along with the example in the Azure Machine Learning examples repo, the
27 folder. This folder contains two files to construct the
source files are already available in prep/
component: prep_component.py, which defines the component and conda.yaml, which defines
the run-time environment of the component.
By using command_component() function as a decorator, you can easily define the component's
interface, metadata and code to execute from a Python function. Each decorated Python function
will be transformed into a single static specification (YAML) that the pipeline service can process.
name is the unique identifier of the component.
version is the current version of the component. A component can have multiple versions.
display_name is a friendly display name of the component in UI, which isn't unique.
In this section, you'll create a component for training the image classification model in the Python
function like the Prep Data component.
The difference is that since the training logic is more complicated, you can put the original training
code in a separate Python file.
The source files of this component are under train/ folder in the Azure Machine Learning
examples repo. This folder contains three files to construct the component:
After defining the training function successfully, you can use @command_component in Azure
Machine Learning SDK v2 to wrap your function as a component, which can be used in Azure
Machine Learning pipelines.
The code above define a component with display name Train Image Classification Keras
using @command_component:
The keras_train_component function defines one input input_data where training data
comes from, one input epochs specifying epochs during training, and one output
output_model where outputs the model file. The default value of epochs is 10. The
execution logic of this component27is from train() function in train.py above.
In this section, other than the previous components, you'll create a component to score the trained
model via Yaml specification and script.
If you're following along with the example in the Azure Machine Learning examples repo, the
source files are already available in score/ folder. This folder contains three files to construct the
component:
In this section, you'll learn to create a component specification in the valid YAML component
specification format. This file specifies the following information:
For prep-data component and train-model component defined by Python function, you can import
the components just like normal Python functions.
27
Azure Machine Learning Studio
Azure Machine Learning Studio is one of the core components of Azure Machine Learning (ML).
It is a graphical interface integrated development environment (IDE) designed for developing and
operationalizing Machine Learning workflows on Azure.
It streamlines the process from data preparation to model deployment, offering a no-code or low-
code experience that makes machine learning accessible to a broader range of users, from
beginners to seasoned data scientists.
The core of Azure ML Studio's appeal lies in its simplicity and power. It provides a user-friendly,
drag-and-drop interface that simplifies the creation, training, and deployment of machine learning
models without requiring deep programming knowledge.
Yet, it remains robust enough for complex workflows, offering functionalities like Automated ML
(AutoML) and the ML Designer for more controlled, custom pipeline constructions.
27
ML Studio also integrates seamlessly with the Azure ecosystem, providing tools for monitoring
applications and services, securely storing secrets, and managing compute resources. It supports
collaboration through shared notebooks and experiments, enhancing the ability of teams to work
together effectively on machine learning projects.
For data scientists who prefer coding, ML Studio offers the Azure SDK, which allows for Python
code to interact with ML Studio resources and experiments, offering a bridge between the no-
code/low-code and code-centric approaches to machine learning.
This flexibility ensures that Azure ML Studio can meet a wide range of needs and preferences,
from those who favor visual programming and drag-and-drop simplicity to those who prefer the
control and customizability offered by coding.
Automated ML (AutoML) is one of the core components of Azure Machine Learning. It is known
for its ability to automate the selection of algorithms and hyperparameters, streamlining the model
training process.
Users simply specify the dataset, the machine learning task (e.g., classification, regression), and
some optional parameters, and Azure ML Studio handles the rest, delivering the best-performing
model based on the criteria provided.
This not only accelerates the development27cycle but also democratizes access to machine learning,
enabling users with varying levels of expertise to participate in ML projects.
While Azure ML Studio's no-code, drag-and-drop interface is a major draw for many, it's not
always sufficient for every scenario. For cases requiring more customization and control, Azure
ML supports development through its SDK, primarily using Python.
This enables data scientists and developers to programmatically construct and manage their
machine learning workflows, offering the flexibility to integrate with existing codebases and use
advanced machine learning techniques.
Through the SDK, users can automate tasks like data preparation, model training, hyperparameter
tuning, and deployment, aligning with more complex project requirements.
When utilizing Azure ML SDK, users can develop and assess machine learning models using
standard ML code directly in their local development environment, such as VS Code.
This setup allows for leveraging Azure's computational resources for executing training jobs. The
process begins with creating a ml_client as a connection to the Azure workspace, facilitating the
management of resources and orchestration of jobs within that environment.
Home Screen
Launching the tool
Datasets
Data prep and column selection
Running your experiment
Deploying your model
Home Screen 27
At first glance the only notable difference is the appearance of the +New button, and the lack of
Notebooks through the interface. You can create Azure notebooks seperately, but the preview does
not have integrated notebooks.
Azure Machine Learning Service Visual Interface (preview)
27
To get started with ML Visual Interface you need an Azure subscription and you need to use the
Azure portal . From the portal, you create a Machine Learning service workspace. If you want to
try it yourself, you can find instructions in the Quickstart: Prepare and visualize data without
writing code in Azure Machine Learning.
27
Create an Azure Machine Learning service workspace
Once you have created the workspace you launch the visual interface from the menu blade
Launch Visual Interface
Datasets
Both tools come with a variety of pre-loaded datasets. Both tools allow you to upload your own
datasets. I was mildly disappointed that my favorite flight dataset is not pre-loaded in ML Visual
interface. I found training a model to predict which flights were late or on time was a great
example, since you don’t need to be a data scientist to understand what data might affect whether
or not a flight is likely to be late.
I have uploaded the titanic dataset to try and predict which passengers would survive.
As soon as I drag the dataset to the work area I discovered the first ‘quirk’ of the preview. When I
try to visualize my dataset, that option is grayed out. I am going to assume this is just one of the
joys of working with a product in preview and will be fixed in a later release. But it is quite
irritating, because of course I always want to look at my data set and make sure it uploaded
correctly.
27
Adding data set to work area
Unfortunately and I will also chalk this up to quirks of previews that will be fixed in a later release,
whatever module you connect first (e.g. Select Columns in Dataset) will not know the names of
the columns in your dataset. So you will want to make sure YOU know the column names because
you have to type them in and the interface will not warn you if you enter an invalid column name.
27
One of the great things about the ML Visual Interface is you control the compute
power used to run your experiment! The only drawback is you have to create the compute
instance before you can run your experiment. Well worth it for the benefits of being able to use
more compute power when needed!
The first time you run the experiment you will need to create new compute. I am just using the
preconfigured defaults for now, but I could go to the Azure portal and create a new compute target
(you will need to learn to do that at some point anyway, because when you are ready to deploy it
will require you to create a compute manually)
Creating compute can take 5-15 minutes, so go ahead and grab a coffee while you wait.
27
Finished Experiment
When you re-run your experiment, you do not need to Create new compute every time. You can
select existing and re-use the compute you created for the first run of your experiment. The first
time I used the tool I had problems with my previous compute not showing up in the list of existing
compute targets (even when I selected Refresh) but when I closed the browser window and re-
opened it, it would show up. Still faster than creating new Compute every time I want to run the
experiment. Today, I did not need to refresh and my compute target was listed as you would
expect, so maybe they fixed it (if so well done that was quick!), or maybe it depends on your
browser or maybe it’s an intermittent quirk of the preview.
27
re-use existing compute
One of the other BIG differences in ML Visual Interface is the ability to deploy the model
somewhere other than an Azure web service. You have control. If you know your way around
managing and deploying models, the world is your oyster. I have not yet had a chance to try it
myself (if you try it before I do, please share your experience). Check out Deploy models with the
Azure Machine Learning Service for more instructions and details.
If you just want to deploy it to a web service the way we did in ML Studio, that still works. Possible
kudos to the product team for quick work, because when I tried this last week it required me to
create a compute target with 12 nodes to 27 deploy. Today I was able to re-use the existing compute
I had used to run my experiment which had only 6 nodes.
Requiring 12 nodes was a hassle because my Pay as You Go Azure subscription had a quota limit
of 10 nodes. I had to submit a request to increase my CPU quota. To the credit of the Azure support
team (no pun intended) The request was processed almost immediately. I was able to confirm my
new quota with the Azure CLI command.
Just for extra fun, the VM family I requested a quota limit for was not one of the valid VMs
supported by the tool, so I had to look up which VM family supported the VMS listed as supported
in the error message to request the correct quota increase.
I did not have to do ANY of that stuff with quotas today, I just clicked DEploy web service, and
re-used the existing comptue I created to run my experiment. I am leaving those links and
information in the blog post, just in case anyone else runs into it and also so I can remember how
to do it if I ever run into that issue again with another service. Blogs make a handy place to keep
notes for yourself.
27
Testing the trained model
This is not a secure way to deploy your model, but it’s great for proof of concept, and testing.
You will see a warning with links to instructions on secure deployment when you open the web
service.
27
Secure deployment
Fans of Azure Machine Learning Studio are likely to become bigger fans of Azure Machine
Learning Service Visual Interface. Two of the biggest complaints about ML Studio were the
inability to scale compute and the inability to deploy models outside of Azure web services. Both
of these concerns are addressed with Azure Machine Learning Service Visual Interface.
27
Azure ML services has been introduced for a while, however in the last Microsoft Build, the visual
interface of azure ml service has been introduce that has so much similarity to Azure ML Studio.
however there are some difference between them as the created experiment in visual interface be
shown in the Assets section as Experiment, the virtual machine that the model is run also be shown
in the Compute section, and the deployment and other part, so despite the Azure ML studio, the
model and service created can be access beside other model created using Automated ML, Jupyter
Notebook Python and so forth.
27
to access to Visual Interface, you need to login to Portal.azure, then in the Azure ML Workspace,
choose the Visual interface
27
Click on the Launch Visual Interface, click on the Datasets, then My Datasets. Choose the New
datasets
27
Upload data from local file, then, load the data into datasets.
27
Then, click on the Experiments, and then click New experiment.
Now in the experiment same as what we have in Azure ML Studio, you able to drag and drop the
imported dataset into the experiment area. Then, select the Select Edit Metadata, to change the
customer Rating, to the categorial variable.
27
27
27
27
To see the result, you need to run the experiment, to run it we need to identify a new Compute
(Virtual Machine) to run the model on it. If you already have one just choose it, otherwise, you
need to create a new Compute.
27
Then from the dataset list, I just select Tenure, Ticket Count, Role in Org, Company Size, and
Rating.
Then same as Azure ML Studio, select the column in dataset and split data.
27
In this model, I am going to predict the customer rating, I use the two-class decision forest mode
and then run the experiment, the experiment will be run on the virtual machine.
27
27
The Experiment has been created, to see it in the Experiment list along other Experiment created
using other Azure ML Services like Automated ML. Also the Compute will be shown in the list.
27
Working With Azure Open Datasets
Azure Open Datasets are curated public datasets that you can add to scenario-specific features to
machine learning solutions, for more accurate models. Open Datasets are available in the cloud,
on Microsoft Azure. They're integrated into Azure Machine Learning and readily available to
Azure Databricks and Machine Learning Studio (classic). You can also access the datasets through
APIs and you can use them in other products, such as Power BI and Azure Data Factory.
Datasets include public-domain data for weather, census, holidays, public safety, and location that
help you train machine learning models and enrich predictive solutions. You can also share your
public datasets through Azure Open Datasets.
Curated open public datasets in Azure Open Datasets are optimized for consumption in machine
learning workflows.
For more information about the available datasets, visit the Azure Open Datasets Catalog resource.
Data scientists often spend the majority of their time cleaning and preparing data for advanced
analytics. To save you time, open Datasets are copied to the Azure cloud, and then preprocessed.
At regular intervals, data is pulled from the sources - for example, by an FTP connection to the
National Oceanic and Atmospheric Administration (NOAA). Next, the data is parsed into a
structured format, and then enriched as needed, with features such as ZIP Code or the locations of
the nearest weather stations.
Datasets are cohosted with cloud compute in Azure, to make access and manipulation easier.
Weather data
Dataset Notebooks Description
Azure
Worldwide hourly weather data from NOAA with the best spatial
NOAA Integrated Notebooks 27
coverage in North America, Europe, Australia, and parts of Asia.
Surface Data (ISD) Azure
Updated daily.
Databricks
Azure
NOAA Global
Notebooks 15-day U.S. hourly weather forecast data from NOAA. Updated
Forecast System
Azure daily.
(GFS)
Databricks
Calendar data
Dataset Notebooks Description
Azure
Public Notebooks Worldwide public holiday data, covering 41 nations or regions from 1970 to
Holidays Azure 2099. Includes country/region and whether most people have paid time off.
Databricks
Access to datasets
With an Azure account, you can access open datasets through code or through the Azure service
interface. The data is colocated with Azure cloud compute resources for use in your machine
learning solutions.
Open Datasets are available through the Azure Machine Learning UI and SDK. Open Datasets
also provide Azure Notebooks and Azure Databricks notebooks that can connect data to Azure
Machine Learning and Azure Databricks. Datasets can also be accessed through a Python SDK.
27
AZURE MLOps,
The Azure Machine Learning Notebooks repository contains samples and tutorials using the
Azure Machine Learning SDK with Python or R. These samples are a great way to explore and
learn the SDK capabilities, which enables data scientists to use cloud-hosted services to speed up
development.
MLOps is a practice that streamlines the development and deployment of ML models and AI
workflows. Try Azure Machine Learning free Get started in the studio.
Continuous Integration and Continuous Deployment (CI/CD): Azure MLOps enables continuous
integration and continuous deployment (CI/CD) pipelines for machine learning, allowing for
automated testing, validation, and deployment of models.
With Azure Databricks notebooks, you can:27 Develop code using Python, SQL, Scala, and R.
Customize your environment with the libraries of your choice. Create regularly scheduled jobs to
automatically run tasks, including multi-notebook workflows
MLOps Azure is a Machine Learning engineering culture and practice that aims at unifying ML
system development (Dev) and ML system operation (Ops). It applies the DevOps principles and
practices like continuous integration, delivery, and deployment to the machine learning process,
with an aim for faster experimentation, development, and deployment of Azure machine learning
models into production and quality assurance.
Here is a list of MLOps capabilities provided by Azure Machine Learning
Check out: Machine learning is a subset of Artificial Intelligence. It is the process of training a
machine with specific data to make inferences. In this post, we are going to cover everything
about Automated Machine Learning in Azure.
The Architecture Of MLOps Azure For Python Models Using Azure ML Service
This architecture describes how we can implement continuous integration (CI), continuous
delivery (CD), and retraining pipeline for an AI application using Azure DevOps and Azure
Machine Learning.
27
1.) Azure Pipelines- The build and test system is based on Azure DevOps and used for the build
and release pipelines. Azure Pipelines break these pipelines into logical steps called tasks.
2.) Azure Machine Learning- This architecture uses the Azure Machine Learning SDK for
Python to create a workspace (space for an experiment), compute resources, and more.
3.) Azure Machine Learning Compute- It is a cluster of virtual machines where a training job is
executed.
4.) Azure Machine Learning Pipelines- It provides reusable machine learning workflows. It is
published or updated at the end of the build phase and is triggered on new data arrival.
5.) Azure Blob Storage- The Blob containers are used for storing the logs from the scoring service.
In the above case, both the input data and model predictions are collected.
6.) Azure Container Registry- The scoring Python script is packaged as a Docker image and
versioned in this registry.
7.) Azure Container Instances- As part of the release pipeline, QA and staging environment is
mimicked by deploying the scoring web service image to Container Instances, which provides an
easy and serverless way to run a container.
8.) Azure Kubernetes Service- It eases the process of deploying a managed Kubernetes cluster
in Azure.
9.) Azure Application Insights- It is a monitoring service used for detecting performance
anomalies.
MLOps Pipeline
1.) Build Pipeline- The CI pipeline gets triggered every time the code is checked in and publishes
an updated Azure Machine Learning pipeline after building the code and running tests. It performs
the following tasks:
Code Quality
Unit Test
Data Test
27
2.) Retraining Pipeline- It retrains the model on schedule or when there is new data available. It
covers the following steps:
Train Model
Evaluate Model
Register Model
3.) Release Pipeline- It operationalizes the scoring image and promotes it safely across either the
QA environment or the Production environment.
Getting Started With MLOpsPython
Let’s look at the steps of how we can get MLOpsPython working with a sample ML project
diabetes_regression. In this project, we will create a linear regression model to predict diabetes
and has CI/CD DevOps practices enabled for model training and serve when these steps are
completed.
Step 1: Sign In to the Microsoft Azure and DevOps portal. (You can use the same ID to log in
for both the portals)
Note: If you do not have an Azure account then please create one before moving forward with the
steps. You can Check out our blog to know more about how to create an Azure free trial account.
Step 2: In the DevOps portal, if you are a first-time user then create a New Organization and then
click on New Project. Give Project Name and keep Visibility as Private
To add the code to our project, click on Project Settings, select GitHub Connections, and then
connect to your GitHub account (It might ask you to sign-in to your GitHub account first).
Step 4: After connecting to the GitHub account, select the repository where you have the code.
Step 5: Create a Variable Group for the pipeline. Select the Pipelines Option and click on
Library to create a variable group.
Variable Groups are used to store values that are to be made available across multiple pipelines
and can be controlled by the developer.
Note: You can give the same name for the variable group as mentioned in the document also.
27
Check Out: Azure Speech Translation.
Step 6: Now we have to add some variables in the variable group created in the previous step. So
click on the Add button and add these variables.
Step 7: No we have to create a service connection for Azure Resource Manager. So click on
Project Settings and select Service Connections.
27
Choose a Service Connection: Azure Resource Manager
Authentication Method: Service Principal (automatic)
Scope Level: Subscription (Your subscription will be populated in the box)
Service connection name: Give the name which you specified in the
AZURE_RM_SVC_CONNECTION variable and click on save.
The Connection will be created and can be viewed in the Azure Portal under Azure Active
Directory Tab.
Also Read: Our blog post on DP 100 Exam: Everything you need to know before giving this
exam.
Step 8: Create Infrastructure as Code (IaC) Pipeline by clicking on Pipelines and selecting New
Pipeline.
You can Review the created pipeline and also rename it as IaC Pipeline for better understanding
in the future. Once the pipeline is created click on Run.
27
Step 9: After the pipeline starts running, you can see in the Azure Portal that a new resource group
will be created with a machine learning service workspace.
27
Step 10: Now the next step will be to connect Azure DevOps Service to Azure ML Workspace.
For this, we will create a new service connection with the scope of the Machine Learning
Workspace.
Step 11: After the connection is established, we will now create a Model, Train, and Register CI
Pipeline and run that (It will take approximately 20 minutes to run this pipeline).
27
Step 12: After the pipeline run is completed you can view the workspace details in your azure
portal in the model and pipeline section.
27
Step 13: If you want to deploy the model, then create another pipeline and run the pipeline. There
will be three deployment environments Deploy to ACI or AKS or Webapp.
27
So this is how we can get MLOpsPython working with some sample projects.
AZURE ML Notebooks
27
Pipelines with AZURE data Lake And Azure M
The Azure Data Lake Store provides a single repository where you can easily capture data of any
size, type and speed without forcing changes to your application as data scales. Azure Data Lake
Analytics is a new service built on Apache YARN and includes U-SQL, a language that unifies
the benefits of SQL with the expressive power of user code. The service dynamically scales and
allows you to do analytics on any kind of data with enterprise-grade security through Azure Active
Directory so you can focus on your business goals.
Easily move data to Azure Data Lake Store
As of today, Azure Data Factory supports moving data from the following sources to Azure Data
Lake Store:
Azure Blob
Azure SQL Database
Azure Table
On-premises SQL Server Database
Azure DocumentDB
Azure SQL DW
On-premises File System
On-premises Oracle Database
On-premises MYSQL Database
On-premises DB2 Database
On-premises Teradata Database
On-premises Sybase Database
On-premises PostgreSQL Database
On-premises HDFS
Generic OData (Coming soon!)
Generic ODBC (Coming soon!)
You can also move data from Azure Data Lake Store to a number of sinks such as Azure Blob,
Azure SQL Database, on-premises file system, etc.
A Data Factory or Synapse Workspace can have one or more pipelines. A pipeline is a logical
grouping of activities that together perform a task. For example, a pipeline could contain a set of
activities that ingest and clean log data, and then kick off a mapping data flow to analyze the log
data. The pipeline allows you to manage the activities as a set instead of each one individually.
You deploy and schedule the pipeline instead of the activities independently.
The activities in a pipeline define actions to perform on your data. For example, you can use a
copy activity to copy data from SQL Server to an Azure Blob Storage. Then, use a data flow
activity or a Databricks Notebook activity to process and transform data from the blob storage to
an Azure Synapse Analytics pool on top of which business intelligence reporting solutions are
built. 27
Azure Data Factory and Azure Synapse Analytics have three groupings of activities: data
movement activities, data transformation activities, and control activities. An activity can take zero
or more input datasets and produce one or more output datasets. The following diagram shows the
relationship between pipeline, activity, and dataset:
An input dataset represents the input for an activity in the pipeline, and an output dataset represents
the output for the activity. Datasets identify data within different data stores, such as tables, files,
folders, and documents. After you create a dataset, you can use it with activities in a pipeline. For
example, a dataset can be an input/output dataset of a Copy Activity or an HDInsightHive Activity.
To create a new pipeline, navigate to the Author tab in Data Factory Studio (represented by the
pencil icon), then click the plus sign and choose Pipeline from the menu, and Pipeline again from
the submenu.
27
Data factory will display the pipeline editor where you can find:
27
CI /CD For Machine Learning with AZURE Pipeline
his article explains Azure continuous integration and continuous delivery (CI/CD) data pipelines
and their importance for data science.
27
Ingest data from various data sources.
Process and transform the data.
Save the processed data to a staging location for others to consume.
Enterprise data pipelines can evolve into more complicated scenarios with multiple source systems
and various supported downstream applications.
Data pipelines let data professionals focus on their core job functions, getting insights from the
data and helping businesses make better decisions.
Building machine learning models is similar to traditional software development in that data
27
scientists write code to train and score machine learning models. But unlike traditional software
based on code, data science machine learning models are based on both code, such as algorithms
and hyperparameters, and the data used to train the models. Most data scientists say they spend
80% of their time doing data preparation, cleaning, and feature engineering.
To ensure the quality of machine learning models, techniques such as A/B testing are also used to
compare and maintain model performance. A/B testing usually uses one control model and one or
more treatment models.
Multiple machine learning models might be used concurrently, adding another layer of complexity
for the CI/CD of machine learning models. A CI/CD data pipeline is crucial for the data science
team to deliver quality machine learning models to the business in a timely manner.
Machine learning is a data science technique that allows computers to use existing data to forecast
future behaviours, outcomes, and trends. By using machine learning, computers learn without
being explicitly programmed.
Azure Machine Learning service provides a cloud-based environment you can use to develop,
train, test, deploy, manage, and track machine learning models.
1. How to build the Continuous Integration and Continuous Delivery pipelines for a Machine
Learning project with Azure Pipelines.
This template contains code and pipeline definition for a machine learning project demonstrating
how to automate the end to end ML/AI project. The build pipelines include DevOps tasks for data
sanity test, model training on different compute targets, model version management, model
evaluation/model selection, model deployment as real-time web service, staged deployment to
QA/prod, integration testing and functional testing.
Before you begin
1. Refer the Getting Started page before you begin following the exercises.
2. Use the Azure DevOps Demo Generator to provision the project on your Azure DevOps
organization. This URL will automatically select Azure Machine Learning template in
the demo generator. This template contains code and pipeline definition for a machine
learning project demonstrating how to automate the end to end ML/AI project.
In this exercise, you will configure CI pipeline for your ML/AI project. This build pipeline includes
DevOps tasks for data sanity test, model training on different compute targets, model version
management, model evaluation/model selection etc.
27
3. Select Create or Get Workspace task. Select the Azure subscription from the drop-down
list and click Authorize to configure Azure service connection. This task used here to
create Workspace for Azure Machine learning service.
27
27
4. Click all other tasks in the pipeline and select the same subscription. Once the tasks are
updated with a subscription, Save the changes.
5. Select Triggers and make sure that CI is enabled.
27
6. The steps performed in the CI pipeline are
o Prepare the python environment
o Get or Create the workspace for AML service
o Submit Training job on the remote DSVM / Local Python Env
o Compare performance of different models and select the best
o Register model to the workspace
o Create Docker Image for Scoring
27 Web service
o Copy and Publish the Artifacts to Release Pipeline
In this exercise, we will configure Release pipeline which will deploy the image created from the
build pipeline to Azure Container Instance and Azure Kubernetes Services
1. Navigate to Pipeline » Releases. Select Deploy Web service and click Edit pipeline.
2. You will see release pipeline with two stages QA and Prod. Select QA stage and select
view stage tasks to view the tasks in QA stage.
27
27
3. In QA stage Conda Environment task and Install requirements tasks are used to setup
and prepare the Python environment to use it for subsequent tasks.
4. Select Deploy Web service on ACI task. Select Azure subscription details. This task
creates ACI (Azure Container Instance) and deploys web service image created in Build
Pipeline to ACI.
27
5. Select Test ACI Web service task. Update the Azure subscription details. This task is to
test the scoring image deployed.
27
6. Click on Tasks on the top to switch to the Prod stage, update the subscription details for
the two tasks in prod stage and click Save to save the changes.
27
Exercise 3: Update config file in the source code to trigger CI and CD
27
2. Update your Azure subscription ID in place of <>. If required Change resource group
name, AML workspace name and the location where you want to deploy your Azure ML
service workspace. Click Commit to commit the changes.
27
You can find the details of the code and scripts here
27
4. Now navigate to your Azure portal to view the resources provisioned and deployed by CI-
CD pipelines.
27
27
Using Microsoft Devlabs Extension
27
The Estimate extension contains four kinds of estimation cards to support the estimate strategy
that your team uses today. Generally, agile teams avoid estimating tasks based on time – there is
a better way to evaluate and “right size” work, such as T-shirt sizes or Fibonacci size. The idea is
to define the smallest part of work and assign it to the lower value, like XS (extra small) or 1.
This Product backlog item is a reference point during planning. Estimate is an online and real-
time web part that allows each team member to vote simultaneously on the same product
backlog item without influencing one another. After voting on a backlog item, the team can
then discuss and approve the proposed estimate or modify it as required. The estimate will
automatically be added to the backlog item.
Code Search
Many times, you might need to check something within the code you're working on, but you
might not want to download the full solution to your local environment. Thankfully, we have the
option to use Code Search. This simple but powerful tool offers many benefits and improves
productivity. Code Search will scan across the entire organisation’s repositories, creating
indexes with the code to search. The extension not only searches by name, but also checks all
references and implementations of the criteria we are looking for.
Inside the search results, you will obviously see the source code. More importantly, you will also
see the history of changes in your selected file and a comparison between versions. It’s
extremely useful to find what has changed between versions or which line of code is related to a
particular bug.
We’ve discussed some wonderful functionalities that can be added to boards and search. As long
as we are talking about code however, we would be remiss if we did not tell you about Pull
Request Merge Conflict extension. This tool has been the safety net that has saved us time after
time. As any developer will tell you, sometimes during the pull request, there are conflicts.
Developers need to resolve the conflicts between their local copy and the copy of code that’s on
the server. Thanks to this plugin, every conflict can be resolved online on the Pull Request
site.
Moreover, if our code contains an automated build during the pull request, you can easily verify
that you didn’t break the code! How’s that for useful?
SonarQube
Finally, an extension or, to be fair, a product that we’d love to share with you, is SonarQube.
More precisely, SonarCloud. This product allows you to conduct static code analysis. Thus,
your development team can ensure that they haven't used any vulnerable components and
that the code doesn’t have any security or code quality issues. Furthermore, the SonarQube
analysis is always run during the Pull Request so that every code review will have additional
27
protection built in.
SonarQube is a product where you can install on your server or connect to the Cloud
(SonarCloud). During the Pull Request or Release, SonarQube fetches the source code and
analyzes everything regarding security and quality. When the scanning process completes, the
status and report will be sent to Azure DevOps to review. Static code analysis should be a part of
every pull request to ensure that your solution will be more protected and stable.
As your codebase expands and is divided across multiple projects and repositories, finding what
you need becomes increasingly difficult. To maximize cross-team collaboration and code
sharing, you need a solution that can quickly and efficiently locate relevant information across
all of your projects.
From discovering examples of an API's implementation, browsing its definition, to searching for
error text, Code Search delivers a one-stop solution for all your code exploration and
troubleshooting needs.
Code Search is available for Azure DevOps Services (formerly knows as Visual Studio Team
Services) and Azure DevOps Server (formerly knows as Team Foundation Server) starting with
TFS "15". You can configure Code Search as part of the Azure DevOps Server configuration.
For more details see Administer Search.
Pricing: Free
27
Search across one or more projects
Start searching for code using the search box on the top right corner.
27
Or you can use the context menu from the code explorer.
27
Code Search enables you to search across all repositories, so you can focus on the results that
matter most to you.
27
Semantic ranking
Ranking ensures you get what you are looking for in the first few results. Code Search uses code
semantics as one of the many signals for ranking, ensuring that the matches are laid out in a
relevant manner. E.g. Files where the given term appears as definition, are ranked higher.
Rich filtering
Get that extra power from Code Search that lets you filter your results by project, repo, branch,
file path, and extension. You can also filter by code type, such as definition, comment, reference,
and much more. And, by incorporating logical operators such as AND, OR, NOT, refine your
27
query to get the results you want.
The Code Search interface integrates with familiar controls in Azure Repos, giving you the
ability to look up History, compare what’s changed since the last commit or changeset, Blame
27
and much more.
27