Machine
Learning: An
Overview
Machine learning (ML) is a type of artificial intelligence (AI) that
allows computer systems to learn from data without being
explicitly programmed. ML algorithms are trained on data to
identify patterns and relationships, enabling them to make
predictions or decisions based on new, unseen data.
Forms of Machine Learning
1 Supervised Learning
In supervised learning, the algorithm learns from labeled data, where each input is associated
with a known output. The algorithm uses this labeled data to infer the relationship between
inputs and outputs, enabling it to predict outputs for new, unseen inputs.
2 Unsupervised Learning
In unsupervised learning, the algorithm learns from unlabeled data, where no output labels
are provided. The algorithm identifies patterns and relationships within the data, grouping
similar data points together or discovering hidden structures.
3 Reinforcement Learning
In reinforcement learning, the algorithm learns through trial and error, interacting with an
environment and receiving rewards or penalties for its actions. The algorithm learns to
maximize its rewards by adjusting its behavior based on past experiences.
Supervised Learning: Classification
Definition Examples
Classification is a type of supervised learning where Examples of classification problems include:
the goal is to categorize input data into one of several
• Face recognition
predefined classes. The algorithm learns from labeled
data, where each input is associated with a specific • Object detection in images
class. • Spam filtering
Supervised Learning: Regression
Definition Examples
Regression is a type of supervised learning where the Examples of regression problems include:
goal is to predict a numeric output based on input
• Predicting the age of a person based on their habits
data. The algorithm learns from labeled data, where
each input is associated with a known numeric output. • Predicting the future prices of stocks
• Estimating the value of a house based on its
features
Unsupervised Learning: Clustering
Definition Examples
Clustering is a type of unsupervised learning where Examples of clustering problems include:
the goal is to group similar data points together based
• Categorizing different types of customers for
on their common characteristics or attributes. The
marketing purposes
algorithm learns from unlabeled data, identifying
patterns and relationships within the data. • Grouping similar documents based on their content
• Identifying clusters of stars in a galaxy
Unsupervised Learning: Association
Definition Examples
Association is a type of unsupervised learning where Examples of association problems include:
the goal is to identify interesting relationships or
• Developing a product recommendation system
dependencies among data attributes. The algorithm
based on customer shopping behavior
learns from unlabeled data, discovering patterns and
co-occurrences. • Identifying common patterns in website traffic
• Discovering relationships between medical
symptoms and diagnoses
Reinforcement Learning
Agent
The intelligent agent interacts with the environment, making
decisions and receiving feedback in the form of rewards or
penalties.
Environment
The environment provides the context for the agent's
actions, responding to its decisions and providing feedback.
Reward Function
The reward function defines the goals of the agent,
specifying what actions are desirable and which are
undesirable.
Machine Learning Workflow
Understand the Objectives 1
Define the purpose of the ML model and agree
on acceptance criteria with stakeholders to
ensure alignment with business priorities.
2 Select a Framework
Choose a suitable AI development framework
based on objectives, acceptance criteria, and
business priorities.
Select & Build the Algorithm 3
Select an ML algorithm based on factors such
as objectives, acceptance criteria, and
available data. The algorithm may be manually
coded or retrieved from a library of pre-written
code.
4 Prepare & Test Data
Prepare data for training and testing the
model, including data acquisition, pre-
processing, and feature engineering. Perform
exploratory data analysis and test the data
and automated data preparation steps.
Machine Learning Workflow
Train the Model 5
Train the selected ML algorithm using training
data to create the model. Adjust model
hyperparameters and algorithm
hyperparameters to control the training
process.
6 Evaluate the Model
Evaluate the model against agreed ML
functional performance metrics using a
validation dataset. Use the results to improve
(tune) the model.
Tune the Model 7
Adjust model settings to fit the data and
improve performance based on evaluation
results. Tune hyperparameters or update
model attributes.
Machine Learning Workflow
8 Test the Model
Test the model against an independent test
dataset to ensure that agreed ML functional
performance criteria are met. Compare
Deploy the Model performance with evaluation results and
9
consider selecting a different model if
Re-engineer the tuned model for deployment
performance is significantly lower.
along with its related resources, including the
data pipeline. Deploy the model to target
environments, such as embedded systems or Use the Model
10
the cloud.
Use the deployed model operationally,
performing scheduled batch predictions or
running on request in real time.
Monitor and Tune the Model 11
Regularly evaluate the operational model
against its acceptance criteria to identify and
manage drift. Update model settings or re-train
with new data to address drift or improve
accuracy.
Selecting a Form of Machine Learning
Supervised Learning
Consider supervised learning if there is labeled data available, where each input is
associated with a known output. Choose classification if the output is discrete and
categorical, and regression if the output is numeric and continuous.
Unsupervised Learning
Consider unsupervised learning if no output labels are provided in the dataset. Choose
clustering if the problem involves grouping similar data points, and association if the
problem involves finding co-occurring data items.
Reinforcement Learning
Consider reinforcement learning if the problem involves interaction with an environment and
the notion of multiple states, where decisions are made at each state.
Factors Involved in ML Algorithm Selection
Functionality Classification, prediction, etc.
Quality Characteristics Accuracy, speed, memory usage, interpretability, etc.
Data Type Image, text, numerical, etc.
Data Quantity Amount of data available for training and testing
Number of Features Number of input variables used by the model
Number of Classes Number of categories for clustering
Previous Experience Prior knowledge and expertise with specific algorithms
Trial and Error Experimenting with different algorithms and settings