0% found this document useful (0 votes)
21 views66 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views66 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 66

Machine Learning

■ Machine learning is a core sub-area of Artificial Intelligence


(AI).

■ ML applications learn from experience (or to be accurate, data)


like humans do without direct programming.

■ In other words, Machine learning involves computers finding


insightful information without being told where to look.

■ Instead, they do this by leveraging algorithms that learn from


data in an iterative process.
Machine Learning
■ Machine Learning is about extracting knowledge from
data.

■ It is a research field at the intersection of statistics,


artificial intelligence and computer science.

■ It is also known as predictive analytics or statistical


learning
Why Machine Learning ?

Model
Steps
Training Training
Labels
Training
Images
Image Learned
Training
Features model

Testing

Image Learned
Prediction
Features model
Test Image Slide credit: D. Hoiem and L. Lazebnik
Types of Training
Supervised Machine Learning
■ It uses a series of labelled examples with direct feedback

■ The user provides the algorithm with pairs of input and desired
outputs, and the algorithm finds a way to produce the desired
output given an input.

■ The algorithm can able to create an output for an input it has


never seen before without any help from a human
An example: dataset (loan application)

11
Iris
Iris
Versicolor
Setosa

The Iris dataset contains the measurements of 150 iris


flowers from 3 different species: Iris
Iris-Setosa, Virginica
Iris-Versicolor, and
Iris-Virginica
■ Features in the Iris dataset:
– sepal length in cm
– sepal width in cm
– petal length in cm
– petal width in cm
■ Target classes to predict:
– Iris Setosa
– Iris Versicolour
– Iris Virginica
Supervised Machine Learning
Algorithms
■ Support-vector machines
■ Linear regression
■ Logistic regression
■ Naive Bayes
■ Linear discriminant analysis
■ Decision trees
■ K-nearest neighbor algorithm
■ Neural networks (Multilayer perceptron)
■ Similarity learning
● It finds the correlation between
“Dependent and Independent
variables”

● Map the input variable of “x” to the


continuous output variable of “y.”

● Applications:
○ House patterns
○ market trends
○ weather pattern
○ oil and gas prices etc.
● Divides the dataset into classes
based on various parameters

When using a Classification
algorithm, a computer program
gets taught on the training dataset
and categorizes the data into
various categories depending on
what it learned.
● In another words, predict the
event occurrence probability by
fitting the data to a logit
function
Ex:

●email and spam classification


●Predicting the willingness of bank
customers to pay their loans
● Identifying cancer tumor cells.
Applications of Supervised Learning

■ Determining whether a tumor is benign based on a Medical


Images
■ Detecting fraudulent activity in credit card transactions
■ Spam detection
■ Stock price prediction
■ Signature recognition
Unsupervised Machine Learning

■ In unsupervised learning, only the input data is known,


and no known output data is given to the algorithm.

■ An algorithms to identify patterns in data sets


containing data points that are neither classified nor
labeled.
Unsupervised Learning Algorithm
• K-means clustering.
• Hierarchal clustering.
• Anomaly detection.
• Neural Networks.
• Principle Component Analysis.
• Independent Component Analysis.
• Apriori algorithm
• EM algorithm
Applications of Unsupervised Learning

■ Identifying topics in a set of blog posts

■ Segmenting customers into groups with similar


preference

■ Detecting abnormal access patterns to a website


Supervised vs. Unsupervised Learning

Parameter Supervised Learning Unsupervised Learning

Dataset Labelled Unlabelled

Method of Learning Guided learning The algorithm learns by itself using dataset

Complexity Simpler method Computationally complex

Accuracy More Accurate Less Accurate


Reinforcement Learning
Reinforcement learning is a machine
learning training method based on
rewarding desired behaviors and/or
punishing undesired ones.

In general, a reinforcement learning agent is


able to perceive and interpret its
environment, take actions and learn through
trial and error.
Different Varieties of Machine Learning
■ Concept Learning
■ Clustering Algorithms
■ Connectionist Algorithms
■ Genetic Algorithms
■ Explanation-based Learning
■ Transformation-based Learning
■ Reinforcement Learning
■ Case-based Learning
■ Macro Learning
■ Evaluation Functions
■ Cognitive Learning Architectures
■ Constructive Induction
■ Discovery Systems
■ Knowledge capture
■ Transfer learning
Types of Machine Learning Systems
■ There are so many different types of Machine Learning systems that it is useful
to classify them in broad categories based on:
– Whether or not they are trained with human supervision (supervised,
unsupervised, semisupervised, and
Reinforcement Learning)
– Whether or not they can learn incrementally on the fly (online versus
batch learning)
– Whether they work by
■ simply comparing new data points to known data points, or
■ instead detect patterns in the training data and build a predictive model,
much like scientists do (instance-based versus model-based
learning)
Batch and Online Learning
Batch and Online Learning

■ Batch learning
– In batch learning, the system is incapable of learning incrementally: it must be
trained using all the available data. This will generally take a lot of time and
computing resources, so it is typically done offline.
– First the system is trained, and then it is launched into production and runs
without learning anymore; it just applies what it has learned. This is called offline
learning.
– If you want a batch learning system to know about new data (such as a new type of
spam), you need to train a new version of the system from scratch on the full
dataset (not just the new data, but also the old data), then stop the old system and
replace it with the new one.
– Training on the full set of data requires a lot of computing resources (CPU,
memory space, disk space, disk I/O, network I/O, etc.). If you have a lot of data
and you automate your system to train from scratch every day, it will end up
costing you a lot of money. If the amount of data is huge, it may even be impossible
to use a batch learning algorithm.
Batch Learning
The ML Algorithms that come under
this are
■ Linear Regression
■ Decision Trees
■ Support Vector Machine,etc.,
Batch and Online Learning
■ Online learning
■ In online learning, you train the system incrementally by feeding it data instances sequentially,
either individually or by small groups called mini-batches. Each learning step is fast and cheap, so
the system can learn about new data on the fly, as it arrives

Fig. :Online Learning


■ Online learning is great for systems that receive data as a continuous flow (e.g., stock
prices) and need to adapt to change rapidly or autonomously.
■ It is also a good option if you have limited computing resources: once an online learning
system has learned about new data instances, it does not need them anymore, so you can
discard them (unless you want to be able to roll back to a previous state and “replay” the
data). This can save a huge amount of space.
■ Online learning algorithms can also be used to train systems on huge datasets that
cannot fit in one machine’s main memory.
■ The algorithm loads part of the data, runs a training step on that data, and repeats the
process until it has run on all of the data (see Figure 1-14).

■ Note: This whole process is usually done offline (i.e., not on the live system), so online
learning can be a confusing name. Think of it as incremental learning.
■ One important parameter of online learning systems is how fast they should adapt to
changing data: this is called the learning rate.
■ If you set a high learning rate, then your system will rapidly adapt to new data, but it
will also tend to quickly forget the old data (you don’t want a spam filter to flag only
the latest kinds of spam it was shown).
■ Conversely, if you set a low learning rate, the system will have more inertia; that is, it
will learn more slowly, but it will also be less sensitive to noise in the new data or to
sequences of nonrepresentative data points.
■ A big challenge with online learning is that if bad data is fed to the system, the
system’s performance will gradually decline
■ For example, bad data could come from a malfunctioning sensor on a robot, or
from someone spamming a search engine to try to rank high in search results. To
reduce this risk, you need to monitor your system closely and promptly switch
learning off
Online Learning
The ML Algorithms that come under this
are
■ Stochastic Gradient Descent(SGD)
■ Online Passive Aggressive(PA) algorithm.
■ Perceptron etc.;
Instance-Based Versus Model-Based
Learning
Instance-based learning is a type of machine
learning where the model learns from examples
and stores them in memory. The model can then
make predictions by comparing new input data to
the stored examples. One example of an instance-
based learning algorithm is k-Nearest Neighbors (k-
NN), which works by identifying the k number of
closest examples in the stored data to the input data
and making a prediction based on the majority of the
closest examples. For example, in a medical
diagnosis application, k-NN could be used to
predict a patient's condition based on the stored
Model based learning
Model-based learning is a type of machine learning
where the model learns the underlying relationships
and patterns in the data by creating a
mathematical representation or a model.
This model can then be used to make predictions on
new input data. One example of a model-based
learning algorithm is linear regression, which is
used to model the relationship between a
dependent variable and one or more independent
variables by fitting a linear equation to the data.
For example, in stock market prediction, linear
regression could be used to model the relationship
between past stock prices and various economic
indicators to predict future stock prices.
Summary…

Model based leraning /Eager Learning


● ex ML Algorithms are Linear Regression,Decision Tree,support vector
machines,Neural network.;
Instance Based Learning/ Lazy Learning/Rote Learning
● ex ML Algorithms are K- Nearest Neighbour
● Heavy dependency on training data
● They will take less time in training because they learn very less
Main Challenges of Machine Learning

● Insufficient Quantity of Training Data


● Nonrepresentative Training Data
● Poor-Quality Data
● Irrelevant Features
● Overfitting the Training Data
● Underfitting the Training Data
● Stepping Back
Insufficient Quantity of Training Data
■ For a toddler to learn what an apple is, all it takes is for you to point to an

apple and say “apple” (possibly repeating this procedure a few times). Now
the child is able to recognize apples in all sorts of colors and shapes. Genius.
■ Machine Learning is not quite there yet; it takes a lot of data for most

Machine Learning algorithms to work properly.


■ Even for very simple problems you typically need thousands of examples, and

for complex problems such as image or speech recognition you may need
millions of examples (unless you can reuse parts of an existing model).
Nonrepresentative Training Data
■ A machine learning model is said to be ideal if it predicts well for generalized cases and provides
accurate decisions. If there is less training data, then there will be a sampling noise in the model,
called the non-representative training set.

● In order to generalize well, it is crucial that your training data be representative of the new
cases you want to generalize to. This is true whether you use instance-based learning or
model-based learning.

● A Representative Dataset means that all possible outcomes that may occur are represented in
the dataset. If the dataset is not representative it is biased. Analyses created on a non-
representative dataset cannot be used for decision-making.
Poor-Quality Data
•If your training data is full of errors, outliers, and noise (e.g., due to poor
quality measurements), it will make it harder for the system to detect the
underlying patterns, so your system is less likely to perform well.
•It is often well worth the effort to spend time cleaning up your training
data. The truth is, most data scientists spend a significant part of their time
doing just that.
•For example:
– If some instances are clearly outliers, it may help to simply discard
them or try to fix the errors manually.
– If some instances are missing a few features (e.g., 5% of your
customers did not specify their age), you must decide whether you
want to ignore this attribute altogether, ignore these instances, fill in
the missing values (e.g., with the median age), or train one model with
the feature and one model without it, and so on.
Irrelevant Features
•As the saying goes: garbage in, garbage out.

•Your system will only be capable of learning if the training data


contains enough relevant features and not too many irrelevant ones.
•A critical part of the success of a Machine Learning project is coming
up with a good set of features to train on. This process, called feature
engineering, involves:
■ Feature selection: selecting the most useful features to train on
among existing features.
■ Feature extraction: combining existing features to produce a
more useful one (as we saw earlier, dimensionality reduction
algorithms can help).
■ Creating new features by gathering new data.
Overfitting the Training Data
A model is said to be overfitted when the model does not make accurate
predictions on testing data. When a model gets trained with so much
data, it starts learning from the noise and inaccurate data entries in our
data set. And when testing with test data results in less accurate results.

Underfitting in Machine Learning


A model or a machine learning algorithm is said to have underfitting when a model is too simple
to capture data complexities. It represents the inability of the model to learn the training data
effectively result in poor performance both on the training and testing data.
Overfitting the Training Data
•In Machine Learning this is called overfitting: it means that
the model performs well on the training data, but it does
not generalize well.
•Fig shows an example of a high-degree polynomial life
satisfaction model that strongly overfits the training data.
Contd..
•The figure shows three models: the dotted line represents the original model
that was trained with a few countries missing, the dashed line is our second
model trained with all countries, and the solid line is a linear model trained with
the same data as the first model but with a regularization constraint.

•The amount of regularization to apply during learning can be controlled by a


hyper parameter. A hyperparameter is a parameter of a learning algorithm
Underfitting the Training Data
•As you might guess, underfitting is the opposite of overfitting:
• It occurs when your model is too simple to learn the underlying structure of
the data.
•For example, a linear model of life satisfaction is prone to underfit;
reality is just more complex than the model, so its predictions are bound to be
inaccurate, even on the training examples.
•The main options to fix this problem are:
•Selecting a more powerful model, with more parameters
•Feeding better features to the learning algorithm (feature engineering)
•Reducing the constraints on the model (e.g., reducing the regularization
hyperparameter)
Stepping Back
■ Machine Learning is about making machines get better at some task by learning
from data, instead of having to explicitly code rules.
■ There are many different types of ML systems: supervised or not, batch or online,
instance-based or model-based, and so on.
■ In a ML project you gather data in a training set, and you feed the training set to a
learning algorithm.
■ If the algorithm is model-based it tunes some parameters to fit the model to the
training set (i.e., to make good predictions on the training set itself), and then
hopefully it will be able to make good predictions on new cases as well.
■ If the algorithm is instance-based, it just learns the examples by heart and
generalizes to new instances by comparing them to the learned instances using a
similarity measure.
■ The system will not perform well if your training set is too small, or if the data is
not representative, noisy, or polluted with irrelevant features (garbage in, garbage
out).
■ Lastly, your model needs to be neither too simple (in which case it will underfit)
nor too complex (in which case it will overfit).
Testing and Validating
•A better option is to split your data into two sets: the training set and the test set.
•You train your model using the training set, and you test it using the test set.
•The error rate on new cases is called the generalization error (or out-of sample
error), and by evaluating your model on the test set, you get an estimate of this
error.
•This value tells you how well your model will perform on instances it has never
seen before.
•If the training error is low (i.e., your model makes few mistakes on the training
set) but the generalization error is high, it means that your model is overfitting the
training data.
•Generalization error is a measure of how accurately an algorithm is able to
predict outcome values for previously unseen data.
Testing and Validating

•Hyperparameter Tuning and Model Selection


•Data Mismatch
Hyperparameter Tuning and Model
Selection
what is Hyperparameter???

Ref:A Comprehensive Guide on Hyperparameter Tuning and its Techniques (analyticsvidhya.com)


Hyperparameter Tuning and Model
Selection
•Evaluating a model
•Use a test set.
•Between two models (say a linear model and a polynomial model)
•Best one : One option is to train both and compare how well they
generalize using the test set
•Now suppose that the linear model generalizes better, but you want to
apply some regularization to avoid overfitting.
•The question is: how do you choose the value of the regularization
hyperparameter?
• One option is to train 100 different models using 100 different values for
this hyperparameter.
Contd..
■ The problem is that you measured the generalization error multiple times on
the test set, and you adapted the model and hyperparameters to produce the
best model for that particular set.
■ This means that the model is unlikely to perform as well on new data.
■ A common solution to this problem is called holdout validation: you simply hold
out part of the training set to evaluate several candidate models and select the
best one.
■ The new held out set is called the validation set
■ More specifically, you train multiple models with various hyperparameters on the
reduced training set (i.e., the full training set minus the validation set), and you
select the model that performs best on the validation set.
■ After this holdout validation process, you train the best model on the full
training set (including the validation set), and this gives you the final model.
■ Lastly, you evaluate this final model on the test set to get an estimate of the
generalization error.
Data Mismatch
•In some cases, it is easy to get a large amount of
data for training, but it is not perfectly representative
of the data that will be used in production.
Dataset Repositories
• UCI Machine Learning Repository. 404 datasets.
• OpenML datasets
• Kaggle datasets
• Academic Torrents
• TU Berlin/ MLdata.org
• AWS Public Datasets
• BigQuery Public Datasets
Knowing your Task and Knowing
Your Data
■ What questions am I trying to answer? Do I think the data collected
can answer that questions?
■ What is the best way to phrase my questions as a machine learning
problem
■ Have I collected enough data to represent the problem I want to solve?
■ What features of the data did I extract, and will these enable the right
predictions?
■ How will I measure success in my applications

63
Frameworks

■ Programming languages
Fast-evolving ecosystem!
– Python
– R
– C++
– ...
classic machine learning
■ Many libraries
– scikit-learn
– PyTorch deep learning frameworks
– TensorFlow
– Keras
– …

64
Reference

■ https://pub.towardsai.net/best-datasets-for-machine-learning-data-science-computer-visi
on-nlp-ai-c9541058cf4f
■ https://monkeylearn.com/machine-learning/#:~:text=Machine%20learning%20(ML)
%20is%20a,to%20make%20their%20own%20predictions.

You might also like