Machine
Learning
Group members
Danish Abhyuday Nitish Arif
What is machine
learning?
• Machine learning is a subfield of
artificial intelligence that focuses
on the development of algorithms
and models that enable
computers to learn and make
predictions or decisions without
being explicitly Programmed
How machine
learning works?
• Machine learning works on data
or information provided to it and
make predictions, analysis, and
critical thinking according to the
users input.
• It also make changes in
accordance to it’s own experience
Benefits of machine
learning
• Machine learning offers numerous benefits, including
• Automation: Machine learning can automate tasks that
would be time-consuming or impractical for humans to
perform, leading to increased efficiency.
• Data-Driven Insights: It helps extract valuable insights
and patterns from large datasets, enabling data-driven
decision-making.
• Predictive Capabilities: Machine learning models can
make predictions, which is valuable in various domains,
such as forecasting sales, stock prices, or equipment
Failures.
• Changes with experience : machine learning helps to
make changes with experience qnd interaction.
Types of data
• Labeled Or training data : it is the
type of Data which contains both
input and output as well . In
simple words It is a well labeld
data.
• Unlabeled data : in this type of
data it only contains input
without any specified output
Types of learning/ methods of learning
Supervised Unsupervised
learning leraning
Supervised learning
• Supervised learning is a type of machine
learning where the algorithm Is trained on
a labeled dataset. In supervised learning,
the algorithm learns to make predictions
or decisions based on input data (features)
by mapping them to corresponding output
labelS
• The main steps in this type of learning are
• 1 Training Data
• 2 Model training
• 3 predictions
• e.g. whether a mail is spam or not.
• e.g. Future price of house based on its
feature like location,bhk,area etc.
Unsupervised
learning
• Unsupervised learning is a category
of machine learning where the
algorithm learns from unlabeled
data, extracting patterns,
structures, or relationships without
specific guidance.
• e.g. Segmenting customers based
on their preference.
Classification and Regression in ML
• Classification is the process of finding or discovering a model or
function which helps in separating the data into multiple categorical
classes i.e. discrete values. It involves categorizing data points into
distinct classes based on their feature.e.g. whether a mail is spam or
not.
• Regression is a task of predicting a continous outcome variable.
Instead of predicting category regression predicts quantity. e.g. price
of stocks/bitcoin tomorrow.
Models in ML
Classification Regression Unsupervised
1. K- Nearest 1. Linear regression 1. k - means clustering
Neighbour 2. neural networks
2. Support vector
machine
3. Naive bayes
classification
4. Logistic regression
K - Nearest Neighbour
For Classification in Machine Learning
What is KNN?
• K-Nearest Neighbour is one of the simplest Machine Learning
algorithms based on Supervised Learning technique.
• K-NN algorithm stores all the available data and classifies a new data
point based on the similarity. This means when new data appears
then it can be easily classified into a well suite category by using K- NN
algorithm.
• It is also called a lazy learner algorithm because it does not learn from
the training set immediately instead it stores the dataset and at the
time of classification, it performs an action on the dataset.
How does KNN work?
1. Select the number K of the neighbors. That will used for taking
prediction.
2. Calculate the Euclidean distance of K number of neighbors.
3. Take the K nearest neighbors as per the calculated Euclidean
distance.
4. Among these k neighbors, count the number of the data points in
each category.
5. Category, which gets the majority vote is assigned to new data
point/querry.
Graphical representation of KNN algorithm.
Graphical representation of KNN algorithm.
Graphical representation of KNN algorithm.
Graphical representation of KNN algorithm.
Thoughts on picking the value of K.
• There is no physical or biological way to determine the best value for
‘k’.So, we have to try out few values before settling on one.
• Low values of ‘k’ can be noisy and subject to effect of outliers.
• Large value of ‘k’ smooth over things, but we don’t want ‘k’ so large
that a category with only few samples in it will always be out voted by
the other categories.
How Outliers affect KNN.
Output:
New data point will be assigned
the category of orange due to
majority vote
Feature 1
Outlier
Feature 2
How Large value of k affect KNN.
k = 10
which is greater than number of
green category dataset elements
Output:
New datapoint will be assigned the
category of red
Feature 1
Feature 2
Support Vector Machine
(SVM)
For Classification/Regression in Machine Learning
What is SVM?
• Most popular Supervised Learning algorithms.
• Primarily, it is used for Classification problems in Machine Learning.
• Goal of SVM is to create the best line or decision boundary that can
segregate n-dimensional space into classes.
• This best decision boundary is called a hyperplane.
• SVM chooses the extreme points/vectors that help in creating the
hyperplane. These extreme cases are called as support vectors.
SVN Terminologies.
Which Hyperplane to Select.
We always create a hyperplane that has a maximum margin, which
means the maximum distance between the data points.
Maximum Margin Hyperplane
Types of SVM.
LInear SVM Non-LInear
SVM
Kernel Function.
Transform low dimensional space into higher dimesional space
making it seperable. Y=X2
X X
0 0
Application of SVM.
• Face detection, image
classification, text categorization,
etc.
• Bioinformatics: SVMs are used
for protein structure prediction,
gene classification, and
identifying biomarkers in
genomics.
Difference b/w KNN and SVM.
1. It classifies a new data point 1. It finds a hyperplane in an N-
based on how its neighbors dimensional space.
are classified. 2. Learns a decision boundary
2. Doesn't have a training phase during the training phase.in
and make prediction at ratio 80:20.
runtime. 3. Generally faster in the
3. Computationally expensive, prediction phase once the
especially with large datasets hyperplane is learned during
training.
Naive Bayes classifier
For Classification in Machine Learning
What is Naive bayes classifier.
• Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine
learning models that can make quick predictions.
• It is a probabilistic classifier, which means it predicts on the basis of
the probability of an object.
• Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.
• Naïve: It is called Naïve because it assumes that the occurrence of a
certain feature is independent of the occurrence of other features.
• Bayes: It is called Bayes because it depends on the principle of Bayes'
Theorem.
What is Bayes theorem.
which is used to determine the probability of a hypothesis with prior
knowledge. It depends on the conditional probability.
Application of Naive bayes classifier.
• Some popular
examples of Naïve
Bayes Algorithm are
spam filtration,
Sentimental analysis,
and classifying
articles.
Regressions in ML
WHAT IS LINEAR
Linear Regression is a machine learning algorithm based
REGRESSION?
on supervised learning.
It is a statistical method that is used for predictive
analysis. Linear regression makes predictions for
continuous/real or numeric variables such as cost, age,
sales, temperature, product price, etc.
LINEAR REGRESSION REAL
WORLD APPLICATION
TYPE OF LINEAR REGRESSION
There are two versions of linear regression depending on the
number of characteristics used as input
Linear Regression with Linear Regression with
Single Variable Multiple Variables
SIMPLE LINEAR REGRESSION
Simple Linear Regression is a type of Regression
algorithms that models the relationship between a
dependent variable and a single independent
variable. The relationship shown by a Simple
Linear Regression model is linear or a sloped
straight line, hence it is called Simple Linear
Regression.
TERM AG PREMIUM
25 18000
30
INSURANCE E
35
32000
42000 EXAMPLE OF SIMPLE
40 47000
45 55000
LINEAR REGRESION
60000
The survey was
50000
conducted on 4,566
PREMIUM
40000
people across 15
30000
metropolitan and tier
20000
one cities in India
25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0
AGE
WHAT IS BEST FIT LINE?
When working with linear
regression, our main goal is to
60000
find the best fit line that means
50000
PREMIUM
the error between predicted 40000
30000
values and actual values should
20000
be minimized. The best fit line 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0
AGE
DEPENDENT AND
INDEPENDENT VARIABLE
Linear Equation:
y = mx+c
60000
50000
PREMIUM
40000
where:
30000
y - dependent variable
20000
x - independent variable 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0
premium=m*age+c AGE (x)
m - slope/gradient/coefficient
MULTIPLE LINEAR
Multiple Linear Regression is an extension of
REGRESSION
Simple Linear regression as it takes more than
one predictor variable to predict the response
variable.
"Multiple
We Linear
can define it as:Regression is one of the
important regression algorithms which models
the linear relationship between a single
dependent continuous variable and more than
EXAMPLE OF MULTUPLE LINEAR REGRESION
TERM AG HEIGHT WEIGH
70
PREMIUM
18000
25 162.56
30
E 172.7 T
95 32000
INSURANCE 35
40
167.64
2
110
78 42000
47000
45 157.48 85 55000
60000
The survey was
50000
conducted on 4,566
PREMIUM
40000
people across 15 30000
metropolitan and tier 20000
one cities in India 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0
AGE
LOGISTIC REGRESSION
Logistic Regression is a machine learning algorithm based on
supervised learning.
It is a statistical method that is used for predicting probability of
target variable. Logistic Regression makes probability for
classification problems that are discrete in nature.
Example: English or Hindi, True or False, 1 or 0, Right or Wrong,
LOGISTIC REGRESSION
Logistic regression predicts the output of a categorical dependent variable.
Therefore the outcome must be a categorical or discrete value.
It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and
1, it gives the probabilistic values which lie between 0 and 1.
In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic function,
which predicts two maximum values (0 or 1).
Logistic Regression can be used to classify the observations
using different types of data and can easily determine the
most effective variables used for the classification. The
below image is showing the logistic function:
LOGISTIC FUNCTION (SIGMOID FUNCTION):
The sigmoid function is a mathematical function used to map the predicted values to
probabilities
It maps any real value into another value within a range of 0 and 1. o The value of the logistic
regression must be between 0 and 1, which cannot go beyond this limit, so it forms a curve like
The S-form curve is called the Sigmoid function or the logistic function.
the “S” form.
In logistic regression, we use the concept of the threshold value, which defines the probability of
either 0 or 1. Such as values above the threshold value tends to 1, and a value below the
threshold values tends to 0.
LOGISTIC
We know the equation of the straight line can be written as
REGRESSION
EQUATION: In Logistic Regression y can be between 0 and 1 only, so for this let's
divide the above equation by (1-y):
But we need range between -[infinity] to +[infinity], then take logarithm of
the equation it will become:
LOGISTIC REGRESSION
REAL LIFE EXAMPLE
TYPE OF LOGISTIC REGRESSION
On the basis of the categories, Logistic Regression can be classified into three
types:
BINOMIAL MULTINOMIAL ORDINAL
In multinomial Logistic In ordinal Logistic
In binomial Logistic
regression, there can be regression, there can be
regression, there can be only
3 or more possible 3 or more possible
two possible types of the
unordered types of the ordered types of
dependent variables, such as
dependent variable, such dependent variables,
EXAMPLE OF LOGISTIC REGRESSION
TERM
AGE BROUGHT INSURANCE
INSURANCE
21 NO
48 YES
32 YES
41 YES
The survey was
20 NO
conducted on 4,566
35 YES
people across 15
metropolitan and tier 20 NO
one cities in India 23 NO
Unsupervised machine
learning
abhyuday
Unsupervised learning
content
DEFINITION
TYPES
52
Unsupervised learning
Definition & more
o Unsupervised learning is a type of machine learning that looks for previously undetected
patterns in a data set with no pre-existing labels and with a minimum of human supervision.
o In contrast to supervised learning that usually makes use of human-labeled data,
unsupervised learning, also known as self-organization allows for modeling of probability
densities over inputs.
o It forms one of the three main categories of machine learning, along with supervised and
reinforcement learning. Semi-supervised learning, a related variant, makes use of supervised
and unsupervised techniques.
o Two of the main methods used in unsupervised learning are principal component* and
cluster analysis. Cluster analysis is used in unsupervised learning to group, or segment,
datasets with shared attributes in order to extrapolate algorithmic relationships
o Cluster analysis is a branch of machine learning that groups the data that has not been
labelled, classified or categorized.
53
Unsupervised learning
Definition
o Instead of responding to feedback, cluster analysis identifies commonalities in the data and
reacts based on the presence or absence of such commonalities in each new piece of
data. This approach helps detect anomalous data points that do not fit into either group.
o A central application of unsupervised learning is in the field of density estimation in statistics,
though unsupervised learning encompasses many other domains involving summarizing
and explaining data features.
54
Unsupervised learning
Types
• Unsupervised learning is mainly of three types
• 1) Clustering
• 2) Association
• 3) Dimensionality reduction
55
Unsupervised learning
CLUSTERING
• Clustering or cluster analysis is a machine learning technique, which groups the unlabeled
dataset. It can be defined as "A way of grouping the data points into different clusters,
consisting of similar data points. The objects with the possible similarities remain in a group
that has less or no similarities with another group.“
• It does it by finding some similar patterns in the unlabeled dataset such as shape, size, color,
behavior, etc., and divides them as per the presence and absence of those similar patterns.
The unlabeled data also means less work for the user.
• It is an unsupervised learning method so no supervision is required while it is processing the
data
• After applying this clustering technique, each cluster or group is provided with a cluster-ID.
ML system can use this id to simplify the processing of large and complex datasets.
56
Unsupervised learning
CLUSTERING
• Types of clustering
• 1) Hierarchical clustering-the type of clustering in which each data point is treated as a single
cluster in the start then merged* with another data point to form another cluster this cluster
formation continues until a point is reached where no clusters can be formed or a stopping
criteria is reached.
• These clusters formed are represented in a dendogram*, from which the word hierarchical
comes from
• *agglomerative clustering (merging)
• *divisive clustering (splitting)
• 2) Partitional clustering-Partitional clustering (or partitioning clustering) are clustering methods
used to classify observations, within a data set, into multiple groups based on their similarity,
dissmilarity
• * K means clustering
57
Unsupervised learning
Hierarchical
• > AGGLOMERATIVE CLUSTERING <
• The word Agglomerate means “to collect or form into a mass or group”
o We begin by considering every datapoint as a individual cluster then start merging the cluster
with the closest clusters in every step, merging is done in pair in continuing iterations.
o Algorithm
• 1) Calculate the similarity of one cluster with all the other clusters (calculate proximity
matrix).
• 2)Consider every data point as an individual cluster.
• 3)Merge the clusters which are highly similar or close to each other.
• 4)Recalculate the proximity matrix for each cluster
• 5)Repeat Steps 3 and 4 until only a single cluster remains.
•
58
Unsupervised learning
Hierarchical
• > AGGLOMERATIVE CLUSTERING <
• here in each step or iteration
• The clusters closest to each other
• Get merged
• The condition for agglomeration
• Here is Euclidean distance.
59
Unsupervised learning
Hierarchical
• > DIVISE CLUSTERING
<
• This type of clustering is exactly
the opposite of agglomerative
clustering
o In this we consider every
datapoint in the set in a single
cluster
o Then we start splitting the
clusters one datapoint at a
point on the basis of which
datapoint is the most dissimilar
•
60
Unsupervised learning
Hierarchical
• > DIVISE CLUSTERING
<
• This type of clustering is exactly
the opposite of agglomerative
clustering
o In this we consider every
datapoint in the set in a single
cluster
o Then we start splitting the
clusters one datapoint at a
point on the basis of which
datapoint is the most dissimilar
•
61
Unsupervised learning
partitional
• This clustering method classifies the datapoints into multiple groups based
on the characteristics, similarity, dissimilarity or relationship of datapoints.
• It is upto data analysts to specify the number of clusters that has to be
generated for the clustering methods
• *K-Mean (A centroid based Technique): The K means algorithm takes the
input parameter K from the user and partitions the dataset containing N
objects into K clusters so that resulting relation among the data objects
inside the group (intracluster) is high but the relation of data objects with the
data objects from outside the cluster is low (intercluster)
62
Unsupervised learning
partitional
o The working of the K-Means algorithm is explained in the below steps:
o Step-1: Select the number K to decide the number of clusters.
o Step-2: Select random K points or centroids. (It can be from the input dataset).
o Step-3: Assign each data point to their closest centroid, which will form the predefined K
cluster
o Step-4: Calculate the variance and place a new centroid of each cluster.
o Step-5: Repeat the third steps, which means reassign each datapoint to the new closest
centroid of each cluster.
o Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
o Step-7: The model is ready.
63
Unsupervised learning
Association
• Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be
more profitable
o The association rule learning is one of the very important concepts of machine learning,
and it is employed in Market Basket analysis
o Association rule learning works on the concept of If and Else Statement, such as if A then
B. Here the If element is called antecedent, and then statement is called as Consequent.
These types of relationships where we can find out some association or relation between
two items is known as single cardinality.
o However if the number of items increase cardinality also increases. So, to measure the
associations between thousands of data items, there are three metrics. These metrics are
given below:
❖ Support
❖ Confidence
64
❖
Unsupervised learning
Association
1) Support- support is the fraction of “frequency of occurrence of a datapoint X” and the total
number of transactions containing it.
2) Confidence- for two items, confidence is the fraction of number of occurrences of X and Y
together to the number of occurrence of X
3) lift - It is the ratio of the observed support measure and expected support if X and Y are
independent of each other. It has three possible values:
o If Lift= 1: The probability of occurrence of antecedent and consequent is independent
of each other.
o Lift>1: It determines the degree to which the two itemsets are dependent to each
other.
o Lift<1: It tells us that one item is a substitute for other items, which means one item
has a negative effect on another.
65
Unsupervised learning
Association
• Types of Association Rule Learning
• Association rule learning can be divided into two algorithms:
• 1)Apriori Algorithm
• This algorithm uses frequent datasets to generate association rules. It is designed to work
on the databases that contain transactions. This algorithm uses a breadth-first search and
Hash Tree to calculate the itemset efficiently.
• It is mainly used for market basket analysis and helps to understand the products that can
be bought together. It can also be used in the healthcare field to find drug reactions for
patients.
• 2)F-P Growth Algorithm
• The F-P growth algorithm stands for Frequent Pattern, and it is the improved version of the
Apriori Algorithm. It represents the database in the form of a tree structure that is known as
a frequent pattern or tree. The purpose of this frequent tree is to extract the most frequent
patterns. 66
Thank you