0% found this document useful (0 votes)

19 views31 pages

Linear & Logistic Regression Guide

fam chapter 6 ppt

Uploaded by

Kartik Pagariya.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views31 pages

Linear & Logistic Regression Guide

fam chapter 6 ppt

Uploaded by

Kartik Pagariya.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

UNIT VI

Classification & Regression

Mark: 10
•Linear Regression in Machine Learning:

▪ Linear regression is one of the easiest and most popular Machine Learning
algorithms.
▪ It is a statistical method that is used for predictive analysis. Linear regression
makes predictions for continuous/real or numeric variables such as sales,
salary, age, product price, etc.
▪ Linear regression algorithm shows a linear relationship between a dependent
(y) and one or more independent (y) variables, hence called as linear
regression. Since linear regression shows the linear relationship,
• Mathematically, we can represent a linear regression as:
•y= a0+a1x+ ε

Here,
• Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
• The values for x and y variables are training datasets for Linear Regression
model representation.
•Types of Linear Regression:

•Simple Linear Regression:

If a single independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression
algorithm is called Simple Linear Regression.

•Multiple Linear regression:

If more than one independent variable is used to predict the value of
a numerical dependent variable, then such a Linear Regression
algorithm is called Multiple Linear Regression.
•Mean Absolute Error(MAE)
•MAE is a very simple metric which calculates the absolute difference
between actual and predicted values.
•Mean Squared Error(MSE)
• MSE is a most used and very simple metric with a little bit of change in mean
absolute error. Mean squared error states that finding the squared difference
between actual and predicted value.
Root Mean Squared Error(RMSE)
As RMSE is clear by the name itself, that it is a simple square root of mean
squared error.
•R Squared (R2)
• R2 score is a metric that tells the performance of your model, not the loss in
an absolute sense that how many wells did your model perform.
• In contrast, MAE and MSE depend on the context as we have seen whereas
the R2 score is independent of context.
•Overfitting in Machine Learning:
• A statistical model is said to be overfitted when the model does not make
accurate predictions on testing data.
• When a model gets trained with so much data, it starts learning from the noise
and inaccurate data entries in our data set.
• And when testing with test data results in High variance. Then the model does
not categorize the data correctly, because of too many details and noise.
• The causes of overfitting are the non-parametric and non-linear methods
because these types of machine learning algorithms have more freedom in
building the model based on the dataset and therefore they can really build
unrealistic models.
• A solution to avoid overfitting is using a linear algorithm if we have linear data
or using the parameters like the maximal depth if we are using decision trees.
•Reasons for Overfitting:
• High variance and low bias.
• The model is too complex.
• The size of the training data.
•Techniques to Reduce Overfitting
• Increase training data.
• Reduce model complexity.
• Early stopping during the training phase (have an eye over the loss over
the training period as soon as loss begins to increase stop training).
• Ridge Regularization and Lasso Regularization.
• Use dropout for neural networks to tackle overfitting.
•
•Underfitting in Machine Learning:
•A statistical model or a machine learning algorithm is said to have underfitting
when a model is too simple to capture data complexities.
• It represents the inability of the model to learn the training data effectively
result in poor performance both on the training and testing data.
• In simple terms, an underfit model’s are inaccurate, especially when applied to
new, unseen examples.
• It mainly happens when we uses very simple model with overly simplified
assumptions.
• To address underfitting problem of the model, we need to use more complex
models, with enhanced feature representation, and less regularization.
• Reasons for Underfitting
• The model is too simple, So it may be not capable to represent the complexities in the data.
• The input features which is used to train the model is not the adequate representations of
underlying factors influencing the target variable.
• The size of the training dataset used is not enough.
• Excessive regularization are used to prevent the overfitting, which constraint the model to
capture the data well.
• Features are not scaled.
• Techniques to Reduce Underfitting
• Increase model complexity.
• Increase the number of features, performing feature engineering.
• Remove noise from the data.
• Increase the number of epochs or increase the duration of training to get better results.
• Multiple Linear Regression:
• Multiple Linear Regression is an extension of Simple Linear regression as it
takes more than one predictor variable to predict the response variable.
• Multiple Linear Regression is one of the important regression algorithms
which models the linear relationship between a single dependent continuous
variable and more than one independent variable.

• Some key points about MLR:

• For MLR, the dependent or target variable(Y) must be the continuous/real, but the predictor
or independent variable may be of continuous or categorical form.
• Each feature variable must model the linear relationship with the dependent variable.
• MLR tries to fit a regression line through a multidimensional space of data-points.
•MLR equation:
•In Multiple Linear Regression, the target variable(Y) is a linear
combination of multiple predictor variables x1, x2, x3, ...,xn. Since it is
an enhancement of Simple Linear Regression, so the same is applied
for the multiple linear regression equation, the equation becomes:

•Y= b0+b1x1+ b2x

2+ b3
•Where,
•Y= Output/Response variable
•b0, b1, b2, b3 , bn....= Coefficients of the model.
•x1, x2, x3, x4,...= Various Independent/feature variable
•
• Implementation of Multiple Linear Regression model:
• Problem Description:
• We have a dataset of 50 start-up companies.
• This dataset contains five main information: R&D Spend, Administration
Spend, Marketing Spend, State, and Profit for a financial year.
• Our goal is to create a model that can easily determine which company has a
maximum profit, and which is the most affecting factor for the profit of a
company.
• Since we need to find the Profit, so it is the dependent variable, and the other
four variables are independent variables. Below are the main steps of
deploying the MLR model:
• Data Pre-processing Steps
• Fitting the MLR model to the training set
• Predicting the result of the test set
• Step-1: Data Pre-processing Step:
• Importing libraries
• Importing dataset
• Extracting dependent and independent Variables:
• Encoding Dummy Variables:

• Step: 2- Fitting our MLR model to the Training set:

• Step: 3- Prediction of Test set results:

•Applications of Multiple Linear Regression:
•There are mainly two applications of Multiple Linear
Regression

•Effectiveness of Independent variable on prediction:

•Predicting the impact of changes:

• Logistic Regression in Machine Learning:
• Logistic regression is a supervised machine learning algorithm mainly used for
classification tasks where the goal is to predict the probability that an instance
of belonging to a given class or not.
• It is a kind of statistical algorithm, which analyze the relationship between a set
of independent variables and the dependent binary variables.
• It is a powerful tool for decision-making. For example email spam or not.
• Logistic Regression is a significant machine learning algorithm because it has
the ability to provide probabilities and classify new data using continuous and
discrete datasets.
• Logistic Regression can be used to classify the observations using different
types of data and can easily determine the most effective variables used for
the classification.
• Classification:
• In machine learning, Classification, as the name suggests, classifies data into
different parts/classes/groups. It is used to predict from which dataset the
input data belongs to.
• For example, if we are taking a dataset of scores of a cricketer in the past few
matches, along with average, strike rate, not outs etc, we can classify him as
“in form” or “out of form”.

• Types of Classification

• Binary classification
• Multi-class classification
•Binary Classification
• It is a process or task of classification, in which a given data is being classified
into two classes. It’s basically a kind of prediction about which of two groups
the thing belongs to.
• Let us suppose, two emails are sent to you, one is sent by an insurance
company that keeps sending their ads, and the other is from your bank
regarding your credit card bill. The email service provider will classify the two
emails, the first one will be sent to the spam folder and the second one will be
kept in the primary one.
• Binary classification uses some algorithms:
• Logistic Regression
• k-Nearest Neighbors
• Decision Trees
• Support Vector Machine
• Naive Bayes
• Term Related to binary classification :

• PRECISION:
• Precision in binary classification (Yes/No) refers to a model's ability to
correctly interpret positive observations.
• RECALL:
• The recall is also known as sensitivity. In binary classification (Yes/No)
recall is used to measure how “sensitive” the classifier is to detecting
positive cases.
•F1 SCORE
•The F1 score can be thought of as a weighted average of precision
and recall, with the best value being 1 and the worst being 0.
Precision and recall also make an equal contribution to the F1
ranking.
•Multiclass Classification
• Multi-class classification is the task of classifying elements into different
classes. Unlike binary, it doesn’t restrict itself to any number of classes.
Examples of multi-class classification are
• classification of news in different categories,
• classifying books according to the subject,
• classifying students according to their streams etc.

• In these, there are different classes for the response variable to be classified in
and thus according to the name, it is a Multi-class classification.
Parameters Binary classiﬁcation Multi-class classiﬁcation

It is a classification of two groups, There can be any number of classes

No. of classes i.e. classifies objects in at most two in it, i.e., classifies the object into
classes. more than two classes.

The most popular algorithms used Popular algorithms that can be used
by the binary classification are- for multi-class classification include:
• Logistic Regression •k-Nearest Neighbors
Algorithms used •k-Nearest Neighbors •Decision Trees
•Decision Trees •Naive Bayes
•Support Vector Machine •Random Forest.
•Naive Bayes •Gradient Boosting

Examples of binary classification Examples of multi-class classification

include- include:
Examples •Email spam detection (spam or not). •Face classification.
•Churn prediction (churn or not). •Plant species classification.
•Conversion prediction (buy or not). •Optical character recognition.
•Classification Performance:
• To evaluate the performance of a classification model, different metrics are
used, and some of them are as follows:
• Accuracy
• Confusion Matrix
• Precision
• Recall
• F-Score
• AUC(Area Under the Curve)-ROC
•I. Accuracy:
• The accuracy metric is one of the simplest Classification metrics to implement,
and it can be determined as the number of correct predictions to the total
number of predictions.
• Accuracy simply measures how often the classifier correctly predicts. We can
define accuracy as the ratio of the number of correct predictions and the total
number of predictions.

• When any model gives an accuracy rate of 99%, you might think that model is
performing very good but this is not always true and can be misleading in
some situations.
•Confusion Matrix:
•Confusion Matrix is a performance measurement for the machine
learning classification problems where the output can be two or
more classes. It is a table with combinations of predicted and actual
values.
•A confusion matrix is defined as thetable that is often used to
describe the performance of a classification model on a set of the
test data for which the true values are known.
•
•Precision :
•It explains how many of the correctly predicted cases actually turned
out to be positive. Precision is useful in the cases where False
Positive is a higher concern than False Negatives.
•Precision for a label is defined as the number of true positives
divided by the number of predicted positives.
•Recall (Sensitivity):
• It explains how many of the actual positive cases we were able to
predict correctly with our model. Recall is a useful metric in cases
where False Negative is of higher concern than False Positive.
•Recall for a label is defined as the number of true positives divided
by the total number of actual positives.
•F1 Score :
•It gives a combined idea about Precision and Recall metrics. It
is maximum when Precision is equal to Recall.
•F1 Score is the harmonic mean of precision and recall.
•
•AUC-ROC :
•The Receiver Operator Characteristic (ROC) is a probability curve that
plots the TPR(True Positive Rate) against the FPR(False Positive Rate)
at various threshold values and separates the ‘signal’ from the ‘noise’.
•The Area Under the Curve (AUC) is the measure of the ability of a
classifier to distinguish between classes. From the graph, we simply
say the area of the curve ABDE and the X and Y-axis.
•

Unit I
No ratings yet
Unit I
14 pages
Regression
No ratings yet
Regression
6 pages
Class 8 - Linear Regression
No ratings yet
Class 8 - Linear Regression
56 pages
Supervised Learning. wk3
No ratings yet
Supervised Learning. wk3
18 pages
Unit 2 ML Regression
No ratings yet
Unit 2 ML Regression
46 pages
LR LogReg
No ratings yet
LR LogReg
53 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
d3 It ML Jan 2023 Part 2
No ratings yet
d3 It ML Jan 2023 Part 2
32 pages
ML Exp 1
No ratings yet
ML Exp 1
6 pages
Artificial Intelligence Lec 4
No ratings yet
Artificial Intelligence Lec 4
13 pages
Intro to Supervised Learning
No ratings yet
Intro to Supervised Learning
52 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
Linear Regression for Analysts
No ratings yet
Linear Regression for Analysts
22 pages
Multiple Linear Regression 3
No ratings yet
Multiple Linear Regression 3
68 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
ML Unit
No ratings yet
ML Unit
23 pages
Lecture 02
No ratings yet
Lecture 02
43 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
Introduction To AI and ML
No ratings yet
Introduction To AI and ML
22 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
ML 2
No ratings yet
ML 2
155 pages
Linear Regression A Foundational ML Algorithm
No ratings yet
Linear Regression A Foundational ML Algorithm
10 pages
AI14 - MachineLearning
No ratings yet
AI14 - MachineLearning
49 pages
ML Ch-2 Linear Models For Supervised Learning
No ratings yet
ML Ch-2 Linear Models For Supervised Learning
18 pages
Ai ML 3
No ratings yet
Ai ML 3
27 pages
Unit 2
No ratings yet
Unit 2
34 pages
Regression
No ratings yet
Regression
6 pages
Unit 2
No ratings yet
Unit 2
18 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Aiml 4
No ratings yet
Aiml 4
107 pages
Linear Regression
No ratings yet
Linear Regression
60 pages
Regression
No ratings yet
Regression
24 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
17 pages
SumitBurnwal ML
No ratings yet
SumitBurnwal ML
13 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
LinearRegression PDF
No ratings yet
LinearRegression PDF
4 pages
Linear Regression Lab Guide
100% (1)
Linear Regression Lab Guide
8 pages
AI Lec 2
No ratings yet
AI Lec 2
49 pages
MLDAP Module2
No ratings yet
MLDAP Module2
32 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Solving One Variable Linear Equations
No ratings yet
Solving One Variable Linear Equations
10 pages
Module 4
No ratings yet
Module 4
41 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
ML 1
No ratings yet
ML 1
24 pages
6.classification & Regression
No ratings yet
6.classification & Regression
45 pages
Unit 3
No ratings yet
Unit 3
30 pages
ML Using Python Unit3 PDF
No ratings yet
ML Using Python Unit3 PDF
8 pages
Regression
No ratings yet
Regression
56 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Lecture3 Supervised Learning I
No ratings yet
Lecture3 Supervised Learning I
84 pages
Machine Learning - Develop Machine Learning Model - Regression
No ratings yet
Machine Learning - Develop Machine Learning Model - Regression
36 pages
Vintage Lens Guillotine Shutter Guide
No ratings yet
Vintage Lens Guillotine Shutter Guide
10 pages
MSC Nastran 2022.1 Linear Static Analysis User Guide
No ratings yet
MSC Nastran 2022.1 Linear Static Analysis User Guide
716 pages
02 Class Viii Mathematics
No ratings yet
02 Class Viii Mathematics
6 pages
Waterproofing 2022
No ratings yet
Waterproofing 2022
8 pages
Appendix C 02 COT RPMS Rating Sheet For T I III For SY 2022 2023
100% (1)
Appendix C 02 COT RPMS Rating Sheet For T I III For SY 2022 2023
3 pages
Data + AI Summit 2024 - Keynote Day 2
No ratings yet
Data + AI Summit 2024 - Keynote Day 2
32 pages
The Future of Health in Europe
No ratings yet
The Future of Health in Europe
24 pages
The Seven Golden Rules For Living As A Couple
No ratings yet
The Seven Golden Rules For Living As A Couple
2 pages
Syntax Checker
No ratings yet
Syntax Checker
8 pages
Amazon ML Summer School Previous Year Questions
100% (1)
Amazon ML Summer School Previous Year Questions
12 pages
Zond-12e Catalogue
100% (1)
Zond-12e Catalogue
12 pages
Imnci. Sa 2024 Final
No ratings yet
Imnci. Sa 2024 Final
61 pages
Deloitte NL Strategy Analytics and Ma The Price Tag of Plastic Pollution
No ratings yet
Deloitte NL Strategy Analytics and Ma The Price Tag of Plastic Pollution
16 pages
Roch
No ratings yet
Roch
3 pages
282
100% (2)
282
13 pages
Saffron Tissue Culture Manual - Final - Complete
No ratings yet
Saffron Tissue Culture Manual - Final - Complete
45 pages
Aerodynamics: Aerodynamics Is A Branch of Dynamics
No ratings yet
Aerodynamics: Aerodynamics Is A Branch of Dynamics
13 pages
Most Critical Windows Security Events 1682893986
No ratings yet
Most Critical Windows Security Events 1682893986
6 pages
Tropical Kagayagi
No ratings yet
Tropical Kagayagi
10 pages
Finite Element Methods in Mechanical Design
No ratings yet
Finite Element Methods in Mechanical Design
11 pages
200 Word Stress
No ratings yet
200 Word Stress
6 pages
Student Pregnancy and Maternity Implications For Heis
No ratings yet
Student Pregnancy and Maternity Implications For Heis
42 pages
Samsung Jet Bot User Manual
No ratings yet
Samsung Jet Bot User Manual
240 pages
Ed Kagke: Liy Relay
No ratings yet
Ed Kagke: Liy Relay
132 pages
Merc 175xr Sport Jet Manual - 10157970
No ratings yet
Merc 175xr Sport Jet Manual - 10157970
64 pages
Friction and Automobile Tires
No ratings yet
Friction and Automobile Tires
3 pages
The Gentle Art of Verbal Self-Defense (Suzette Had - 221009 - 092401
100% (3)
The Gentle Art of Verbal Self-Defense (Suzette Had - 221009 - 092401
332 pages
CII School Education Report - Final
No ratings yet
CII School Education Report - Final
52 pages
Doctor Job Application Letter Sample
No ratings yet
Doctor Job Application Letter Sample
20 pages
1 19 Loadrunner (Controller Module) Interview Questions 43 Q. 1: What Is The Purpose of Using HP - Loadrunner?
No ratings yet
1 19 Loadrunner (Controller Module) Interview Questions 43 Q. 1: What Is The Purpose of Using HP - Loadrunner?
17 pages

Linear & Logistic Regression Guide

Uploaded by

Linear & Logistic Regression Guide

Uploaded by

UNIT VI

Classification & Regression

•Simple Linear Regression:

•Multiple Linear regression:

• Some key points about MLR:

•Y= b<sub>0</sub>+b<sub>1</sub>x<sub>1</sub>+ b<sub>2</sub>x

• Step: 2- Fitting our MLR model to the Training set:

• Step: 3- Prediction of Test set results:

•Effectiveness of Independent variable on prediction:

•Predicting the impact of changes:

It is a classification of two groups, There can be any number of classes

Examples of binary classification Examples of multi-class classification

You might also like