0% found this document useful (0 votes)

123 views15 pages

FMLT Unit 4 Cat Ii Notes

The document discusses regression modeling techniques in machine learning. It begins with an introduction to regression and its goal of predicting numerical variables. Simple linear regression is described as the simplest algorithm, where a straight line is fit to the data using least squares. Multiple linear regression, which uses more than one predictor variable, is also introduced. The document provides an example of using regression to predict real estate prices based on attributes like area, location, and amenities. It then defines common regression algorithms like linear regression, polynomial regression and logistic regression. Simple linear regression is explained in more detail, including the concepts of slope, intercept, positive and negative slopes, and errors in regression models.

Uploaded by

Keerthana K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views15 pages

FMLT Unit 4 Cat Ii Notes

Uploaded by

Keerthana K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

20AIPC302

FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

II 3

20AIPC302

FUNDAMENTAL OF MACHINE
LEARNING TECHNIQUES

UNIT 4- REGRESSION MODELLING

Introduction regression modeling – Mathematical model for Linear

regression – Simple Linear regression –Multiple Linear Regression
–
Improving Accuracy of Linear regression model –Polynomial
Regression – Logistic regression – Maximum likelihood Estimation -
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

4.1 REGRESSION:
In this chapter, we will build concepts on prediction of numerical variables – which is another key area of
supervised learning. This area, known as regression, focuses on solving problems such as predicting value of real
estate, demand forecast in retail, weather forecast, etc. First, you will be introduced to the most popular and
simplest algorithm, namely simple linear regression. This model roots from the statistical concept of fitting a
straight line and the least squares method. We will explore this algorithm in detail. In this same context, we will
also explore the concept of multiple linear regression. We will then briefly touch upon the other important
algorithms in regression, namely multivariate adaptive regression splines, logistic regression, and maximum
likelihood estimation. By the end of this chapter, you will gain sufficient knowledge in all the aspects of
supervised learning and become ready to start solving problems on your own.
4.2 EXAMPLE OF REGRESSION
New City is the primary hub of the commercial activities in the country. In the last couple of decades, with
increasing globalization, commercial activities have intensified in New City. Together with that, a large number of
people have come and settled in the city with a dream to achieve professional growth in their lives. As an obvious
fall-out, a large number of housing projects have started in every nook and corner of the city.
But the demand for apartments has still outgrown the supply. To get benefit from this boom in real estate
business, Karen has started a digital market agency for buying and selling real estates (including apartments,
independent houses, town houses, etc.). Initially, when the business was small, she used to interact with buyers and
sellers personally and help them arrive at a price quote – either for selling a property (for a seller) or for buying a
property (for a buyer). Her long experience in real estate business helped her develop an intuition on what the
correct price quote of a property could be – given the value of certain standard parameters such as area (sq. m.) of
the property, location, floor, number of years since purchase, amenities available, etc.
However, with the huge surge in the business, she is facing a big challenge. She is not able to manage
personal interactions as well as setting the correct price quote for the properties all alone. She hired an assistant for
managing customer interactions. But the assistant, being new in the real estate business, is struggling with price
quotations. How can Karen solve this problem? Fortunately, Karen has a friend, Frank, who is a data scientist with
in-depth knowledge in machine learning models. Frank comes up with a solution to Karen’s problem.
He builds a model which can predict the correct value of a real estate if it has certain standard inputs such
as area (sq. m.) of the property, location, floor, number of years since purchase, amenities available, etc. Wow, that
sounds to be like Karen herself doing the job! Curious to know what model Frank has used? Yes, you guessed it
right. He used a regression model to solve Karen’s real estate price prediction problem. So, we just discussed about
one problem which can be solved using regression. In the same way, a bunch of other problems related to
prediction of numerical value can be solved using the regression model. In the context of regression, dependent
variable (Y) is the one whose value is to be predicted, e.g. the price quote of the real estate in the context of
Karen’s problem.
This variable is presumed to be functionally related to one (say, X) or more independent variables called
predictors. In the context of Karen’s problem, Frank used area of the property, location, floor, etc. as predictors of
the model that he built. In other words, the dependent variable depends on independent variable(s) or predictor(s).
Regression is essentially finding a relationship (or) association between the dependent variable (Y) and the
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

independent variable(s) (X), i.e. to find the function ‘f ’ for the association Y = f (X).

COMMON REGRESSION ALGORITHMS:

The most common regression algorithms are
 Simple linear regression
 Multiple linear regression
 Polynomial regression
 Multivariate adaptive regression splines
 Logistic regression
 Maximum likelihood estimation (least squares)
4.3 SIMPLE LINEAR REGRESSION:
As the name indicates, simple linear regression is the simplest regression model which involves only one
predictor. This model assumes a linear relationship between the dependent variable and the predictor variable.

In the context of Karen’s problem, if we take Price of a Property as the dependent variable and the Area of
the Property (in sq. m.) as the predictor variable, we can build a model using simple linear regression.

Assuming a linear association, we can reformulate the model as

20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Price = a + b. Area Property Property Property Property where ‘a’ and ‘b’ are intercept and slope of the
straight line, respectively. Just to recall, straight lines can be defined in a slope– intercept form Y = (a + bX),
where a = intercept and b = slope of the straight line. The value of intercept indicates the value of Y when X = 0. It
is known as ‘the intercept or Y intercept’ because it specifies where the straight line crosses the vertical or Y-axis.
Slope of the simple linear regression model
Slope of a straight line represents how much the line in a graph changes in the vertical direction (Y-axis) over a
change in the horizontal direction (X-axis) as shown.
Slope = Change in Y/Change in X
Rise is the change in Y-axis (Y − Y ) and Run is the change in X-axis (X − X ). So, slope is represented as given
below:

Let us find the slope of the graph where the lower point on the line is represented as (−3, −2) and the higher point
on the line is represented as (2, 2).
(X , Y ) = (−3, −2) and (X , Y ) = (2, 2)
Rise = (Y − Y ) = (2 − (−2)) = 2 + 2 = 4
Run = (X − X ) = (2 − (−3)) = 2 + 3 = 5
Slope = Rise/Run = 4/5 = 0.8
There can be two types of slopes in a linear regression model: positive slope and negative slope. Different
types of regression lines based on the type of slope include
 Linear positive slope
 Curve linear positive slope
 Linear negative slope
 Curve linear negative slope
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Linear positive slope

 A positive slope always moves upward on a graph from left to right.
 Slope = Rise/Run = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)
 Scenario 1 for positive slope: Delta (Y) is positive and Delta (X) is positive
 Scenario 2 for positive slope: Delta (Y) is negative and Delta (X) is negative

Curve linear positive slope

 Curves in these graphs (refer to Fig. 8.4) slope upward from left to right.
 Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X) S
 lope for a variable (X) may vary between two graphs, but it will always be positive;
 hence, the above graphs are called as graphs with curve linear positive slope.

Linear negative slope:

 A negative slope always moves downward on a graph from left to right.
 As X value (on X-axis) increases, Y value decreases
 Slope = Rise/Run = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X) Scenario 1 for negative slope:
 Delta (Y) is positive and Delta (X) is negative Scenario 2 for negative slope: Delta (Y) is negative and
Delta (X) is positive
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Curve linear negative slope:

 Curves in these graphs (refer to Fig. 8.6) slope downward from left to right.
 Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)
 Slope for a variable (X) may vary between two graphs, but it will always be negative;
 hence, the above graphs are called as graphs with curve linear negative slope.

No relationship graph:
Scatter graph indicates ‘no relationship’ curve as it is very difficult to conclude whether the relationship between X
and Y is positive or negative.
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Error in simple regression

 The regression equation model in machine learning uses the above slope–intercept format in algorithms.
 X and Y values are provided to the machine, and it identifies the values of a (intercept) and b (slope) by
relating the values of X and Y.
 However, identifying the exact match of values for a and b is not always possible. There will be some error
value (ɛ) associated with it.
 This error is called marginal or residual error.

 Now that we have some context of the simple regression model, let us try to explore an example to
understand clearly how to decide the parameters of the model (i.e. values of a and b) for a given problem.

Example of simple regression:

A college professor believes that if the grade for internal examination is high in a class, the grade for external
examination will also be high. A random sample of 15 students in that class was selected, and the data is given
below:

A scatter plot was drawn to explore the relationship between the independent variable (internal marks) mapped
to X-axis and dependent variable (external marks) mapped to Y-axis.

As you can observe from the above graph, the line (i.e. the regression line) does not predict the data exactly
(refer to Fig. 8.8). Instead, it just cuts through the data. Some predictions are lower than expected, while some
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

others are higher than expected.

Residual is the distance between the predicted point (on the regression line) and the actual point.

As we know, in simple linear regression, the line is drawn using the regression formula.

 If we know the values of ‘a’ and ‘b’, then it is easy to predict the value of Y for any given X by using the
above formula.
 But the question is how to calculate the values of ‘a’ and ‘b’ for a given set of X and Y values?
 A straight line is drawn as close as possible over the points on the scatter plot. Ordinary Least Squares
(OLS) is the technique used to estimate a line that will minimize the error (ε), which is the difference
between the predicted and the actual values of Y.
 This means summing the errors of each prediction or, more appropriately, the Sum of the Squares of the

Errors (SSE)
 It is observed that the SSE is least when b takes the value

 The corresponding value of ‘a’ calculated using the above value of ‘b’ is

 So, let us calculate the value of a and b for the given example. For detailed calculation,

 Calculation summary
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Sum of X = 299
Sum of Y = 852
Mean X, M = 19.93
Mean Y, M = 56.8

Hence, for the above example, the estimated regression equation is constructed on the basis of the estimated values
of a and b:

So, in the context of the given problem, we can say

Marks in external exam = 19.04 + 1.89 × (Marks in internal exam)
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Detailed calculation of regression parameters

The model built above can be represented graphically as an extended version (refer to Fig. 8.11) a zoom-in version
Interpretation of the intercept As we have already seen, the simple linear regression model built on the data in
the example is

The value of the intercept from the above equation is 19.05. However, none of the internal mark is 0. So, intercept
= 19.05 indicates that 19.05 is the portion of the external examination marks not explained by the internal
examination marks.
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Slope measures the estimated change in the average value of Y as a result of a one-unit change in X. Here,
slope = 1.89 tells us that the average value of the external examination marks increases by 1.89 for each additional
1 mark in the internal examination.
Now that we have a complete understanding of how to build a simple linear regression model for a given
problem, it is time to summarize the algorithm.
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

OLS algorithm
Step 1: Calculate the mean of X and Y
Step 2: Calculate the errors of X and Y
Step 3: Get the product
Step 4: Get the summation of the products
Step 5: Square the difference of X
Step 6: Get the sum of the squared difference
Step 7: Divide output of step 4 by output of step 6 to calculate ‘b’
Step 8: Calculate ‘a’ using the value of ‘b’

Maximum and minimum point of curves:

Maximum (shown in Fig. 8.13) and minimum points (shown in Fig. 8.14) on a graph are found at points
where the slope of the curve is zero. It becomes zero either from positive or negative value. The maximum point is
the point on the curve of the graph with the highest y-coordinate and a slope of zero.
The minimum point is the point on the curve of the graph with the lowest y-coordinate and a slope of zero.
FIG. 8.13 Maximum point of curve Point 63 is at the maximum point for this curve (refer to Fig. 8.13).
Point 63 is at the highest point on this curve. It has a greater y-coordinate value than any other point on the
curve and has a slope of zero. Point 40 (marked with an arrow in Fig. 8.14) is the minimum point for this curve.
Point 40 is at the lowest point on this curve. It has a lesser y-coordinate value than any other point on the
curve and has a slope of zero. FIG. 8.14 Minimum point of curve. Point 40 is at the lowest point on this curve. It
has a lesser y-coordinate value than any other point on the curve and has a slope of zero.
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

4.4 MULTIPLE LINEAR REGRESSIONS:

In a multiple regression model, two or more independent variables, i.e. predictors are involved in the
model. If we think in the context of Karen’s problem, in the last section, we came up with a simple linear
regression by considering Price of a Property as the dependent variable and the Area of the Property (in sq. m.) as
the predictor variable. However, location, floor, number of years since purchase, amenities available, etc. are also
important predictors which should not be ignored. Thus, if we consider Price of a Property (in $) as the dependent
variable and Area of the Property (in sq. m.), location, floor, number of years since purchase and amenities
available as the independent variables, we can form a multiple regression equation as shown below:

The simple linear regression model and the multiple regression model assume that the dependent variable
is continuous. The following expression describes the equation involving the relationship with two predictor
variables, namely X and X .

The model describes a plane in the three-dimensional space of Ŷ, X1 , and X2 . Parameter ‘a’ is the
intercept of this plane. Parameters ‘b ’ and ‘b ’ are referred to as partial regression coefficients. Parameter b
represents the change in the mean response corresponding to a unit change in X1 when X2 is held constant.
Parameter b represents the change in the mean response corresponding to a unit change in X2 when X1 is held
constant.
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

Consider the following example of a multiple linear regression model with two predictor variables, namely
X1 and X2.

Multiple regression for estimating equation when there are ‘n’ predictor variables is as follows:

While finding the best fit line, we can fit either a polynomial or curvilinear regression. These are known as
polynomial or curvilinear regression, respectively.

Assumptions in Regression Analysis:

1. The dependent variable (Y) can be calculated / predicated as a linear function of a specific set of
independent variables (X’s) plus an error term (ε).
2. The number of observations (n) is greater than the number of parameters (k) to be estimated, i.e. n > k.
3. Relationships determined by regression are only relationships of association based on the data set and not
necessarily of cause and effect of the defined class.
4. Regression line can be valid only over a limited range of data. If the line is extended (outside the range of
extrapolation), it may only lead to wrong predictions.
5. If the business conditions change and the business assumptions underlying the regression model are no
longer valid, then the past data set will no longer be able to predict future trends.
6. Variance is the same for all values of X (homoskedasticity).
7. The error term (ε) is normally distributed. This also means that the mean of the error (ε) has an expected
value of 0.
8. The values of the error (ε) are independent and are not related to any values of X. This means that there are
no relationships between a particular X, Y that are related to another specific value of X, Y.
Given the above assumptions, the OLS estimator is the Best Linear Unbiased Estimator (BLUE), and this is
20AIPC302
FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

called as Gauss-Markov Theorem.

Main Problems in Regression Analysis In multiple regressions, there are two primary problems:
multicollinearity and heteroskedasticity.
Multicollinearity:
 Two variables are perfectly collinear if there is an exact linear relationship between them.
 Multicollinearity is the situation in which the degree of correlation is not only between the dependent
variable and the independent variable, but there is also a strong correlation within (among) the
independent variables themselves.
 A multiple regression equation can make good predictions when there is multicollinearity, but it is
difficult for us to determine how the dependent variable will change if each independent variable is
changed one at a time.
 When multicollinearity is present, it increases the standard errors of the coefficients. By overinflating
the standard errors, multicollinearity tries to make some variables statistically insignificant when they
actually should be significant (with lower standard errors).
 One way to gauge multicollinearity is to calculate the Variance Inflation Factor (VIF), which assesses
how much the variance of an estimated regression coefficient increases if the predictors are correlated.
If no factors are correlated, the VIFs will be equal to 1.
 The assumption of no perfect collinearity states that there is no exact linear relationship among the
independent variables. This assumption implies two aspects of the data on the independent variables.
 First, none of the independent variables, other than the variable associated with the intercept term, can
be a constant.
 Second, variation in the X’s is necessary. In general, the more variation in the independent variables,
the better will be the OLS estimates in terms of identifying the impacts of the different independent
variables on the dependent variable.

Heteroskedasticity:
 Heteroskedasticity refers to the changing variance of the error term.
 If the variance of the error term is not constant across data sets, there will be erroneous
predictions.
 In general, for a regression equation to make accurate predictions, the error term should be
independent, identically (normally) distributed (iid).
 Mathematically, this assumption is written as

where ‘var’ represents the variance, ‘cov’ represents the covariance, ‘u’ represents the error terms,
and ‘X’ represents the independent variables.

 This assumption is more commonly written as

ML Unit 4 Material
No ratings yet
ML Unit 4 Material
20 pages
ML - UNIT 4 - Material - SVCK - CSE
No ratings yet
ML - UNIT 4 - Material - SVCK - CSE
19 pages
UNIT 4 - Machine Learning
No ratings yet
UNIT 4 - Machine Learning
18 pages
Supervised Learning
No ratings yet
Supervised Learning
61 pages
AI Lab7
No ratings yet
AI Lab7
13 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
ML-UNIT-IV - Complete
No ratings yet
ML-UNIT-IV - Complete
42 pages
ML Unit 4 Material SVCK Cse
No ratings yet
ML Unit 4 Material SVCK Cse
19 pages
Unit-2 ML (Reference Guide For Students)
No ratings yet
Unit-2 ML (Reference Guide For Students)
27 pages
ML Week 4
No ratings yet
ML Week 4
5 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
Unit 2
No ratings yet
Unit 2
26 pages
UNIT 3 Regression
No ratings yet
UNIT 3 Regression
5 pages
ML U2 Regression
No ratings yet
ML U2 Regression
20 pages
Supervised Learning
No ratings yet
Supervised Learning
20 pages
Data Analysis Chap 3
No ratings yet
Data Analysis Chap 3
21 pages
Regression
No ratings yet
Regression
6 pages
ML 01 (Shubham)
No ratings yet
ML 01 (Shubham)
14 pages
Algorithms For Data Science: Attendance: 88772147
No ratings yet
Algorithms For Data Science: Attendance: 88772147
35 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
House Price Prediction Project
No ratings yet
House Price Prediction Project
55 pages
Linear Regression - Mathematical Concepts
No ratings yet
Linear Regression - Mathematical Concepts
22 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
30 pages
Machine Learning Applications Guide
No ratings yet
Machine Learning Applications Guide
41 pages
ML Unit 3 Notes 1
No ratings yet
ML Unit 3 Notes 1
58 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
MLT Unit 2 Linear Regression
No ratings yet
MLT Unit 2 Linear Regression
26 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Unit 2
No ratings yet
Unit 2
79 pages
Regression - Docx 1 2
No ratings yet
Regression - Docx 1 2
2 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
Supervised Learning Regression vs. Classification, Linear Regression, Logistic Regression, Decision Trees and Random Forests
No ratings yet
Supervised Learning Regression vs. Classification, Linear Regression, Logistic Regression, Decision Trees and Random Forests
9 pages
Regression Analysis
No ratings yet
Regression Analysis
52 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Unit 3
No ratings yet
Unit 3
30 pages
Unit 2
No ratings yet
Unit 2
92 pages
404-BA-chapter IV
No ratings yet
404-BA-chapter IV
70 pages
ML - Regression
No ratings yet
ML - Regression
34 pages
5 Regression-1
No ratings yet
5 Regression-1
46 pages
Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals
No ratings yet
Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals
56 pages
Unit II - Supervised Machine Learning Techniques
No ratings yet
Unit II - Supervised Machine Learning Techniques
131 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
SemVII MachineLearning
No ratings yet
SemVII MachineLearning
22 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
Unit I
No ratings yet
Unit I
14 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Machine Learning & Data Types Guide
No ratings yet
Machine Learning & Data Types Guide
22 pages
Sourav Moocs A2 65
No ratings yet
Sourav Moocs A2 65
32 pages
Unit 4 - Machine Learning PDF
No ratings yet
Unit 4 - Machine Learning PDF
49 pages
Wa0023.
No ratings yet
Wa0023.
22 pages
Lecture 4 - Linear Regression
No ratings yet
Lecture 4 - Linear Regression
18 pages
Chapter 2
No ratings yet
Chapter 2
50 pages

FMLT Unit 4 Cat Ii Notes

Uploaded by

FMLT Unit 4 Cat Ii Notes

Uploaded by

20AIPC302

FUNDAMENTAL OF MACHINE LEARNING TECHNIQUE

UNIT 4- REGRESSION MODELLING

Introduction regression modeling – Mathematical model for Linear

COMMON REGRESSION ALGORITHMS:

Assuming a linear association, we can reformulate the model as

Linear positive slope

Curve linear positive slope

Linear negative slope:

Curve linear negative slope:

Error in simple regression

Example of simple regression:

others are higher than expected.

So, in the context of the given problem, we can say

Detailed calculation of regression parameters

Maximum and minimum point of curves:

4.4 MULTIPLE LINEAR REGRESSIONS:

Assumptions in Regression Analysis:

called as Gauss-Markov Theorem.

 This assumption is more commonly written as

You might also like