0% found this document useful (0 votes)
24 views5 pages

Logistic

Uploaded by

zizzy1029
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views5 pages

Logistic

Uploaded by

zizzy1029
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Logistic Regression:

 Logistic regression is one of the most popular Machine Learning algorithms, which comes
under the Supervised Learning technique. It is used for predicting the categorical dependent
variable using a given set of independent variables
 Logistic regression predicts the output of a categorical dependent variable. Therefore the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1
 In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
 Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets
 Logistic Regression can be used to classify the observations using different types of data and
can easily determine the most effective variables used for the classification. The below image
is showing the logistic function

Assumptions for Logistic Regression:


 The dependent variable must be categorical in nature
 The independent variable should not have multi-collinearity
Logistic Regression Equation: The Logistic regression equation can be obtained from the Linear
Regression equation. The mathematical steps to get Logistic Regression equations are given
below:

We know the equation of the straight line can be written as:

𝑝
𝑙𝑜𝑔 ( ) 𝑜𝑟 𝑌 = β0 + β1𝑥1 + β2𝑥2 +….β𝑛 𝑥𝑛
1−𝑝

𝑝
= 𝑒 𝛽0 +𝛽1 𝑥1+𝛽2𝑥2 +⋯.𝛽𝑛𝑥𝑛
1−𝑝

1−𝑝 1
= 𝛽 +𝛽 𝑥 +𝛽 𝑥 +⋯.𝛽 𝑥
𝑝 𝑒 0 1 1 2 2 𝑛 𝑛

1
− 1 = 𝑒 −(𝛽0 +𝛽1 𝑥1+𝛽2 𝑥2+⋯.𝛽𝑛 𝑥𝑛)
𝑝

1
= 1 + 𝑒 −(𝛽0 +𝛽1 𝑥1+𝛽2 𝑥2+⋯.𝛽𝑛 𝑥𝑛)
𝑝
1
𝑝=
1 + 𝑒 −(𝛽0 +𝛽1 𝑥1+𝛽2 𝑥2+⋯.𝛽𝑛 𝑥𝑛)

Where ‘p’ is the probability of the binary outcome being 1 given the input feature ‘X’.
β0 is the intercept or constant

β1, β2 , β3...βn are coefficients of the variables x1, x2 , x3.............xn


Example: Suppose we want to predict whether a student will be admitted to a university based on their
scores from two exams
Dataset:

Exam Score (X) Admitted (Y)


78.02 0
43.89 0
72.90 0
86.31 1
75.34 1

In this dataset, the first column represent the exam score of students, and the third column
represents whether they were admitted (1) or not (0) to the university.

Solution:
Modeling and Prediction:

Using the logistic regression formula, the model can be represented as:

1
𝑝=
1 + 𝑒 −(𝛽0 +𝛽1 𝑥)

Where β1 are the coefficient that need to be learned from the training data

X Y 𝑿𝟐 XY
78.02 0 6087.12 0
43.89 0 1926.332 0
72.9 0 5314.41 0
86.31 1 7449.416 86.31
75.34 1 5676.116 75.34

356.46 2 26453.39 161.65

First we will find Regression equation

𝑌 = 𝛽0 + 𝛽1 𝑥
From the table

∑ 𝒙 =356.46 ∑𝒚 = 𝟐
n=5 ∑ 𝒙𝒚 = 𝟏𝟔𝟏. 𝟔𝟓
∑ 𝒙𝟐 =26453.39

∑𝒚
̅=
𝒚
𝒏
𝟐
= =0.40
𝟓
∑𝒙
̅
𝒙=
𝒏
𝟑𝟓𝟔.𝟒𝟔
= =71.292
𝟓

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝛽1 =
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2

(5 × 161.65) − (356.46 × 2)
=
(5 × 26453.39) − (356.46)2

= 0.0183

Intercept(𝛽0 ) = 𝒚̅ − 𝛽1 𝒙̅
=0.40-(0.0183*72.292)
=-0.9062
1 1
𝑝= =
1 + 𝑒 −(𝛽0 +𝛽1 𝑥) 1 + 𝑒 (−0.9062+0.0183𝑥)
𝜷𝟎 + 𝜷𝟎 x=-0.9062+0.0183x 1
X Y 𝑿𝟐 XY P= (−0.9062+0.0183𝑥)
1+𝑒
-0.9062+(0.0183*78.02)=0.5216 1
78.02 0 = 0.6275
(−0.5216)
6087.12 0 1+𝑒
43.89 0 -0.10301 0.4743
1926.332 0
72.9 0 0.42787 0.6054
5314.41 0
86.31 1 0.673273 0.6622
7449.416 86.31
75.34 1 0.472522 0.616
5676.116 75.34

After training model, we can use it to predict the probability of admission of a new
student.
For example, let’s say we have a student with exam score of 20. We plug these
values into the logistic function and calculate the probability
X= 20
1
P=
1+𝑒(−0.9062+0.0183𝑥)

1
P= (−0.9062+0.0183∗20)
1+𝑒

P=0.3677

This Probability value (0.3677) is near to 0 because this value is below 0.5.Thus
the student didn’t admitted to a university.

You might also like