0% found this document useful (0 votes)
38 views20 pages

Logistic Regression-4

Logistic regression is used when the dependent variable is binary and non-normally distributed. It models the relationship between the log-odds of the dependent variable occurring and the independent variables. The logistic regression equation transforms the probability using the logit function to constrain the output between 0 and 1. The coefficients can be interpreted as how they affect the log-odds or odds of the dependent variable occurring.

Uploaded by

Amelie Griffith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views20 pages

Logistic Regression-4

Logistic regression is used when the dependent variable is binary and non-normally distributed. It models the relationship between the log-odds of the dependent variable occurring and the independent variables. The logistic regression equation transforms the probability using the logit function to constrain the output between 0 and 1. The coefficients can be interpreted as how they affect the log-odds or odds of the dependent variable occurring.

Uploaded by

Amelie Griffith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Logistic Regression

~ by Heiletjé van Zyl (VZYHEI003@myuct.ac.za)


Room 5.61, P.D. Hahn Building, Level 5 (South entrance)
Linear regression Logistic regression
Assumptions Assumptions
(1) Underlying relationship between (1) BINARY dependent variable has a relationship
CONTINUOUS dependent and (either continuous with (either continuous and/or categorical)
and/or categorical) independent variable/s is independent variable/s
linear
(2) Errors are assumed to be: ∴ Outcome of interest = NOT continuous and so NOT normally distributed…
> Independent
> Normally distributed for all 𝒙 ‘s with mean
of 0 and constant variance
𝜺𝒊 ~ 𝑵(𝟎, 𝝈𝟐 )

Simple Multiple

𝒚
𝒚
𝒚
𝒙𝟐
𝒙𝟏

𝒙 𝒙𝟏
Application of LR
On 28 January 1986, the Space Shuttle Challenger exploded 73 seconds after lift-off
~ due to O-ring (circular gasket that sealed the right rocket booster) failure caused by exposure to very
low temperature (30 ℉ or -0.5℃)

Morton Thiokol engineers (specifically Roger Boisjoly) warned NASA management about this risk…
BUT NASA overruled these engineers, making an estimate that there is only a 1 in 100 000 chance of
shuttle failure for any given launch! Clearly subpar statistical reasoning = one of the main reasons the launch still went through!

“Take off your engineer hat, and put on your management hat” ~ NASA to Boisjoly
Application of LR

Of interest: to predict whether the Space Shuttle Challenger’s O-rings will fail or not,
contingent on the temperature at the time of the launch.

All of the variables involved:


• 𝒚 (failure) = binary indicating whether O-rings failure occurred (0) or not (1).
• 𝒙𝟏 (temperature) = temperature at time of Space Shuttle Challenger launch (in ℉)
Application of LR

Visualizing the data Superimposing SLR Superimposing LR

~ uses identity link ~ uses logit link

> Want to estimate the probability of getting a 1 (i.e., the O-rings NOT failing)
> Fitted line needs to be constrained such that it falls between 0 and 1…
Logit Transformation

Logistic regression uses the following logit link function:


Slope parameters
(related to different independent variables)

Dependent variable
This transformation
enables the
$!
logit 𝑝# = log( ) = 𝜷𝟎 + 𝜷𝟏 𝒙𝟏𝒊 + 𝜷𝟐 𝒙𝟐𝒊 + … + 𝜷𝒑 𝒙𝒑𝒊 + 𝜺𝒊 constraining of 𝑝"
%&$!
to be between 0
and 1!
Intercept parameter

Independent variables

$! $
%&$!
= odds of outcome of interest occurring log(%&$! )= log-odds of outcome of interest occurring
!
Logit Transformation
𝑝7#
logit 𝑝7# = log( :𝟎 + 𝜷
)=𝜷 :𝟏 𝒙𝟏𝒊 + 𝜷
:𝟐 𝒙𝟐𝒊 + … + 𝜷
:𝒑 𝒙𝒑𝒊
1 − 𝑝7#

But recall the outcome of interest is still the predicted 𝑝" (probability)…

∴ NEED TO BACKTRANSFORM
(once model is fitted and 𝛽 coefficients estimated)

+ + + +
𝒆𝒙𝒑( 𝜷𝟎-𝜷𝟏𝒙𝟏𝒊-𝜷𝟐𝒙𝟐𝒊- … - 𝜷𝒑𝒙𝒑𝒊) 𝒆𝒙𝒑𝑳𝑶
𝑝7# = + + + + 𝑝#$ =
𝟏 + 𝒆𝒙𝒑( 𝜷𝟎-𝜷𝟏𝒙𝟏𝒊-𝜷𝟐𝒙𝟐𝒊- … - 𝜷𝒑𝒙𝒑𝒊) 𝟏 + 𝒆𝒙𝒑 𝑳𝟎

𝟏 𝟏
= =
'( 𝜷)𝟎 +𝜷)𝟏 𝒙𝟏𝒊 +𝜷)𝟐 𝒙𝟐𝒊 + … + 𝜷
)𝒑 𝒙 )
𝒑𝒊
𝟏-𝒆𝒙𝒑'(𝑳𝑶)
𝟏-𝒆𝒙𝒑

:𝟎 + 𝜷
where 𝜷 :𝟏 𝒙𝟏𝒊 + 𝜷
:𝟐 𝒙𝟐𝒊 + … + 𝜷
:𝒑 𝒙𝒑𝒊 = LOG-ODDS (LO)

𝑝!$ ∈ 0, 1 ∴ 𝑝!$ could take on value like 0.66…


BUT want to speak in terms of 0 and 1 only! So, then often specify a threshold 𝜋 value (e.g., 0.5) where:

𝑝7# = { 0,
1,
if 𝑝"! ≤ 𝜋
i𝐟 '𝑝! > 𝜋
Odds versus Log(odds)

$! $
= odds of outcome of interest occurring log(%&$! )= log-odds of outcome of interest occurring
%&$! !

Use ln button on
calculator to get this!
~ denotes for base e

DO NOT USE the log button!


~ denotes for base 10
(natural logarithm)
Logit Transformation

In terms of probability In terms of the logit transformation

~ want shape between p and X


= S-curved
Ways in which to interpret 𝛽! coefficients of fitted LR model

Log-odds Odds with factor


𝑝7# 𝑝7# + +
log( :𝟎 + 𝜷
)=𝜷 :𝟏 𝒙𝟏𝒊 = 𝒆𝒙𝒑(𝜷𝟎) × 𝒆𝒙𝒑(𝜷𝟏𝒙𝟏𝒊)
1 − 𝑝7# 1 − 𝑝7#

*𝟏 = change in logarithm of odds (log-odds) when *


𝜷 𝒆𝒙𝒑(𝜷𝟏) = factor that odds get multiplied by for
𝒙𝟏𝒊 is increased by one-unit. every one-unit increase in 𝒙𝟏𝒊 .

~ more than one-unit in 𝒙𝟏𝒊


&
𝒆𝒙𝒑( $ ×𝜷𝟏)
*
𝑟 ×𝜷*𝟏 = estimated change in the log-odds when 𝒙𝟏𝒊 𝒆𝒙𝒑( - ×𝜷𝟏) = factor that odds get multiplied by
is increased by 𝑟 –units. for every 𝑟 -unit increase in 𝒙𝟏𝒊 .
Ways in which to interpret factors
𝑝7# + +
= 𝒆𝒙𝒑(𝜷𝟎) × 𝒆𝒙𝒑(𝜷𝟏𝒙𝟏𝒊)
1 − 𝑝7#
&
𝒆𝒙𝒑(𝜷𝟏) = factor that odds get multiplied by for every one-unit change in 𝒙𝟏𝒊 .
+ Interpretation
Magnitude of factor 𝒆𝒙𝒑(𝜷𝟏)
Greater than 1 Positive effect – increase in odds
Less than 1 Negative effect – decrease in odds
Equal to 1 No effect

Examples:
)
> 𝒆𝒙𝒑(𝜷𝟏) = 1.8
~ one-unit increase in some 𝒙 is associated with the odds of event being multiplied by 1.8, meaning an increase of 80%.

)
> 𝒆𝒙𝒑(𝜷𝟏) = 0.6
~ one-unit increase in some 𝒙 is associated with the odds of event being multiplied by 0.6, meaning a decrease of 40%.

)
> 𝒆𝒙𝒑(𝜷𝟏) = 1
~ one-unit increase in some 𝒙 is associated with the odds of event being multiplied by 1, meaning there is no effect.
Odds Ratio (OR’s)
𝑝7#
log( :𝟎 + 𝜷
)=𝜷 :𝟏 𝒙𝟏𝒊
1 − 𝑝7#

Assumption that 𝒙𝟏𝒊 is a dichotomous variable, where 1 = exposed to disease and 0 = unexposed to disease

If 𝑝*
"- = Pr(event | unexposed):
/. & Substituting 0 into 𝒙𝟏𝒊
> Odds of event among unexposed are: /0
= 𝒆𝒙𝒑(𝜷𝟎)
01 /./0

If 𝑝*
"0 = Pr(event | exposed):
/. & & Substituting 1 into 𝒙𝟏𝒊
> Odds of event among exposed are: /1
= 𝒆𝒙𝒑(𝜷𝟎3𝜷𝟏)
012//1

Ratio of odds of event for an exposed person relative to an unexposed person:

32/1 * *
14 53/1 𝒆𝒙𝒑(𝜷𝟎7𝜷𝟏) &
𝑶𝑹 = 32 = * = 𝒆𝜷𝟏
/0 𝒆𝒙𝒑(𝜷𝟎)
14 63/0
Perform LR in R
In R use function:
glm(y ~ x1 + x2, data =…, family = “binomial”)

{ Code

*𝟎
𝜷
*𝟏
𝜷
{ Output
Using R for Interpretation

𝑝7#
log = 𝟏𝟎. 𝟖𝟕𝟓𝟑𝟓 − 𝟎. 𝟏𝟕𝟏𝟑𝟐 (𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆)
1 − 𝑝7#

𝒆𝒙𝒑𝟏𝟎.𝟖𝟕𝟓𝟑𝟓 '𝟎.𝟏𝟕𝟏𝟑𝟐 (𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆)


THEN BACKTRANSFORMS TO: 𝑝7# = 𝟏-𝒆𝒙𝒑𝟏𝟎.𝟖𝟕𝟓𝟑𝟓 '𝟎.𝟏𝟕𝟏𝟑𝟐 (𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆)
𝟏
Alternatively: 𝑝7# =
𝟏-𝒆𝒙𝒑'(𝟏𝟎.𝟖𝟕𝟓𝟑𝟓 '𝟎.𝟏𝟕𝟏𝟑𝟐 𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆 )

In terms of log-odds
> On average, the log-odds of O-rings failure occurring decreases by 0.17 units if temperature increases by 1 ℉

In terms of factor (one-unit)


> On average, 1 ℉ increase in temperature is associated with the odds of O-rings failure changing by a factor of
𝒆𝒙𝒑(8𝟎.𝟏𝟕𝟏𝟑𝟐) = 0.84, which translates to a 16% decrease in the odds of O-rings failure occurring.

In terms of factor (ten-unit)


> On average, 10 ℉ increase in temperature is associated with the odds of O-rings failure changing by a factor of
𝒆𝒙𝒑(8𝟎.𝟏𝟕𝟏𝟑𝟐 × 𝟏𝟎) = 0.18, which translates to an 82% decrease in the odds of O-rings failure occurring.
Checking significance of 𝒙 effect

Of interest: how confident about temperature effect being real?

.𝟏
Compute 95% confidence interval for 𝜷

$𝟏 ± 𝒛𝜶 × 𝒔 .
CI = 𝜷 𝜷𝟏
𝟐

Note: NOT from t-distribution, but the NORMAL!

Raw-scale OR-scale

(𝑒 −0.335 ; 𝑒 −0.008 )
CI = -0.17132 ± -1.046).
(-3.669, 1.96 × 0.08344
∴(0.715 ; 0.992)
∴(-0.335; -0.008)
Prediction
Of interest: to predict the probability of O-rings failure in relation to temperature being 30 ℉

Steps:
(1) Estimate log-odds by substituting in value of 𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆 (i.e., 𝒙 value of interest)
(2) Substitute estimated log-odds in backtransformed 𝑝6" formula

𝑝7#
log = 𝟏𝟎. 𝟖𝟕𝟓𝟑𝟓 − 𝟎. 𝟏𝟕𝟏𝟑𝟐 (𝟑𝟎)
1 − 𝑝7#

𝒆𝒙𝒑𝟓.𝟕𝟑𝟓𝟕𝟓
THEN BACKTRANSFORMS TO: 𝑝7# =
𝟏-𝒆𝒙𝒑𝟓.𝟕𝟑𝟓𝟕𝟓
𝟏
Alternatively: 𝑝7# =
𝟏-𝒆𝒙𝒑'(𝟓.𝟕𝟑𝟓𝟕𝟓)

Both equal to 0.99678...


LABEL PROPERLY!!!!! Confusion Matrix Positive Predictive Value

-
OBSERVED PPV = -./
> probability of being positive given
that the prediction is positive.
> proportion of true positives (a) among
all the predicted positives (a+b)

Negative Predictive Value


PREDICTED
N
NPV =
MLN
> probability of being negative given
that the prediction is negative.
> proportion of true negatives (d)
among all the predicted negatives (c+d)

K N
Sensitivity = KLM Specificity =
OLN
> ability of the model to predict a true positive > ability of the model to predict a true negative
> proportion of true positives (a) among all the > proportion of true negatives (d) among all the
actual (observed) positives (a+c). actual (observed) negatives (b+d).
OBSERVED Confusion Matrix Threshold value chosen to be 𝜋 = 0.5

OBSERVED
PREDICTED O-rings Succeed O-rings Fail

7 0
PPV = = 0.1 ≈ 0.75
O-rings Succeed 3 1 789

PREDICTED
N ST
NPV = = ≈ 0.80
MLN RLST
O-rings Fail
4 16

K Q N ST
Sensitivity = KLM = QLR ≈ 0.4286 Specificity = = ≈ 0.9418
OLN SLST
Choosing threshold value (𝜋) = trade-off between sensitivity and specificity

Change in 𝜋 Effect
OVERALL = MAXIMISE
MODEL’S ABILITY TO Decrease Classify more observations = 1, ∴ ↑ sensitivity and ↓ specificity
CLASSIFY CORRECTLY!
Increase Classify more observations = 0 ∴ ↑ specificity and ↓ sensitivity
More observations classified as 1!
Confusion Matrix
Decrease threshold value to be 𝜋 = 0.2 (no longer 0.5)

OBSERVED
O-rings Succeed O-rings Fail

7 2
PPV = = 2.3 ≈ 0.4286
O-rings Succeed 6 8 789

PREDICTED
N U
O-rings Fail
9 NPV = = ≈ 0.900
1 MLN SLU

K T N U
Sensitivity = KLM = SLT ≈ 0.8571 Specificity = = ≈ 0.5294
OLN VLU

Evidently, decreasing 𝜋 results in the sensitivity increasing (from 0.4286 to


0.8571) and the specificity decreasing (from 0.9418 to 0.5294)…
More observations classified as 0!
Confusion Matrix
Increase threshold value to be 𝜋 = 0.8 (no longer 0.5)

OBSERVED
O-rings Succeed O-rings Fail

7 1
PPV = = 1.4 ≈ 1
O-rings Succeed 1 0 789

PREDICTED
N SW
O-rings Fail
17 NPV = = ≈ 0.7391
6 MLN TLSW

K S N SW
Sensitivity = KLM = SLT ≈ 0.1429 Specificity = = ≈1
OLN XLSW

Evidently, increasing 𝜋 results in the sensitivity decreasing (from 0.4286 to


0.1429) and the specificity increasing (from 0.9418 to 1)…

You might also like