0% found this document useful (0 votes)

86 views35 pages

Week 12 LPN Logit 0

This document provides an outline for a lecture on linear probability models, logistic regression, and probit regression for modeling binary outcome variables. It discusses how linear probability models can be used but are not ideal since probabilities must be between 0 and 1. Logistic regression and probit regression are introduced as better alternatives that can model probabilities using specific functions. The document provides examples of using a linear probability model to predict hypertension using age from survey data.

Uploaded by

Mehak gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views35 pages

Week 12 LPN Logit 0

Uploaded by

Mehak gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Week 12: Linear Probability Models, Logistic and

Probit

Marcelo Coca Perraillon

University of Colorado
Anschutz Medical Campus

Health Services Research Methods I

HSMP 7607
2019

These slides are part of a forthcoming book to be published by Cambridge

University Press. For more information, go to perraillon.com/PLH. c This
material is copyrighted. Please see the entire copyright notice on the book’s
website.
Updated notes are here: https://clas.ucdenver.edu/marcelo-perraillon/
teaching/health-services-research-methods-i-hsmp-7607 1
Outline

Modeling 1/0 outcomes

The “wrong” but super useful model: Linear Probability Model
Deriving logistic regression
Probit regression as an alternative

2
Binary outcomes

Binary outcomes are everywhere: whether a person died or not, broke

a hip, has hypertension or diabetes, etc
We typically want to understand what is the probability of the binary
outcome given explanatory variables
It’s exactly the same type of models we have seen during the
semester, the difference is that we have been modeling the conditional
expectation given covariates: E [Y |X ] = β0 + β1 X1 + · · · + βp Xp
Now, we want to model the probability given covariates:
P(Y = 1|X ) = f (β0 + β1 X1 + · · · + βp Xp )
Note the function f() in there

3
Linear Probability Models

We could actually use our vanilla linear model to do so

If Y is an indicator or dummy variable, then E [Y |X ] is the proportion
of 1s given X , which we interpret as the probability of Y given X
The parameters are changes/effects/differences in the probability of Y
by a unit change in X or for a small change in X
If an indicator variable, then change from 0 to 1
For example, if we model diedi = β0 + β1 agei + i , we could interpret
β1 as the change in the probability of death for an additional year of
age

4
Linear Probability Models
The problem is that we know that this model is not entirely
correct. Recall that in the linear model we assume
Y ∼ N(β0 + β1 X1 + · · · + βp Xp , σ 2 ) or equivalently, i ∼ N(0, σ 2 )
That is, Y distributes normal conditional on Xs or the error
distributes normal with mean 0
Obviously, a 1/0 variable can’t distribute normal, and i can’t be
normally distributed either
We also know that we needed the normality assumption for
inference, not to get best betas
The big picture: Using the linear model for a 1/0 outcomes is
mostly wrong in the sense that the SEs are not right
Yet, the effects of covariates on the probability of the outcome are
more often than not fine
So LPM is the wrong but super useful model because changes can
be interpreted in the probability scale
5
Example
Data from the National Health and Nutrition Examination Survey
(NHANES)
We model the probability of hypertension given age
reg htn age

Source | SS df MS Number of obs = 3,529

-------------+---------------------------------- F(1, 3527) = 432.78
Model | 59.3248737 1 59.3248737 Prob > F = 0.0000
Residual | 483.471953 3,527 .137077389 R-squared = 0.1093
-------------+---------------------------------- Adj R-squared = 0.1090
Total | 542.796826 3,528 .153853976 Root MSE = .37024

------------------------------------------------------------------------------
htn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0086562 .0004161 20.80 0.000 .0078404 .009472
_cons | -.1914229 .0193583 -9.89 0.000 -.2293775 -.1534682
------------------------------------------------------------------------------

An additional year of life increases the probability of hypertension by

0.8 percent. See, neat
6
Example
Plot predicted values against observed values (note the jittering)
reg htn age
predict htnhat_lpm
scatter htn age, jitter(5) msize(tiny) || line htnhat_lpm age, sort ///
legend(off) color(red) saving(hlpm.gph, replace)
graph export hlpm.png, replace
di _b[_cons] + _b[age]*20

7
Linear Probability Models

You can see the first problem with the LPMs

The relationship between age (or any other variable) cannot be
linear. Probabilities need to be constrained to be between 0 and 1
In this example, the probability of hypertension for a 20 y/o is
-.0182996
Is this a big problem in this example? No, because on average the
probability of hypertension is 0.19, which is not close to zero
Lowess is neat for seeing 0/1 variables conditional on a variable

8
Lowess
Lowess can show you how the relationship between the indicator
variable and the explanatory variable looks like
scatter htn age, jitter(5) msize(vsmall) || line lhtn age, sort ///
legend(off) color(red) saving(hlowess.gph, replace) ///
title("Lowess")

9
LPN, more issues

The variance of a 1/0 (binary) depends on the values of X so there is

always heteroskedasticity: var (y |x) = p(x)[1 − p(x)]
We know that we need this assumption for correct SEs and F tests.
So we can correct SEs in LPMs using the robust option (Huber-White
SEs; aka sandwich estimator)
Still, we do know that SEs are not totally correct because they
do not distribute normal either, even is we somehow correct for
heteroskedasticity
But at the very least, use the robust option by default

10
So why do we use LPMs?
Not long ago, maybe 10 or 15 years ago (it’s always 10 to 15 to 20
years ago), you couldn’t use other alternatives with large datasets
(logistic, probit)
It would take too long to run the models or they wouldn’t run;
researchers would take a sample and run logit or probit as a sensitivity
analysis
The practice still lingers in HSR and health economics
The main reason to keep using LPM as a first step in modeling, it’s
because the coefficients are easy to interpret
In my experience, if the average of the outcome is not close to 0 or 1,
not much difference between LPM or logit/probit (but SEs can
change, although not by a lot)
But not a lot of good reasons to present LPM results in papers
anymore, except maybe in difference-in-difference models
11
One more time

Other than interpretation of coefficients or a first pass to modeling,

there are NO GOOD REASONS TO USE THE LPM model
Some researches (ok, economists, mostly) truly love the LPN because
the parameters are easy to interpret and often the effects are close
enough
Yet, in some cases, the effects could be off, too
But it’s the wrong model. Use a probit or logit, period

12
Logistic or logit model
Logistic models can be derived in several ways, which makes learning
confusing since you can read different versions
In the MLE lecture we derived the model assuming that the outcome
1/0 distributes Bernoulli and that observations were iid . We will
extend that example today
An (almost) identical way is to assume that the outcome comes
from a Binomial distribution since the Binomial is the sum of iid
Bernoulli random variables
A third way is to assume that there is a latent and continuous
variable that distributes logistic (yes, there is also a logistic pdf), or
probit, but we only get to observe a 1 or 0 when the latent variable
crosses a threshold
You get to the same model but the latent interpretation has a
bunch of applications ins economics (for example, random utility
models) and psychometrics (the latent variable is “ability” but you
only observed if a person answers a question correctly, a 1/0)
13
Recall the Logit MLE from the MLE lecture

Remember that if we have an iid Bernoulli rv we can write the joint

probability of observing the data as:
L(p) = ni=1 p yi (1 − p)1−yi
Q

As we saw before, taking the log makes the analytical and

computational calculations easier:
lnL(p) = ni=1 yi ln(p) + ni=1 (1 − yi )ln(1 − p)
P P

The log likelihood simplifies to:

lnL(p) = nȳ ln(p) + (n − nȳ )ln(1 − p)
If we take the derivative with respect to p and set it equal to zero we
find the MLE estimators. The SEs can be calculated using the second
derivatives (see the MLE lecture)

14
Adding covariates

We can of course make p a function of covariates but the function

can’t be something like p = β0 + β1 X1 + · · · + βp Xp since the p must
be bounded between 0 and 1
So we use a transformation. One of those transformations is the
logistic response function:
ex 1
π= 1+e x = 1+e −x
, where x is any real number
π(x) is then restricted to be between 0 and 1
Confusion alert: Note that there is a logistic response function and
also the logistic distribution. Here we are using the response function.
In the latent derivation, the logistic distribution is used
See more here:
https://en.wikipedia.org/wiki/Logistic_distribution

15
Logistic response function
If we constrain the response to be between 0 and 1, it can’t be linear
with respect to X
twoway function y=exp(x) / (1+ exp(x)), range(-10 10) saving(l1.gph, replace)
twoway function y=exp(-x) / (1+ exp(-x)), range(-10 10) saving(l2.gph, replace)
graph combine l1.gph l2.gph, xsize(20) ysize(10)
graph export lboth.png, replace

16
Logistic or logit model

Notice a couple of things.The effect of x on π is not linear; the effect

depends on the value of x
But we can make the function linear using the so-called logit
transformation
π
ln( 1−π )=x
I made you go the other way in one homework. If you solve for π you
get to the logistic response function
More general, the model is:
p
ln( 1−p ) = β0 + β1 X1 + · · · + βp Xp , which transformed is
e β0 +β1 X1 +···+βp Xp
p= 1+e β0 +β1 X1 +···+βp Xp
1
Can also be written as: p = 1+e −(β0 +β1 X1 +···+βp Xp )

17
Logistic MLE

When extending the log-likelihood function to make p a function of

covariates we use the logistic response function to constraint p
And that’s what creates a bunch of interpretation problems. The
estimated betas are changes in the log-odds scale
From now on, always, always write logistic models like this:
pi
ln( 1−p i
) = β0 + β1 X1i + · · · + βp Xpi
That’s what Stata (SAS or R) estimate. The betas are changes in
the log-odds scale
We will see next class that we could write effects as odds ratios but
I’ll tell you that odds ratios are confusing

18
Yet another way

What about if we assume there is a latent (unobserved) variable y ∗

that is continuous. Think about it as a measure of illness
If y ∗ crosses a threshold, then the person dies. We only observe if the
person died but we can’t observe the latent variable
We can write this problem as
P(y = 1|X ) = P(β0 + β1 X1 + · · · + βp Xp + u > 0) = P(−u <
β0 + β1 X1 + · · · + βp Xp ) = F (β0 + β1 X1 + · · · + βp Xp )
F() is the cdf of -u. If we assume logistic distribution, we get logistic
regression, if we assume cumulative normal, we get a probit model
See Cameron and Trivedi Chapter 14, section 14.3.1
Both models are similar because the logistic distribution and the
normal and almost the same

19
Compare logistic distribution and normal

Stata doesn’t have a logistic distribution but you can simulate any
probability distribution using the uniform distribution and plugging in
into the inverse of the pdf you want
clear
set seed 123456
set obs 5000
gen u = uniform()

* Simulate logistic distribution

gen l = -ln((1 - u)/u)
sum l

* Simulated normal with same parameters

gen n = rnormal(r(mean), r(sd))

* Plot
kdensity l, bw(0.3) gen(xl dl)
kdensity n, bw(0.3) gen(xn dn)
line dl xl, sort color(red) || line dn xn, sort ///
title("Logistic (red) vs normal distribution") ytitle("Density") ///
xtitle("x") legend(off)

20
Standard logistic vs standard normal

Does it make much of a difference if we use one vs the other?

clear
set seed 123456
set obs 5000
gen u = uniform()

* Simulate logistic distribution

gen l = -ln((1 - u)/u)
sum l

* Simulated normal with same parameters

gen n = rnormal(r(mean), r(sd))

21
Logistic vs normal
Assuming either one as the latent distributions makes little difference

22
Big picture

Not a big difference in the probability scale between probit and logit
If you are an economist you run probit models; for the rest of the
world, there is the logistic model
IMPORTANT: There is a big difference in terms of interpreting a
regression output because the coefficients are estimated in different
scales
In the logistic model the effect of a covariate can be made linear in
terms of the odds-ratio; you can’t do the same in probit models
We will see examples of both

23
Digression: How to simulate a logistic model

set seed 12345678

set obs 10000
gen x1 = rchi2(1)+2

* Make the probability a function of x1

gen pr1 = exp(-1+0.1*x1) /(1+exp(-1+0.1*x1))

* Generate a one-trial binomial (Bernoulli)

gen y = rbinomial(1, pr1)
list y in 1/10
+---+
| y |
|---|
1. | 0 |
2. | 1 |
3. | 1 |
4. | 0 |
5. | 1 |
|---|
6. | 0 |
7. | 0 |
8. | 0 |
9. | 1 |
10. | 0 |
+---+

24
Simulation

Note that the parameters match the ones I used in the gen step in
previous slide
logit y x1

Iteration 0: log likelihood = -6351.6567

Iteration 1: log likelihood = -6328.5462
Iteration 2: log likelihood = -6328.4883
Iteration 3: log likelihood = -6328.4883

Logistic regression Number of obs = 10,000

LR chi2(1) = 46.34
Prob > chi2 = 0.0000
Log likelihood = -6328.4883 Pseudo R2 = 0.0036

------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .0998315 .0145838 6.85 0.000 .0712478 .1284153
_cons | -1.004065 .0493122 -20.36 0.000 -1.100715 -.9074147
------------------------------------------------------------------------------

25
Big picture

In the next weeks, we will (again) focus on parameter interpretation

You already have most of the tools to do modeling but we need to
adapt them to these types of models
The tricky part about logistic or probit models or other types of
models is to move from the world in which relationships are linear to
the world in which they are not
There is no R 2 but there are other ways to check the predictive ability
and fit of models. Mostly, we do it using the estimated log-likelihood

26
Comparing LPM, logit, and probit models

For now, let’s compare models

* LPM
qui reg htn age
est sto lpm
predict hatlpm

* Logit
qui logit htn age
est sto logit
predict hatlog

* Probit
qui probit htn age
est sto prob
predict hatprob

line hatlpm age, sort color(blue) || line hatlog age, sort || ///
line hatprob age, sort legend(off) saving(probs.gph, replace)
graph export prob.png, replace

27
Predicted probabilities
Note that probabilities are not linear with probit and logit even
though we wrote models in the same way
Note that there is practically no difference between the logit and
probit models in this example

28
But wait, look closely... What about effects?
In the linear model the effect of X (age in this case) is always the
same regardless of age. It’s the slope for small changes and when we
use marginal effects
See graph again. Around 45 the effect will be identical (similar
slopes). But not around 25 or 65

29
Effect at different points

Comparing logit and LPM for now (not probit)

For the linear model, the effect is, well, linear, so always the same:
0.0086562
qui reg htn age
margins, dydx(age) at(age=(25 45 65))
Conditional marginal effects Number of obs = 3,529
Model VCE : OLS
Expression : Linear prediction, predict()
dy/dx w.r.t. : age
1._at : age = 25
2._at : age = 45
3._at : age = 65
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
_at |
1 | .0086562 .0004161 20.80 0.000 .0078404 .009472
2 | .0086562 .0004161 20.80 0.000 .0078404 .009472
3 | .0086562 .0004161 20.80 0.000 .0078404 .009472
------------------------------------------------------------------------------

30
Effect at different points

But for logit the effect will be different. Note that the difference is
quite large
qui logit htn age
margins, dydx(age) at(age=(25 45 65)) vsquish
Conditional marginal effects Number of obs = 3,529
Model VCE : OIM
Expression : Pr(htn), predict()
dy/dx w.r.t. : age
1._at : age = 25
2._at : age = 45
3._at : age = 65
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
_at |
1 | .0030922 .0001522 20.32 0.000 .0027939 .0033904
2 | .0084346 .0004166 20.25 0.000 .007618 .0092511
3 | .0149821 .0009574 15.65 0.000 .0131057 .0168585
------------------------------------------------------------------------------

31
But, does this matter?

Well, yes and no

If you are using the model to understand how age affects the
probability of hypertension, yes, of course it matters
If you want to understand, the effect of age on p(htn) for most
people (on average) then it doesn’t matter because the average age
on the sample is 45 and at 45 the effect is the same
Yet another reason to use the LPM as a first pass, not as your final
tool
At the very least you must understand these issues
I’m with Will Manning on this issue. The LPM is the wrong model
(but useful)

32
Coefficients in the estimation scale?

The coefficients are very different because they are measured in

different scales
. est table lpm logit prob, p

-----------------------------------------------------
Variable | lpm logit prob
-------------+---------------------------------------
_ |
age | .00865616
| 0.0000
_cons | -.19142286
| 0.0000
-------------+---------------------------------------
htn |
age | .0623793 .0352797
| 0.0000 0.0000
_cons | -4.4563995 -2.5542014
| 0.0000 0.0000
-----------------------------------------------------
legend: b/p

33
Similar in probability scale
We will learn how to make them comparable using the probability
scale, which is what we really care about. The margins command is
computing an average effect across values of age
. * LPM
qui reg htn age
margins, dydx(age)
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0086562 .0004161 20.80 0.000 .0078404 .009472
------------------------------------------------------------------------------

* Logit
qui logit htn age
margins, dydx(age)
------------+----------------------------------------------------------------
age | .0084869 .0004066 20.87 0.000 .00769 .0092839
------------------------------------------------------------------------------

* Probit
qui probit htn age
margins, dydx(age)
-------------+----------------------------------------------------------------
age | .0084109 .0003917 21.47 0.000 .0076432 .0091787
------------------------------------------------------------------------------

34
Summary

There are many ways to arrive to the logistic model

The big picture is that we are trying to model a probability, which
must be bounded between 0 and 1
If it’s bounded, then the effect of any covariate on the probability
cannot be linear
The tricky part is to learn how to interpret parameters. The short
story is that we estimate (assuming one predictor)
p e β0 +β1 x
log ( 1−p ) = β0 + β1 x but we care about p = 1+e β0 +β1 x

To make life easier, logistic models are often interpreted using odds
ratios but odds ratios can be misleading
In the probit model, we interpret parameters as shifts in the
cumulative normal, even less intuitive

CH 17 Complete
No ratings yet
CH 17 Complete
89 pages
CH 17 PT1
No ratings yet
CH 17 PT1
40 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Limited Dependent Variables - Binary Dependent Variables
No ratings yet
Limited Dependent Variables - Binary Dependent Variables
24 pages
Cap1 Slides
No ratings yet
Cap1 Slides
30 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Binary Outcome Models Explained
No ratings yet
Binary Outcome Models Explained
26 pages
MicroEconometrics Lecture10
No ratings yet
MicroEconometrics Lecture10
27 pages
411 Note LDV
No ratings yet
411 Note LDV
12 pages
Econometrics for Researchers
No ratings yet
Econometrics for Researchers
17 pages
Week 6 Mle Perraillon 0
No ratings yet
Week 6 Mle Perraillon 0
69 pages
Topic 3: Qualitative Response Regression Models
No ratings yet
Topic 3: Qualitative Response Regression Models
29 pages
Discrete Choice Models in Econometrics
No ratings yet
Discrete Choice Models in Econometrics
38 pages
Seminar Econometrie
No ratings yet
Seminar Econometrie
15 pages
Logit Probit
No ratings yet
Logit Probit
11 pages
Metrikaq
No ratings yet
Metrikaq
11 pages
Lecture 7 - Binary
No ratings yet
Lecture 7 - Binary
45 pages
Msfe Week9
No ratings yet
Msfe Week9
5 pages
17 Ae2
No ratings yet
17 Ae2
29 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
Chapter 4
No ratings yet
Chapter 4
11 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
Us20 Allison
No ratings yet
Us20 Allison
10 pages
Day 4
No ratings yet
Day 4
29 pages
BSC Intermediate Econometrics: Please Do Not Distribute
No ratings yet
BSC Intermediate Econometrics: Please Do Not Distribute
25 pages
Qualitative Response Regression Model - Probabilistic Models
No ratings yet
Qualitative Response Regression Model - Probabilistic Models
34 pages
Logit Probit
No ratings yet
Logit Probit
20 pages
Chapter 5 MGT
No ratings yet
Chapter 5 MGT
60 pages
Qualitative Response Models
No ratings yet
Qualitative Response Models
35 pages
Slides 7 Iu
No ratings yet
Slides 7 Iu
48 pages
Chapter - Five - Limited Dependent Variable Models
No ratings yet
Chapter - Five - Limited Dependent Variable Models
75 pages
Discrete Choice Model Soderbom
No ratings yet
Discrete Choice Model Soderbom
43 pages
LPM, Logit and Probit Models
No ratings yet
LPM, Logit and Probit Models
21 pages
Econometric Lec7
No ratings yet
Econometric Lec7
26 pages
Binaryresponsemf IMP
No ratings yet
Binaryresponsemf IMP
11 pages
Binary
No ratings yet
Binary
135 pages
0 LIMDEP MODEL NB NB
No ratings yet
0 LIMDEP MODEL NB NB
96 pages
Ecntr Assmm
No ratings yet
Ecntr Assmm
23 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
Week 6 Mle
No ratings yet
Week 6 Mle
41 pages
Binary Data Advanced
No ratings yet
Binary Data Advanced
42 pages
4a. LPM-Logit-Probit-Tobit Model - IInd Sem 23-24
No ratings yet
4a. LPM-Logit-Probit-Tobit Model - IInd Sem 23-24
130 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Binary Choice Models Explained
No ratings yet
Binary Choice Models Explained
14 pages
Logit To Probit To LPM Example
No ratings yet
Logit To Probit To LPM Example
21 pages
Week 6 Notes
No ratings yet
Week 6 Notes
107 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
34 pages
Perraillon Marginal Effects Lecture Lisbon 0
No ratings yet
Perraillon Marginal Effects Lecture Lisbon 0
65 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Chapter 15.1
No ratings yet
Chapter 15.1
22 pages
Econometrics II: Binary Models Guide
No ratings yet
Econometrics II: Binary Models Guide
63 pages
Qualitative Response Regression Models
No ratings yet
Qualitative Response Regression Models
6 pages
Notes 13
No ratings yet
Notes 13
18 pages
Lecture 6&7 - Qualitative Dependent Models
No ratings yet
Lecture 6&7 - Qualitative Dependent Models
15 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
PSQF6270 Lecture4 Continuous
No ratings yet
PSQF6270 Lecture4 Continuous
20 pages
Chapter 15 Qualitative Response Regression Models Part 2
No ratings yet
Chapter 15 Qualitative Response Regression Models Part 2
31 pages
50 Important Simplification Question
No ratings yet
50 Important Simplification Question
26 pages
Multiple Item Scale Development
No ratings yet
Multiple Item Scale Development
6 pages
Discriminant Interpretation
No ratings yet
Discriminant Interpretation
5 pages
100 Number Series Question and Solutions
No ratings yet
100 Number Series Question and Solutions
17 pages
Capital Budgeting PracticeIM
No ratings yet
Capital Budgeting PracticeIM
3 pages
Week 4 G Sept 2021
No ratings yet
Week 4 G Sept 2021
4 pages
Rationale of The Study: 3 Single Spaces
No ratings yet
Rationale of The Study: 3 Single Spaces
16 pages
Exploratory Research-Descriptive - Explanatory
No ratings yet
Exploratory Research-Descriptive - Explanatory
35 pages
Simple Linear Regression Example Infographic
No ratings yet
Simple Linear Regression Example Infographic
1 page
Research Pro ICT With Questionnaire
No ratings yet
Research Pro ICT With Questionnaire
24 pages
ML 5
No ratings yet
ML 5
61 pages
Cluster Analysis in Python Chapter1 PDF
No ratings yet
Cluster Analysis in Python Chapter1 PDF
31 pages
Laboratory Manual UE24CS1204
No ratings yet
Laboratory Manual UE24CS1204
47 pages
Data Analytics Advanced With Python, Numpy and
No ratings yet
Data Analytics Advanced With Python, Numpy and
6 pages
Internet Ads in Dashen Brewery Study
No ratings yet
Internet Ads in Dashen Brewery Study
18 pages
cs3352 Foundation of Data Science
No ratings yet
cs3352 Foundation of Data Science
117 pages
Data Visualization
No ratings yet
Data Visualization
76 pages
Split-Split Plot Design
No ratings yet
Split-Split Plot Design
11 pages
The TOFD Procedure
100% (1)
The TOFD Procedure
26 pages
Mizan-Tepi University: School of Computing & Informatics Department of Information Systems
No ratings yet
Mizan-Tepi University: School of Computing & Informatics Department of Information Systems
45 pages
Iimk Adsm b6 - Brochure
No ratings yet
Iimk Adsm b6 - Brochure
20 pages
Final Report Agb Food Product Innovation 456
No ratings yet
Final Report Agb Food Product Innovation 456
20 pages
Control Chart Graph
100% (1)
Control Chart Graph
18 pages
Random Forest
100% (1)
Random Forest
11 pages
Narrative Study Experience of An Engineer Pursuing Another Career
No ratings yet
Narrative Study Experience of An Engineer Pursuing Another Career
33 pages
Fel Mark P. Barcebal Research
No ratings yet
Fel Mark P. Barcebal Research
43 pages
Simple Linear Regression: Math Behind
0% (1)
Simple Linear Regression: Math Behind
6 pages
Unified Data Architecture Guide
No ratings yet
Unified Data Architecture Guide
12 pages
Dbs3e PPT ch03
No ratings yet
Dbs3e PPT ch03
61 pages
Linking TOEFL Ibt Scores To Ielts Scores
No ratings yet
Linking TOEFL Ibt Scores To Ielts Scores
17 pages
Slides 0
No ratings yet
Slides 0
21 pages
Energy - Audit - Draft Report - Bajaber Millers LTD v1 22 11 2017
No ratings yet
Energy - Audit - Draft Report - Bajaber Millers LTD v1 22 11 2017
73 pages
CBSMeffectson Injuryand Illness
No ratings yet
CBSMeffectson Injuryand Illness
9 pages
Resume Sample For Quality Control Manager
100% (2)
Resume Sample For Quality Control Manager
5 pages
Introduction To Business Analytics
No ratings yet
Introduction To Business Analytics
9 pages

Week 12 LPN Logit 0

Uploaded by

Week 12 LPN Logit 0

Uploaded by

Week 12: Linear Probability Models, Logistic and

Marcelo Coca Perraillon

Health Services Research Methods I

These slides are part of a forthcoming book to be published by Cambridge

Modeling 1/0 outcomes

Binary outcomes are everywhere: whether a person died or not, broke

We could actually use our vanilla linear model to do so

Source | SS df MS Number of obs = 3,529

An additional year of life increases the probability of hypertension by

You can see the first problem with the LPMs

The variance of a 1/0 (binary) depends on the values of X so there is

Other than interpretation of coefficients or a first pass to modeling,

Remember that if we have an iid Bernoulli rv we can write the joint

As we saw before, taking the log makes the analytical and

The log likelihood simplifies to:

We can of course make p a function of covariates but the function

Notice a couple of things.The effect of x on π is not linear; the effect

When extending the log-likelihood function to make p a function of

What about if we assume there is a latent (unobserved) variable y ∗

* Simulate logistic distribution

* Simulated normal with same parameters

Does it make much of a difference if we use one vs the other?

* Simulate logistic distribution

* Simulated normal with same parameters

set seed 12345678

* Make the probability a function of x1

* Generate a one-trial binomial (Bernoulli)

Iteration 0: log likelihood = -6351.6567

Logistic regression Number of obs = 10,000

In the next weeks, we will (again) focus on parameter interpretation

For now, let’s compare models

Comparing logit and LPM for now (not probit)

Well, yes and no

The coefficients are very different because they are measured in

There are many ways to arrive to the logistic model

You might also like