0% found this document useful (0 votes)

16 views25 pages

Chapter 5

Chapter 5 discusses Generalized Linear Models, focusing on situations where the outcome variable is not continuous, such as binary and categorical outcomes. It introduces the linear probability model, latent outcomes, and alternative models like probit and logit for binary outcomes, emphasizing the use of Maximum Likelihood Estimation for inference. The chapter concludes with applications in consumer choice analysis and the necessity of additional assumptions for proper model estimation.

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views25 pages

Chapter 5

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Statistical Foundations of Business Analytics

Chapter 5: Generalized Linear Models

Tim Ederer

Mini 2, 2024
Tepper Business School
Introduction

Chapter 1-4 give you a complete toolkit to make inference about β

• Mostly focus on cases where y is continuous

What happens when y is not continuous?

• Binary outcomes: binary choice, credit default,...
• Categorical outcomes: duration, multiple choice,...

1 / 21
Binary Outcomes
Linear Probability Model

What happens when yi is binary?

• Example: yi = 1 if consumer i chooses product A and yi = 0 otherwise

Linear regression model is not appropriate!

• Linear probability model: E[yi |xi ] = P(yi = 1|xi ) = xi′ β
• Can lead to predictions where P(yi = 1|xi ) is below 0 or above 1

Need to think about an alternative model

• Build model where P(yi = 1|xi ) ∈ [0, 1]

2 / 21
Latent Outcomes

Assume that there is a continuous latent outcome yi∗ such that

(
1, if yi∗ ≥ 0
yi =
0, otherwise

Examples
• Choice: yi∗ could be the utility/valuation of a specific product
• Credit default: yi∗ could be the solvency of a company
• yi∗ is normalized such that you buy the product or you default when yi∗ ≥ 0

How does that help us?

3 / 21
Adapting the Linear Regression Model

We can use the linear regression model to relate yi∗ to xi

yi∗ = xi′ β + εi

Under this structure, all you need to know is β!

• Causal analysis: how a change in xi would change yi∗ and eventually yi
• Forecasting: what would be yi under a counterfactual realization of xi

How do we use data on (yi , xi ) to learn about β?

• How can we overcome the challenge that yi∗ is not observed?

4 / 21
Assumptions

We still need EXO and RANK

• E[εi |xi ] = 0 and no perfect collinearity between elements of x

But given that we do not observe yi∗ we need more structure

P(yi = 1|xi ) = P(xi′ β + εi ≥ 0|xi )

Can we recover β from P(yi = 1|xi )?

5 / 21
Probit

Answer is YES if you specify the distribution of ε

• Reminder: this is not needed in the standard linear regression model

Probit model: εi |xi ∼ N (0, 1)

• P(yi = 1|xi ) = Φ(xi′ β)
• Φ(.) is the c.d.f. of the standard normal distribution

β is identified under this assumption!

• If we would observe the population we could directly derive β

6 / 21
Logit

Alternative to probit: logit model

• Different assumption on distribution of εi

More convenient than probit because of tractable analytical expressions

exp{xi′ β} 1
P(yi = 1|xi ) = and P(yi = 0|xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

β is also identified under this assumption

• Use the fact that log P(yi = 1|xi ) − log P(yi = 0|xi ) = xi′ β

7 / 21
What About Inference?

β is identified, now what?

• Does not tell us how we can make inference about β with our sample

We cannot use OLS for these models

• OLS would only work if we observed yi∗

How can we find an alternative estimator for β?

• Use Maximum Likelihood Estimation (MLE)
• Intuition: find value of β such that the predictions of the model are closest to data

8 / 21
Maximum Likelihood Estimation
Likelihood

We want an estimator that “fits” the data best

• Need to measure how likely it is that our model will predict the observed outcome

Likelihood of individual i: l(β; yi , xi )

• How likely it is given a value of β that I observe (yi , xi )?

Qn
Likelihood of the sample: i=1 l(β; yi , xi )
• How likely it is given a value of β that I observe (yi , xi ) for all i = 1, ..., n?
• If this value is small, the model fits poorly =⇒ we should change β!

9 / 21
Maximum Likelihood Estimator

MLE is the value of β that maximizes the log of the likelihood of the sample
• We take the log for mathematical and computational tractability

Pn
Log-Likelihood of the sample: L(β; y , X ) = i=1 log l(β; yi , xi )
• Allows to transform product into a sum =⇒ easier to compute in R

The maximum likelihood estimator for β is defined as

β̂ ML = arg max L(β; y , X )

10 / 21
Examples: Logit and Probit

Likelihood of individual i in the logit model

yi 1−yi
exp{xi′ β}

1
l(β; yi , xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

Likelihood of individual i in the probit model

y 1−yi
l(β; yi , xi ) = (Φ(xi′ β)) i (1 − Φ(xi′ β))

11 / 21
Illustration in R: MLE with Probit Model

Assume that yi∗ = βxi + εi with β = 1

• Probit: εi |xi ∼ N (0, 1)

12 / 21
Properties of MLE

β̂ ML is unbiased, consistent and efficient

• Only under EXO and RANK

Unbiased and consistent estimator for variance of MLE

" #−1
L(β̂; y , X )
ML
Var(β̂ |X ) = −
c = Iˆ−1
∂β∂β ′

β̂ ML is normally distributed for large n

β̂ ML |X ∼ N β, Iˆ−1

13 / 21
Recap

Linear regression model is not appropriate for binary outcomes

• Leads to incoherent predictions

Alternative model: latent variable model

• Impose linear regression model on continuous latent outcome yi∗

We can make inference about β even if we do not observe yi∗

• Need to impose distributional assumption on ε (probit, logit,...)
• β is identified and β̂ ML is unbiased, consistent and efficient
• =⇒ we can make inference about β using the tools from Chapter 2!

14 / 21
Categorical Outcomes
Categorical Outcomes

What should we do when yi is a categorical variable?

• Multiple choice: yi is the product chosen by consumer i
• Survival analysis: yi is the duration before an event occurs (credit default, insurance claim)

As in the binary case, the linear regression model is not appropriate

• Rely instead on latent variable model
• Use linear regression model to link continuous latent outcome to x
• Estimate parameters of interest via Maximum Likelihood

Focus of this chapter: multiple choice analysis

• Useful for analysis of consumer behavior, optimal pricing, advertisement strategy

15 / 21
Discrete Choice Model

Consider yi as being consumer i’s choice over J products



 1 if i chooses product 1

2 if i chooses product 2

yi = ..


 .

J if i chooses product J


Goal: study the relationship between consumer choice yi and (z1 , z2 , ..., zJ )
• zj : product j’s characteristics (i.e. price, quality)

16 / 21
Latent Utility Model

Define yi as a function of uij the utility consumer i gets from buying product j

1
 if ui1 ≥ uij for all j

2 if ui2 ≥ uij for all j

yi = ..


 .

J if uiJ ≥ uij for all j


Use linear regression model to link uij to zj

uij = zj′ β + εij

17 / 21
Conditional Logit

Assumptions needed
• As always we need EXO and RANK
• Fix distribution of ε: εij ∼ Gumbel(0, 1)

This is called the conditional logit model

exp{zj′ β}
P(yi = j|z1 , ..., zJ ) = PJ
′
k=1 exp{zk β}

18 / 21
Estimation of Conditional Logit Model

Use Maximum Likelihood to estimate β

• β̂ ML is unbiased, consistent and efficient under EXO and RANK

Likelihood of individual i
J
!1{yi =j}
Y exp{zj′ β}
l(β; yi , z1 , ..., zJ ) = PJ
j=1 k=1 exp{zk′ β}

19 / 21
Summary

This chapter: what happens when yi is not continuous?

• Linear regression model is not appropriate anymore

Alternative: latent variable model

• Need additional assumptions (fix distribution of errors)
• Need to change the estimator (maximum likelihood estimator)

Very useful applications

• Consumer choice analysis, duration analysis, pricing strategy,...
• Essential in economics, finance, strategy, marketing

20 / 21
Thank you and good luck!

21 / 21

Seminar Econometrie
No ratings yet
Seminar Econometrie
15 pages
Binaryresponsemf IMP
No ratings yet
Binaryresponsemf IMP
11 pages
Metrikaq
No ratings yet
Metrikaq
11 pages
Binary Data Advanced
No ratings yet
Binary Data Advanced
42 pages
Limited Dependent Variables - Binary Dependent Variables
No ratings yet
Limited Dependent Variables - Binary Dependent Variables
24 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
7 Binaryresponsemf
No ratings yet
7 Binaryresponsemf
11 pages
Probit and Logit-Madesh
No ratings yet
Probit and Logit-Madesh
22 pages
Msfe Week9
No ratings yet
Msfe Week9
5 pages
09 Discrete Choice 1 Notes
No ratings yet
09 Discrete Choice 1 Notes
17 pages
3.handouts Binary Dependent Variables
No ratings yet
3.handouts Binary Dependent Variables
8 pages
Econometrics for Researchers
No ratings yet
Econometrics for Researchers
17 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
MicroEconometrics Lecture10
No ratings yet
MicroEconometrics Lecture10
27 pages
Logit and Probit Models
No ratings yet
Logit and Probit Models
44 pages
Binary
No ratings yet
Binary
47 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Section 11 PDF
No ratings yet
Section 11 PDF
7 pages
EC501 Lecture 04
No ratings yet
EC501 Lecture 04
30 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
Lecture15 Binary Dependent Variables
No ratings yet
Lecture15 Binary Dependent Variables
38 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
17 pages
411 Note LDV
No ratings yet
411 Note LDV
12 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
Pro Bit
No ratings yet
Pro Bit
5 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
57 pages
Econometric Lec7
No ratings yet
Econometric Lec7
26 pages
2A.3 Lecture Slides20 LDV 1
No ratings yet
2A.3 Lecture Slides20 LDV 1
21 pages
Qualitative Data Models Guide
No ratings yet
Qualitative Data Models Guide
34 pages
Microeconometrie Chapitre1 BinaryOutcomeModels
No ratings yet
Microeconometrie Chapitre1 BinaryOutcomeModels
42 pages
Chapter 5 MGT
No ratings yet
Chapter 5 MGT
60 pages
Discrete Choice Model Soderbom
No ratings yet
Discrete Choice Model Soderbom
43 pages
Qualitative Response Models
No ratings yet
Qualitative Response Models
35 pages
LPM, Logit and Probit Models
No ratings yet
LPM, Logit and Probit Models
21 pages
Linear Regression Model - Applied - Part 1&2
No ratings yet
Linear Regression Model - Applied - Part 1&2
69 pages
Topic 3: Qualitative Response Regression Models
No ratings yet
Topic 3: Qualitative Response Regression Models
29 pages
Econometrics 2 Module 5 Video 2 Canvas
No ratings yet
Econometrics 2 Module 5 Video 2 Canvas
13 pages
17 Ae2
No ratings yet
17 Ae2
29 pages
Logit To Probit To LPM Example
No ratings yet
Logit To Probit To LPM Example
21 pages
Binary Outcome Models Explained
No ratings yet
Binary Outcome Models Explained
26 pages
Discrete Choice Models in Econometrics
No ratings yet
Discrete Choice Models in Econometrics
38 pages
Logit Probit
No ratings yet
Logit Probit
11 pages
CH 17 PT1
No ratings yet
CH 17 PT1
40 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
CH 17 Complete
No ratings yet
CH 17 Complete
89 pages
Econ Shu301 CH11
No ratings yet
Econ Shu301 CH11
53 pages
Bgpev2 LDV
No ratings yet
Bgpev2 LDV
53 pages
HKUST ISOM 2500 Lecture Materials On Uncertainty
No ratings yet
HKUST ISOM 2500 Lecture Materials On Uncertainty
26 pages
Manzan Sample Midterm2
No ratings yet
Manzan Sample Midterm2
9 pages
Probit Model
No ratings yet
Probit Model
29 pages
Limited Dependent Variables Models PDF
No ratings yet
Limited Dependent Variables Models PDF
47 pages
Econometric Logit Analysis Insights
No ratings yet
Econometric Logit Analysis Insights
14 pages
Econometrics 2 Module 6 Video 3 Canvas
No ratings yet
Econometrics 2 Module 6 Video 3 Canvas
10 pages
Econometrics
No ratings yet
Econometrics
37 pages
Notes 13
No ratings yet
Notes 13
18 pages
LPM Stata Baum
No ratings yet
LPM Stata Baum
73 pages
Baker (2011) Fragility Fitting
No ratings yet
Baker (2011) Fragility Fitting
10 pages
Forensic Anthropology in DVI
No ratings yet
Forensic Anthropology in DVI
14 pages
Chapter 2 Optimization and Solving Nonlinear Equations
No ratings yet
Chapter 2 Optimization and Solving Nonlinear Equations
22 pages
IIT Patna BS MS 5 Year CSDA Course Curriculum Autumn 2025
No ratings yet
IIT Patna BS MS 5 Year CSDA Course Curriculum Autumn 2025
98 pages
Risk Matrix
No ratings yet
Risk Matrix
8 pages
SSRN 4598180
No ratings yet
SSRN 4598180
8 pages
Probit Analysis MiniTab - Konsentrasi (LC50)
No ratings yet
Probit Analysis MiniTab - Konsentrasi (LC50)
3 pages
6.5 Order Statistik
No ratings yet
6.5 Order Statistik
13 pages
Society of Actuaries/Casualty Actuarial Society: Exam C Construction and Evaluation of Actuarial Models
No ratings yet
Society of Actuaries/Casualty Actuarial Society: Exam C Construction and Evaluation of Actuarial Models
83 pages
Chapter 5
No ratings yet
Chapter 5
42 pages
Mathematical Statistics 16th Edition Keith Knight Updated 2025
100% (2)
Mathematical Statistics 16th Edition Keith Knight Updated 2025
144 pages
Model Exit Exam Statistical Inference (2024)
No ratings yet
Model Exit Exam Statistical Inference (2024)
7 pages
Embedded Topic Models Explained
No ratings yet
Embedded Topic Models Explained
12 pages
Mathews 2008
No ratings yet
Mathews 2008
9 pages
Core Statistics and R Guide
100% (4)
Core Statistics and R Guide
256 pages
Chapter 3 Uncertainty 1
No ratings yet
Chapter 3 Uncertainty 1
41 pages
2011 June
No ratings yet
2011 June
22 pages
Financial Ratios As Predictor of Failure
No ratings yet
Financial Ratios As Predictor of Failure
25 pages
Nourollah Ahmadi, Jérôme Bartholomé - Genomic Prediction of Complex Traits-Humana Press (2022)
No ratings yet
Nourollah Ahmadi, Jérôme Bartholomé - Genomic Prediction of Complex Traits-Humana Press (2022)
651 pages
Notes Simple Linear Regression Analysis
No ratings yet
Notes Simple Linear Regression Analysis
39 pages
Non Linear Optimization Modified
No ratings yet
Non Linear Optimization Modified
78 pages
PR-1709 - Lifting and Hoisting Procedure Lift Planning Execution
No ratings yet
PR-1709 - Lifting and Hoisting Procedure Lift Planning Execution
37 pages
ONGC GCS KG Basin
No ratings yet
ONGC GCS KG Basin
284 pages
9 6 4 PB
No ratings yet
9 6 4 PB
11 pages
M.E. Big Data Analytics Curriculum
No ratings yet
M.E. Big Data Analytics Curriculum
106 pages
7 Pillars of Stat Wisdom PDF
No ratings yet
7 Pillars of Stat Wisdom PDF
38 pages
Rose & Bliemer 2009 - Constructing Efficient Stated Choice Experimental Designs
No ratings yet
Rose & Bliemer 2009 - Constructing Efficient Stated Choice Experimental Designs
32 pages
Cherof - On The Distribution of The Likelihood Ratio
No ratings yet
Cherof - On The Distribution of The Likelihood Ratio
7 pages
Leviathan As Foreign Investor: Geopolitics and Sovereign Wealth Funds
No ratings yet
Leviathan As Foreign Investor: Geopolitics and Sovereign Wealth Funds
18 pages
Determinants of Participation in Non Far
No ratings yet
Determinants of Participation in Non Far
21 pages