0% found this document useful (0 votes)

17 views16 pages

Endogeneity 6

The document discusses the endogeneity issue in econometrics, where key regressors are correlated with the error term, leading to biased OLS estimators. It introduces Instrumental Variables (IV) and Two-Stage Least Squares (2SLS) as methods to address endogeneity, emphasizing the importance of finding valid IVs that satisfy exogeneity, relevance, and exclusion criteria. The document also covers the structural vs. reduced form models, the role of control variables, and various tests for identifying the validity of IVs.

Uploaded by

davisantos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views16 pages

Endogeneity 6

Uploaded by

davisantos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Lecture: IV and 2SLS Estimators (Wooldridge’s book chapter 15)

1
Endogeneity

• Endogeneity issue arises when the key regressor is correlated with the error term.

cov(x, u) 6= 0 (Endogeneity) (1)

• This can happen when (i) there are omitted variables; (ii) there is reverse causation or
simultaneity; (iii) there is measurement error

• In the presence of endogeneity, OLS estimator is biased

β̂ → p β + bias (2)

or equivalently, causal effect cannot be identified

cov(y, x) = β cov(x, x) + cov(x, u) (3)

cov(y, x) − cov(x, u) cov(y, x)
β = 6= (4)
cov(x, x) cov(x, x)

• The primary goal of econometrics is to resolve the endogeneity (identification) issue

2
Instrumental Variables (IV) Can Help

• If there is endogeneity, 2SLS or IV estimator based on valid IVs is consistent (β can be

identified)

• IV is valid if (i) it is uncorrelated with error term (exogeneity); (ii) it is correlated with
the key regressor (relevance); (iii) it has no direct effect on y or it is excluded in the
structural form (exclusion)

• A valid IV is hard to find

• For instance, the number of raining days is a valid IV for watching TV if (1) it is
uncorrelated with autism gene (exogeneity); (2) it is correlated with watching TV
(relevance); (3) it cannot have direct effect on developing autism (exclusion). Notice
that it is allowed to indirectly affect autism through watching TV.

3
Structural Form vs Reduced Form

Consider a linear model in which x2 is assumed to be exogenous

y = β1 x1 + β2 x2 + u (5)

• We are interested in estimating β1 that measures the marginal effect of x1 on y

• This is reduced form if x1 is also exogenous. OLS can be applied to the reduced form

• This is structural form if x1 is endogenous. Most economics models are structural

forms. OLS becomes biased. Instead we may need to find IV.

4
IVs

• x2 cannot be used as IV. It satisfies exogeneity, and maybe relevance. But it does not
satisfy exclusion

• The valid IV should be an exogenous variable that matters for x1 (relevance) but only
has indirect effect on y through its effect on x1 (exclusion)

• β1 is just-identified if there is only one IV (excluded exogenous variable). In this case,

2SLS is also called IV estimator.

• β1 is over-identified if there are multiple IVs.

• β1 is under-identified if there is no excluded exogenous variable.

• For instance, we have over-identification if we know the number of raining days and the
number of snowy days. If only one is known, we have just identification.

5
Apple Story

• You can think of x1 as a partially rotten apple consisting of two parts: the bad
endogenous part (correlated with u) and the good exogenous part (uncorrelated with u)

• OLS is bad since it uses the whole apple

• IV estimation is good because IV is used as knife to remove the endogenous part, and
only the exogenous part is used in the estimation.

• When people ask about your identification strategy, typically they wonder how the bad
part of apple is removed or how the good part is isolated

• We hope the good part is big, i.e., the IV and x1 are not weakly related

• It is a good idea to use more IV (over-identification) to isolate bigger exogenous part of

the apple

6
Big Picture

7
Big Picture

• The box defines the structural model in which y depends on x1 , x2 and u.

• x1 is the variable of interest, for which we want to quantify its marginal (causal) effect
on y. However, x1 is endogenous because it is linked to u. OLS is biased because of the
x1u link.

• To solve the endogeneity or identification issue, we need help—an IV variable z which

is outside the box (exclusion), is related to x1 (relevance), and is unrelated to u
(exogeneity)

• Notice that x2 is exogenous because there is no link between x2 and u. x2 cannot be used
as IV because it is inside the box (fails exclusion). Instead, x2 is called controlled
variable (included exogenous variable)

• Critical thinking: what if we do not control for x2 ? (Hint: think about the potential link
between z and x2 )

• You need to draw and justify this big picture if you decide to use IV methodology
8
Stata

Suppose there are two valid IVs z1 and z2 . The stata command for 2SLS estimator is

ivreg y (x1 = z1 z2) x2, first

• It is important to control for x2 , which can make exogeneity condition more likely to
hold for z1 and z2

• The option first reports the first-stage regression that regresses x1 onto z1, z2 and x2 .
The residual of the first-stage regression is the bad part of apple, and can be used to
implement Hausman test. The weak IV test is just the F-value for testing both
coefficients of z1 and z2 being zero. The fitted value of first-stage regression is the good
part of apple, so is the IV variable used in the second-stage

• We obtain 2SLS estimator by regressing y onto the first-stage fitted value and x2 using
OLS (second-stage). The ivreg command does all these for you

• Important: z1 , z2 are excluded exogenous variables. while x2 is included exogenous

variable (control variable).
9
Three Little Pigs Story

• Recall the first stage regression

x = c1 z1 + c2 z2 + . . . cm zm + included exogenous variable + v

• (Hausman Test): The null hypothesis is that the regressor is exogenous (so OLS is good
and IV is not needed). We run the first stage regression and save the residual v̂. Then
run an auxiliary regression y = xβ + d v̂ and test H0 : d = 0. Small p value indicates that
the regressor is endogenous and IV is needed.
• (Stock-Yogo Test): The null hypothesis is that c1 = c2 = . . . = cm = 0, meaning that the
IV is irrelevant (weak IV). We reject the null hypothesis if F statistic exceeds 10
• (Over-identification or Sargan’s J Test) The key coefficient is over-identified if the
number of IV exceeds the number of endogenous regressor by q > 0. In that case we
can test the null hypothesis that all IVs are exogenous. We run the auxiliary regression

û2sls = a1 z1 + a2 z2 + . . . am zm + included exogenous regressor

and compute nR2 ∼ χ 2 (q). Big nR2 leads to rejection, so at least one IV is endogenous.
10
Math (Optional)

Consider a simple regression y = β0 + β1 x + u, where x is endogenous cov(x, u) 6= 0. To

derive a formula for the IV estimator, assume there is only one excluded exogenous variable
z satisfying cov(z, u) = 0 and cov(x, z) 6= 0. It follows that

cov(y, z) = cov(β0 + β1 x + u, z) = β1 cov(x, z) (6)

cov(y, z)
β1IV = (7)
cov(x, z)
Some old school people want to rewrite it as
cov(y, z)/var(z) reduce-form OLS estimate
β1IV = = (8)
cov(x, z)/var(z) first-stage OLS estimate
When there are multiple instrumental variables, the IV estimator is called 2SLS estimator
cov(y, x̂) cov(y, x̂)
β12SLS = = = OLS estimate of regressing y onto x̂ (9)
cov(x, x̂) var(x̂)
where x̂ is the fitted value of regressing x onto the multiple IV variables (first-stage
regression).
11
(Optional) Matrix Algebra I

• Let X be the matrix for the regressors in the structural form X = (x1 , x2 ). Note x1 is
endogenous while x2 is exogenous

• Let Z be the matrix for all exogenous variables Z = (z1 , z2 , x2 ). Note x2 is included
exogenous variable, while z1 , z2 are excluded exogenous variables

• Define the projection matrix P = Z(Z 0 Z)−1 Z 0 . The fitted value of first stage is Xb = PX.
Note Xb is exogenous, and the fitted value for x2 is itself

• The second stage uses Xb as regressors and apply OLS

−1
2SLS 0 0
β̂ = Xb Xb Xb Y (10)
0
−1 0
= X PX X PY (11)
0 0 −1 0
−1 0 0 −1 0

= X Z(Z Z) Z X X Z(Z Z) Z Y (12)

where we use the fact that P is symmetric and idempotent P0 = P and PP = P

12
(Optional) Matrix Algebra II

• It follows that
2SLS 0
−1 0

β̂ = β + X PX X PU
So β̂ 2SLS is unbiased if IVs are valid

• The variance-covariance matrix for β̂ 2SLS is (assuming homoscedasticity)

2SLS 2 0
−1
var-cov(β̂ ) = σ X PX

• CLT implies that in large sample

−1
β̂ 2SLS ∼ N β , σ 2 X 0 PX

and Wald statistic can be constructed to test H0 : Rβ = r

0 h −1 0 i−1 2SLS
2SLS 2 0
Wald Test = Rβ̂ − r Rσ X PX R Rβ̂ −r

13
No Free Lunch (trade off between unbiasedness and efficiency)

Recall that the first-stage regression is basically a decomposition

x = x̂ + r̂ (13)

which implies the following decomposition of total sum square (TSS)

T SS = ESS + RSS, T SS ≥ ESS (14)

or in this case, loosely speaking, we have

X 0 X ≥ X 0 PX, (X 0 X)−1 ≤ (X 0 PX)−1 , var-cov(β̂ OLS ) ≤ var-cov(β̂ 2SLS ) (15)

In words, IV estimator is less efficient than OLS estimator by having bigger variance (and
smaller t value). Intuitively this is because only part of the apple is eaten.

14
(Optional) Matrix Algebra III

It is straightforward to account for heteroskedasticity. The robust variance-covariance matrix

for β̂ 2SLS allowing for heteroskedasticity is
2SLS 0
−1 0 0 −1
robust var-cov(β̂ ) = X PX X PΩPX X PX

where Ω = E(UU 0 ). To estimate the meat in the middle of that sandwich, using
n
X 0 PΩPX
b = Xb0 Ω
b Xb = ∑ û2i b x0i
xi b
i=1

where û denotes the 2SLS residual

b = Y − X β̂ 2SLS
u

15
(Optional) GMM Estimator

If sample is not IID (i.e., there is heteroscedasticity or serial correlation), a more efficient
estimator is generalized method of moments (GMM) estimator
−1
GMM 0 −1 0 −1
β̂ = Xb Ω Xb Xb Ω Y (16)
0 0 −1
−1 0 0 −1
= X P Ω PX X P Ω PY (17)

Basically GMM is combining GLS with IV estimation.

15 Instrumental Variables
No ratings yet
15 Instrumental Variables
27 pages
Instrumental Variables: Ani Katchova
100% (1)
Instrumental Variables: Ani Katchova
27 pages
Lectures On IV Estimation: 1 General Set-UP
No ratings yet
Lectures On IV Estimation: 1 General Set-UP
7 pages
Econometrics: Instrumental Variables
No ratings yet
Econometrics: Instrumental Variables
21 pages
Instrumental Variables & 2SLS: y + X + X + - . - X + U X + Z+ X + - . - X + V
No ratings yet
Instrumental Variables & 2SLS: y + X + X + - . - X + U X + Z+ X + - . - X + V
21 pages
Ch. 1 - Endogeneity
No ratings yet
Ch. 1 - Endogeneity
18 pages
Slides 5 Iu
No ratings yet
Slides 5 Iu
38 pages
Chapter 15
No ratings yet
Chapter 15
38 pages
Lecture 2 - Instrumental Variable
No ratings yet
Lecture 2 - Instrumental Variable
18 pages
Econometrics for Advanced Students
No ratings yet
Econometrics for Advanced Students
73 pages
Endogeneity and Instrumental Variables
No ratings yet
Endogeneity and Instrumental Variables
22 pages
Instrumental PDF
No ratings yet
Instrumental PDF
69 pages
MIT Microeconomics 14.32 Final Review
No ratings yet
MIT Microeconomics 14.32 Final Review
5 pages
Cathy Econ0019 - w3
No ratings yet
Cathy Econ0019 - w3
44 pages
Variáveis Instrumentais
No ratings yet
Variáveis Instrumentais
21 pages
Cathy Econ0019 - w2
No ratings yet
Cathy Econ0019 - w2
62 pages
Instrumental Variables & 2SLS Guide
No ratings yet
Instrumental Variables & 2SLS Guide
21 pages
Block 4
No ratings yet
Block 4
51 pages
Chapter 1 - Instrumental Variable Method
No ratings yet
Chapter 1 - Instrumental Variable Method
32 pages
Instrumental Variables Regression Guide
No ratings yet
Instrumental Variables Regression Guide
63 pages
Lecture Set 7
No ratings yet
Lecture Set 7
88 pages
Week 10
No ratings yet
Week 10
42 pages
Instrumental Variables in Econometrics
No ratings yet
Instrumental Variables in Econometrics
17 pages
5 Ivmf
No ratings yet
5 Ivmf
13 pages
Development Economics I Dr. Elisabetta Gentile: Orientation Tutorial
No ratings yet
Development Economics I Dr. Elisabetta Gentile: Orientation Tutorial
11 pages
Econometrics: IV Method Essentials
No ratings yet
Econometrics: IV Method Essentials
32 pages
Chapter 2 SEM
No ratings yet
Chapter 2 SEM
33 pages
Inst Va Reg
No ratings yet
Inst Va Reg
37 pages
Econometrics: Endogeneity & IVs
No ratings yet
Econometrics: Endogeneity & IVs
5 pages
Instrumental Variable Estimation 1: Framework: Instructor: Yuta Toyama Last Updated: 2021-05-18
No ratings yet
Instrumental Variable Estimation 1: Framework: Instructor: Yuta Toyama Last Updated: 2021-05-18
30 pages
Instrumental Variables Regression Guide
No ratings yet
Instrumental Variables Regression Guide
7 pages
Instrumental Variables & 2SLS Guide
No ratings yet
Instrumental Variables & 2SLS Guide
28 pages
cn4 IV
No ratings yet
cn4 IV
18 pages
Lecture: Simultaneous Equation Model (Wooldridge's Book Chapter 16)
No ratings yet
Lecture: Simultaneous Equation Model (Wooldridge's Book Chapter 16)
28 pages
Endogeneity
No ratings yet
Endogeneity
9 pages
Econ 680 Tutorial 8
No ratings yet
Econ 680 Tutorial 8
1 page
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
2 pages
Econometrics: 2SLS Estimator Insights
No ratings yet
Econometrics: 2SLS Estimator Insights
4 pages
Metrics WT 2023-24 Unit12 Iv+2sls
No ratings yet
Metrics WT 2023-24 Unit12 Iv+2sls
32 pages
IV Notes1-2
No ratings yet
IV Notes1-2
56 pages
IMPORTANT3
No ratings yet
IMPORTANT3
13 pages
Chapter 15
No ratings yet
Chapter 15
76 pages
Ec0 8203 Econometrics Ppt6b
No ratings yet
Ec0 8203 Econometrics Ppt6b
25 pages
Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'
No ratings yet
Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'
9 pages
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
No ratings yet
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
33 pages
Key Expressions & Concepts
No ratings yet
Key Expressions & Concepts
5 pages
s10 IV Handout
No ratings yet
s10 IV Handout
48 pages
How To Test Endogeneity or Exogeneity Using SAS-1
No ratings yet
How To Test Endogeneity or Exogeneity Using SAS-1
28 pages
Notes 11
No ratings yet
Notes 11
9 pages
Applied Economics IV Lecture Notes
No ratings yet
Applied Economics IV Lecture Notes
64 pages
Class 7 After
No ratings yet
Class 7 After
23 pages
Introduction To Econometrics - Stock & Watson - CH 10 Slides
No ratings yet
Introduction To Econometrics - Stock & Watson - CH 10 Slides
99 pages
2024 French IV Slides
No ratings yet
2024 French IV Slides
93 pages
Tests
No ratings yet
Tests
10 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
2SLS Notes
No ratings yet
2SLS Notes
44 pages
IIMT6007 Empirical Research in Economics of Information Systems
No ratings yet
IIMT6007 Empirical Research in Economics of Information Systems
10 pages
Institutional Knowledge at Singapore Management University Institutional Knowledge at Singapore Management University
No ratings yet
Institutional Knowledge at Singapore Management University Institutional Knowledge at Singapore Management University
11 pages
Econometrics by Example 2nd Edition Damodar Gujarati Available Any Format
50% (2)
Econometrics by Example 2nd Edition Damodar Gujarati Available Any Format
175 pages
Mae Syllabus
No ratings yet
Mae Syllabus
113 pages
Sample Selection Bias and Heckman Models in Strategic Management Research
No ratings yet
Sample Selection Bias and Heckman Models in Strategic Management Research
19 pages
Eco 372.1 Course Outline Summer 2024-SBn-Final
No ratings yet
Eco 372.1 Course Outline Summer 2024-SBn-Final
4 pages
Public Health Development and Economic Growth Evidence From Chinese Cities
No ratings yet
Public Health Development and Economic Growth Evidence From Chinese Cities
26 pages
Strategic Management Evolution & Future
No ratings yet
Strategic Management Evolution & Future
43 pages
Corruption's Impact on Vietnamese SMEs
100% (1)
Corruption's Impact on Vietnamese SMEs
12 pages
Loeffler Nagin 2022 The Impact of Incarceration On Recidivism
No ratings yet
Loeffler Nagin 2022 The Impact of Incarceration On Recidivism
22 pages
CH 19
No ratings yet
CH 19
23 pages
Simultaneous Equations Modelling in Python
No ratings yet
Simultaneous Equations Modelling in Python
24 pages
Angrist & Pischke - The Credibility Revolution in Empirical Economics Jep.24.2.3
No ratings yet
Angrist & Pischke - The Credibility Revolution in Empirical Economics Jep.24.2.3
62 pages
Addressing Endogeneity in International Marketing Applications of Partial Least Squares Structural Equation Modeling
No ratings yet
Addressing Endogeneity in International Marketing Applications of Partial Least Squares Structural Equation Modeling
21 pages
Does The Competitive Advantage of Digital Transformation Influence Comparability of Accounting Informationjournal of Competitiveness
No ratings yet
Does The Competitive Advantage of Digital Transformation Influence Comparability of Accounting Informationjournal of Competitiveness
16 pages
Epple, D. & McCallum, B. (2006) - Simultaneous Equation Econometrics The Missing Example
No ratings yet
Epple, D. & McCallum, B. (2006) - Simultaneous Equation Econometrics The Missing Example
28 pages
Sky The Limit
No ratings yet
Sky The Limit
32 pages
Asymptotic Theory For Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White 2024 Scribd Download
100% (9)
Asymptotic Theory For Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White 2024 Scribd Download
67 pages
The Returns To Criminal Capital
No ratings yet
The Returns To Criminal Capital
24 pages
2023 - Velte - Ownership Structure and Corporate Tax Avoidance. Structured Literature Review On Archival Research
No ratings yet
2023 - Velte - Ownership Structure and Corporate Tax Avoidance. Structured Literature Review On Archival Research
36 pages
Employee Treatment and Firm Leverage:: A Test of The Stakeholder Theory of Capital Structure
No ratings yet
Employee Treatment and Firm Leverage:: A Test of The Stakeholder Theory of Capital Structure
23 pages
Policy-Brief 12 v2 EN
No ratings yet
Policy-Brief 12 v2 EN
9 pages
Detailed Course Outlines of Term IV - 2019-20
No ratings yet
Detailed Course Outlines of Term IV - 2019-20
106 pages
Econometrics Meets Machine Learning
100% (1)
Econometrics Meets Machine Learning
31 pages
(Ebook PDF) Introductory Econometrics: A Modern Approach 6th Edition Download
100% (4)
(Ebook PDF) Introductory Econometrics: A Modern Approach 6th Edition Download
55 pages
Chapter - 20 From The Book of Damodar N Gujarati
No ratings yet
Chapter - 20 From The Book of Damodar N Gujarati
42 pages
Nihms 1780206
No ratings yet
Nihms 1780206
11 pages
Tasya
No ratings yet
Tasya
23 pages
Stata 11: GMM Estimation Guide
No ratings yet
Stata 11: GMM Estimation Guide
29 pages
Determinants of Dividend Payout Decisions: A Dynamic Panel Data Analysis of Turkish Stock Market
No ratings yet
Determinants of Dividend Payout Decisions: A Dynamic Panel Data Analysis of Turkish Stock Market
16 pages