Quantitative Social Science Methods, I,
Lecture Notes: Research Designs for Causal
Inference
Gary King1
Institute for Quantitative Social Science
Harvard University
August 17, 2020
1
GaryKing.org
1 / 25 .
Components of Causal Estimation Error
Research Designs
Issues in Ideal Designs
Components of Causal Estimation Error 2 / 25 .
Reference
• Kosuke Imai, Gary King, and Elizabeth Stuart.
Misunderstandings among Experimentalists and
Observationalists: Balance Test Fallacies in Causal Inference
Journal of the Royal Statistical Society, Series A, 171, Part 2
(2008): 1–22.
• http://j.mp/MisExpObs
Components of Causal Estimation Error 3 / 25 .
Notation
• Sample 𝑛 units from finite population size 𝑁 (typically
𝑁 ≫ 𝑛)
• Observed outcome variable: 𝑌𝑖
• Sample selection: 𝐼𝑖 = 1 if selected, 0 otherwise
• Treatment assignment: 𝑇𝑖 = 1 if treated group, 0 if control
• (Assume: treated and control groups are each of size 𝑛/2)
• Potential outcomes: 𝑌𝑖 (1) and 𝑌𝑖 (0), 𝑌𝑖 when 𝑇𝑖 is 1 or 0
• Fundamental problem of causal inference. Only one potential
outcome is ever observed:
If 𝑇𝑖 = 0, 𝑌𝑖 (0) = 𝑌𝑖 𝑌𝑖 (1) = ?
If 𝑇𝑖 = 1, 𝑌𝑖 (0) = ? 𝑌𝑖 (1) = 𝑌𝑖
• (𝐼𝑖 , 𝑇𝑖 , 𝑌𝑖 ) are random; 𝑌𝑖 (1) and 𝑌𝑖 (0) are fixed.
• Quiz: How can 𝑌𝑖 be random when 𝑌𝑖 (0) and 𝑌𝑖 (1) are fixed?
Components of Causal Estimation Error 4 / 25 .
Quantities of Interest
• Treatment Effect (for unit 𝑖):
TE𝑖 ≡ 𝑌𝑖 (1) − 𝑌𝑖 (0)
• Population Average Treatment Effect
1 𝑁
PATE ≡ ∑ TE𝑖
𝑁 𝑖=1
• Sample Average Treatment Effect
1
SATE ≡ ∑ TE𝑖
𝑛 𝑖∈{𝐼 =1}
𝑖
Components of Causal Estimation Error 5 / 25 .
Decomposition of Causal Effect Estimation Error
• Difference in means estimator
⎛ 1 ⎞ ⎛ 1 ⎞
𝐷 ≡ 𝑌1̄ − 𝑌0̄ = ⎜ ∑ 𝑌𝑖 ⎟ − ⎜ ∑ 𝑌𝑖 ⎟ .
⎝ 𝑛/2 𝑖 ∈{𝐼𝑖 =1,𝑇𝑖 =1} ⎠ ⎝ 𝑛/2 𝑖 ∈{𝐼𝑖 =1,𝑇𝑖 =0} ⎠
• Pretreatment confounders: observed 𝑋 ; unobserved 𝑈
• Decomposition
Δ ≡ PATE − 𝐷 (Estimation error)
= Δ𝑆 + Δ𝑇
= (Δ𝑆𝑋 + Δ𝑆𝑈 ) + (Δ𝑇𝑋 + Δ𝑇𝑈 )
Error due to: Δ𝑆 (sample selection), Δ𝑇 (treatment
imbalance), and each due to observed (𝑋𝑖 ) and unobserved
(𝑈𝑖 ) covariates
Components of Causal Estimation Error 6 / 25 .
Decomposing Selection Error
Δ = Δ𝑆 + Δ𝑇 = (Δ𝑆𝑋 + Δ𝑆𝑈 ) + Δ𝑇
• Definition
Δ𝑆 ≡ PATE − SATE
𝑁 −𝑛
= (NATE − SATE), NATE: nonsample ATE
𝑁
• Δ𝑆 vanishes if
• The sample is a census (𝐼𝑖 = 1 for all observations and 𝑛 = 𝑁 );
• SATE = NATE (i.e., nothing to correct)
• Switch quantity of interest from PATE to SATE
(recommended!)
• Δ𝑆𝑋 = 0 when empirical distribution of (observed) 𝑋 is
identical in population and sample:
̃
𝐹 (𝑋 ∣ 𝐼 = 0) = ̃ 𝐹 (𝑋 ∣ 𝐼 = 1).
• Δ𝑆𝑈 = 0 when empirical distribution of (unobserved) 𝑈 is
identical in population and sample:
̃
𝐹 (𝑈 ∣ 𝐼 = 0) = ̃ 𝐹 (𝑈 ∣ 𝐼 = 1).
• Unverifiable: 𝑋 unobserved out of sample; 𝑈 unobserved
• Δ𝑆𝑋 : vanishes if weighting on 𝑋 (and examples exist in
sample)
Components of Causal Estimation Error 7 / 25 .
Decomposing Treatment Imbalance
Δ = Δ𝑆 + Δ𝑇 = Δ𝑆 + (Δ𝑇𝑋 + Δ𝑇𝑈 )
• Δ𝑇𝑋 = 0: when 𝑋 balanced between treateds and controls:
̃
𝐹 (𝑋 ∣ 𝑇 = 1, 𝐼 = 1) = ̃
𝐹 (𝑋 ∣ 𝑇 = 0, 𝐼 = 1).
Verifiable; generated ex ante by blocking or ex post via
matching or modeling
• Δ𝑇𝑈 = 0: when 𝑈 balanced between treateds and controls:
𝐹 (𝑈 ∣ 𝑇 = 1, 𝐼 = 1) = ̃
̃ 𝐹 (𝑈 ∣ 𝑇 = 0, 𝐼 = 1).
Unverifiable; Achieved only by assumption or, on average, by
random treatment assignment
Components of Causal Estimation Error 8 / 25 .
Alternative Quantities of Interest: For Matching
• Population average treatment effect on the treated
1
PATT ≡ ∑ TE𝑖
𝑁 ∗ 𝑖∈{𝑇 =1}
𝑖
(𝑁 ∗ = ∑𝑁
𝑖=1 𝑇𝑖 : number of treated units in population)
• Sample average treatment effect on the treated
1
SATT ≡ ∑ TE𝑖
𝑛/2 𝑖∈{𝐼 =1,𝑇 =1}
𝑖 𝑖
• Analogous estimation error decomposition holds:
Δ′ = PATT − 𝐷 = (Δ′𝑆𝑋 + Δ′𝑆𝑈 ) + (Δ′𝑇𝑋 + Δ′𝑇𝑈 )
• Quiz: Why PATT and SATT rather than PATE and SATE for
matching?
• Quiz: How do they differ in randomized experiments?
Components of Causal Estimation Error 9 / 25 .
Effects of Design Components on Estimation Error
Δ = Δ𝑆 + Δ𝑇 = (Δ𝑆𝑋 + Δ𝑆𝑈 ) + (Δ𝑇𝑋 + Δ𝑇𝑈 )
Design Choice Δ𝑆𝑋 Δ𝑆𝑈 Δ𝑇𝑋 Δ𝑇𝑈
avg avg
Random sampling = 0 = 0
avg
Complete stratified random sampling =0 = 0
Focus on SATE rather than PATE =0 =0
Weighting for nonrandom sampling =0 =?
Large sample size →? →? →? →?
avg avg
Random treatment assignment = 0 = 0
Complete blocking =0 =?
Exact matching =0 =?
Assumption
avg avg
No selection bias = 0 = 0
avg
Ignorability = 0
No omitted variables =0
Components of Causal Estimation Error 10 / 25 .
Comparing Blocking (i.e., before) and Matching (i.e.,
after)
• Adding blocking (on pretreatment vars related to outcome) to
random assignment: as or more efficient, and never biased
• Blocking: like regression adjustment, where functional form
and the parameter values are known
• Matching is like blocking, except:
• to avoid selection error: change QOI from PATE to PATT/SATT
• random treatment assignment following matching:
impossible
• Exact matching, unlike blocking: dependent on good matches
in already-collected data
• Worst case scenario: matching on wrong vars (like regression
adjustment) can increase bias
• Adding matching to a parametric model: reduces model
dependence and bias, and sometimes variance too
• Quiz: Which is preferable: Matching or Blocking?
Components of Causal Estimation Error 11 / 25 .
Components of Causal Estimation Error
Research Designs
Issues in Ideal Designs
Research Designs 12 / 25 .
The Benefits of Major Research Designs: Overview
Δ𝑆𝑋 Δ 𝑆𝑈 Δ 𝑇𝑋 Δ 𝑇𝑈
Ideal experiment →0 →0 =0 →0
Randomized clinicial trials
avg avg
(Limited or no blocking) ≠0 ≠0 = 0 = 0
Randomized clinicial trials
avg
(Full blocking) ≠0 ≠0 =0 = 0
Social Science
Field Experiment • → 0: 𝐸(𝑄) = 0 &
(Limited or no blocking) ≠0 ≠0 →0 →0 lim Var(𝑄) = 0
Survey Experiment 𝑛→∞
(Limited or no blocking) →0 →0 →0 →0
Observational Study avg
(Representative data set, • = 0: 𝐸(𝑄) = 0
Well-matched) ≈0 ≈0 ≈0 ≠0
Observational Study
(Unrepresentative but partially,
correctable data, well-matched) ≈0 ≠0 ≈0 ≠0
Observational Study
(Unrepresentative data set,
Well-matched) ≠0 ≠0 ≈0 ≠0
Research Designs 13 / 25 .
The Ideal Experiment (according to the paper)
• Random selection from well-defined population
• large 𝑛
• blocking on all known confounders
• random treatment assignment within blocks
• 𝐸(Δ𝑆𝑋 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑆𝑋 ) = 0
• 𝐸(Δ𝑆𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑆𝑈 ) = 0
• Δ𝑇𝑋 = 0
• 𝐸(Δ𝑇𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑈 ) = 0
• Quiz: Is there an even more ideal experiment?
• Hint: How can we make Δ𝑆𝑋 = 0?
Research Designs 14 / 25 .
An Even More Ideal Experiment (not in the paper)
• Begin with a well-defined population
• New feature: Define sampling strata based on
cross-classification of all known confounders
• Random sampling within strata
• (if strata sample ∝ population size, no weights needed)
• large 𝑛
• blocking on all known confounders
• random treatment assignment within blocks
• Δ𝑆𝑋 = 0
• 𝐸(Δ𝑆𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑆𝑈 ) = 0
• Δ𝑇𝑋 = 0
• 𝐸(Δ𝑇𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑈 ) = 0
• Wait, why wasn’t this in the paper?
Research Designs 15 / 25 .
Randomized Clinical Trials (Little or no Blocking)
• nonrandom selection
• small 𝑛
• little or no blocking
• random treatment assignment
• Δ𝑆𝑋 ≠ 0
• Δ𝑆𝑈 ≠ 0
• 𝐸(Δ𝑇𝑋 ) = 0
• 𝐸(Δ𝑇𝑈 ) = 0
Research Designs 16 / 25 .
Randomized Clinical Trials (Full Blocking)
• nonrandom selection
• small 𝑛
• Full blocking
• random treatment assignment
• Δ𝑆𝑋 ≠ 0
• Δ𝑆𝑈 ≠ 0
• Δ𝑇𝑋 = 0
• 𝐸(Δ𝑇𝑈 ) = 0
Research Designs 17 / 25 .
Social Science Field Experiment
• nonrandom selection
• large 𝑛
• limited or no blocking
• random treatment assignment
• Δ𝑆𝑋 ≠ 0 or change PATE to SATE and Δ𝑆𝑋 = 0
• Δ𝑆𝑈 ≠ 0 or change PATE to SATE and Δ𝑆𝑈 = 0
• 𝐸(Δ𝑇𝑋 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑋 ) = 0
• 𝐸(Δ𝑇𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑈 ) = 0
Research Designs 18 / 25 .
Survey Experiment
• random selection
• large 𝑛
• limited or no blocking
• random treatment assignment
• (only treatments: question wording changes)
• 𝐸(Δ𝑆𝑋 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑆𝑋 ) = 0
• 𝐸(Δ𝑆𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑆𝑈 ) = 0
• 𝐸(Δ𝑇𝑋 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑋 ) = 0
• 𝐸(Δ𝑇𝑈 ) = 0, lim𝑛→∞ 𝑉 (Δ𝑇𝑈 ) = 0
Research Designs 19 / 25 .
Observational Study, well-matched
• no stratification, nonrandom selection
• large 𝑛
• no blocking, nonrandom treatment assignment
• Δ𝑆𝑋 ≈ 0 if representative, corrected by weighting, or for
estimating SATE; or ≠ 0 otherwise
• Δ𝑆𝑈 ≠ 0
• Δ𝑇𝑋 ≈ 0 (due to matching well)
• Δ𝑇𝑈 ≠ 0 except by assumption
Research Designs 20 / 25 .
Components of Causal Estimation Error
Research Designs
Issues in Ideal Designs
Issues in Ideal Designs 21 / 25 .
What is the Best Design?
• Ideal design: rarely feasible
• Effort in experimental studies: random assignment
• Effort in observational studies: knowing, measuring, and
adjusting for 𝑋 (via matching or modeling)
• Achilles heal of experiments: Δ𝑆 , small 𝑛
• Achilles heal of observational studies: Δ𝑇
• Each design: accommodates best to its applications
• Quiz: Astronomers never randomize; is astronomy a science?
Issues in Ideal Designs 22 / 25 .
Fallacies in Experimental Research
• Failure to block on all available confounders
• incorrectly seen as requiring fewer assumptions (about what
to block on)
• In fact, blocking helps (except in strange situations)
• Blocking on relevant covariates is better, so choose carefully.
• “Block what you can and randomize what you cannot”
• t-test to check balance after random treatment assignment
• blocking vars: balance exactly after treatment assignment; if
you’re checking, you missed an opportunity to increase
efficiency
• if vars become available after treatment assignment: t-test
checks if randomization was done appropriately
• randomization balances on average: any one random
assignment is not balanced exactly (which is why its better to
block)
Issues in Ideal Designs 23 / 25 .
The Balance Test Fallacy in Matching Research
100
4
80
3
60
Math test score
t−statistic
"Statistical
40
insignificance" region
1
20 QQ Plot Mean Deviation
Difference in Means
0
0 100 200 300 400 0 100 200 300 400
Number of Controls Randomly Dropped Number of Controls Randomly Dropped
Quiz: randomly dropping observations reduces imbalance??
Issues in Ideal Designs 24 / 25 .
The Balance Test Fallacy: Explanation
• Hypo tests: balance and power; only want balance
• Balance is observed: No need for superpopulation or
inference
• Simple linear model (for intution):
• Suppose 𝐸(𝑌 ∣ 𝑇 , 𝑋 ) = 𝜃 + 𝑇 𝛽 + 𝑋 𝛾
• Bias in coefficient on 𝑇 from regressing 𝑌 on 𝑇 (without 𝑋 ):
𝐸(𝛽 ̂ − 𝛽 ∣ 𝑇 , 𝑋 ) = 𝐺𝛾 (where 𝐺 are coefficients from a
regression 𝑋 on a constant and 𝑇 )
• Imbalance: 𝐺, Importance: 𝛾
• If 𝐺 = 0, bias=0
• If 𝐺 ≠ 0, bias can be any size (due to 𝛾 )
• To reduce bias: reduce 𝐺 without limit
• No threshold level is safe
• But prune too much, variance increases
• Quiz: Should we match on vars that do not influence 𝑌 ?
Issues in Ideal Designs 25 / 25 .