0% found this document useful (0 votes)

52 views61 pages

Kernel Matching

L'appariement

Uploaded by

Kingue bébé

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views61 pages

Kernel Matching

L'appariement

Uploaded by

Kingue bébé

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Kernel matching with automatic bandwidth selection

Ben Jann

University of Bern, ben.jann@soz.unibe.ch

2017 London Stata Users Group meeting

London, September 7–8, 2017

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 1

Contents

1 Background
What is Matching?
Multivariate Distance Matching (MDM)
Propensity Score Matching (PSM)
Matching Algorithms
“Why PSM Should Not Be Used for Matching”

2 The kmatch command

Features
Examples
Some Simulation Results

3 Conclusions

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 2

What is Matching?

Matching is an approach to “condition on X ” between a treatment

group and a control group.

Basic idea:
1. For each observation in the treatment group, find “statistical twins” in
the control group with the same (or at least very similar) X values.
2. The Y values of these matching observations are then used to
compute the counterfactual outcome without treatment for the
observation at hand.
3. An estimate for the average treatment effect can be obtained as the
mean of the differences between the observed values and the
“imputed” counterfactual values over all observations.

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 3

What is Matching?

Formally:
1 X h i X
[ =
ATT Yi − Ŷi0 with Ŷi0 = wij Yj
N T =1
i|T =1 j|T =0

1 X h i X
[=
ATC Ŷi1 − Yi with Ŷi1 = wij Yj
N T =0
i|T =0 j|T =1

T =1 T =0
d =N
ATE [+N
· ATT · ATC
[
N N

Different matching algorithms use different definitions of wij .

ATE : average treatment effect; ATT : a.t.e. on the treated; ATC : a.t.e. on the untreated
T : treatment indicator (0/1)
Y : observed outcome; Y 1 ; potential outcome with treatment; Y 0 : p.o. without treatment

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 4

Exact Matching

Exact matching:
(
1/ki if Xi = Xj
wij =
0 else
with ki as the number of observations for which Xi = Xj applies.

The result equivalent to “perfect stratification” or “subclassification”

(see, e.g., Cochran 1968).

Problem: If X contains several variables there is a large probability

that no exact matches can be found for many observations (the
“curse of dimensionality”).

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 5

Multivariate Distance Matching (MDM)

An alternative is to match based on a distance metric that measures

the proximity between observations in the multivariate space of X .

The idea then is to use observations that are “close”, but not
necessarily equal, as matches.

A common approach is to use

q
MD(Xi , Xj ) = (Xi − Xj )0 Σ−1 (Xi − Xj )

as distance metric, where Σ is an appropriate scaling matrix.

I Mahalanobis matching: Σ is the covariance matrix of X .
I Euclidean matching: Σ is the identity matrix.
I Mahalanobis matching is equivalent to Euclidean matching based on
standardized and orthogonalized X .

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 6

Propensity Score Matching (PSM)

(Y 0 , Y 1 ) ⊥⊥ T | X implies (Y 0 , Y 1 ) ⊥
⊥ T | π(X ), where π(X ) is the
treatment probability conditional on X (the “propensity score”)
(Rosenbaum and Rubin 1983).

This simplifies the matching task as we can match on

one-dimensional π(X ) instead of multi-dimensional X .

Procedure
I Step 1: Estimate the propensity score, e.g. using a Logit model.
I Step 2: Apply a matching algorithm using differences in the
propensity score, |π̂(Xi ) − π̂(Xj )|, instead of multivariate distances.

PSM is very popular

I https://scholar.google.ch/scholar?q="propensity+score"+AND+
(matching+OR+matched+OR+match)

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 7

Matching Algorithms

Various matching algorithms can be used to find potential matches

based on MD or π̂(X ) and determine the matching weights wij .
Pair matching (one-to-one matching without replacement)
I For each observation in the treatment group find the closest
observation in the control group. Each control is only used once.

Nearest-neighbor matching (with replacement)

I For each observation in the treatment group find the k closest
observations in the control group. A single control can be used
multiple times. In case of ties, use all ties as matches. k is set by the
researcher.

Caliper matching
I Like nearest-neighbor matching, but only use controls with a distance
smaller than some threshold c.

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 8

Matching Algorithms

Radius matching
I Use all controls with a distance smaller than some threshold c.

Kernel matching
I Like radius matching, but give larger weight to controls with smaller
distances (using some kernel function such as, e.g., the Epanechnikov
kernel).

Optional: remove remaining imbalance after matching using

regression adjustment (a.k.a. “bias correction” in the context of
nearest-neighbor matching).

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 9

“Why PSM Should Not Be Used for Matching”
The message of a recent paper by Gary King and Richard Nielsen is:
Do not use PSM, it is really, really bad.
I The paper: http://j.mp/1sexgVw
I Slides: https://gking.harvard.edu/presentations/
why-propensity-scores-should-not-be-used-matching-6
I Watch it: https://www.youtube.com/watch?v=rBv39pK1iEs

Their argument goes about as follows:

I In experimental language, PSM approximates complete randomization.
I Other methods such as MDM approximate fully blocked
randomization.
I A fully blocked design is more efficient. It leads to less data imbalance
and less “model dependence” (dependence of results on modeling
decisions by the researcher).
I Hence, procedures such as MDM dominate PSM.
I King and Nielsen provide evidence suggesting that PSM performs
shockingly bad.
Ben Jann (University of Bern) Kernel matching London, 07.09.2017 10
Matching: Finding Hidden Randomized Experiments
Types of Experiments

Balance Complete Fully

Covariates: Randomization Blocked
Observed On average Exact
Unobserved On average On average

Fully blocked dominates complete randomization for:

imbalance, model dependence, power, efficiency, bias, research
costs, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

(slides by King and Nielsen)

Goal of Each Matching Method (in Observational Data)
• PSM: complete randomization
• Other methods: fully blocked
• Other matching methods dominate PSM (wait, it gets worse)

6
Ben Jann (University of Bern) Kernel matching London, 07.09.2017 11
Best Case: Mahalanobis Distance Matching

80
C
C C C
C
C C C C CCCCC C CCCCCC C C C
70
CTCC CC
C
C
C CCCC C CCCC CC TT
C C
T
C CCCT
C
C C C CC
C CC CC C CC C C C C
CC CCCC CCCCC TC C CC
CCC CCCCC
CCC CC
C C C CCC CC
C C CC CC CCCC C CC
CC CC CC
C
C
CC C
C
C
C CCCTC
C CCC C C CCCC
C CTC
60 C C T
CC CC C C C CC CCT CC C
CC CCC CCCTC TC CCT
CT C C CCC C
C
C
CCC C CCCCC C CC CC TCCC
C TCC C CCC CCCC C
C CCCCC CCC C C
T
C C
C TCCCC C C C
Age 50 CC
CC C CCC
CC CCTCC CCCC C C C T CC
C CCT CCC
C C T
C CC TCC
C CCC
TC
C CC C
CT
CCC C C C C C C C C
T CC
C C C CCC CC
C CCC
C C C
T C C CC CCC CC C TC
CT
CC
C C C
T
T
C C C C CC C T
C CC
T CC
C C
40 CCCCC C C CC CC
CCC
C C C
T
C CC C TCC CCCC
TC
C C T CCCCCCT T C CC
TC CCC T
CC C CCCC
CC
CCC
T CCCCCC C
CC
C C CCC C C CT C C
C CC
T C TC
C C CCCCCC
CC
C C CCC CC C C TC
CC C T
C C
T C CCCC
CC C C CCC CCC C
T CCCC

(slides by King and Nielsen)

20
12 14 16 18 20 22 24 26 28

Education (years) 9/23

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 12

Best Case: Mahalanobis Distance Matching

70 C
T TT
CC T
C
T
C
C
T
T
C C
T
60 T
C T
C C
T
TT
CT C
C C
T
T
C T
C
TC
Age 50 C
T T
T
C T
C T
C
CT C
C T
T C T
TC
CTC
T
C T CTT
40 CT T
C TC
C
T C
C T C T C
TC T C
T
C
T TC
C T C C
TCT
T T
C

(slides by King and Nielsen)

20
12 14 16 18 20 22 24 26 28

Education (years) 9/23

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 12

Best Case: Propensity Score Matching
80
C
CC C C C C C CCCCCC CC CCC C 1
70
C
C
TCCCC
C
C CCC CCCCC
C CC C T
TCC C CCCC
C C C C CC C CT C
C CC T
CCC C CC CCCCC
C CC C CCC CTCC CCCCC
C
C CCC CCTC
C C
CCCCCCCCC
CC CC CCCC C
C C
C
CCCCC CCCCCC C CC
C
C C C
CCC
C CTC
60 C
C CCCT
C C C
TC C T
C CC
C CCC CCC C
C
C CC C CCCCCC
CC
C C C
TCCC
C
C
T
C C
T
C
T C
C CCC CC
CCCCCC
C
CC C CC C C CT
CC CCC C CCC C
C
CC
T CC TCC
C CC TCCC C CC
Age 50 CC C C
CCC
CC
C CCC
C
CCCCC
TC CCT C C CT C CC
T
CC C
C C
C
C
CT
CCCC C C CCCC
CC CC
T CC CCCC C C
CC
C T
TCC
C
CT
TCC
C
CC
CCCCCCC
C
CC C CCCC
CT
T
C C TTC
C CCCCC
40 CCCC C
C CCC CC C
CCT
CCC C
CC C TCC CCC
TC
CC T CC CCCCC
T T
C
T CC C
C C C
C
T
C C C CCCCCC
CC C
TC CC
CCC
TC C CCCCCC CC T C C
C C
T
C C C CC
CC CCC C C TCC
CCT C
T CCC TC
C CC
30 CC CC C CC C CCCCCCCC CC C

(slides by King and Nielsen)

20
0
12 16 20 24 28
Propensity
Education (years) Score
15/23

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 12

(slides by King and Nielsen)

20
0
12 16 20 24 28
Propensity
Education (years) Score
15/23

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 12

Best Case: Propensity Score Matching is Suboptimal
80

70 T C TT
C CC
C T TC
T C
TC T
60 T C T C C
T C
C
T
C T
T
CTCTT
C C
CT T C
Age 50 CC T T
CT TC T CCT
T TTC C C
CTTT C T C
40 C T
C C C
T T TCCC
T T
C T T C
T C T T C T T C
C TCCT T TC
30

(slides by King and Nielsen)

20
12 16 20 24 28

Education (years)
15/23

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 12

“Why PSM Should Not Be Used for Matching”
Are King and Nielsen right?
I For a given sample size (as in an experiment with fixed budget), fully
blocked randomization is more efficient than complete randomization.
Things are less clear if blocking reduces the sample size, as in
matching.
I The complete randomization analogy only works for observations with
the same propensity score. If X has a strong effect on T , there is a
lot of blocking also in PSM.
I King and Nielson’s examples illustrating the bad performance of PSM
seem to be based on pair matching without replacement. Pair
matching throws away a lot of data. For PSM, pair matching is
particularly bad because a lot of good data (i.e. observations with the
same PS) is thrown away (“random pruning”).
I The performance of PSM should be alright for matching algorithms
that do not engage in random pruning, such as radius or kernel
matching.

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 13

The kmatch command

New matching software for Stata.

Partly written in response to the paper by King and Nielsen.

Available from SSC (ssc install kmatch).

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 14

Key Features
Type of matching
I Multivariate Distance Matching (MDM)
I Propensity Score Matching (PSM)
I MDM combined with PSM
I MDM and PSM combined with exact matching
Matching algorithms
I Kernel matching, including ridge and local-linear matching
I Nearest-neighbor matching, optionally with caliper
I Optional regression adjustment
Several automatic bandwidth selectors for kernel matching
Joint analysis of multiple subgroups and multiple outcome variables
Various post-estimation commands for balancing and
common-support diagnostics
Computationally efficient

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 15

Examples: Mahalanobis-Distance Kernel Matching
Estimation of the “effect” of union membership on wages using the
NLSW 1988 data.
. sysuse nlsw88, clear
(NLSW, 1988 extract)
. drop if industry==2
(4 observations deleted)
. kmatch md union collgrad ttl_exp tenure i.industry i.race south ///
> (wage), nate att
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1105 291 1396 1.3394

Treatment-effects estimation

wage Coef.

ATT .6059013
NATE 1.432913

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 16

Examples: Balancing Statistics
. kmatch summarize
(refitting the model using the generate() option)

Raw Matched(ATT)
Means Treated Untrea~d StdDif Treated Untrea~d StdDif

collgrad .321663 .224212 .219912 .319444 .319444 0

ttl_exp 13.2685 12.7323 .117584 13.3205 13.1425 .039036
tenure 7.89205 6.17658 .29735 7.91744 7.58347 .057888
3.industry .006565 .012178 -.058246 .00463 .00463 0
4.industry .183807 .166905 .044425 .185185 .185185 0
5.industry .105033 .027937 .312944 .085648 .085648 0
6.industry .045952 .169771 -.407129 .048611 .048611 0
7.industry .019694 .102436 -.350657 .020833 .020833 0
8.industry .017505 .035817 -.113785 .009259 .009259 0
9.industry .010941 .040115 -.185669 .011574 .011574 0
10.industry .004376 .008596 -.052551 .002315 .002315 0
11.industry .479212 .356734 .250073 .506944 .506944 0
12.industry .122538 .07235 .169707 .12037 .12037 0
2.race .330416 .244986 .189418 .3125 .3125 0
3.race .017505 .011461 .050566 .006944 .006944 0
south .297593 .466332 -.352408 .291667 .291667 0

Raw Matched(ATT)
Variances Treated Untrea~d Ratio Treated Untrea~d Ratio

collgrad .218674 .174066 1.25628 .217904 .217904 1

ttl_exp 20.5898 21.0001 .980459 19.8177 18.2323 1.08696
tenure 37.2044 29.3629 1.26706 37.0399 34.9543 1.05966
3.industry .006536 .012038 .542928 .004619 .004619 1
4.industry .150351 .139148 1.08052 .151242 .151242 1
5.industry .094207 .027176 3.46656 .078494 .078494 1
6.industry .043936 .14105 .311496 .046355 .046355 1
7.industry .019348 .092008 .210287 .020447 .020447 1
8.industry .017237 .034559 .498769 .009195 .009195 1
9.industry .010845 .038533 .281445 .011467 .011467 1
Ben Jann (University
10.industry .004367of Bern)
.008528 .512039 Kernel
.002315 matching
.002315 1 London, 07.09.2017 17
Examples: Make a Graph of the Balancing Statistics
. mat M = r(M)
. mat V = r(V)
. coefplot matrix(M[,3]) matrix(M[,6]) || matrix(V[,3]) matrix(V[,6]) || , ///
> bylabels("Std. mean difference" "Variance ratio") ///
> noci nolabels byopts(xrescale)
. addplot 1: , xline(0) norescaling legend(order(1 "Raw" 2 "Matched"))
. addplot 2: , xline(1) norescaling

Std. mean difference Variance ratio

collgrad
ttl_exp
tenure
3.industry
4.industry
5.industry
6.industry
7.industry
8.industry
9.industry
10.industry
11.industry
12.industry
2.race
3.race
south
-.4 -.2 0 .2 .4 0 1 2 3 4

Raw Matched

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 18

Examples: Propensity-Score Kernel Matching

. kmatch ps union collgrad ttl_exp tenure i.industry i.race south ///

> (wage), nate att
(computing bandwidth ... done)
Propensity-score kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Covariates: collgrad ttl_exp tenure i.industry i.race south
PS model : logit (pr)
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 431 26 457 1214 182 1396 .00188

Treatment-effects estimation

wage Coef.

ATT .3887224
NATE 1.432913

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 19

Examples: Density Balancing Plot
. kmatch density, lw(*6 *2) lc(*.5 *1)
(refitting the model using the generate() option)
(applying 0-1 boundary correction to density estimation of propensity score)
(bandwidth for propensity score = .06803989)

Raw Matched (ATT)

3
2
Density
1
0

0 .2 .4 .6 .8 0 .2 .4 .6 .8
Propensity score
Untreated Treated

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 20

Examples: Cumulative Distribution Balancing Plot
. kmatch cumul, lw(*6 *2) lc(*.5 *1)
(refitting the model using the generate() option)

1 Raw Matched (ATT)

Cumulative probability
.5
0

0 .2 .4 .6 .8 0 .2 .4 .6 .8
Propensity score
Untreated Treated

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 21

Examples: Balancing Box Plot
. kmatch box
(refitting the model using the generate() option)

Raw Matched (ATT)

.8
.6
Propensity score
.4
.2
0

Untreated Treated

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 22

Examples: Standard Errors
. kmatch md union collgrad ttl_exp tenure i.industry i.race south ///
> (wage), nate ate att atc vce(bootstrap)
(computing bandwidth for treated ... done)
(computing bandwidth for untreated ... done)
(running kmatch on estimation sample)
Bootstrap replications (50)
1 2 3 4 5
.................................................. 50
Multivariate-distance kernel matching Number of obs = 1,853
Replications = 50
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1105 291 1396 1.3394

Untreated 1386 10 1396 455 2 457 3.3975
Combined 1818 35 1853 1560 293 1853 .

Treatment-effects estimation

Observed Bootstrap Normal-based

wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

ATE .4095729 .1920853 2.13 0.033 .0330928 .7860531

ATT .6059013 .2472069 2.45 0.014 .1213846 1.090418
ATC .3483797 .1893653 1.84 0.066 -.0227695 .7195289
NATE 1.432913 .2333282 6.14 0.000 .9755981 1.890228

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 23

Examples: Postestimation Tests

. lincom ATT-NATE
( 1) ATT - NATE = 0

wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

(1) -.8270117 .1810415 -4.57 0.000 -1.181847 -.4721768

. test ATT = ATC

( 1) ATT - ATC = 0
chi2( 1) = 2.42
Prob > chi2 = 0.1200

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 24

Examples: Nearest-Neighbor Matching (1 Neighbor)
. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), att nn
Multivariate-distance nearest-neighbor matching
Number of obs = 1,853
Neighbors: min = 1
Treatment : union = 1 max = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 457 0 457 328 1068 1396 .

Treatment-effects estimation

wage Coef.

ATT .7246969

. teffects nnmatch (wage collgrad ttl_exp tenure i.industry i.race south) (union), atet
Treatment-effects estimation Number of obs = 1,853
Estimator : nearest-neighbor matching Matches: requested = 1
Outcome model : matching min = 1
Distance metric: Mahalanobis max = 1

AI Robust
wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

ATET
union
(union vs nonunion) .7246969 .2942952 2.46 0.014 .147889 1.301505

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 25

Examples: Nearest-Neighbor Matching (5 Neighbors)
. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), att nn(5)
Multivariate-distance nearest-neighbor matching
Number of obs = 1,853
Neighbors: min = 5
Treatment : union = 1 max = 5
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 457 0 457 870 526 1396 .

Treatment-effects estimation

wage Coef.

ATT .5590823

. teffects nnmatch (wage collgrad ttl_exp tenure i.industry i.race south) (union), atet nn(5)
Treatment-effects estimation Number of obs = 1,853
Estimator : nearest-neighbor matching Matches: requested = 5
Outcome model : matching min = 5
Distance metric: Mahalanobis max = 6

AI Robust
wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

ATET
union
(union vs nonunion) .5590823 .2381752 2.35 0.019 .0922675 1.025897

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 26

Examples: Regression Adjustment
. kmatch md union collgrad ttl_exp tenure i.industry i.race south ///
> (wage = collgrad ttl_exp tenure i.industry i.race south), att nn(5)
Multivariate-distance nearest-neighbor matching
Number of obs = 1,853
Neighbors: min = 5
Treatment : union = 1 max = 5
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 457 0 457 870 526 1396 .

Treatment-effects estimation

wage Coef.

ATT .5288023

adjusted for collgrad ttl_exp tenure i.industry i.race south

. teffects nnmatch (wage collgrad ttl_exp tenure i.industry i.race south) ///
> (union), atet nn(5) biasadj(collgrad ttl_exp tenure i.industry i.race south)
Treatment-effects estimation Number of obs = 1,853
Estimator : nearest-neighbor matching Matches: requested = 5
Outcome model : matching min = 5
Distance metric: Mahalanobis max = 6

AI Robust
wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

ATET
union
(union vs nonunion) .5288023 .2420635 2.18 0.029 .0543666 1.003238

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 27

Examples: MDM and PSM combined

. kmatch md union collgrad ttl_exp tenure (wage), att ///

> psvars(i.industry i.race south) psweight(3)
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis (modified)
Covariates: collgrad ttl_exp tenure
PS model : logit (pr)
PS covars : i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 439 18 457 1258 138 1396 .83886

Treatment-effects estimation

wage Coef.

ATT .6408443

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 28

Examples: MDM with Exact Matching

. kmatch md union collgrad ttl_exp tenure (wage), att ematch(industry race south)
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure
Exact : industry race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1103 293 1396 1.3013

Treatment-effects estimation

wage Coef.

ATT .6047374

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 29

Examples: Bandwidth Selection
Default: 1.5 times the 90% quantile of the (non-zero) distances in
pair matching with replacement (Huber et al. 2013, 2015).
. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), ///
> att bwidth(pm)
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1105 291 1396 1.3394

Treatment-effects estimation

wage Coef.

ATT .6059013

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 30

Examples: Bandwidth Selection

Cross validation with respect to the means of X .

. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), ///
> att bwidth(cv)
(computing bandwidth ................ done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 448 9 457 1184 212 1396 1.8888

Treatment-effects estimation

wage Coef.

ATT .6651578

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 31

Examples: Bandwidth Selection
. kmatch cvplot, ms(o) index mlabposition(1) sort

.1
3
.08
.06
MSE
.04

4
1
.02

5 6
7 9158 102
12
11
14
13

1.5 2 2.5 3
Bandwidth

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 32

Examples: Bandwidth Selection

Cross validation with respect to Y (Frölich 2004, 2005).

. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), ///
> att bwidth(cv wage)
(computing bandwidth ................ done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 453 4 457 1289 107 1396 2.433

Treatment-effects estimation

wage Coef.

ATT .6928956

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 33

Examples: Bandwidth Selection
. kmatch cvplot, ms(o) index mlabposition(1) sort

12.6

1
12.4

3
MISE
12.2

2
12

5 4
7 14
8
96
11
113
0
15
12
11.8

1.5 2 2.5 3
Bandwidth

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 34

Examples: Bandwidth Selection
Weighted cross validation with respect to Y (Galdo et al. 2008,
Section 4.2).
. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), ///
> att bwidth(cv wage, weighted)
(computing bandwidth ................ done)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 455 2 457 1356 40 1396 2.7626

Treatment-effects estimation

wage Coef.

ATT .7308166

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 35

Examples: Bandwidth Selection
. kmatch cvplot, ms(o) index mlabposition(1) sort

14 13 4
Weighted MISE

5
2
7
12

6
3
9
1012 11
1413
815
11

1 2 3 4 5
Bandwidth

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 36

Examples: Common Support Diagnostics
. kmatch md union collgrad ttl_exp tenure i.industry i.race south (wage), ///
> att bwidth(0.5)
Multivariate-distance kernel matching Number of obs = 1,853
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 366 91 457 701 695 1396 .5

Treatment-effects estimation

wage Coef.

ATT .3303161

. kmatch csummarize
(refitting the model using the generate() option)

Common support (treated) Standardized difference

Means Matched Unmatc~d Total (1)-(3) (2)-(3) (1)-(2)

collgrad .322404 .318681 .321663 .001585 -.006376 .007962

ttl_exp 13.3929 12.7682 13.2685 .027413 -.110253 .137666
tenure 8.12614 6.95055 7.89205 .038378 -.154356 .192734
3.industry .002732 .021978 .006565 -.047404 .190657 -.238061
4.industry .191257 .153846 .183807 .019212 -.077269 .096481
5.industry .062842 .274725 .105033 -.137462 .552867 -.690329
6.industry .057377 0 .045952 .054507 -.219225 .273732
7.industry .019126 .021978 .019694 -.004083 .016423 -.020506
8.industry .005464 .065934 .017505 -.091714 .368871 -.460585
9.industry .010929 .010989 .010941 -.000115 .000462 -.000577
10.industry 0 .021978 .004376 -.066227 .266363 -.332589
11.industry .554645 .175824 .479212 .15083 -.606636 .757467
12.industry .092896 .241758 .122538 -.090299 .363181 -.45348
2.race .243169 .681319 .330416 -.185284 .745209 -.930494
3.race .002732 .076923 .017505 -.112525 .452572 -.565097
south .29235 .318681 .297593 -.011456 .046074 -.05753
Ben Jann (University of Bern) Kernel matching London, 07.09.2017 37
Examples: Make a Graph of Common Support Statistics
. mat M = r(M)
. coefplot matrix(M[,4]), noci nolabels xline(0) ///
> title("Std. difference between matched and original")

Std. difference between matched and original

collgrad
ttl_exp
tenure
3.industry
4.industry
5.industry
6.industry
7.industry
8.industry
9.industry
10.industry
11.industry
12.industry
2.race
3.race
south
-.2 -.1 0 .1 .2

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 38

Examples: Multiple Outcome Variables
. kmatch md union collgrad ttl_exp tenure i.industry i.race south ///
> (wage hours), nate att
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,852
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1104 291 1395 1.3392

Treatment-effects estimation

Coef.

wage
ATT .6021049
NATE 1.430823

hours
ATT 1.263759
NATE 1.450303

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 39

Examples: Varying Regression-Adjustment Equations
. kmatch md union collgrad ttl_exp tenure i.industry i.race south ///
> (wage = collgrad ttl_exp tenure) ///
> (hours = i.industry i.race), nate att
(computing bandwidth ... done)
Multivariate-distance kernel matching Number of obs = 1,852
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race south
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

Treated 432 25 457 1104 291 1395 1.3392

Treatment-effects estimation

Coef.

wage
ATT .5152752
NATE 1.430823

hours
ATT 1.263759
NATE 1.450303

wage: adjusted for collgrad ttl_exp tenure

hours: adjusted for i.industry i.race

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 40

Examples: Treatment Effects by Subpopulation
. kmatch md union collgrad ttl_exp tenure i.industry i.race (wage), ///
> att vce(boot) over(south)
(south=0: computing bandwidth ... done)
(south=1: computing bandwidth ... done)
(running kmatch on estimation sample)
Bootstrap replications (50)
1 2 3 4 5
.................................................. 50
Multivariate-distance kernel matching Number of obs = 1,853
Replications = 50
Kernel = epan
Treatment : union = 1
Metric : mahalanobis
Covariates: collgrad ttl_exp tenure i.industry i.race
0: south = 0
1: south = 1
Matching statistics

Matched Controls Band-

Yes No Total Used Unused Total width

0
Treated 306 15 321 625 120 745 1.3199

1
Treated 126 10 136 473 178 651 1.3398

Treatment-effects estimation

Observed Bootstrap Normal-based

wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

0
ATT .4586332 .2808206 1.63 0.102 -.0917652 1.009032

1
ATT .9518705 .334356 2.85 0.004 .2965449 1.607196

. test [0]ATT = [1]ATT

( 1) [0]ATT - [1]ATT = 0
chi2( 1) = 1.36
Prob > chi2 = 0.2433
. lincom [1]ATT - [0]ATT
( 1) - [0]ATT + [1]ATT = 0

wage Coef. Std. Err. z P>|z| [95% Conf. Interval]

(1) .4932373 .4227171 1.17 0.243 -.335273 1.321748

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 41

Simulation
Population data from Swiss census of 2000.
Outcome: Treiman occupational prestige (recoded from ISCO codes
of the current job using command iskotrei by Hendrickx 2002)
(values from 6 to 78; mean 44).
Estimand: ATT of nationality on occupational prestige, with
resident aliens as the treatment group and Swiss nationals as the
control group.
Control variables: gender, age, and highest educational degree.
Population restricted to people between 24 to 60 years old who are
working.
2’308’006 individuals, of which 17.5% belong to the treatment
group.
Draw random samples (N = 500 or 5000) from population and
compute various matching estimators.
Ben Jann (University of Bern) Kernel matching London, 07.09.2017 42
Simulation
Substantial differences between resident aliens and Swiss nationals
on all three covariates.
Propensity score in population (computed from fully stratified data)
Untreated
7
Treated

5
Density

0
0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
Propensity score

McFadden R 2 = 0.121
Ben Jann (University of Bern) Kernel matching London, 07.09.2017 43
Simulation

Raw mean difference in occupational prestige (NATE): −4.79

Population ATT (computed from fully stratified data): −3.96
There is some treatment effect heterogeneity (ATE = −3.51, ATC
= −3.41)

55 -1
Untreated
Treated

50 -2

Treatment effect
45 -3
Outcome

40 -4

35 -5

30 -6
0 .1 .2 .3 .4 .5 .6 .7 .8 0 .1 .2 .3 .4 .5 .6 .7 .8
Propensity score Propensity score

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 44

Results: Variance
N = 500 N = 5000
Nearest-neighbor
matching

1 neighbor

5 neighbors

Kernel matching MDM

with bias
correction
fixed bandwidth
PSM
pair-matching with bias
bandwidth correction

cross-validation
with respect to X

cross-validation
with respect to Y

weighted CV
with respect to Y

1.5 2 2.5 3 3.5 4 4.5 .15 .2 .25 .3 .35 .4 .45

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 45

Results: Variance

Kernel matching Nearest-neighbor

matching
N = 500 N = 5000

2017-09-12
1 neighbor

The kmatch command 5 neighbors

Kernel matching MDM

with bias

Some Simulation Results

correction
fixed bandwidth
PSM
pair-matching with bias
bandwidth correction

cross-validation
with respect to X

cross-validation
with respect to Y

weighted CV
with respect to Y

1.5 2 2.5 3 3.5 4 4.5 .15 .2 .25 .3 .35 .4 .45

In this slide we can see that for the same algorithm PSM typically is
somewhat less efficient than MDM, but that across algorithms PSM
can also be much more efficient than MDM. For example, kernel
matching PSM has a much smaller variance than 1-nearest-neighbor
MDM. That is, the choice of algorithm matters much more than the
choice between PSM and MDM.

For kernel matching the efficiency differences between PSM and MDM
are only small; additional post-matching regression adjustment further
reduces the differences.
Results: Bias reduction (in percent)
N = 500 N = 5000
Nearest-neighbor
matching

1 neighbor

5 neighbors

Kernel matching MDM

with bias
correction
fixed bandwidth
PSM
pair-matching with bias
bandwidth correction

cross-validation
with respect to X

cross-validation
with respect to Y

weighted CV
with respect to Y

70 80 90 100 110 120 130 95 100 105 110 115 120 125 130 135

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 46

Results: Bias reduction (in percent)

Kernel matching Nearest-neighbor

matching
N = 500 N = 5000

2017-09-12
1 neighbor

The kmatch command 5 neighbors

Kernel matching MDM

with bias

Some Simulation Results

correction
fixed bandwidth
PSM
pair-matching with bias
bandwidth correction

cross-validation
with respect to X

cross-validation
with respect to Y

weighted CV
with respect to Y

70 80 90 100 110 120 130 95 100 105 110 115 120 125 130 135

Here we see that PSM has a bias that does not vanish as the sample
size increases. The reason is that the same propensity-score model
specification is used for both sample sizes. The model is rather simple
(linear effect of age, no interactions) and due to the specific pattern of
the data (in particular, the sharp drop in the outcome variable after
propensity score 0.3) small imprecisions can have substantial effects on
the results. In practice, one would probably use a more refined
specification in the large-sample situation, which would reduce bias.

The bias also vanishes once post-matching regression adjustment is

applied.
Results: Mean squared error
N = 500 N = 5000
Nearest-neighbor
matching

1 neighbor

5 neighbors

Kernel matching MDM

with bias
correction
fixed bandwidth
PSM
pair-matching with bias
bandwidth correction

cross-validation
with respect to X

cross-validation
with respect to Y

weighted CV
with respect to Y

1.5 2 2.5 3 3.5 4 4.5 .15 .2 .25 .3 .35 .4 .45 .5

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 47

Results: Relative standard error
N = 500 N = 5000
Nearest-neighbor
matching (teffects)

1 neighbor

5 neighbors

Nearest-neighbor
matching (bootstrap)

1 neighbor MDM
with bias
5 neighbors correction
Kernel matching
(bootstrap) PSM
with bias
fixed bandwidth correction
pair-matching
bandwidth
cross-validation
with respect to X
cross-validation
with respect to Y
weighted CV
with respect to Y
.9 .95 1 1.05 1.1 1.15 1.2 .95 1 1.05 1.1 1.15 1.2 1.25

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 48

Results: Relative standard error

Kernel matching Nearest-neighbor

matching (teffects)
N = 500 N = 5000

2017-09-12
1 neighbor

The kmatch command

5 neighbors

Nearest-neighbor
matching (bootstrap)

1 neighbor MDM
with bias
5 neighbors

Some Simulation Results

correction
Kernel matching
(bootstrap) PSM
with bias
fixed bandwidth correction
pair-matching
bandwidth
cross-validation
with respect to X
cross-validation
with respect to Y
weighted CV
with respect to Y
.9 .95 1 1.05 1.1 1.15 1.2 .95 1 1.05 1.1 1.15 1.2 1.25

Here we can observe the well-known result that bootstrap standard

errors are biased (too large) for nearest-neighbor matching.
In small samples, also the teffects standard errors seem to be slightly
off (too low) for PSM and for MDM with bias-correction.

For kernel matching, bootstrap standard standard errors are often

somewhat too large, especially in the small sample. The bias is most
pronounced for the estimates using the pair-matching bandwidth
selector. Results are better if the bandwidth is selected by
cross-validation.
Results: Coverage of 95% CIs
N = 500 N = 5000
Nearest-neighbor
matching (teffects)

1 neighbor

5 neighbors

Nearest-neighbor
matching (bootstrap)

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 49

Results: Coverage of 95% CIs

Kernel matching Nearest-neighbor

matching (teffects)
N = 500 N = 5000

2017-09-12
1 neighbor

The kmatch command

5 neighbors

Nearest-neighbor
matching (bootstrap)

1 neighbor MDM
with bias
5 neighbors

Some Simulation Results

Coverage of teffects CIs is a bit too low for PSM (and for MDM with
bias-correction in the small sample).
Bootstrap CIs are too conservative for nearest-neighbor matching.

For kernel matching, coverage is mostly okay, being a bit too

conservative in case of the pair-matching bandwidth selector and
considerably off (anti-conservative) for the PSM estimates without
bias-correction (due to the pronounced bias in these estimates).
Conclusions

Overall, I agree with King and Nielsen that MDM has some
advantages over PSM, but it also has some disadvantages. In
applied research the choice may not be that clear.
- MDM leaves less scope for bias due to post-matching modeling
decisions.
- Theoretical results (see, e.g., Frölich 2007) suggest that MDM will
generally tend to outperform PSM in terms of efficiency (but
differences are likely to be small).
- Less restrictions in terms of possible post-matching analyses.
, Choice of scaling matrix largely arbitrary.
, Computational complexity.

One clear conclusion we can draw, however, is:

Do not use propensity scores for pair matching!
(But don’t use pair matching anyhow.)

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 50

Conclusions

Some additional conclusions from the simulation

I For PSM, application of regression-adjustment seems like a great idea
(reduction of bias and variance); for MDM the advantages of
regression-adjustment are less clear.
I Bootstrap standard error/confidence interval estimation seems to be
mostly ok for kernel/ridge matching; this is in contrast to
nearest-neighbor matching, where bootstrap standard errors are
clearly biased.

To do
I Run some more simulations.
I Variance estimation based on influence functions?
I Better (and faster) bandwidth selection algorithms?
I Explore potential of adaptive bandwidths?

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 51

References I

Cochran, W.G. 1968. The Effectiveness of Adjustment by Subclassification

in Removing Bias in Observational Studies. Biometrics 24(2):295–313.
Frölich, M. 2004. Finite-sample properties of propensity-score matching
and weighting estimators. The Review of Economics and Statistics
86(1):77–90.
Frölich, M. 2005. Matching estimators and optimal bandwidth choice.
Statistics and Computing 15:197-215.
Frölich, M. 2007. On the inefficiency of propensity score matching AStA
91:279–290.
Galdo, J.C., J. Smith, D. Black. 2008. Bandwidth selection and the
estimation of treatment effects with unbalanced data. Annales d’Économie
et de Statistique 91/92:89-216.

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 52

References II

Hendrickx, J. 2002. ISKO: Stata module to recode 4 digit ISCO-88

occupational codes. Statistical Software Components S425802, Boston
College Department of Economics.
Huber, M., M. Lechner, A. Steinmayr. 2015. Radius matching on the
propensity score with bias adjustment: tuning parameters and finite sample
behaviour. Empirical Economics 49:1-31.
Huber, M., M. Lechner, C. Wunsch. 2013. The performance of estimators
based on the propensity score. Journal of Econometrics 175:1-21.
King, G., R. Nielsen. 2016. Why Propensity Scores Should Not Be Used
for Matching. Working Paper. Available from http://j.mp/1sexgVw.
Rosenbaum, P.R., D.B. Rubin. 1983. The Central Role of the Propensity
Score in Observational Studies for Causal Effects. Biometrika 70:41–55.

Ben Jann (University of Bern) Kernel matching London, 07.09.2017 53

PSM
No ratings yet
PSM
21 pages
Stata Guide for Propensity Score Matching
100% (1)
Stata Guide for Propensity Score Matching
15 pages
Propensity Score Matching With Clustered Data in Stata: Bruno Arpino
No ratings yet
Propensity Score Matching With Clustered Data in Stata: Bruno Arpino
37 pages
EH426 AT3 2024 Matching
No ratings yet
EH426 AT3 2024 Matching
31 pages
Germany17 Jann
No ratings yet
Germany17 Jann
84 pages
Propensity Score Matching Guide
100% (1)
Propensity Score Matching Guide
41 pages
Lanners 23 A
No ratings yet
Lanners 23 A
11 pages
Matching and The Propensity Score Handout
No ratings yet
Matching and The Propensity Score Handout
23 pages
Strategy Matching
No ratings yet
Strategy Matching
3 pages
Cem: Coarsened Exact Matching in Stata: 9, Number 4, Pp. 524-546
No ratings yet
Cem: Coarsened Exact Matching in Stata: 9, Number 4, Pp. 524-546
23 pages
Journal of Statistical Software
No ratings yet
Journal of Statistical Software
52 pages
Matching Estimator
No ratings yet
Matching Estimator
38 pages
Data Matching
No ratings yet
Data Matching
74 pages
Slides DS
No ratings yet
Slides DS
334 pages
Propensity Score Matching - Dimewiki - 100606
No ratings yet
Propensity Score Matching - Dimewiki - 100606
2 pages
Ver Invariant and Metric Free Proximities For Data Jss.v025.i11
No ratings yet
Ver Invariant and Metric Free Proximities For Data Jss.v025.i11
22 pages
Introduction To Propensity Score Analysis
No ratings yet
Introduction To Propensity Score Analysis
41 pages
FLAME: A Fast Large-Scale Almost Matching Exactly Approach To Causal Inference
No ratings yet
FLAME: A Fast Large-Scale Almost Matching Exactly Approach To Causal Inference
23 pages
On Kernel-Target Alignment
No ratings yet
On Kernel-Target Alignment
7 pages
Data Matching
No ratings yet
Data Matching
37 pages
Likelihood-Free Adaptive Bayesian Inference Via Nonparametric Distribution Matching
No ratings yet
Likelihood-Free Adaptive Bayesian Inference Via Nonparametric Distribution Matching
61 pages
Chapter 3
No ratings yet
Chapter 3
12 pages
Propensity Score Matching With SPSS
No ratings yet
Propensity Score Matching With SPSS
30 pages
Unit-II DM Techniques
No ratings yet
Unit-II DM Techniques
20 pages
Matching and Selection On Observables Handout
100% (1)
Matching and Selection On Observables Handout
30 pages
Data Analytics Course (IIFT MBA) Full Course Summary - 27072023
No ratings yet
Data Analytics Course (IIFT MBA) Full Course Summary - 27072023
253 pages
Kernel Methods For General Pattern Analysis PDF
No ratings yet
Kernel Methods For General Pattern Analysis PDF
77 pages
Cheat Sheet PSM
No ratings yet
Cheat Sheet PSM
3 pages
Rathje 等 - Making the most of AI and machine learning in organizations and strategy research Supervised machin
No ratings yet
Rathje 等 - Making the most of AI and machine learning in organizations and strategy research Supervised machin
28 pages
Matching Method (PSM) - Mbarara. Toko
No ratings yet
Matching Method (PSM) - Mbarara. Toko
28 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
Slides QSMII Chapter 3
No ratings yet
Slides QSMII Chapter 3
12 pages
Machine Learning Overview Guide
No ratings yet
Machine Learning Overview Guide
68 pages
Aiml Unit-4
No ratings yet
Aiml Unit-4
82 pages
Modeldepmatch Handout - Pdf#page 20
No ratings yet
Modeldepmatch Handout - Pdf#page 20
110 pages
Lecture 8-9 - Clustering
No ratings yet
Lecture 8-9 - Clustering
43 pages
R Propensity Score Matching Guide
No ratings yet
R Propensity Score Matching Guide
24 pages
DWDM (Unit-4) - 2
No ratings yet
DWDM (Unit-4) - 2
23 pages
Data Mining Technique
No ratings yet
Data Mining Technique
7 pages
Robust Improper Maximum Likelihood Tuning Computation and A Comparison With Other Methods For Robust Gaussian Clustering
No ratings yet
Robust Improper Maximum Likelihood Tuning Computation and A Comparison With Other Methods For Robust Gaussian Clustering
13 pages
Curve Fit
No ratings yet
Curve Fit
212 pages
CHL5230 2025w Lecture 09 v2
No ratings yet
CHL5230 2025w Lecture 09 v2
25 pages
Kernel Methods and Machine Learning S. Y. Kung Download
No ratings yet
Kernel Methods and Machine Learning S. Y. Kung Download
58 pages
Cours2 ML
No ratings yet
Cours2 ML
21 pages
Informational Rescaling of PCA Maps With Application To Genetic Distance
No ratings yet
Informational Rescaling of PCA Maps With Application To Genetic Distance
5 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
NeurIPS 2021 Adaptive Conformal Inference Under Distribution Shift Paper
No ratings yet
NeurIPS 2021 Adaptive Conformal Inference Under Distribution Shift Paper
13 pages
Unsupervised
No ratings yet
Unsupervised
14 pages
Ai Notes V
No ratings yet
Ai Notes V
7 pages
21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
Wallace PropScore 0614f
No ratings yet
Wallace PropScore 0614f
22 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
OLS and Matching
No ratings yet
OLS and Matching
20 pages
Propensity Score Matching Guide
No ratings yet
Propensity Score Matching Guide
42 pages
(Monographs On Statistics and Applied Probability (Series) 161) Li, Bing - Sufficient Dimension Reduction - Methods and Applications With R-CRC Press (2018)
100% (1)
(Monographs On Statistics and Applied Probability (Series) 161) Li, Bing - Sufficient Dimension Reduction - Methods and Applications With R-CRC Press (2018)
307 pages
An Animated Guide: Matching Test and Control Subjects: Russell Lavery, Independent Consultant
No ratings yet
An Animated Guide: Matching Test and Control Subjects: Russell Lavery, Independent Consultant
21 pages
rssb12129 Sup 0001 Supinfo
No ratings yet
rssb12129 Sup 0001 Supinfo
39 pages
Entropy Balancing for Stata Users
No ratings yet
Entropy Balancing for Stata Users
32 pages
Diff
No ratings yet
Diff
41 pages
Some Practical Guidance For The Implementation of Propensity Score Matching
No ratings yet
Some Practical Guidance For The Implementation of Propensity Score Matching
33 pages
Psmatch 2
No ratings yet
Psmatch 2
12 pages
Diffusions and Optimization
No ratings yet
Diffusions and Optimization
13 pages
CausalInference w7 Panel
No ratings yet
CausalInference w7 Panel
30 pages
Hirano Imbens Ridder 2003
No ratings yet
Hirano Imbens Ridder 2003
30 pages
Meilleure Pratique Du Diff in Diff
No ratings yet
Meilleure Pratique Du Diff in Diff
17 pages
MASI ContentOutline 2025
No ratings yet
MASI ContentOutline 2025
8 pages
AIML ML Session 4 - Student Common Reference (With More Additional Reading Materials) Part 2
No ratings yet
AIML ML Session 4 - Student Common Reference (With More Additional Reading Materials) Part 2
45 pages
(Ebook) Biostatistics: A Foundation For Analysis in The Health Sciences by Wayne W. Daniel ISBN 9780470105825, 0470105828 PDF Download
No ratings yet
(Ebook) Biostatistics: A Foundation For Analysis in The Health Sciences by Wayne W. Daniel ISBN 9780470105825, 0470105828 PDF Download
147 pages
Entropy, Irreversibility and Inference at The Foundations of Statistical Physics
No ratings yet
Entropy, Irreversibility and Inference at The Foundations of Statistical Physics
12 pages
Pusat Pertumbuhan Kota Dan Hinterland
No ratings yet
Pusat Pertumbuhan Kota Dan Hinterland
9 pages
Wilcoxon Signed Rank Table PDF
No ratings yet
Wilcoxon Signed Rank Table PDF
1 page
Pengaruh Coaching Dan Mentoring Terhadap Kualifikasi Kelulusan Pelatihan Dasar Cpns Guru SD
No ratings yet
Pengaruh Coaching Dan Mentoring Terhadap Kualifikasi Kelulusan Pelatihan Dasar Cpns Guru SD
11 pages
FIDP - Statistics and Probability SY 24-25
No ratings yet
FIDP - Statistics and Probability SY 24-25
13 pages
Fe220 Apr23 With Supplement-1
No ratings yet
Fe220 Apr23 With Supplement-1
20 pages
Homework Assignment-7 Answers
No ratings yet
Homework Assignment-7 Answers
11 pages
Tutorial Time Series Forecasting With Xgboost
No ratings yet
Tutorial Time Series Forecasting With Xgboost
5 pages
Sampling Theory Assignment
No ratings yet
Sampling Theory Assignment
1 page
SAT Suite Question Bank - Problem Solving and Data Analysis AnsResults
No ratings yet
SAT Suite Question Bank - Problem Solving and Data Analysis AnsResults
113 pages
OUTPUT Spss Versi 20
No ratings yet
OUTPUT Spss Versi 20
7 pages
(BA ZG524/MBA ZG538/PDBA ZG538) Advanced Statistical Methods Lecture No: 11 (13-04-24)
No ratings yet
(BA ZG524/MBA ZG538/PDBA ZG538) Advanced Statistical Methods Lecture No: 11 (13-04-24)
43 pages
Chapter 9: Linear Regression and Correlation
No ratings yet
Chapter 9: Linear Regression and Correlation
6 pages
Lec 6
No ratings yet
Lec 6
133 pages
Panel Smooth Threshold Regression Guide
No ratings yet
Panel Smooth Threshold Regression Guide
7 pages
Quiz CH 5-8
No ratings yet
Quiz CH 5-8
5 pages
Solutions Set2
No ratings yet
Solutions Set2
2 pages
10 RD
No ratings yet
10 RD
16 pages
Business Analytics For Decision Making
No ratings yet
Business Analytics For Decision Making
3 pages
Maximum Likelihood Estimate For (MLE) : XX X X XXX X
No ratings yet
Maximum Likelihood Estimate For (MLE) : XX X X XXX X
2 pages
Kernel Methods: Dept. Computer Science & Engineering, Shanghai Jiao Tong University
No ratings yet
Kernel Methods: Dept. Computer Science & Engineering, Shanghai Jiao Tong University
29 pages
Correlations Loretta Kunch Capella University Master of Psychology Spring 2009
No ratings yet
Correlations Loretta Kunch Capella University Master of Psychology Spring 2009
8 pages
ARMA Model
No ratings yet
ARMA Model
12 pages
Lecture 10 PDF
No ratings yet
Lecture 10 PDF
73 pages
CIE Review For : Inferential Statistics
No ratings yet
CIE Review For : Inferential Statistics
7 pages
Introductory Econometrics A Modern Approach 4th Edition Jeffrey M. Wooldridge PDF Available
No ratings yet
Introductory Econometrics A Modern Approach 4th Edition Jeffrey M. Wooldridge PDF Available
127 pages
Data Correlation Analysis
No ratings yet
Data Correlation Analysis
15 pages