0% found this document useful (0 votes)
34 views12 pages

Psmatch 2

Uploaded by

Kingue bébé
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views12 pages

Psmatch 2

Uploaded by

Kingue bébé
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Micro-econometric Methods for Policy Evaluation Master 1 Epp

Christian Belzil S2 2010

PC 8 : Propensity score matching and panel discrete choice models


Elise Coudin (elise.coudin@insee.fr) & Eric Strobl (eric.strobl@polytechnique.edu)

1. psmatch2 implements a variety of matching methods (full Mahalanobis or propensity score) to


adjust for pre-treatment observable dierences between a group of treated and a group of untreated.
psgraph graphs the propensity score histogram by treatment status, useful to check the support
condition. pstest calculates several measures of the balancing of the conditioning variables before
and after matching such as t-tests for equality of means in the treated and non-treated groups.
This is useful to check that the estimated propensity score balances well the matched treated and
non-treated groups.
2. psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat, ///
caliper(0.001) norepl
Probit regression Number of obs = 826
LR chi2(8) = 44.66
Prob > chi2 = 0.0000
Log likelihood = -511.14954 Pseudo R2 = 0.0419
------------------------------------------------------------------------------
dfmfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sexhead | -.5856012 .2159713 -2.71 0.007 -1.008897 -.1623052
agehead | .0078299 .0039731 1.97 0.049 .0000427 .0156171
educhead | -.0233796 .0150976 -1.55 0.121 -.0529704 .0062112
lnland | -.1229218 .0274799 -4.47 0.000 -.1767815 -.0690622
vaccess | -.0209349 .2048482 -0.10 0.919 -.4224301 .3805603
pcirr | .2127278 .1603985 1.33 0.185 -.1016474 .527103
rice | -.0094697 .0519575 -0.18 0.855 -.1113046 .0923652
wheat | .0623835 .0398347 1.57 0.117 -.015691 .1404579
_cons | -.2943495 .6845007 -0.43 0.667 -1.635946 1.047247
------------------------------------------------------------------------------
. des _*
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
_pscore double %10.0g psmatch2: Propensity Score
_treated byte %9.0g _treated psmatch2: Treatment assignment
_support byte %11.0g _support psmatch2: Common support
_weight double %10.0g psmatch2: weight of matched
controls
_n1 float %9.0g psmatch2: ID of nearest neighbor
_nn float %9.0g psmatch2: # matched neighbors
_id float %9.0g psmatch2: Identifier (ID)
_pdif double %10.0g psmatch2: abs(pscore -
pscore[nearest neighbor])

1
browse

3. . tab _treated _support

psmatch2: | psmatch2: Common


Treatment | support
assignment | Off suppo On suppor | Total
-----------+----------------------+----------
Untreated | 0 539 | 539
Treated | 82 205 | 287
-----------+----------------------+----------
Total | 82 744 | 826

. sum _pscore if _treated==1

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
_pscore | 287 .3816799 .1132882 .1683377 .7348696

. sum _pscore if _treated==0

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
_pscore | 539 .3290559 .1043259 .0797802 .7090652

. sum _pscore if _treated==1 & _support==1

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
_pscore | 205 .3516456 .0888244 .1683377 .681276

. sum _pscore if _treated==0 & _weight==1

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
_pscore | 205 .3518039 .0888879 .1688955 .6814633

Comment : 205 (among 287) treated individuals are matched with 205 controls. Before matching
the propensity score averages of treated and non treated groups dier, whereas they are similar for
the matched ones.
4. psgraph, bin(100)
Comment : the common support condition requires that there exist treated and non treated at
any value of the propensity score. Here, there exist treated and non treated with close propensity
score values for propensity scores between .2 and .55. Most of the treated with propensity score
higher than .55 are disregarded because they cannot be matched. Note that using another caliper or

2
allowing for replacement (a given control can be matched with several treated) change the number

0 .2 .4 .6 .8
Propensity Score

Untreated Treated: On support


Treated: Off support
of on support treated.
5. * treated vs untreated
kdensity _pscore if _treated==1, gen(x1 y1)
kdensity _pscore if _treated==0, legend(lab(1 "untreated")) ///
addplot(line y1 x1, legend(lab(2 "treated")))

Kernel density estimate


4
3
Density
21
0

0 .2 .4 .6 .8
psmatch2: Propensity Score

untreated
treated
kernel = epanechnikov, bandwidth = 0.0260

* treated vs untreated, on support only


kdensity _pscore if _treated==1 & _support==1, gen(x2 y2)
kdensity _pscore if _treated==0 & _support==1, legend(lab(1 "untreated on support")) ///
addplot(line y2 x2, legend(lab(2 "treated on support")))

3
Kernel density estimate

4
3
Density
21
0

0 .2 .4 .6 .8
psmatch2: Propensity Score

untreated on support
treated on support
kernel = epanechnikov, bandwidth = 0.0260

* treated on support vs treated off support


kdensity _pscore if _treated==1 & _support==1, gen(x3 y3)
kdensity _pscore if _treated==1 & _support==0, legend(lab(1 "treated off support")) ///
addplot(line y2 x2, legend(lab(2 "treated on support")))

Kernel density estimate


4
3
Density
21
0

0 .2 .4 .6 .8
psmatch2: Propensity Score

treated off support


treated on support
kernel = epanechnikov, bandwidth = 0.0487

6. . pstest dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat

----------------------------------------------------------------------------
| Mean %reduct | t-test
Variable Sample | Treated Control %bias |bias| | t p>|t|
------------------------+----------------------------------+----------------
dfmfd Unmatched | 1 0 . | . .
Matched | 1 0 . . | . .
| |
sexhead Unmatched | .91986 .97217 -23.3 | -3.44 0.001
Matched | .98049 .97073 4.3 81.3 | 0.64 0.523
| |
agehead Unmatched | 41.328 40.468 7.3 | 0.98 0.327
Matched | 40.702 40.4 2.6 64.8 | 0.26 0.795
| |

4
educhead Unmatched | 1.9094 2.6698 -23.7 | -3.14 0.002
Matched | 2.1171 2.1024 0.5 98.1 | 0.05 0.960
| |
lnland Unmatched | 2.3735 3.034 -37.2 | -5.00 0.000
Matched | 2.5803 2.6616 -4.6 87.7 | -0.50 0.615
| |
vaccess Unmatched | .95122 .94249 3.9 | 0.53 0.599
Matched | .94146 .93171 4.3 -11.7 | 0.40 0.686
| |
pcirr Unmatched | .46341 .4357 8.8 | 1.21 0.225
Matched | .452 .43785 4.5 49.0 | 0.46 0.644
| |
rice Unmatched | 9.6847 9.7015 -1.8 | -0.24 0.807
Matched | 9.7146 9.7127 0.2 88.4 | 0.02 0.983
| |
wheat Unmatched | 8.757 8.6368 9.5 | 1.26 0.210
Matched | 8.6573 8.8098 -12.1 -26.9 | -1.39 0.165
| |
----------------------------------------------------------------------------

Comment: control averages between matched treated and matched untreated are not signicantly
dierent, whereas they are for some conditioning variables when comparing unmatched treated and
untreated. Hence, the estimated propensity balances well the matched treated and non treated
groups.

7. Comment: with a smaller caliper, the number of matched individuals is smaller. Note that the
average educhead for matched treated and control are signicantly dierent at 5%.
8. . psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat, caliper(0.001) norepl out(lexptot)

Probit regression Number of obs = 826


LR chi2(8) = 44.66
Prob > chi2 = 0.0000
Log likelihood = -511.14954 Pseudo R2 = 0.0419

------------------------------------------------------------------------------
dfmfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sexhead | -.5856012 .2159713 -2.71 0.007 -1.008897 -.1623052
agehead | .0078299 .0039731 1.97 0.049 .0000427 .0156171
educhead | -.0233796 .0150976 -1.55 0.121 -.0529704 .0062112
lnland | -.1229218 .0274799 -4.47 0.000 -.1767815 -.0690622
vaccess | -.0209349 .2048482 -0.10 0.919 -.4224301 .3805603
pcirr | .2127278 .1603985 1.33 0.185 -.1016474 .527103
rice | -.0094697 .0519575 -0.18 0.855 -.1113046 .0923652
wheat | .0623835 .0398347 1.57 0.117 -.015691 .1404579
_cons | -.2943495 .6845007 -0.43 0.667 -1.635946 1.047247
------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Variable Sample | Treated Controls Difference S.E. T-stat
----------------------------+-----------------------------------------------------------
lexptot Unmatched | 8.19926804 8.2768092 -.077541161 .027811151 -2.79
ATT | 8.18032496 8.247803 -.067478036 .033849895 -1.99
----------------------------+-----------------------------------------------------------
Note: S.E. for ATT does not take into account that the propensity score is estimated.

5
psmatch2: | psmatch2: Common
Treatment | support
assignment | Off suppo On suppor | Total
-----------+----------------------+----------
Untreated | 0 539 | 539
Treated | 82 205 | 287
-----------+----------------------+----------
Total | 82 744 | 826

. psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat, caliper(0.0001)
norepl out(lexptot)

Probit regression Number of obs = 826


LR chi2(8) = 44.66
Prob > chi2 = 0.0000
Log likelihood = -511.14954 Pseudo R2 = 0.0419

------------------------------------------------------------------------------
dfmfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sexhead | -.5856012 .2159713 -2.71 0.007 -1.008897 -.1623052
agehead | .0078299 .0039731 1.97 0.049 .0000427 .0156171
educhead | -.0233796 .0150976 -1.55 0.121 -.0529704 .0062112
lnland | -.1229218 .0274799 -4.47 0.000 -.1767815 -.0690622
vaccess | -.0209349 .2048482 -0.10 0.919 -.4224301 .3805603
pcirr | .2127278 .1603985 1.33 0.185 -.1016474 .527103
rice | -.0094697 .0519575 -0.18 0.855 -.1113046 .0923652
wheat | .0623835 .0398347 1.57 0.117 -.015691 .1404579
_cons | -.2943495 .6845007 -0.43 0.667 -1.635946 1.047247
------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Variable Sample | Treated Controls Difference S.E. T-stat
----------------------------+-----------------------------------------------------------
lexptot Unmatched | 8.19926804 8.2768092 -.077541161 .027811151 -2.79
ATT | 8.19833669 8.28574089 -.087404204 .07457537 -1.17
----------------------------+-----------------------------------------------------------
Note: S.E. for ATT does not take into account that the propensity score is estimated.

psmatch2: | psmatch2: Common


Treatment | support
assignment | Off suppo On suppor | Total
-----------+----------------------+----------
Untreated | 0 539 | 539
Treated | 236 51 | 287
-----------+----------------------+----------
Total | 236 590 | 826

Comment: Even if the ATT standard errors estimated here do not account that the propensity
score is estimated, the ATT is not signicant at 5% and negative!
9. bootstrap r(att), reps(50): psmatch2 dfmfd sexhead agehead educhead lnland vaccess ///
pcirr rice wheat, caliper(0.001) norepl out(lexptot)

Bootstrap replications (50)

6
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50

Bootstrap results Number of obs = 826


Replications = 50
command: psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat, caliper(0.001) norepl
out(lexptot)
_bs_1: r(att)
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_bs_1 | -.067478 .0424284 -1.59 0.112 -.1506362 .0156802
------------------------------------------------------------------------------

Comment: the estimated ATT is not signicant at 5% and negative. Note the t-stat is of smaller
magnitude than the non bootstrapped one because it accounts for the fact that the propensity score
is estimated.
10. use bangladesh2
. * treatment group: those who change of treatment status: dfmfd==1 and l.dfmfd==0
. * eliminate those such that dfmfd98=0 and dfmfd91==1
. * easier to do in wide format
. drop lexptot lnland

. reshape wide
(note: j = 91 98)

Data long -> wide


-----------------------------------------------------------------------------
Number of obs. 1652 -> 826
Number of variables 16 -> 29
j variable (2 values) year -> (dropped)
xij variables:
villid -> villid91 villid98
thanaid -> thanaid91 thanaid98
agehead -> agehead91 agehead98
sexhead -> sexhead91 sexhead98
educhead -> educhead91 educhead98
hhland -> hhland91 hhland98
exptot -> exptot91 exptot98
dfmfd -> dfmfd91 dfmfd98
dmmfd -> dmmfd91 dmmfd98
weight -> weight91 weight98
vaccess -> vaccess91 vaccess98
pcirr -> pcirr91 pcirr98
rice -> rice91 rice98
wheat -> wheat91 wheat98
-----------------------------------------------------------------------------

.
. drop if dfmfd98==0 & dfmfd91==1
(33 observations deleted)

. gen treatment=(dfmfd98==1)*(dfmfd91==0)

7
. reshape long
(note: j = 91 98)

Data wide -> long


-----------------------------------------------------------------------------
Number of obs. 793 -> 1586
Number of variables 30 -> 17
j variable (2 values) -> year
xij variables:
villid91 villid98 -> villid
thanaid91 thanaid98 -> thanaid
agehead91 agehead98 -> agehead
sexhead91 sexhead98 -> sexhead
educhead91 educhead98 -> educhead
hhland91 hhland98 -> hhland
exptot91 exptot98 -> exptot
dfmfd91 dfmfd98 -> dfmfd
dmmfd91 dmmfd98 -> dmmfd
weight91 weight98 -> weight
vaccess91 vaccess98 -> vaccess
pcirr91 pcirr98 -> pcirr
rice91 rice98 -> rice
wheat91 wheat98 -> wheat
-----------------------------------------------------------------------------

. gen lnland=log(hhland)
. gen lexptot=log(exptot)

11. . psmatch2 treatment sexhead agehead educhead lnland vaccess pcirr rice wheat if year==91,
caliper(0.001) norepl

Probit regression Number of obs = 793


LR chi2(8) = 28.46
Prob > chi2 = 0.0004
Log likelihood = -411.73148 Pseudo R2 = 0.0334

------------------------------------------------------------------------------
treatment | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sexhead | .408713 .2710587 1.51 0.132 -.1225523 .9399783
agehead | -.0133881 .004428 -3.02 0.002 -.0220668 -.0047094
educhead | -.0185048 .0158925 -1.16 0.244 -.0496535 .0126439
lnland | -.0641379 .0300213 -2.14 0.033 -.1229785 -.0052973
vaccess | .1888308 .2330654 0.81 0.418 -.267969 .6456305
pcirr | -.1858468 .1778552 -1.04 0.296 -.5344366 .162743
rice | .1104556 .0580798 1.90 0.057 -.0033787 .2242899
wheat | -.0691978 .0383199 -1.81 0.071 -.1443034 .0059077
_cons | -.9616839 .7575114 -1.27 0.204 -2.446379 .5230112
------------------------------------------------------------------------------
(793 missing values generated)

. pstest sexhead agehead educhead lnland vaccess pcirr rice wheat

----------------------------------------------------------------------------
| Mean %reduct | t-test

8
Variable Sample | Treated Control %bias |bias| | t p>|t|
------------------------+----------------------------------+----------------
sexhead Unmatched | .97238 .94935 11.9 | 1.31 0.192
Matched | .96753 .96753 0.0 100.0 | -0.00 1.000
| |
agehead Unmatched | 38.055 41.475 -29.0 | -3.40 0.001
Matched | 38.682 38.948 -2.3 92.2 | -0.20 0.841
| |
educhead Unmatched | 2.0994 2.5147 -12.4 | -1.47 0.143
Matched | 2.2597 2.4416 -5.4 56.2 | -0.47 0.639
| |
lnland Unmatched | 2.4441 2.904 -26.1 | -2.98 0.003
Matched | 2.5756 2.526 2.8 89.2 | 0.26 0.796
| |
vaccess Unmatched | .9558 .94281 5.9 | 0.68 0.498
Matched | .95455 .94156 5.9 0.0 | 0.51 0.609
| |
pcirr Unmatched | .42519 .44639 -6.9 | -0.80 0.424
Matched | .41779 .44214 -7.9 -14.9 | -0.72 0.470
| |
rice Unmatched | 9.8055 9.6778 13.8 | 1.63 0.104
Matched | 9.726 9.7494 -2.5 81.7 | -0.22 0.824
| |
wheat Unmatched | 8.5884 8.7083 -8.2 | -1.08 0.282
Matched | 8.8019 8.6802 8.3 -1.5 | 0.97 0.332
| |
----------------------------------------------------------------------------

Comment: The averages of the conditioning variables are not signicantly dierent between
matched treated and matched non treated. Hence, the estimated propensity score balances well
the matched treated and non treated groups. Not that some old control households and young
treatment group households cannot be matched. Some treatment group households with small
farms, and controls with big farms cannot be matched.
12. . gen time=(year==98)
. gen treatmentt2=(treatment==1)*(time==1)
. egen matched=max(_weight), by(nh)

13. . reg lexptot time treatment treatmentt2 if matched==1, robust

Linear regression Number of obs = 616


F( 3, 612) = 12.65
Prob > F = 0.0000
R-squared = 0.0569
Root MSE = .42186

------------------------------------------------------------------------------
| Robust
lexptot | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
time | .205729 .051929 3.96 0.000 .1037483 .3077096
treatment | -.0121564 .0372494 -0.33 0.744 -.0853086 .0609958
treatmentt2 | .0011963 .0679884 0.02 0.986 -.1323226 .1347153
_cons | 8.216379 .0269606 304.75 0.000 8.163433 8.269326

9
------------------------------------------------------------------------------

. reg lexptot time treatment treatmentt2, robust

Linear regression Number of obs = 1586


F( 3, 1582) = 30.63
Prob > F = 0.0000
R-squared = 0.0511
Root MSE = .46228

------------------------------------------------------------------------------
| Robust
lexptot | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
time | .2084268 .0277563 7.51 0.000 .1539839 .2628698
treatment | -.059309 .0280666 -2.11 0.035 -.1143607 -.0042573
treatmentt2 | .0003589 .0480446 0.01 0.994 -.0938789 .0945967
_cons | 8.262594 .0163468 505.46 0.000 8.23053 8.294658
------------------------------------------------------------------------------

Comment: the regression is done on a smaller number of observations, but the results are similar.

. xtlogit dfmfd lnland vaccess pcirr rice wheat, fe


note: multiple positive outcomes within groups encountered.
note: 612 groups (1224 obs) dropped because of all positive or
all negative outcomes.
note: lnland omitted because of no within-group variance.

Iteration 0: log likelihood = -77.091704


Iteration 1: log likelihood = -71.198664
Iteration 2: log likelihood = -70.734246
Iteration 3: log likelihood = -70.732988
Iteration 4: log likelihood = -70.732988

Conditional fixed-effects logistic regression Number of obs = 362


Group variable: nh Number of groups = 181

Obs per group: min = 2


avg = 2.0
max = 2

LR chi2(4) = 109.45
Log likelihood = -70.732988 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
dfmfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
vaccess | -2.817089 1.313771 -2.14 0.032 -5.392032 -.2421461
pcirr | 3.974166 1.016353 3.91 0.000 1.98215 5.966181
rice | .5236971 .1259889 4.16 0.000 .2767634 .7706309
wheat | -.5069951 .11704 -4.33 0.000 -.7363894 -.2776009
------------------------------------------------------------------------------

Comment: The conditional xed-eect logit model is estimated only on individuals for whom dfmfd

10
change between the two periods (181). Fixed in time covariates cannot be included. The coecient
estimates give the sign of the partial eect of covariate on the probability that df mf d = 1.

. xtprobit dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat

Fitting comparison model:

Iteration 0: log likelihood = -1085.6527


Iteration 1: log likelihood = -1019.2266
Iteration 2: log likelihood = -1018.8976
Iteration 3: log likelihood = -1018.8976

Fitting full model:

rho = 0.0 log likelihood = -1018.8976


rho = 0.1 log likelihood = -994.51383
rho = 0.2 log likelihood = -973.3339
rho = 0.3 log likelihood = -955.06732
rho = 0.4 log likelihood = -939.6369
rho = 0.5 log likelihood = -927.21874
rho = 0.6 log likelihood = -918.38527
rho = 0.7 log likelihood = -914.54824
rho = 0.8 log likelihood = -920.11931

Iteration 0: log likelihood = -914.49765


Iteration 1: log likelihood = -887.7076
Iteration 2: log likelihood = -885.07663
Iteration 3: log likelihood = -885.06546
Iteration 4: log likelihood = -885.06546 (backed up)
Iteration 5: log likelihood = -885.06544

Random-effects probit regression Number of obs = 1586


Group variable: nh Number of groups = 793

Random effects u_i ~ Gaussian Obs per group: min = 2


avg = 2.0
max = 2

Wald chi2(8) = 80.96


Log likelihood = -885.06544 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
dfmfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sexhead | -.6398418 .3213653 -1.99 0.046 -1.269706 -.0099773
agehead | .0301637 .0070609 4.27 0.000 .0163246 .0440028
educhead | -.05691 .0282304 -2.02 0.044 -.1122405 -.0015794
lnland | -.3523769 .0611959 -5.76 0.000 -.4723187 -.2324352
vaccess | -.5339855 .2541767 -2.10 0.036 -1.032163 -.0358084
pcirr | 1.140745 .2725301 4.19 0.000 .6065954 1.674894
rice | .162991 .0475735 3.43 0.001 .0697486 .2562335
wheat | -.1849797 .0473762 -3.90 0.000 -.2778354 -.0921241
_cons | -.2928303 .7756034 -0.38 0.706 -1.812985 1.227324
-------------+----------------------------------------------------------------
/lnsig2u | 1.599932 .1961666 1.215452 1.984411

11
-------------+----------------------------------------------------------------
sigma_u | 2.225465 .2182809 1.836251 2.697177
rho | .8320088 .0274182 .7712622 .8791506
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 267.66 Prob >= chibar2 = 0.000

Comment: The random-eect probit model allows for xed in time covariates but requires that random
eects are uncorrelated with covariates and errors. The coecient estimates give the sign of the marginal
eect. The marginal eects can be computed.

12

You might also like