Mastering’MetricsMastering’MetricsThePathfro

Mastering’Metrics
Mastering’Metrics
ThePathfromCausetoEffect
JoshuaD.Angrist
and
Jörn-SteffenPischke
PRINCETONUNIVERSITYPRESS▪PRINCETONANDOXFORD
Copyright©2015byPrincetonUniversityPress
PublishedbyPrincetonUniversityPress,41WilliamStreet,Princeto
n,NewJersey08540
IntheUnitedKingdom:PrincetonUniversityPress,6OxfordStreet,
Woodstock,Oxfordshire
OX201TW
press.princeton.edu
JacketandillustrationdesignbyWandaEspana
BookillustrationsbyGarrettScafani
AllRightsReserved

LibraryofCongressCataloging-in-PublicationData
Angrist,JoshuaDavid.
Mastering’metrics:thepathfromcausetoeffect/JoshuaD.Angrist,Jö
rn-SteffenPischke.
pagescm
Includesindex.
Summary:“Appliedeconometrics,knowntoaficionadosas’metrics,
istheoriginaldatascience.
’Metricsencompassesthestatisticalmethodseconomistsusetountan
glecauseandeffectin
humanaffairs.Throughaccessiblediscussionandwithadoseofkungf
u-themedhumor,
Mastering’Metricspresentstheessentialtoolsofeconometricresearc
handdemonstrateswhy
econometricsisexcitinganduseful.Thefivemostvaluableeconometr
icmethods,orwhatthe
authorscalltheFuriousFive—
randomassignment,regression,instrumentalvariables,regression
discontinuitydesigns,anddifferencesindifferences-
areillustratedthroughwellcraftedr eal-world
examples(vettedforawesomenessbyKungFuPanda’sJadePalace).
Doeshealthinsurancemake
youhealthier?Randomizedexperimentsprovideanswers.Areexpens
iveprivatecollegesand
selectivepublichighschoolsbetterthanmorepedestrianinstitutions?
Regressionanalysisanda
regressiondiscontinuitydesignrevealthesurprisingtruth.Whenpriv
atebanksteeter,and
depositorstaketheirmoneyandrun,shouldcentralbanksstepintosav
ethem?Differences-in-
differencesanalysisofaDepression-
erabankingcrisisoffersaresponse.CouldarrestingO.J.
Simpsonhavesavedhisex-

wife’slife?Instrumentalvariablesmethodsinstructlawenforcement
authoritiesinhowbesttorespondtodomesticabuse.Wieldingeconom
etrictoolswithskilland
confidence,Mastering’Metricsusesdataandstatisticstoilluminatet
hepathfromcausetoeffect.
ShowswhyeconometricsisimportantExplainseconometricresearch
throughhumorousand
accessiblediscussionOutlinesempiricalmethodscentraltomoderne
conometricpracticeWorks
throughinterestingandrelevantreal-worldexamples”—
Providedbypublisher.
ISBN978-0-691-15283-7(hardback:alk.paper)—
ISBN978-0-691-15284-4(paperback:alk.paper)
1.Econometrics.I.Pischke,Jörn-Steffen.II.Title.
HB139.A539842014
330.01′5195—dc232014024449
BritishLibraryCataloging-in-PublicationDataisavailable
ThisbookhasbeencomposedinSabonwithHelveticaNeueCondense
dfamilydisplayusing
ZzTEXbyPrincetonEditorialAssociatesInc.,Scottsdale,Arizona
Printedonacid-freepaper.♾
PrintedintheUnitedStatesofAmerica
13579108642
http://press.princeton.edu
CONTENTS
ListofFiguresvii

ListofTablesix
Introductionxi
1RandomizedTrials1
1.1InSicknessandinHealth(Insurance)1
1.2TheOregonTrail24
Mastersof’Metrics:FromDanieltoR.A.Fisher30
Appendix:MasteringInference33
2Regression47
2.1ATaleofTwoColleges47
2.2MakeMeaMatch,RunMeaRegression55
2.3CeterisParibus?68
Mastersof’Metrics:GaltonandYule79
Appendix:RegressionTheory82
3InstrumentalVariables98
3.1TheCharterConundrum99
3.2AbuseBusters115
3.3ThePopulationBomb123
Mastersof’Metrics:TheRemarkableWrights139
Appendix:IVTheory142
4RegressionDiscontinuityDesigns147
4.1BirthdaysandFunerals148
4.2TheEliteIllusion164
Mastersof’Metrics:DonaldCampbell175
5Differences-in-Differences178
5.1AMississippiExperiment178

5.2Drink,Drank,…191
Mastersof’Metrics:JohnSnow204
Appendix:StandardErrorsforRegressionDD205
6TheWagesofSchooling209
6.1Schooling,Experience,andEarnings209
6.2TwinsDoubletheFun217
6.3EconometriciansAreKnownbyTheir…Instruments223
6.4RustlingSheepskinintheLoneStarState235
Appendix:BiasfromMeasurementError240
AbbreviationsandAcronyms245
EmpiricalNotes249
Acknowledgments269
Index271
FIGURES
1.1Astandardnormaldistribution 40
1.2Thedistributionofthet-statisticforthemeaninasample
ofsize10
41
ofsize40
42
ofsize100
42

2.1TheCEFandtheregressionline 83
2.2VarianceinXisgood 96
3.1ApplicationandenrollmentdatafromKIPPLynnlotteries 103
3.2IVinschool:theeffectofKIPPattendanceonmathscores 108
4.1Birthdaysandfunerals 149
4.2AsharpRDestimateofMLDAmortalityeffects 150
4.3RDinaction,threeways 154
4.4QuadraticcontrolinanRDdesign 158
4.5RDestimatesofMLDAeffectsonmortalitybycauseof
death
161
4.6EnrollmentatBLS 166
4.7EnrollmentatanyBostonexamschool 167
4.8PeerqualityaroundtheBLScutoff 168
4.9MathscoresaroundtheBLScutoff 172
4.10ThistlethwaiteandCampbell’sVisualRD 177
5.1BankfailuresintheSixthandEighthFederalReserve
Districts
184
5.2TrendsinbankfailuresintheSixthandEighthFederal 185
ReserveDistricts
5.3TrendsinbankfailuresintheSixthandEighthFederal
ReserveDistricts,andtheSixthDistrict’sDD
counterfactual
186
5.4AnMLDAeffectinstateswithparalleltrends 198

5.5AspuriousMLDAeffectinstateswheretrendsarenot
parallel
198
5.6ArealMLDAeffect,visibleeventhoughtrendsarenot
parallel
199
5.7JohnSnow’sDDrecipe 206
6.1Thequarterofbirthfirststage 230
6.2Thequarterofbirthreducedform 230
6.3Last-chanceexamscoresandTexassheepskin 237
6.4Theeffectoflast-chanceexamscoresonearnings 237
TABLES
1.1Healthanddemographiccharacteristicsofinsuredand
uninsuredcouplesintheNHIS
5
1.2OutcomesandtreatmentsforKhuzdarandMaria 7
1.3Demographiccharacteristicsandbaselinehealthinthe
RANDHIE
20
1.4HealthexpenditureandhealthoutcomesintheRANDHIE 23
1.5OHPeffectsoninsurancecoverageandhealth-careuse 27
1.6OHPeffectsonhealthindicatorsandfinancialhealth 28
2.1Thecollegematchingmatrix 53
2.2Privateschooleffects:Barron’smatches 63

2.3Privateschooleffects:AverageSATscorecontrols 66
2.4Schoolselectivityeffects:AverageSATscorecontrols 67
2.5Privateschooleffects:Omittedvariablesbias 76
3.1AnalysisofKIPPlotteries 104
3.2Thefourtypesofchildren 112
3.3AssignedanddeliveredtreatmentsintheMDVE 117
3.4Quantity-qualityfirststages 135
3.5OLSand2SLSestimatesofthequantity-qualitytrade-off 137
4.1SharpRDestimatesofMLDAeffectsonmortality 160
5.1Wholesalefirmfailuresandsalesin1929and1933 190
5.2RegressionDDestimatesofMLDAeffectsondeathrates 196
5.3RegressionDDestimatesofMLDAeffectscontrollingfor
beertaxes
201
6.1Howbadcontrolcreatesselectionbias 216
6.2ReturnstoschoolingforTwinsburgtwins 220
6.3Returnstoschoolingusingchildlaborlawinstruments 226
6.4IVrecipeforanestimateofthereturnstoschoolingusing
asinglequarterofbirthinstrument
231
6.5Returnstoschoolingusingalternativequarterofbirth
instruments
232
INTRODUCTION

BLINDMASTERPO:Closeyoureyes.Whatdoyouhear?
YOUNGKWAICHANGCAINE:Ihearthewater,Ihearthebirds.
MASTERPO:Doyouhearyourownheartbeat?
KWAICHANGCAINE:No.
MASTERPO:Doyouhearthegrasshopperthatisatyourfeet?
KWAICHANGCAINE:Oldman,howisitthatyouhearthesethings?
MASTERPO:Youngman,howisitthatyoudonot?
KungFu,Pilot
Economists’ reputation for dismality is a bad rap. Economics is
as
exciting as any science can be: theworld is our lab, and themany
diversepeopleinitareoursubjects.
The excitement in ourwork comes from the opportunity to learn
aboutcauseandeffectinhumanaffairs.Thebigquestionsofthedayare
ourquestions:Willloosemonetarypolicysparkeconomicgrowthorju
st
fanthefiresof inflation?IowafarmersandtheFederalReservechair
wanttoknow.WillmandatoryhealthinsurancereallymakeAmerican
s
healthier? Such policy kindling lights the fires of talk radio. We
approachthesequestionscoolly,however,armednotwithpassionbut
withdata.
Economists’ use of data to answer cause-and-effect questions
constitutes the field of applied econometrics, known to students
and

mastersalikeas’metrics.Thetoolsofthe’metricstrade aredisciplined
dataanalysis,pairedwiththemachineryofstatisticalinference.There
is
amysticalaspecttoourworkaswell:we’reaftertruth,buttruthisnot
revealed in full, and the messages the data transmit require
interpretation. In this spirit,wedraw inspiration fromthe
journeyof
KwaiChangCaine,herooftheclassicKungFuTVseries.Caine,amixe
d-
raceShaolinmonk,wandersinsearchofhisU.S.-bornhalf-
brotherinthe
nineteenthcenturyAmericanWest.Ashesearches,Cainequestionsal
l
hesees inhumanaffairs,uncoveringhiddenrelationshipsanddeeper
meanings.LikeCaine’s journey, theWayof ’Metrics is
illuminatedby
questions.
OtherThingsEqual
Inadisturbingdevelopmentyoumayhaveheardof,theproportionof
Americancollegestudentscompletingtheirdegreesinatimelyfashio
n
has taken a sharp turn south. Politicians and policy analysts
blame
fallingcollegegraduationratesonaperniciouscombinationoftuition
hikesand the large student loansmany studentsuse to finance
their
studies.Perhapsincreasedstudentborrowingderailssomewhowould
otherwisestayontrack.Thefactthatthestudentsmostlikelytodrop
out of school often shoulder large student loans would seem to
substantiatethishypothesis.
You’d rather pay for school with inherited riches than borrowed
money if you can. As we’ll discuss in detail, however,
education
probablyboostsearningsenoughtomakeloanrepaymentbearablefor

mostgraduates.Howthenshouldweinterpretthenegativecorrelation
betweendebtburdenandcollegegraduationrates?Doesindebtedness
causedebtorstodropout?Thefirstquestiontoaskinthiscontextiswho
borrows themost. Studentswhoborrowheavily typically come
from
middle and lower income families, since richer families have
more
savings.Formanyreasons,studentsfromlowerincomefamiliesarele
ss
likely to complete a degree than those fromhigher income
families,
regardlessofwhetherthey’veborrowedheavily.Weshouldtherefore
be
skeptical of claims that high debt burdens cause lower college
completionrateswhentheseclaimsarebasedsolelyoncomparisonsof
completionratesbetweenthosewithmoreorlessdebt.Byvirtueofthe
correlationbetweenfamilybackgroundandcollegedebt,thecontrasti
n
graduationratesbetweenthosewithandwithoutstudentloansisnotan
otherthingsequalcomparison.
Ascollegestudentsmajoringineconomics,wefirstlearnedtheother
thingsequal ideaby
itsLatinname,ceterisparibus.Comparisonsmade
underceterisparibusconditionshaveacausalinterpretation.Imagine
two
studentsidenticalineveryway,sotheirfamilieshavethesamefinanci
al
resourcesandtheirparentsaresimilarlyeducated.Oneofthesevirtual
twinsfinancescollegebyborrowingandtheotherfromsavings.Becau
se
theyareotherwiseequalineveryway(theirgrandmotherhastreated
bothtoasmallnestegg),differencesintheireducationalattainmentca
n

beattributedtothefactthatonlyonehasborrowed.Tothisday,we
wonderwhy somany economics students first encounter this
central
ideainLatin;maybeit’saconspiracytokeepthemfromthinkingabout
it.Because,as thishypotheticalcomparisonsuggests, realother
things
equalcomparisonsarehardtoengineer,somewouldevensayimpossib
ile
(that’sItaliannotLatin,butatleastpeoplestillspeakit).
Hardtoengineer,maybe,butnotnecessarilyimpossible.The’metrics
craftusesdatatogettootherthingsequalinspiteoftheobstacles —
called
selectionbiasoromittedvariablesbias —
foundonthepathrunningfrom
raw numbers to reliable causal knowledge. The path to causal
understandingisroughandshadowedasitsnakesaround theboulders
of selection bias. And yet, masters of ’metrics walk this path
with
confidenceaswellashumility,successfullylinkingcauseandeffect.
Our first line of attack on the causality problem is a randomized
experiment, often called a randomized trial. In a randomized
trial,
researcherschangethecausalvariablesofinterest(say,theavailabilit
y
ofcollegefinancialaid)foragroupselectedusingsomethinglikeacoin
toss.Bychangingcircumstancesrandomly,wemakeithighlylikelyth
at
the variable of interest is unrelated to the many other factors
determiningtheoutcomeswemeantostudy.Randomassignmentisn’t
thesameasholdingeverythingelse fixed,but ithas thesameeffect.
Randommanipulationmakesotherthingsequalholdonaverageacros
s
thegroupsthatdidanddidnotexperiencemanipulation.Asweexplain

inChapter1,“onaverage”isusuallygoodenough.
Randomized trials takeprideofplace inour ’metrics toolkit.Alas,
randomizedsocialexperimentsareexpensivetofieldandmaybeslowt
o
bear fruit, while research funds are scarce and life is short.
Often,
therefore,mastersof’metricsturntolesspowerfulbutmoreaccessible
researchdesigns.Evenwhenwecan’tpracticablyrandomize,howeve
r,
we still dreamof the trialswe’d like to do. The notion of an ideal
experimentdisciplinesourapproachtoeconometricresearch.Master
ing
’Metrics showshowwise application of our five favorite
econometric
toolsbringsusascloseaspossibletothecausality-revealingpowerofa
realexperiment.
Ourfavoriteeconometrictoolsareillustratedherethroughaseriesof
well-
craftedandimportanteconometricstudies.VettedbyGrandMast er
OogwayofKungFuPanda’sJadePalace,theseinvestigationsofcausa
l
effectsaredistinguishedbytheirawesomeness.Themethodstheyuse
—
random assignment, regression, instrumental variables,
regression
discontinuity designs, and differences-in-differences—are the
Furious
Five of econometric research. For starters, motivated by the
contemporary American debate over health care, the first
chapter
describes two social experiments that reveal whether, as many

policymakersbelieve,healthinsuranceindeedhelpsthosewhohaveit
stayhealthy.Chapters2–5putourothertoolstowork,craftinganswers
to importantquestions ranging from
thebenefitsofattendingprivate
collegesandselectivehighschoolstothecostsofteendrinkingandthe
effectsofcentralbankinjectionsofliquidity.
OurfinalchapterputstheFuriousFivetothetestbyreturningtothe
education arena. On average, college graduates earn about twice
as
muchashighschoolgraduates,anearningsgapthatonlyseemstobe
growing.Chapter6askswhetherthisgapisevidenceofalargecausal
returntoschoolingormerelyareflectionofthemanyotheradvantages
thosewithmoreeducationmighthave(suchasmoreeducatedparents).
Cantherelationshipbetweenschoolingandearningseverbeevaluate
d
onaceterisparibusbasis,ormustthebouldersofselectionbiasforever
blockourway?Thechallengeofquantifying thecausal linkbetween
schoolingandearningsprovidesagrippingtestmatchfor’metricstool
s
andthemasterswhowieldthem.
Mastering’Metrics
Chapter1
RandomizedTrials
KWAICHANGCAINE:Whathappensinaman’slifeisalreadywritte
n.A
manmustmovethroughlifeashisdestinywills.
OLDMAN:Yeteachisfreetoliveashechooses.Thoughtheyseem

opposite,botharetrue.
KungFu,Pilot
OurPath
Our path begins with experimental random assignment, both as
a
frameworkforcausalquestionsandabenchmarkbywhichtheresults
fromothermethodsare judged.We illustrate theawesomepowerof
randomassignmentthroughtworandomizedevaluationsoftheeffect
sof
health insurance. The appendix to this chapter also uses the
experimental framework to review the concepts and methods of
statisticalinference.
1.1InSicknessandinHealth(Insura nce)
The Affordable Care Act (ACA) has proven to be one of the
most
controversial and interestingpolicy innovationswe’ve
seen.TheACA
requiresAmericanstobuyhealthinsurance,withataxpenaltyforthose
who don’t voluntarily buy in. The question of the proper role of
governmentinthemarketforhealthcarehasmanyangles.Oneisthe
causaleffectofhealth insuranceonhealth.TheUnitedStates spends
moreof itsGDPonhealthcare thandootherdevelopednations,yet
Americansaresurprisinglyunhealthy.Forexample,Americansarem
ore
likelytobeoverweightanddiesoonerthantheirCanadiancousins,wh
o
spendonlyabouttwo-thirdsasmuchoncare.Americaisalsounusual
among developed countries in having no universal health
insurance

scheme.Perhapsthere’sacausalconnectionhere.
ElderlyAmericansarecoveredbyafederalprogramcalledMedicare,
while some poor Americans (including most single mothers,
their
children,andmanyotherpoorchildren)arecoveredbyMedicaid.Man
y
oftheworking,prime-agepoor,however,havelongbeenuninsured.In
fact,manyuninsuredAmericanshavechosennot toparticipate inan
employer-provided insuranceplan.1 Theseworkers, perhaps
correctly,
count on hospital emergency departments, which cannot turn
them
away,toaddresstheirhealth-
careneeds.Buttheemergencydepartment
mightnotbethebestplacetotreat,say,theflu,ortomanagec hronic
conditionslikediabetesandhypertensionthataresopervasiveamong
poorAmericans.Theemergencydepartmentisnotrequiredtoprovide
long-termcare.Itthereforestandstoreasonthatgovernment-
mandated
healthinsurancemightyieldahealthdividend.Thepushforsubsidize
d
universalhealthinsurancestemsinpartfromthebeliefthatitdoes.
The ceteris paribus question in this context contrasts the health
of
someonewithinsurancecoveragetothehealthofthesamepersonwere
theywithoutinsurance(otherthananemergencydepartmentbackstop
).
Thiscontrasthighlightsafundamentalempiricalconundrum:people
are
eitherinsuredornot.Wedon’tgettoseethembothways,atleastnotat
thesametimeinexactlythesamecircumstances.
Inhiscelebratedpoem,“TheRoadNotTaken,”RobertFrostusedthe
metaphor of a crossroads to describe the causal effects of
personal
choice:

Tworoadsdivergedinayellowwood,
AndsorryIcouldnottravelboth
Andbeonetraveler,longIstood
AndlookeddownoneasfarasIcould
Towhereitbentintheundergrowth;
Frost’stravelerconcludes:
Tworoadsdivergedinawood,andI—
Itooktheonelesstraveledby,
Andthathasmadeallthedifference.
Thetravelerclaimshischoicehasmattered,but,beingonlyoneperson,
hecan’tbesure.Alatertriporareportbyothertravelerswon’tnailit
downforhim,either.Ournarratormightbeolderandwiserthesecond
timearound,whileothertravelersmighthavedifferentexperienceson
thesameroad.Soitiswithanychoice,includingthoserelatedtohealth
insurance:woulduninsuredmenwithheartdiseasebedisease-free if
theyhadinsurance?InthenovelLightYears,JamesSalter’s
irresolute
narratorobserves:“Actsdemolishtheiralternatives,thatistheparado
x.”
Wecan’tknowwhatliesattheendoftheroadnottaken.
Wecan’tknow,butevidencecanbebroughttobearonthequestion.
Thischaptertakesyouthroughsomeoftheevidencerelatedtopaths
involvinghealth insurance. The starting point is
theNationalHealth
InterviewSurvey(NHIS),anannualsurveyoftheU.S.populationwith
detailedinformationonhealthandhealthinsurance.Amongmanyoth
er
things, the NHIS asks: “Would you say your health in general is
excellent,verygood,good,fair,orpoor?”Weusedthisquestiontocod
e

an indexthatassigns5 toexcellenthealthand1topoorhealth ina
sample ofmarried 2009NHIS respondentswhomay ormay not be
insured.2 This index is our outcome: a measure we’re interested
in
studying.Thecausalrelationofinteresthereisdeterminedbyavariabl
e
thatindicatescoveragebyprivatehealthinsurance.Wecallthisvariab
le
thetreatment,borrowingfromtheliteratureonmedicaltrials,althoug
h
thetreatmentswe’reinterestedinneednotbemedicaltreatmentslike
drugsorsurgery.Inthiscontext,thosewithinsurancecanbethoughtof
asthetreatmentgroup;thosewithoutinsurancemakeupthecompariso
n
orcontrolgroup.Agoodcontrolgrouprevealsthefateofthetreatedina
counterfactualworldwheretheyarenottreated.
The first row of Table 1.1 compares the average health index of
insuredanduninsuredAmericans,withstatisticstabulatedseparately
for
husbandsandwives.3Thosewithhealthinsuranceareindeedhealthie
r
thanthosewithout,agapofabout.3intheindexformenand.4inthe
indexforwomen.Thesearelargedifferenceswhenmeasuredagainstt
he
standard deviation of the health index,which is about 1.
(Standard
deviations,reportedinsquarebracketsinTable1.1,measurevariabili
ty
indata.Thechapterappendixreviewstherelevantformula.)Theselar
ge
gapsmightbethehealthdividendwe’relookingfor.

FruitlessandFruitfulComparisons
Simplecomparisons,suchasthoseatthetopofTable1.1,areoftencited
as evidence of causal effects. More often than not, however,
such
comparisonsaremisleading.Onceagaintheproblemisotherthingseq
ual,
or lack thereof. Comparisons of people with and without health
insurancearenotapplestoapples;suchcontrastsareapplestooranges,
orworse.
Among other differences, those with health insurance are better
educated,havehigherincome,andaremorelikelytobeworkingthan
theuninsured.ThiscanbeseeninpanelBofTable1.1,whichreports
theaveragecharacteristicsofNHISrespondentswhodoanddon’thav
e
health insurance.Many of the differences in the table are large
(for
example,anearly3-yearschoolinggap);mostarestatisticallyprecise
enoughtoruleoutthehypothesisthatthesediscrepanciesaremerely
chancefindings(seethechapterappendixforarefresheronstatistical
significance).Itwon’tsurpriseyoutolearnthatmostvariablestabulat
ed
herearehighlycorrelatedwithhealthaswellaswithhealthinsurance
status.More-educatedpeople,forexample,tendtobehealthieraswell
as being overrepresented in the insured group. Thismaybe
because
more-
educatedpeopleexercisemore,smokeless,andaremorelikelyto
wearseatbelts.Itstandstoreasonthatthedifferenceinhealthbetween
insuredanduninsuredNHISrespondentsatleastpartlyreflectstheext
ra
schoolingoftheinsured.

TABLE1.1
Healthanddemographiccharacteristicsofinsuredanduninsured
couplesintheNHIS
Notes:Thistablereportsaveragecharacteristicsforinsuredandunins
uredmarriedcouplesin
the2009NationalHealthInterviewSurvey(NHIS).Columns(1),(2),(
4),and(5)showaverage
characteristicsofthegroupofindividualsspecifiedbythecolumnhea
ding.Columns(3)and(6)
reportthedifferencebetweentheaveragecharacteristicforindividual
swithandwithouthealth
insurance(HI).Standarddeviationsareinbrackets;standarderrorsar
ereportedinparentheses.
Ourefforttounderstandthecausalconnectionbetweeninsuranceand
healthisaidedbyfleshingoutFrost’stwo-roadsmetaphor.Weusethe
letterYas shorthand forhealth, theoutcomevariableof interest.To
makeitclearwhenwe’retalkingaboutspecificpeople,weusesubscrip
ts
asastand-infornames:Yiisthehealthofindividuali.TheoutcomeYiis
recordedinourdata.But,facingthechoiceofwhethertopayforhealth
insurance, person i has two potential outcomes, only one
ofwhich is
observed.Todistinguishonepotentialoutcomefromanother,weadda
secondsubscript:Theroadtakenwithouthealthins uranceleadstoY0i
(read this as “y-zero-i”) for person i, while the road with health
insurance leads toY1i (read this as “y-one–i”) for person i.
Potential
outcomeslieattheendofeachroadonemighttake.Thecausaleffectof
insuranceonhealthisthedifferencebetweenthem,writtenY1i−Y0i.

4
Tonailthisdownfurther,considerthestoryofvisitingMassachusetts
InstituteofTechnology(MIT)studentKhuzdarKhalat,recentlyarriv
ed
fromKazakhstan. Kazakhstan has a national health insurance
system
thatcoversallitscitizensautomatically(thoughyouwouldn’tgothere
just for the health insurance). Arriving
inCambridge,Massachusetts,
KhuzdarissurprisedtolearnthatMITstudentsmustdecidewhetherto
optintotheuniversity’shealthinsuranceplan,forwhichMITleviesa
hefty fee. Upon reflection, Khuzdar judges theMIT
insuranceworth
paying for, since he fears upper respiratory infections in
chillyNew
England.Let’ssaythatY0i=3andY1i=4fori=Khuzdar.Forhim,
thecausaleffectofinsuranceisonestepupontheNHISscale:
Table1.2summarizesthisinformation.
TABLE1.2
OutcomesandtreatmentsforKhuzdarandMaria
KhuzdarKhalat MariaMoreño
Potentialoutcomewithoutinsurance:Y0i 3 5
Potentialoutcomewithinsurance:Y1i 4 5
Treatment(insurancestatuschosen):Di 1 0
Actualhealthoutcome:Yi 4 5
Treatmenteffect:Y1i−Y0i 1 0

It’sworthemphasizingthatTable1.2isanimaginarytable:someof
theinformationitdescribesmustremainhidden.Khuzdarwilleitherb
uy
insurance,revealinghisvalueofY1i,orhewon’t,inwhichcasehisY0ii
s
revealed. Khuzdar has walked many a long and dusty road in
Kazakhstan,butevenhecannotbesurewha tliesattheendofthosenot
taken.
MariaMoreñoisalsocomingtoMITthisyear;shehailsfromChile’s
Andeanhighlands.LittleconcernedbyBostonwinters,heartyMariai
s
not the type to fall sick easily. She therefore passes up the MIT
insurance,planningtousehermoneyfortravelinstead.BecauseMaria
hasY0,Maria=Y1,Maria=5,thecausaleffectofinsuranceonherhealt
h
is
Maria’snumberslikewiseappearinTable1.2.
SinceKhuzdarandMariamakedifferentinsurancechoices,theyoffer
aninterestingcomparison.Khuzdar’shealthisYKhuzdar=Y1,Khuz
dar=4,
whileMaria’sisYMaria=Y0,Maria=5.Thedifferencebetweenthemi
s
Taken at face value, this quantity—which we observe—suggests
Khuzdar’s decision to buy insurance is counterproductive. His
MIT
insurancecoveragenotwithstanding,insuredKhuzdar’shealthiswor
se
thanuninsuredMaria’s.
Infact,thecomparisonbetweenfrailKhuzdarandheartyMariatells

uslittleaboutthecausaleffectsoftheirchoices.Thiscanbeseenby
linkingobservedandpotentialoutcomesasfollows:
Thesecondlineinthisequationisderivedbyaddingandsubtracting
Y0,Khuzdar, therebygenerating twohidden comparisons
thatdetermine
the onewe see. The first comparison,Y1,Khuzdar−Y0,Khuzdar,
is the
causaleffectofhealthinsuranceonKhuzdar,whichisequalto1.The
second,Y0,Khuzdar−Y0,Maria,isthedifferencebetweenthetwostu
dents’
healthstatuswerebothtodecideagainstinsurance.Thisterm,equalto
−2, reflectsKhuzdar’s relative frailty. In thecontextofoureffort
to
uncovercausaleffects,thelackofcomparabilitycapturedbythesecon
d
termiscalledselectionbias.
Youmightthinkthatselectionbiashassomethingtodowithour focus
on particular individuals instead of on groups, where, perhaps,
extraneousdifferencescanbeexpectedto“averageout.”Butthediffic
ult
problemofselectionbiascarriesovertocomparisonsofgroups,thoug
h,
insteadofindividualcausaleffects,ourattentionshiftstoavera gecaus
al
effects.Inagroupofnpeople,averagecausaleffectsarewrittenAvgn[
Y1i
−Y0i],where averaging is done in the usualway (that is,we sum
individualoutcomesanddividebyn):
Thesymbol indicatesasumovereveryonefromi=1ton,wheren
is thesizeof thegroupoverwhichweareaveraging.Notethatboth
summationsinequation(1.1)aretakenovereverybodyinthegroupof
interest.Theaveragecausaleffectofhealthinsurancecomparesavera
ge

healthinhypotheticalscenarioswhereeverybodyinthegroupdoesan
d
doesnothavehealthinsurance.Asacomputationalmatter,thisisthe
average of individual causal effects like Y1,Khuzdar −
Y0,Khuzdar and
Y1,Maria−Y0,Mariaforeachstudentinourdata.
An investigationof theaveragecausaleffectof insurancenaturally
begins by comparing the average health of groups of insured
and
uninsuredpeople,asinTable1.1.Thiscomparisonisfacilitatedbythe
constructionofadummyvariable,Di,whichtakesonthevalues0and1
toindicateinsurancestatus:
WecannowwriteAvgn[Yi|Di=1]fortheaverageamongtheinsured
and Avgn[Yi|Di = 0] for the average among the uninsured.
These
quantitiesareaveragesconditionaloninsurancestatus.5
TheaverageYifortheinsuredisnecessarilyanaverageofoutcome
Y1i, but contains no information aboutY0i. Likewise, the
averageYi
amongtheuninsuredisanaverageofoutcomeY0i,butthisaverageis
devoidofinformationaboutthecorrespondingY1i.Inotherwords,the
roadtakenbythosewithinsuranceendswithY1i,whiletheroadtaken
bythosewithoutinsuranceleadstoY0i.Thisinturnleadstoasimple
but important conclusion about the difference in average health
by
insurancestatus:
anexpressionhighlightingthefactthatthecomparisonsinTable1.1tel
l
ussomethingaboutpotentialoutcomes,thoughnotnecessarilywhatw
e

want to know.We’re afterAvgn[Y1i−Y0i], an average causal
effect
involvingeveryone’sY1iandeveryone’sY0i,butweseeaverageY1io
nly
fortheinsuredandaverageY0ionlyfortheuninsured.
Tosharpenourunderstandingofequation(1.2),ithelpstoimagine
thathealthinsurancemakeseveryonehealthierbyaconstantamount,κ
.
Asisthecustomamongourpeople,weuseGreekletterstolabelsuch
parameters,soastodistinguishthemfromvariablesordata;thisoneis
theletter“kappa.”Theconstant-effectsassumptionallowsustowrite:
or,equivalently,Y1i−Y0i=κ.Inotherwords,κisboththeindividual
andaveragecausaleffectofinsuranceonhealth.Thequestionathandis
howcomparisonssuchasthoseatthetopofTable1.1relatetoκ.
Using the constant-effects model (equation (1.3)) to substitute
for
Avgn[Y1i|Di=1]inequation(1.2),wehave:
Thisequationrevealsthathealthcomparisonsbetweenthosewithand
without insurance equal the causal effect of interest (κ) plus the
differenceinaverageY0ibetweentheinsuredandtheuninsured.Asin
theparableofKhuzdarandMaria,thissecondtermdescribesselection
bias.Specifically, thedifference inaveragehealthby
insurancestatus
canbewritten:
whereselectionbiasisdefinedasthedifferenceinaverageY0ibetwee
n
thegroupsbeingcompared.
Howdoweknowthatthedifferenceinmeansbyinsurancestatusis

contaminatedbyselectionbias?WeknowbecauseY0iisshorthandfor
everythingaboutpersonirelatedtohealth,otherthaninsurancestatus.
The lower part of Table 1.1 documents important noninsurance
differencesbetweentheinsuredanduninsured,showingthatceterisis
n’t
paribushereinmanyways.TheinsuredintheNHISarehealthierforall
sortsofreasons,including,perhaps,thecausaleffectsofinsurance.Bu
t
theinsuredarealsohealthierbecausetheyaremoreeducated,among
other things.To seewhy thismatters, imagineaworld inwhich the
causaleffectofinsuranceiszero(thatis,κ=0).Eveninsuchaworld,
we should expect insured NHIS respondents to be healthier,
simply
becausetheyaremoreeducated,richer,andsoon.
Wewrapupthisdiscussionbypointingoutthesubtleroleplayedby
informationlikethatreportedinpanelBofTable1.1.Thispanelshows
thatthegroupsbeingcompareddifferinwaysthatwecanobserve.As
we’llseeinthenextchapter,iftheonlysourceofselectionbiasisaset
of differences in characteristics that we can observe and
measure,
selectionbiasis(relatively)easytofix.Suppose,forexample,thatthe
onlysourceofselectionbiasintheinsurancecomparisoniseducation.
Thisbiasiseliminatedbyfocusingonsamplesofpeoplewiththesame
schooling,say,collegegraduates.Educationisthesameforinsuredan
d
uninsuredpeopleinsuchasample,becauseit’sthesameforeveryonein
thesample.
The subtlety inTable1.1arisesbecausewhenobserveddifferences
proliferate,soshouldoursuspicionsaboutunobserveddifferences.T
he

factthatpeoplewithandwithouthealthinsura ncedifferinmanyvisibl
e
wayssuggeststhatevenwerewetoholdobservedcharacteristicsfixed
,
theuninsuredwouldlikelydifferfromtheinsuredinwayswedon’tsee
(afterall,thelistofvariableswecanseeispartlyfortuitous).Inother
words,eveninasampleconsistingofinsuredanduninsuredpeoplewit
h
thesameeducation,income,andemploymentstatus,theinsuredmight
have higher values ofY0i. The principal challenge facingmasters
of
’metrics is elimination of the selection bias that arises from
such
unobserveddifferences.
BreakingtheDeadlock:JustRANDomize
Mydoctorgaveme6monthstolive…butwhenIcouldn’tpaythe
bill,hegaveme6monthsmore.
WalterMatthau
Experimentalrandomassignmenteliminatesselectionbias.Thelogis
tics
ofarandomizedexperiment,sometimescalledarandomizedtrial,can
be
complex,butthelogicissimple.Tostudytheeffectsofhealthinsuranc
e
in a randomized trial, we’d start with a sample of peoplewho are
currentlyuninsured.We’dthenprovidehealthinsurancetoarandoml
y
chosen subset of this sample, and let the rest go to the
emergency
department if the need arises. Later, the health of the insured
and
uninsured groups can be compared. Random assignment makes

this
comparison ceteris paribus: groups insured and uninsured by
random
assignmentdifferonly intheir
insurancestatusandanyconsequences
thatfollowfromit.
SupposetheMITHealthServiceelectstoforgopaymentandtossesa
coin to determine the insurance status of new students Ashish
and
Zandile (just this once, as a favor to their distinguished
Economics
Department).Zandileisinsuredifthetosscomesupheads;otherwise,
Ashish gets the coverage. A good start, but not good enough,
since
random assignment of two experimental subjects does not
produce
insuredanduninsuredapples.Foronething,AshishismaleandZandil
e
female.Women,asarule,arehealthierthanmen.IfZandilewindsup
healthier,itmightbeduetohergoodluckinhavingbeenbornawoman
andunrelatedtoherluckydrawintheinsurancelottery.Theproblem
here is that two is not enough to tangowhen it comes to random
assignment.Wemustrandomlyassigntreatmentinasamplethat’slarg
e
enoughtoensurethatdifferencesinindividualcharacteristics
likesex
washout.
Two randomly chosen groups, when large enough, are indeed
comparable.Thisfactisduetoapowerfulstatisticalpropertyknownas
theLawofLargeNumbers(LLN).TheLLNcharacterizesthebehavior
of

sampleaverages in relation to sample size.Specifically,
theLLNsays
thatasampleaveragecanbebroughtascloseasweliketotheaverage
in the population from which it is drawn (say, the population of
Americancollegestudents)simplybyenlargingthesample.
ToseetheLLNinaction,playdice.6Specifically,rollafairdieonce
andsavetheresult.Thenrollagainandaveragethesetworesults.Keep
onrollingandaveraging.Thenumbers1to6areequallylikely(that’s
whythedieissaidtobe“fair”),sowecanexpecttoseeeachvaluean
equal number of times if we play long enough. Since there are
six
possibilitieshere,andallareequallylikely,theexpectedoutcomeisan
equallyweightedaverageofeachpossibility,withweightsequalto1/6
:
Thisaveragevalueof3.5 is calledamathematical expectation; in
this
case,it’stheaveragevaluewe’dgetininfinitelymanyrollsofafairdie.
The expectation concept is important to our work, so we define
it
formallyhere.
MATHEMATICALEXPECTATIONThemathematicalexpectation
ofavariable,Yi,
writtenE[Yi], is thepopulationaverageof thisvariable. IfYi isa
variable generated by a randomprocess, such as throwing a die,
E[Yi]istheaverageininfinitelymanyrepetitionsofthisprocess.IfYi
isavariablethatcomesfromasamplesurvey,E[Yi]istheaverage
obtained if everyone in the population fromwhich the sample is
drawnweretobeenumerated.
Rollingadieonlyafewtimes,theaveragetossmaybefarfromthe

correspondingmathematicalexpectation.Roll two times,
forexample,
andyoumightgetboxcarsorsnakeeyes(twosixesortwoones).These
averagetovalueswellawayfromtheexpectedvalueof3.5.Butasthe
numberoftossesgoesup,theaverageacrosstossesreliablytendsto3.5
.
ThisistheLLNinaction(andit’showcasinosmakeaprofit:inmost
gamblinggames,youcan’tbeatthehouseinthelongrun,becausethe
expectedpayout forplayers isnegative).More remarkably,
itneedn’t
take toomany rolls or too large a sample for a sample average to
approach the expected value. The chapter appendix addresses
the
question of how the number of rolls or the size of a sample
survey
determinesstatisticalaccuracy.
Inrandomizedtrials,experimentalsamplesarecreatedbysampling
fromapopulationwe’dliketostudyratherthanbyrepeatingagame,
buttheLLNworksjustthesame.Whensampledsubjectsarerandomly
divided(as ifbyacointoss) intotreatmentandcontrolgroups, they
comefromthesameunderlyingpopulation.TheLLNthereforepromis
es
thatthoseinrandomlyassignedtreatmentandcontrolsampleswillbe
similarifthesamplesarelargeenough.Forexample,weexpecttosee
similarproportionsofmenandwomeninrandomlyassignedtreatment
andcontrolgroups.Randomassignmentalsoproducesgroupsofabout
the same age and with similar schooling levels. In fact,
randomly
assignedgroupsshouldbesimilarineveryway,includinginwaysthat
we cannot easily measure or observe. This is the root of random
assignment’sawesomepowertoeliminateselectionbias.
Thepowerofrandomassignmentcanbedescribedpreciselyusingthe
following definition, which is closely related to the definition
of

mathematicalexpectation.
CONDITIONAL EXPECTATION The conditional expectation
of a variable,Yi,
givenadummyvariable,Di=1,iswrittenE[Yi|Di=1].Thisisthe
averageofYiinthepopulationthathasDiequalto1.Likewise,the
conditional expectation of a variable, Yi, given Di = 0, written
E[Yi|Di=0],istheaverageofYiinthepopulationthathasDiequal
to0.IfYiandDiarevariablesgeneratedbyarandomprocess,such
asthrowingadieunderdifferentcircumstances,E[Yi|Di=d]isthe
averageofinfinitelymanyrepetitionsofthisprocesswhileholding
thecircumstancesindicatedbyDifixedatd.IfYiandDicomefroma
samplesurvey,E[Yi|Di=d]istheaveragecomputedwheneveryone
inthepopulationwhohasDi=dissampled.
Becauserandomlyassignedtreatmentandcontrolgroupscomefrom
the same underlying population, they are the same in every way,
including their expected Y0i. In other words, the conditional
expectations,E[Y0i|Di=1]andE[Y0i|Di=0],arethesame.Thisinturn
meansthat:
RANDOM ASSIGNMENT ELIMINATES SELECTION BIAS
When Di is randomly
assigned,E[Y0i|Di= 1]= E[Y0i|Di= 0], and the difference in
expectations by treatment status captures the causal effect of
treatment:
ProvidedthesampleathandislargeenoughfortheLLNtoworkits
magic(sowecanreplacetheconditionalaveragesinequation(1.4)wi t
h
conditional expectations), selection bias disappears in a
randomized
experiment. Random assignmentworks not by eliminating
individual

differences but rather by ensuring that themix of individuals
being
comparedisthesame.Thinkofthisascomparingbarrelsthatinclude
equalproportionsofapplesandoranges.Asweexplaininthechapters
that follow, randomization isn’t theonlyway togenerate
suchceteris
paribuscomparisons,butmostmastersbelieveit’sthebest.
Whenanalyzingdatafromarandomizedtrialoranyotherresearch
design,mastersalmostalwaysbeginwithacheckonwhethertreatment
andcontrolgroupsindeedlooksimilar.Thisprocess,calledcheckingf
or
balance,amountstoacomparisonofsampleaveragesasinpanelBof
Table1.1.Theaveragecharacteristics inpanelBappeardissimilaror
unbalanced,underliningthefactthatthedatainthistabledon’tcome
fromanythinglikeanexperiment.It’sworthcheckingforbalanceinthi
s
manneranytimeyoufindyourselfestimatingcausaleffects.
Random assignment of health insurance seems like a fanciful
proposition. Yet health insurance coverage has twice been
randomly
assignedtolargerepresentativesamplesofAmericans.TheRANDHe
alth
InsuranceExperiment(HIE),whichranfrom1974to1982,wasoneof
themost influential social experiments in research history. The
HIE
enrolled3,958peopleaged14to61fromsixareasofthecountry.The
HIE sample excluded Medicare participants and most Medicaid
and
militaryhealth
insurancesubscribers.HIEparticipantswererandomly
assignedtooneof14insuranceplans.Participantsdidnothavetopay
insurancepremiums,buttheplanshadavarietyofprovisionsr elatedto
costsharing,leadingtolargedifferencesintheamountofinsurancethe

y
offered.
ThemostgenerousHIEplanofferedcomprehensivecareforfree.At
theotherendoftheinsurancespectrum,three“catastrophiccoverage”
plans required families topay95%of theirhealth-carecosts,
though
thesecostswerecappedasaproportionofincome(orcappedat$1,000
perfamily,ifthatwaslower).Thecatastrophicplansapproximateano-
insurance condition. A second insurance scheme (the
“individual
deductible” plan) also required families to pay 95% of
outpatient
charges,butonlyupto$150perpersonor$450perfamily.Agroupof
nine other plans had a variety of coinsurance provisions,
requiring
participantstocoveranywherefrom25%to50%ofcharges,butalways
capped at a proportion of income or $1,000, whicheverwas
lower.
Participatingfamiliesenrolledintheexperimentalplansfor3or5year
s
andagreed togiveupanyearlier insurancecoverage in return fora
fixedmonthlypaymentunrelatedtotheiruseofmedicalcare.7
TheHIEwasmotivatedprimarilybyaninterestinwhateconomists
callthepriceelasticityofdemandforhealthcare.Specifically,theRA
ND
investigatorswantedtoknowwhetherandbyhowmuchhealth-
careuse
fallswhenthepriceofhealthcaregoesup.Familiesinthefreecareplan
facedapriceofzero,whilecoinsuranceplanscutpricesto25%or50%
of costs incurred, and families in the catastrophic coverage and
deductibleplanspaidsomethingclosetothestickerpriceforcare,at
leastuntiltheyhitthespendingcap.Buttheinvestigatorsalsowantedt

o
knowwhethermorecomprehensiveandmoregeneroushealthinsuran
ce
coverageindeedleadstobetterhealth.Theanswertothefirstquestion
wasaclear“yes”:health-careconsumptionishighlyresponsivetothe
priceofcare.Theanswertothesecondquestionismurkier.
RandomizedResults
Randomized field experiments are more elaborate than a coin
toss,
sometimes regrettably so. TheHIEwas
complicatedbyhavingmany
smalltreatmentgroups,spreadovermorethanadozeninsuranceplans
.
Thetreatmentgroupsassociatedwitheachplanaremostlytoosmallfor
comparisonsbetweenthemtobestatisticallymeaningful.Mostanalys
es
oftheHIEdatathereforestartbygroupingsubjectsw howereassigned
tosimilarHIEplanstogether.Wedothathereaswell.8
Anatural grouping scheme combines plans by the amount of cost
sharing they require. The three catastrophic coverage plans,
with
subscribers shouldering almost all of theirmedical expenses up
to a
fairly high cap, approximate a no-insurance state. The
individual
deductibleplanprovidedmorecoverage,butonlybyreducingthecap
ontotalexpensesthatplanparticipantswererequiredtoshoulder.The
ninecoinsuranceplansprovidedmoresubstantialcoveragebysplittin
g
subscribers’ health-care costswith the insurer, startingwith the
first

dollar of costs incurred. Finally, the free plan constituted a
radical
interventionthatmightbeexpectedtogeneratethelargestincreasein
health-careusageand,perhaps,health.Thiscategorizatio nleadsusto
four groups of plans: catastrophic, deductible, coinsurance, and
free,
instead of the 14 original plans. The catastrophic plans provide
the
(approximate)no-
insurancecontrol,whilethedeductible,coinsurance,
andfreeplansarecharacterizedbyincreasinglevelsofcoverage.
Aswithnonexperimentalcomparisons,afirststepinourexperimental
analysis is to check for balance. Do subjects randomly assigned
to
treatmentandcontrolgroups—
inthiscase,tohealthinsuranceschemes
ranging from little to complete coverage—indeed look similar?
We
gauge thisbycomparingdemographiccharacteristicsandhealthdata
collected before the experiment began. Because demographic
characteristicsareunchanging,while thehealthvariables
inquestion
weremeasuredbeforerandomassignment,weexpecttoseeonlysmall
differences in these variables across the groups assigned to
different
plans.
IncontrastwithourcomparisonofNHISrespondents’characteristics
byinsurancestatusinTable1.1,acomparisonofcharacteristicsacross
randomlyassignedtreatmentgroupsintheRANDexperimentshowst
he
peopleassignedtodifferentHIEplanstobesimilar.Thiscanbeseenin
panelAofTable1.3.Column(1)inthistablereportsaveragesforthe
catastrophic plan group, while the remaining columns compare
the
groupsassignedmoregenerousinsurancecoveragewiththecatastrop
hic

controlgroup.Asasummarymeasure,column(5)comparesasample
combiningsubjectsinthedeductible,coinsurance,andfreeplanswith
subjectsinthecatastrophicplans.Individualsassignedtotheplanswit
h
moregenerouscoveragearealittlelesslikelytobefemaleandalittle
lesseducated than those in thecatastrophicplans.Wealso see
some
variation in income, but differences between plan groups
aremostly
smallandareaslikelytogoonewayasanother.Thispatterncontrasts
withthelargeandsystematicdemographicdifferencesbetweeninsur
ed
anduninsuredpeopleseenintheNHISdatasummarizedinTable1.1.
ThesmalldifferencesacrossgroupsseeninpanelAofTable1.3seem
likelytoreflectchancevariationthatemergesnaturallyaspartofthe
sampling process. In any statistical sample, chance differences
arise
because we’re looking at one of many possible draws from the
underlying population fromwhichwe’ve sampled. A new sample
of
similar size from the same population can be expected to
produce
comparisons that are similar—though not identical—to those in
the
table.Thequestionofhow muchvariationweshouldexpectfromone
sampletoanotherisaddressedbythetoolsofstatisticalinference.
TABLE1.3
DemographiccharacteristicsandbaselinehealthintheRANDHIE
Notes:Thistabledescribesthedemographiccharacteristicsandbasel

inehealthofsubjectsin
theRANDHealth InsuranceExperiment (HIE).Column (1) shows
theaverage for thegroup
assigned catastrophic coverage. Columns (2)–(5) compare
averages in the deductible, cost-
sharing,freecare,andanyinsurancegroupswiththeaverageincolumn
(1).Standarderrorsare
reported in parentheses in columns (2)–(5); standard deviations
are reported in brackets in
column(1).
Theappendixtothischapterbrieflyexplainshowtoquantifysampling
variation with formal statistical tests. Such tests amount to the
juxtapositionofdifferencesinsampleaverageswiththeirstandarderr
ors,
thenumbersinparenthesesreportedbelowthedifferencesinaverages
listedincolumns(2)–
(5)ofTable1.3.Thestandarderrorofadifference
inaveragesisameasureofitsstatisticalprecision:whenadifferencein
sample averages is smaller than about two standard errors, the
differenceistypicallyjudgedtobeachancefindingcompatiblewithth
e
hypothesisthatthepopulationsfromwhichthesesamplesweredrawn
are,infact,thesame.
Differencesthatarelargerthanabouttwostandarderrorsaresaidto
bestatisticallysignificant:insuchcases,itishighlyunlikely(thoughn
ot
impossible) that
thesedifferencesarosepurelybychance.Differences
thatarenotstatisticallysignificantareprobablyduetothevagariesof
the sampling process. The notion of statistical significance
helps us

interpret comparisons like those in Table 1.3. Not only are the
differencesinthistablemostlysmall,onlytwo(forproportionfemalei
n
columns (4) and (5)) aremore than twiceas largeas theassociated
standarderrors.Intableswithmanycomparisons,thepresenceofafew
isolatedstatisticallysignificantdifferencesisusuallyalsoattributabl
eto
chance.Wealsotakecomfortfromthefactthatthestandarderrorsin
this table are not very big, indicating differences across groups
are
measuredreasonablyprecisely.
Panel B of Table 1.3 complements the contrasts in panel A with
evidence for reasonablygoodbalance inpre-
treatmentoutcomesacross
treatmentgroups.Thispanelshowsnostatisticallysignificantdiffere
nces
in a pre-treatment index of general health. Likewise, pre-
treatment
cholesterol,bloodpressure,andmental healthappearlargelyunrelate
d
to treatment assignment, with only a couple of contrasts close to
statisticalsignificance.Inaddition,althoughlowercholesterolinthef
ree
groupsuggestssomewhatbetterhealththaninthecatastrophicgroup,
differencesinthegeneralhealthindexbetweenthesetwogroupsgothe
otherway(sincelowerindexvaluesindicateworsehealth).Lackofa
consistent pattern reinforces the notion that these gaps are due
to
chance.
ThefirstimportantfindingtoemergefromtheHIEwasthatsubjects
assigned to more generous insurance plans used substantially
more
health care. This finding, which vindicates economists’ view
that

demandforagoodshouldgoupwhenitgetscheaper,canbeseenin
panel A of Table 1.4.9 As might be expected, hospital inpatient
admissions were less sensitive to price than was outpatient care,
probablybecauseadmissionsdecisionsareusuallymadebydoctors.O
n
the other hand, assignment to the free care plan raised
outpatient
spending by two-thirds (169/248) relative to spending by those
in
catastrophic plans, while total medical expenses increased by
45%.
These large gaps are economically important as well as
statistically
significant.
Subjectswhodidn’thavetoworryaboutthecostofhealthcareclearly
consumedquiteabitmoreofit.Didthisextracareandexpensemake
themhealthier?PanelBinTable1.4,whichcompareshealthindicators
across HIE treatment groups, suggests not. Cholesterol levels,
blood
pressure,andsummaryindicesofoverallhealthandmentalhealthare
remarkablysimilaracrossgroups(theseoutcomesweremostlymeasu
red
3or5yearsafterrandomassi gnment).Formalstatisticaltestsshowno
statisticallysignificantdifferences,ascanbeseeninthegroup-
specific
contrasts(reportedincolumns(2) –(4))andinthedifferencesinhealth
betweenthoseinacatastrophicplanandeveryoneinthemoregenerous
insurancegroups(reportedincol umn(5)).
TheseHIEfindingsconvincedmanyeconomiststhatgeneroushealth
insurance can have unintended and undesirable consequences,
increasinghealth-

careusageandcosts,withoutgeneratingadividendin
theformofbetterhealth.10
TABLE1.4
HealthexpenditureandhealthoutcomesintheRANDHIE
Notes: This table reportsmeans and treatment effects for health
expenditure and health
outcomesintheRANDHealthInsuranceExperiment(HIE).Column(
1)showstheaverageforthe
groupassignedcatastrophiccoverage.Columns(2)–
(5)compareaveragesinthedeductible,cost-
sharing,freecare,andanyinsurancegroupswiththeaverageincolumn
(1).Standarderrorsare
reported in parentheses in columns (2)–(5); standard deviations
are reported in brackets in
column(1).
1.2TheOregonTrail
MASTERKAN:Truthishardtounderstand.
KWAICHANGCAINE:Itisafact,itisnotthetruth.Truthisoftenhidde
n,
likeashadowindarkness.
KungFu,Season1,Episode14
The HIE was an ambitious attempt to assess the impact of health
insurance on health-care costs and health. And yet, as far as the
contemporarydebateoverhealth insurancegoes, theHIEmighthave
missedthemark.Foronething,eachHIEtreatmentgrouphadatleast
catastrophic coverage, so financial liability for health-care

costswas
limited under every treatment. More importantly, today’s
uninsured
Americans differ considerably from theHIE population:most of
the
uninsured are younger, less educated, poorer, and less likely to
be
working.Thevalueofextrahealthcareinsuchagroupmightbevery
differentthanforthemiddleclassfamiliesthatparticipatedintheHIE.
Oneofthemostcontroversialideasinthecontemporaryhealthpolicy
arena is the expansion ofMedicaid to cover the currently
uninsured
(interestingly, on the eve of the RAND experiment, talk was of
expanding Medicare, the public insurance program for
America’s
elderly).Medicaidnowcoversfamiliesonwelfare,someofthedisable
d,
otherpoor children, andpoorpregnantwomen.Supposewewere to
expandMedicaidtocoverthosewhodon’tqualifyundercurrentrules.
Howwould suchan expansionaffecthealth-care spending?Would
it
shift treatment from costly and crowded emergency departments
to
possibly more effective primary care? Would Medicaid
expansion
improvehealth?
Many American states have begun to “experiment”withMedicaid
expansioninthesensethatthey’veagreedtobroadeneligibility,with
thefederalgovernmentfootingmostofthebill.Alas,thesearen’treal
experiments, since everyone who is eligible for expanded
Medicaid
coverage gets it. The most convincing way to learn about the
consequences of Medicaid expansion is to randomly offer
Medicaid

coveragetopeopleincurrentlyineligiblegroups.Randomassignment
of
Medicaid seems too much to hope for. Yet, in an awesome
social
experiment,thestateofOregonrecentlyofferedMedicaidtothousand
s
of randomlychosenpeople inapubliclyannouncedhealth insurance
lottery.
We can think of Oregon’s health insurance lottery as randomly
selectingwinnersandlosersfromapoolofregistrants,thoughcoverag
e
was not automatic, even for lottery winners. Winners won the
opportunitytoapplyfor thestate-runOregonHealthPlan(OHP), the
OregonversionofMedicaid.Thestatethenreviewedtheseapplication
s,
awardingcoveragetoOregonresidentswhowereU.S.citizensorlegal
immigrantsaged19–
64,nototherwiseeligibleforMedicaid,uninsured
foratleast6months,withincomebelowthefederalpovertylevel,and
few financial assets. To initiate coverage, lottery winners had to
documenttheirpovertystatusandsubmittherequiredpaperworkwith
in
45days.
Therationaleforthe2008OHPlotterywasfairnessandnotresearch,
butit’snolessawesomeforthat.TheOregonhealthinsurancelottery
providessomeofthebestevidencewecanhopetofindonthecostsand
benefitsof insurancecoverageforthecurrentlyuninsured,afactthat
motivated researchonOHPbyMITmasterAmyFinkelstein andher
coauthors.11
Roughly75,000 lotteryapplicants registered

forexpandedcoverage
throughtheOHP.Ofthese,almost30,000wererandomlyselectedand
invitedtoapplyforOHP;thesewinnersconstitutetheOHPtreatment
group.Theother45,000constitutetheOHPcontrolsample.
ThefirstquestionthatarisesinthiscontextiswhetherOHPlottery
winnersweremorelikelytoendupinsuredasaresultofwinning.This
question ismotivated by the fact that some applicants qualified
for
regularMedicaidevenwithoutthelottery.PanelAofTable1.5shows
thatabout14%ofcontrols(lotterylosers)werecoveredbyMedicaidin
theyearfollowingthefirstOHPlottery.Atthesametime,thesecond
column,which reportsdifferencesbetween the
treatmentandcontrol
groups,showsthattheprobabilityofMedicaidcoverageincreasedby2
6
percentage points for lottery winners. Column (4) shows a
similar
increase for the subsample living in and around Portland,
Oregon’s
largest city.Theupshot is thatOHP lotterywinnerswere insuredat
muchhigherratesthanwerelotterylosers,adifferencethatmighthave
affectedtheiruseofhealthcareandtheirhealth.12
TheOHPtreatmentgroup(thatis,lotterywinners)usedmorehealth-
careservicesthantheyotherwisewouldha ve.Thiscanalsobeseenin
Table1.5,whichshowsestimatesofchangesinserviceuseintherows
below the estimate of the OHP effect on Medicaid coverage.
The
hospitalizationrateincreasedbyabouthalfapercentagepoint,amode
st
though statistically significant effect. Emerge ncy department
visits,

outpatientvisits,andprescriptiondruguseallincreasedmarkedly.Th
e
factthatthenumberofemergencydepartmentvisitsroseabout10%,a
precisely estimated effect (the standard error associated with
this
estimate, reported in column (4), is .029), is especially
noteworthy.
Many policymakers hoped and expected health insurance to
shift
formerlyuninsuredpatientsawayfromhospitalemergencydepartme
nts
towardlesscostlysourcesofcare.
TABLE1.5
OHPeffectsoninsurancecoverageandhealth-careuse
Notes:ThistablereportsestimatesoftheeffectofwinningtheOregon
HealthPlan(OHP)
lotteryoninsurancecoverageanduseofhealthcare.Odd-
numberedcolumnsshowcontrolgroup
averages. Even-numbered columns report the regression
coefficient on a dummy for lottery
winners.Standarderrorsarereportedinparentheses.
Finally, theproofof thehealth insurancepuddingappears inTable
1.6: lottery winners in the statewide sample report a modest
improvementintheprobabilitytheyassesstheirhealthasbeinggoodo
r
better(aneffectof.039,whichcanbecomparedwithacontrolmeanof
.55; theHealth isGoodvariable isadummy).Results fromin-person
interviewsconducted inPortlandsuggest thesegains stemmore
from
improvedmental rather than physical health, as can be seen in
the

second and third rows in column (4) (the health variables in the
Portlandsampleare indicesranging from0 to100).As in theRAND
experiment,resultsfromPortlandsuggestphysicalhealthindicatorsl
ike
cholesterol and blood pressurewere largely unchanged by
increased
accesstoOHPinsurance.
TABLE1.6
OHPeffectsonhealthindicatorsandfinancialhealth
Notes:ThistablereportsestimatesoftheeffectofwinningtheOregon
HealthPlan(OHP)
lotteryonhealth indicatorsandfinancialhealth.Odd-
numberedcolumnsshowcontrolgroup
averages. Even-numbered columns report the regression
coefficient on a dummy for lottery
winners.Standarderrorsarereportedinparentheses.
TheweakhealtheffectsoftheOHPlotterydisappointedpolicymakers
wholookedtopubliclyprovidedinsurancetogenerateahealthdividen
d
for low-income Americans. The fact that health insurance
increased
ratherthandecreasedexpensiveemergencydepartmentuseisespecia
lly
frustrating.Atthesametime,panelBofTable1.6revealsthathealth
insurance provided the sort of financial safety net forwhich
itwas
designed.Specifically,householdswinningthelotterywerelesslike l
yto
have incurred large medical expenses or to have accumulated
debt
generated by the need to pay for health care. It may be this

improvement in financial health that accounts for improved
mental
healthinthetreatmentgroup.
Italsobearsemphasizingthatthe financialandhealtheffectsseenin
Table1.6mostlikelycomefromthe25%ofthesamplewhoobtained
insuranceasaresultofthelottery.Adjustingforthefactthatinsurance
statuswasunchangedformanywinnersshowsthatgains in financial
securityandmentalhealthfortheone-quarterofapplicantswhowere
insuredasaresultofthelotterywereconsiderablylargerthansimple
comparisons of winners and losers would suggest. Chapter 3, on
instrumentalvariablesmethods,detailsthenatureofsuchadjustment
s.
As you’ll soon see, the appropriate adjustment here amounts to
the
divisionofwin/lossdifferencesinoutcomesbywin/lossdifferencesi
n
theprobabilityofinsurance.Thisimpliesthattheeffectofbeinginsure
d
is asmuch as four times larger than the effect ofwinning theOHP
lottery(statisticalsignificanceisunchangedbythisadjustment).
The RAND and Oregon findings are remarkably similar. Two
ambitiousexperimentstargetingsubstantiallydifferentpopulations
show
that the use of health-care services increases sharply in response
to
insurance coverage, while neither experiment reveals much of
an
insurance effect on physical health. In 2008, OHP lottery
winners
enjoyed small but noticeable improvements in mental health.
Importantly,andnotcoincidentally,OHPalsosucceeded in
insulating
many lotterywinners fromthe

financialconsequencesofpoorhealth,
justasagoodinsurancepolicyshould.Atthesametime,thesestudies
suggestthatsubsidizedpublichealthinsuranceshouldnotbeexpected
toyieldadramatichealthdividend.
MASTERJOSHWAY:Inanutshell,please,Grasshopper.
GRASSHOPPER:Causalinferencecomparespotentialoutcomes,
descriptionsoftheworldwhenalternativeroadsaretaken.
MASTERJOSHWAY:Dowecomparethosewhotookoneroadwithth
ose
whotookanother?
GRASSHOPPER:Suchcomparisonsareoftencontaminatedbyselect
ion
bias,thatis,differencesbetweentreatedandcontrolsubjectsthat
existevenintheabsenceofatreatmenteffect.
MASTERJOSHWAY:Canselectionbiasbeeliminated?
GRASSHOPPER:Randomassignmenttotreatmentandcontrol
conditionseliminatesselectionbias.Yeteveninrandomizedtrials,
wecheckforbalance.
MASTERJOSHWAY:Isthereasinglecausaltruth,whichallrandomi
zed
investigationsaresuretoreveal?
GRASSHOPPER:Iseenowthattherecanbemanytruths,Master,some
compatible,someincontradiction.Wethereforetakespecialnote
whenfindingsfromtwoormoreexperimentsaresimilar.

Mastersof’Metrics:FromDanieltoR.A.Fisher
ThevalueofacontrolgroupwasrevealedintheOldTestament.The
BookofDanielrecountshowBabylonianKingNebuchadnezzardecid
ed
togroomDanielandother Israelite captives forhis royal
service.As
slaverygoes,thiswasn’tabadgig,sincethekingorderedhiscaptivesb
e
fed“foodandwinefromtheking’stable.”Danielwasuneasyaboutthe
rich diet, however, preferring modest vegetarian fare. The
king’s
chamberlainsinitiallyrefusedDaniel’sspecialmealsrequest,fearing
that
hisdietwouldprove inadequate foronecalledon to serve theking.
Daniel,notwithoutchutzpah,proposedacontrolledexperiment:“Tes
t
yourservants fortendays.Giveusnothingbutvegetablestoeatand
watertodrink.Thencompareourappearancewiththatoftheyoung
menwhoeattheroyalfood,andtreatyourservantsinaccordancewith
what you see” (Daniel 1, 12–13). The Bible recounts how this
experiment supported Daniel’s conjecture regarding the relative
healthfulness of a vegetarian diet, though as far aswe
knowDaniel
himselfdidn’tgetanacademicpaperoutofit.
Nutrition is a recurring theme in the quest for balance. Scurvy,
a
debilitatingdiseasecausedbyvitaminCdeficiency,wasthescourgeof
theBritishNavy. In 1742, James Lind, a
surgeononHMSSalisbury,
experimentedwithacureforscurvy.Lindchose12seamenwithscurvy
and started themonan identicaldiet.He then formed sixpairs and
treatedeachofthepairswithadifferentsupplementtotheirdailyfood

ration.Oneoftheadditionswasanextratwoorangesandonelemon
(Lindbelievedanacidicdietmightcurescurvy).ThoughLinddidnot
userandomassignment,andhissamplewassmallbyourstandards,he
wasapioneerinthathechosehis12studymemberssotheywere“as
similarasIcouldhavethem.”Thecitruseaters—
Britain’sfirstlimeys—
were quickly and incontrovertibly cured, a life-changing
empirical
finding that emerged from Lind’s data even though his theory
was
wrong.13
Almost150yearspassedbetweenLindandthefirstrecordeduseof
experimental random assignment. This was by Charles Peirce,
an
American philosopher and scientist,who experimentedwith
subjects’
ability todetect smalldifferences inweight. Ina less-than-
fascinating
but methodologically significant 1885 publication, Peirce and
his
student Joseph Jastrow explained how they varied experimental
conditionsaccordingtodrawsfromapileofplayingcards.14
Theideaofarandomizedcontrolledtrialemergedinearnestonlyat
thebeginningofthetwentiethcentury,intheworkofstatisticianand
geneticistSirRonaldAylmerFisher,whoanalyzeddatafromagricult
ural
experiments.ExperimentalrandomassignmentfeaturesinFisher’s1
925
StatisticalMethodsforResearchWorkersandisdetailedinhislandma
rk
TheDesignofExperiments,publishedi n1935.15

Fisher hadmany fantastically good ideas and a few bad ones. In
additiontoexplainingthevalueofrandomassignment,heinventedthe
statisticalmethodofmaximumlikelihood.Alongwith
’metricsmaster
SewallWright(andJ.B.S.Haldane),helaunchedthefieldoftheoretica
l
population genetics. But he was also a committed eugenicist and
a
proponentof
forcedsterilization(aswasregressionmasterSirFrancis
Galton,whocoinedtheterm“eugenics”).Fisher,alifelongpipesmoke
r,
wasalsoonthewrongsideofthedebateoversmokingandhealth,due
inparttohisstronglyheldbeliefthatsmokingandlungcancersharea
commongeneticorigin.Thenegativeeffectofsmokingonhealthnow
seemswellestablished,thoughFisherwasrighttoworryaboutselecti
on
biasinhealthresearch.Manylifestylechoices,suchaslow -
fatdietsand
vitamins,havebeenshown tobeunrelated tohealthoutcomeswhen
evaluatedwithrandomassignment.
Appendix:MasteringInference
YOUNGCAINE:Iampuzzled.
MASTERPO:Thatisthebeginningofwisdom.
Thisisthefirstofanumberofappendicesthatfillinkeyeconometric
and statistical details. You can spend your life studying
statistical
inference;manymastersdo.Hereweofferabrief sketchofessential
ideasandbasicstatisticaltools,enoughtounderstandtableslikethose
in

thischapter.
TheHIEisbasedonasampleofparticipantsdrawn(moreorless)at
random from the population eligible for the experiment.
Drawing
anothersamplefromthesamepopulation,we’dgetsomewhatdifferen
t
results,butthegeneralpictureshouldbesimilarifthesampleislarge
enoughfortheLLNtokickin.Howcanwedecidewhetherstatistical
resultsconstitutestrongevidenceormerelyaluckydraw,unlikelytob
e
replicatedinrepeatedsamples?Howmuchsamplingvarianceshould
we
expect?Thetoolsofformalstatisticalinferenceanswerthesequestion
s.
Thesetoolsworkforalloftheeconometricstrategiesofconcerntous.
Quantifyingsamplinguncertainty isanecessarystep
inanyempirical
project and on the road to understanding statistical claimsmade
by
others.WeexplainthebasicinferenceideahereinthecontextofHIE
treatmenteffects.
The taskathand is thequantificationof theuncertaintyassociate d
withaparticularsampleaverageand,especially,groupsofaveragesan
d
thedifferencesamongthem.Forexample,we’dliketoknowifthelarge
differencesinhealth-
careexpenditureacrossHIEtreatmentgroupscan
bediscountedasachancefinding.TheHIEsamplesweredrawnfroma
much largerdata set thatwe thinkof as covering thepopulationof
interest. The HIE population consists of all families eligible for
the
experiment(tooyoungforMedicareandsoon).Insteadofstudyingthe
manymillionsofsuchfamilies,amuchsmallergroupofabout2,000

families (containingabout4,000people)was selectedat randomand
thenrandomlyallocatedtooneof14plansortreatmentgroups.Note
thattherearetwosortsofrandomnessatworkhere:thefirstpertainsto
theconstructionofthestudysampleandthesecondtohowtreatment
wasallocatedtothosewhoweresampled.Randomsamplingandrando
m
assignmentarecloselyrelatedbutdistinctideas.
AWorldwithoutBias
We first quantify the uncertainty induced by random sampling,
beginning with a single sample average, say, the average health
of
everyone in thesampleathand,asmeasuredbyahealth index.Our
targetisthecorrespondingpopulationaveragehealthindex,thatis,the
meanovereveryoneinthepopulationofinterest.Aswenotedonp.14,
thepopulationmeanofavariableiscalleditsmathematicalexpectatio
n,
or justexpectation for short.For theexpectationo favariable,Yi,we
write E[Yi]. Expectation is intimately related to formal notions
of
probability. Expectations can bewritten as aweighted average of
all
possiblevaluesthatthevariableYicantakeon,withweightsgivenby
the probability these values appear in the population. In our
dice-
throwingexample,theseweightsareequalandgivenby1/6(seeSectio
n
1.1).
Unlikeournotationforaverages,thesymbolforexpectationdoesnot
reference the sample size.That’sbecause
expectationsarepopulation
quantities, defined without reference to a particular sample of
individuals.Foragivenpopulation,thereisonlyoneE[Yi],whilethere

aremanyAvgn[Yi],dependingonhowwechoosenandjustwhoendsup
inoursample.BecauseE[Yi]isafixedfeatureofaparticularpopulatio
n,
wecallitaparameter.Quantitiesthatvaryfromone sampletoanother,
suchasthesampleaverage,arecalledsamplestatistics.
Atthispoint,it’shelpfultoswitchfromAvgn[Yi]toamorecompact
notationforaverages,Ȳ.Notethatwe’redispensingwiththesubscript
n
to avoid clutter—henceforth, it’s on you to remember that
sample
averages are computed in a sample of a particular size. The
sample
average,Ȳ,isagoodestimatorofE[Yi](instatistics,anestimatorisany
functionofsampledatausedtoestimateparameters).Foronething,the
LLNtellsusthatinlargesamples,thesampleaverageislikelytobevery
closetothecorrespondingpopulationmean.Arelatedpropertyisthat
theexpectationofȲisalsoE[Yi].Inotherwords,ifweweretodraw
infinitelymanyrandomsamples,theaverageoftheresultingȲacross
draws would be the underlying population mean. When a sample
statistic has expectation equal to the corresponding population
parameter,it’ssaidtobeanunbiasedestimatorofthatparameter.Here’
s
thesamplemean’sunbiasednesspropertystatedformally:
UNBIASEDNESSOFTHESAMPLEMEANE[Ȳ]=E[Yi]
The sample mean should not be expected to be bang on the
corresponding population mean: the sample average in one
sample
might be too big, while in other samples it will be too small.
Unbiasednesstellsusthatthesedeviationsarenotsystematicallyupor
down; rather, in repeated samples they average out to zero. This
unbiasedness property is distinct from the LLN,which says that

the
samplemeangetscloserandclosertothepopulationmeanasthesampl
e
sizegrows.Unbiasednessofthesamplemeanholdsforsamplesofany
size.
MeasuringVariability
In addition to averages, we’re interested in variability. To gauge
variability,it’scustomarytolookataveragesquareddeviationsfromt
he
mean, in which positive and negative gaps get equal weight. The
resultingsummaryofvariabilityiscalledvariance.
ThesamplevarianceofYiinasampleofsizenisdefinedas
The corresponding population variance replaces averages with
expectations,giving:
Like E[Yi], the quantityV(Yi) is a fixed feature of a
population—a
parameter. It’s therefore customary to christen it inGreek: ,
whichisreadas“sigma-squared-y.”16
Becausevariancessquarethedatatheycanbeverylarge.Multiplya
variableby10and itsvariancegoesupby100.Therefore,weoften
describevariabilityusingthesquarerootofthevariance:thisiscalled
the standard deviation,writtenσY.Multiply a variable by 10 and
its
standarddeviationincreasesby10.Asalways,thepopulationstandar
d
deviation,σY,hasasamplecounterpartS(Yi),thesquarerootofS(Yi)
2.
VarianceisadescriptivefactaboutthedistributionofYi.(Reminder:

thedistributionofavariableisthesetofvaluesthevariabletakesonand
therelativefrequencythateachvalueisobservedinthepopulationor
generatedbyarandomprocess.)Somevariablestakeonanarrowsetof
values(likeadummyvariableindicatingfamilieswithhealthinsuranc
e),
whileothers(likeincome)tendtobespreadoutwithsomeveryhigh
valuesmixedinwithmanysmallerones.
It’s important to document the variability of the variables
you’re
working with. Our goal here, however, goes beyond this. We’re
interestedinquantifyingthevarianceofthesamplemeaninrepeated
samples. Since the expectationof the samplemean isE[Yi](from
the
unbiasednessproperty),thepopulationvarianceofthesamplemeanca
n
bewrittenas
Thevarianceofastatisticlikethesamplemeanisdistinctfromthe
varianceusedfordescriptivepurposes.WewriteV(Ȳ)forthevariance
of
the sample mean, while V(Yi) (or ) denotes the variance of the
underlyingdata.BecausethequantityV(Ȳ)measuresthevariabilityo
fa
samplestatisticinrepeatedsamples,asopposedtothedispersionofra
w
data,V(Ȳ)hasaspecialname:samplingvariance.
Sampling variance is related to descriptive variance, but, unlike
descriptivevariance, samplingvariance is alsodeterminedby
sample
size. We show this by simplifying the formula for V(Ȳ). Start
by
substitutingtheformulaforȲinsidethenotationforvariance:
Tosimplifythisexpression,wefirstnotethatrandomsamplingensure
s
theindividualobservationsinasamplearenotsystematicallyrelatedt

o
one another; in otherwords, they are statistically independent.
This
important property allows us to take advantage of the fact that
the
varianceofasumofstatisticallyindependentobservations,eachdraw
n
randomly from the same population, is the sum of their
variances.
Moreover,becauseeachYi issampledfromthesamepopulation,each
drawhasthesamevariance, .Finally,weusethepropertythatthe
varianceofaconstant(like1/n)timesYiisthesquareofthisconstant
timesthevarianceofYi.Fromtheseconsiderations,weget
Simplifyingfurther,wehave
We’veshownthatthesamplingvarianceofasampleaveragedepends
onthevarianceoftheunderlyingobservations, ,andthesamplesize,
n. As you might have guessed, more data means less dispersion
of
sampleaveragesinrepeatedsamples.Infact,whenthesamplesizeis
verylarge,there’salmostnodispersionatall,becausewhennislarge,
is small. This is the LLNatwork: asn approaches infinity, the
sampleaverageapproachesthepopulationmean,andsamplingvarian
ce
disappears.
Inpractice,weoftenworkwiththestandarddeviationofthesample
meanratherthanitsvariance.Thestandarddeviationofastatisticlike
thesampleaverageiscalleditsstandarderror.Thestandarderrorofthe
samplemeancanbewrittenas

Everyestimatediscussedinthisbookhasanassociatedstandarderror.
This includes sample means (for which the standard error
formula
appearsinequation(1.6)),differencesinsamplemeans(discussedlat
er
inthisappendix),regressioncoefficients(discussedinChapter2),and
instrumentalvariablesandothermoresophisticatedestimates.Form
ulas
forstandarderrorscangetcomplicated,but the idearemainssimple.
The standard error summarizes the variability in an estimate due
to
random sampling. Again, it’s important to avoid confusing
standard
errorswiththestandarddeviationsoftheunderlyingvariables;thetwo
quantitiesareintimatelyrelatedyetmeasuredifferentthings.
One last step on the road to standard errors: most population
quantities, includingthestandarddeviationinthenumeratorof(1.6),
are unknown and must be estimated. In practice, therefore, when
quantifyingthesamplingvarianceofasamplemean,weworkwithan
estimatedstandarderror.ThisisobtainedbyreplacingσYwithS(Yi)i
nthe
formula for SE(Ȳ). Specifically, the estimated standard error of
the
samplemeancanbewrittenas
Weoftenforgetthequalifier“estimated”whendiscussingstatisticsan
d
theirstandarderrors,butthat’sstillwhatwehaveinmind.Forexample,
thenumbersinparenthesesinTable1.4areestimatedstandarderrors
fortherelevantdifferencesinmeans.
Thet-StatisticandtheCentralLimitTheorem

Havinglaidoutasimpleschemetomeasurevaria bilityusingstandard
errors,itremainstointerpretthismeasure.Thesimplestinterpretation
usesat-statistic.Supposethedataathandcomefromadistributionfor
whichwe believe the populationmean,E[Yi], takes on a
particular
value, μ (read this Greek letter as “mu”). This value constitutes
a
workinghypothesis.At-
statisticforthesamplemeanundertheworking
hypothesisthatE[Yi]=μisconstructedas
Theworkinghypothesisisareferencepointthatisoftencalledthenull
hypothesis.Whenthenullhypothesisisμ=0,thet-statisticistheratio
ofthesamplemeantoitsestimatedstandarderror.
Manypeoplethinkthescienceofstatisticalinferenceisboring,butin
fact it’snothingshortofmiraculous.Onemiraculousstatistical fact
is
thatifE[Yi]isindeedequaltoμ,then—aslongasthesampleislarge
enough—thequantityt(μ)hasasamplingdistributionthatisveryclose
toabell-shapedstandardnormaldistribution, sketched inFigure1.1.
Thisproperty,whichappliesregardlessofwhetherYiitselfisnormall
y
distributed,iscalledtheCentralLimitTheorem(CLT).TheCLTallow
sus
tomakeanempirically informeddecisi onastowhethertheavailable
datasupportorcastdoubtonthehypothesisthatE[Yi]equalsμ.
FIGURE1.1
Astandardnormaldistribution
TheCLTisanastonishingandpowerfulresult.Amongotherthings,it
impliesthatthe(large-sample)distributionofat-
statisticisindependent

of the distribution of the underlying data used to calculate it.
For
example, supposewemeasure health status with a dummy
variable
distinguishinghealthypeoplefromsickandthat20%ofthepopulation
issick.Thedistributionofthisdummyvariablehastwospikes,oneof
height.8atthevalue1andoneofheight.2atthevalue0.TheCLTtells
usthatwithenoughdata,thedistributionofthet-statisticissmoothand
bell-
shapedeventhoughthedistributionoftheunderlyingdatahasonly
twovalues.
We can see the CLT in action through a sampling experiment. In
sampling experiments, we use the random number generator in
our
computertodrawrandomsamplesofdifferentsizesoverandoveragai
n.
Wedidthisforadummyvariablethatequalsone80%ofthetimeand
forsamplesofsize10,40,and100.Foreachsamplesize,wecalculated
thet-statisticinhalfamillionrandomsamplesusing.8asourvalueofμ.
Figures1.2–1.4plotthedistributionof500,000t-statisticscalculated
foreachofthethreesamplesizesinourexperiment,withthestandard
normal distribution superimposed. With only 10 observations,
the
samplingdistributionisspiky,thoughtheoutlinesofabell-
shapedcurve
alsoemerge.Asthesamplesizeincreases,thefittoanormaldistributio
n
improves.With100observations,thestandardnormalisjustaboutban
g
on.
The standard normal distribution has a mean of 0 and standard

deviationof1.Withanystandardnormalvariable,values largerthan
±2arehighlyunlikely. In fact, realizations larger than2
inabsolute
valueappearonlyabout5%ofthetime.Becausethet-
statisticiscloseto
normallydistributed,wesimilarlyexpect it tofallbetweenabout±2
mostofthetime.Therefore,it’scustomarytojudgeanyt-
statisticlarger
thanabout2(inabsolutevalue)astoounlikelytobeconsistentwiththe
nullhypothesisusedtoconstructit.Whenthenullhypothesisisμ=0
andthet-statisticexceeds2inabsolutevalue,wesaythesamplemeanis
significantlydifferent fromzero.Otherwise, it’snot.Similar
language is
usedforothervaluesofμaswell.
FIGURE1.2
Thedistributionofthet-statisticforthemeaninasampleofsize10
Note:Thisfigureshowsthedistributionofthesamplemeanofadummy
variablethatequals
1withprobability.8.
FIGURE1.3
variablethatequals
1withprobability.8.
FIGURE1.4

variablethatequals
1withprobability.8.
Wemightalsoturnthequestionofstatisticalsignificanceonitsside:
instead of checkingwhether the sample is consistentwith a
specific
valueofμ,wecanconstructthesetofallvaluesofμthatareconsistent
withthedata.Thesetofsuchvaluesiscalledaconfidenceintervalfor
E[Yi].Whencalculatedinrepeatedsamples,theinterval
shouldcontainE[Yi]about95%ofthetime.Thisintervalistherefore
said to be a 95% confidence interval for the population mean.
By
describing the set of parameter values consistent with our data,
confidence intervals provide a compact summary of the
information
thesedatacontainaboutthepopulationfromwhichtheyweresampled.
PairingOff
Onesampleaverageistheloneliestnumberthatyou’lleverdo.Luckily
,
we’re usually concernedwith two.We’re especially keen to
compare
averagesforsubjectsinexperimentaltreatmentandcontrolgroups.W
e
reference these averages with a compact notation, writing Ȳ1
for
Avgn[Yi|Di=1]andȲ
0forAvgn[Yi|Di=0].Thetreatmentgroupmean,
Ȳ1, is theaverage for then1observationsbelonging to the

treatment
group,withȲ0definedsimilarly.Thetotalsamplesizeisn=n0+n1.
For our purposes, the difference between Ȳ1 andȲ0 is either an
estimateof the causal effectof treatment (ifYi is anoutcome), or
a
checkonbalance(ifYiisacovariate).Tokeepthediscussionfocused,
we’ll assume the former. Themost important null hypothesis in
this
contextisthattreatmenthasnoeffect,inwhichcasethetwosamples
usedtoconstructtreatmentandcontrolaveragescomefromthesame
population. On the other hand, if treatment changes outcomes,
the
populationsfromwhichtreatmentandcontrolobservationsaredrawn
arenecessarilydifferent.Inparticular,theyhavedifferentmeans,whi
ch
wedenoteμ1andμ0.
Wedecidewhethertheevidencefavorsthehypothesisthatμ1=μ0
by lookingforstatisticallysignificantdifferences in
thecorresponding
sampleaverages.Statisticallysignificantresultsprovidestrongevid
ence
of a treatment effect, while results that fall short of statistical
significanceareconsistentwiththenotionthattheobserveddifferenc
e
in treatment and controlmeans is a chance finding. The
expression
“chancefinding”inthiscontextmeansthatinahypotheticalexperime
nt
involvingvery large samples—so large that any
samplingvariance is
effectivelyeliminated—
we’dfindtreatmentandcontrolmeanstobethe
same.
Statisticalsignificanceisdeterminedbytheappropriate t-statistic.A

keyingredientinanytrecipeisthestandarderrorthatlivesdownstairs
inthetratio.Thestandarderrorforacomparisonofmeansisthesquare
root of the sampling variance ofȲ1−Ȳ0. Using the fact that the
varianceofadifferencebetweentwostatisticallyindependentvariabl
es
isthesumoftheirvariances,wehave
Thesecondequalityhereusesequation(1.5),whichgivesthesampling
varianceofasingleaverage.Thestandarderrorweneedistherefore
In deriving this expression, we’ve assumed that the variances of
individualobservationsarethesameintreatmentandcontrolgroups.
This assumption allows us to use one symbol, , for the common
variance.Aslightlymorecomplicatedformulaallowsvariancestodif
fer
acrossgroupsevenifthemeansarethesa me(anideatakenupagainin
thediscussionofrobustregressionstandarderrors in theappendix to
Chapter2).17
Recognizingthat mustbeestimated,inpracticeweworkwiththe
estimatedstandarderror
whereS(Yi) isthepooledsamplestandarddeviation.This is
thesample
standard deviation calculated using data from both treatment
and
controlgroupscombined.
Underthenullhypothesisthatμ1−μ0isequaltothevalueμ,thet-
statisticforadifferenceinmeansis
Weusethist-statistictotestworkinghypothesesaboutμ1−μ0andto
construct confidence intervals for this difference. When the null

hypothesisisoneofequalmeans(μ=0),thestatistict(μ)equalsthe
differenceinsamplemeansdividedbytheestimatedstandarderrorof
thisdifference.Whenthet-
statisticislargeenoughtorejectadifference
ofzero,wesaytheestimateddifferenceisstatisticallysignificant.The
confidenceintervalforadifferenceinmeansisthedifferenceinsampl
e
meansplusorminustwostandarderrors.
Bearinmindthatt-statisticsandconfidenceintervalshavelittletosay
about whether findings are substantively large or small. A large
t-
statistic ariseswhen the estimated effect of interest is largebut
also
when theassociated standarderror is small
(ashappenswhenyou’re
blessed with a large sample). Likewise, the width of a
confidence
interval isdeterminedby statisticalprecisionas reflected in
standard
errorsandnotby themagnitudeof therelationshipsyou’re trying to
uncover. Conversely, t-statistics may be small either because
the
difference in theestimatedaverages is smallorbecause the
standard
errorofthisdifferenceislarge.Thefactthatanestimateddifferenceis
notsignificantlydifferentfromzeroneednotimplythattherelationshi
p
under investigation is small or unimportant. Lack of statistical
significance often reflects lack of statistical precision, that is,
high
sampling variance.Masters aremindful of this factwhen
discussing

econometricresults.
1Formoreonthissurprisingfact,seeJonathanGruber,“Coveringthe
UninsuredintheUnited
States,”JournalofEconomicLiterature,vol.46,no.3,September200
8,pages571–606.
2Oursampleisaged26–
59andthereforedoesnotyetqualifyforMedicare.
3AnEmpiricalNotessectionafterthelastchaptergivesdetailednotesf
orthistableandmost
oftheothertablesandfiguresinthebook.
4 Robert Frost’s insights notwithstanding, econometrics isn’t
poetry. A modicum of
mathematicalnotationallowsustodescribeanddiscusssubtlerelatio
nshipsprecisely.Wealso
use italics to introduce repeatedly used terms, such as
potentialoutcomes, that have special
meaningformastersof’metrics.
5OrderthenobservationsonYisothatthen0observationsfromthegro
upindicatedbyDi=
0precedethen1observationsfromtheDi=1group.Theconditionalave
rage
isthesampleaverageforthen0observationsintheDi=0group.Theter
mAvgn[Yi|Di=1]is
calculatedanalogouslyfromtheremainingn1observations.
6Six-
sidedcubeswithonetosixdotsengravedoneachside.There’sanappfo
r’emonyour

smartphone.
7OurdescriptionoftheHIEfollowsRobertH.Brooketal.,“DoesFree
CareImproveAdults’
Health?ResultsfromaRandomizedControlledTrial,”NewEnglandJ
ournalofMedicine,vol.309,
no.23,December8,1983,pages1426–1434.SeealsoAvivaAron-
Dine,LiranEinav,andAmy
Finkelstein,“TheRANDHealthInsuranceExperiment,ThreeDecad
esLater,”JournalofEconomic
Perspectives,vol.27,Winter2013,pages197–
222,forarecentassessment.
8 OtherHIE complications include the fact that instead of
simply tossing a coin (or the
computer equivalent), RAND investigators implemented a
complex assignment scheme that
potentiallyaffectsthestatisticalpropertiesoftheresultinganalyses(f
ordetails,seeCarlMorris,
“AFiniteSelectionModelforExperimentalDesignoftheHealthInsur
anceStudy,”Journalof
Econometrics,vol.11,no.1,September1979,pages43–
61).Intentionshereweregood,inthat
theexperimentershoped to insure themselvesagainst
chancedeviation fromperfectbalance
acrosstreatmentgroups.MostHIEanalystsignoretheresultingstatist
icalcomplications,though
manyprobably joinus
inregrettingthisattempttogildtherandomassignment lily.Amore
seriousproblemarisesfromthelargenumberofHIEsubjectswhodrop
pedoutoftheexperiment
andthelargedifferencesinattritionratesacrosstreatmentgroups(few
erleftthefreeplan,for
example).AsnotedbyAron-
Dine,Einav,andFinkelstein,“TheRANDExperiment,”Journalof
Economic Perspectives, 2013, differential attrition may have

compromised the experiment’s
validity.Today’s“randomistas”dobetteronsuchnuts-and-
boltsdesignissues(see,forexample,
the experiments described in Abhijit Banerjee and Esther Duflo,
Poor Economics: A Radical
RethinkingoftheWaytoFightGlobalPoverty,PublicAffairs,2011).
9TheRANDresultsreportedherearebasedonourowntabulationsfro
mtheHIEpublicuse
file,asdescribedintheEmpiricalNotessectionattheendofthebo ok.T
heoriginalRANDresults
are summarized in Joseph P. Newhouse et al., Free for All?
Lessons from the RANDHealth
InsuranceExperiment,HarvardUniversityPress,1994.
10Participants
inthefreeplanhadslightlybettercorrectedvisionthanthose
intheother
plans;seeBrooketal.,“DoesFreeCareImproveHealth?”NewEnglan
dJournalofMedicine,1983,
fordetails.
11SeeAmyFinkelsteinetal.,“TheOregonHealthInsuranceExperim
ent:Evidencefromthe
FirstYear,”Quarterly Journal of Economics, vol. 127, no.
3,August 2012, pages1057–1106;
KatherineBaickeretal.,“TheOregonExperiment—
EffectsofMedicaidonClinicalOutcomes,”
NewEnglandJournalofMedicine,vol.368,no.18,May2,2013,pages
1713–1722;andSarah
Taubmanetal.,“MedicaidIncreasesEmergencyDepartmentUse:Evi
dencefromOregon’sHealth
InsuranceExperiment,”Science,vol.343,no.6168,January17,2014,
pages263–268.
12 Why weren’t all OHP lottery winners insured? Some failed
to submit the required

paperworkontime,whileabouthalfofthosewhodidcompletethenece
ssaryformsinatimely
fashionturnedouttobeineligibleonfurtherreview.
13Lind’sexperimentisdescribedinDuncanP.Thomas,“Sailors,Scur
vy,andScience,”Journal
oftheRoyalSocietyofMedicine,vol.90,no.1,January1997,pages50
–54.
14CharlesS.PeirceandJosephJastrow,“OnSmallDifferencesinSen
sation,”Memoirsofthe
NationalAcademyofSciences,vol.3,1885,pages75–83.
15RonaldA. Fisher,StatisticalMethods for
ResearchWorkers,Oliver andBoyd, 1925, and
RonaldA.Fisher,TheDesignofExperiments,OliverandBoyd,1935.
16Samplevariancestendtounderestimatepopulationvariances.Sam
plevarianceistherefore
sometimesdefinedas
thatis,dividingbyn−1insteadofbyn.Thismodifiedformulaprovides
anunbiasedestimate
ofthecorrespondingpopulationvariance.
17Usingseparatevariancesfortreatmentandcontrolobservations,w
ehave
whereV1(Yi) is the variance of treated observations, andV
0(Yi) is the variance of control
observations.

Chapter2
Regression
KWAICHANGCAINE:Aworkerisknownbyhistools.Ashovelfora
man
whodigs.Anaxforawoodsman.Theeconometricianruns
regressions.
OurPath
Whenthepathtorandomassignmentisblocked,welookforalternate
routestocausalknowledge.Wieldedskillfully,’metricstoolsotherth
an
randomassignmentcanhavemuchofthecausality-
revealingpowerofa
real experiment. The most basic of these tools is regression,
which
comparestreatmentandcontrolsubjectswhohav ethesameobserved
characteristics.Regressionconceptsarefoundational,pavingthewa
yfor
themoreelaboratetoolsusedinthechaptersthat follow.Regression-
basedcausalinferenceispredicatedontheassumptionthatwhenkey
observedvariableshavebeenmadeequalacrosstreatmentand control
groups, selection bias from the things we can’t see is also
mostly
eliminated.Weillustratethisideawithanempiricalinvestigationofth
e
economicreturnstoattendanceateliteprivatecolleges.
2.1ATaleofTwoColleges

Studentswhoattendedaprivatefour-yearcollegeinAmericapaidan
averageofabout$29,000intuitionandfeesinthe2012–2013school
year.Thosewhowenttoapublicuniversityintheirhomestatepaidless
than$9,000.Aneliteprivateeducationmightbebetterinmanyways:
the classes smaller, the athletic facilities newer, the faculty
more
distinguished,andthestudentssmarter.But$20,000peryearofstudyi
s
abigdifference.Itmakesyouwonderwhetherthedifferenceisworthit.
Theapples-to-applesquestioninthiscaseaskshowmucha40-year-
oldMassachusetts-
borngraduateof,say,Harvard,wouldhaveearnedif
heorshehadgonetotheUniversityofMassachusetts(U-
Mass)instead.
Moneyisn’teverything,but,asGrouchoMarxobserved:“Moneyfree
s
you from doing things you dislike. Since I dislike doing nearly
everything, money is handy.” So whenwe ask whether the
private
school tuition premium is worth paying, we focus on the
possible
earnings gain enjoyed by thosewho attend elite private
universities.
Higherearningsaren’ttheonlyreasonyoumightpreferaneliteprivate
institutionoveryour localstateschool.Manycollegestudentsmeeta
futurespouseandmakelastingfriendshipswhileincollege.Still,whe
n
families invest an additional $100,000 ormore in human capital,
a
higheranticipatedearningspayoffseemslikelytobepartofthestory.
Comparisonsofearningsbetweenthosewhoattenddifferentsortsof
schools invariably reveal large gaps in favor of elite-college
alumni.
Thinkingthisthrough,however,it’seasytoseewhycomparisonsofth
e
earningsofstudentswhoattendedHarvardandU-Massareunlikelyto

revealthepayofftoaHarvarddegree.Thiscomparisonreflectsthefact
thatHarvardgradstypicallyhavebetterhighschoolgradesandhigher
SAT scores, aremoremotivated, and perhaps have other skills
and
talents.NodisrespectintendedforthemanygoodstudentswhogotoU-
Mass,butit’sdamnhardtogetintoHarvard,andthosewhodoarea
specialandselectgroup.Incontrast,U-Massacceptsandevenawards
scholarshipmoneytoalmosteveryMassachusettsapplicantwithdece
nt
tenth-grade test scores. We should therefore expect earnings
comparisonsacrossalmamaterstobecontaminatedbyselectionbias,
justlikethecomparisonsofhealthbyinsurancestatusdiscussedinthe
previous chapter.We’ve also seen that this sort of selection bias
is
eliminatedbyrandomassignment.Regrettably,theHarvardadmissio
ns
officeisnotyetpreparedtoturntheiradmissionsdecisionsovertoa
randomnumbergenerator.
Thequestionofwhethercollegeselectivitymattersmustbeanswered
using the data generated by the routine application, admission,
and
matriculation decisionsmade by students and universities of
various
types.Canweusethesedatatomimictherandomizedtrialwe’dliketo
runinthiscontext?Not toperfection,surely,butwemaybeableto
comeclose.Thekeytothisundertakingisthefactthatmanydecisions
and choices, including those related to college attendance,
involve a
certain amount of serendipitous variation generated by financial
considerations,personalcircumstances,andtiming.
Serendipitycanbeexploitedinasampleofapplicantsonthecusp,
whocouldeasilygoonewayor theother.Doesanyoneadmitted to
Harvard really go to their local state school instead?Our

friendand
formerMITPhDstudent,Nancy,didjustthat.NancygrewupinTe xas,
sotheUniversityofTexas(UT)washerstateschool.UT’sflagshipAus
tin
campusisrated“HighlyCompetitive”inBarron’srankings,butit’sno
t
Harvard. UT is, however, much less expensive than Harvard
(The
PrincetonReview recently named UT Austin a “Best Value
College”).
Admitted to both Harvard and UT, Nancy chose UT over
Harvard
becausetheUTadmissionsoffice,anxioustoboostaverageSATscore
s
oncampus,offeredNancyanda fewotheroutstandingapplicantsan
especiallygenerousfinancialaidpackage,whichNancygladlyaccept
ed.
WhataretheconsequencesofNancy’sdecisiontoacceptUT’soffer
anddeclineHarvard’s?ThingsworkedoutprettywellforNancyinspit
e
ofherchoiceofUToverHarvard:todayshe’saneconomicsprofessora
t
anotherIvyLeagueschoolinNewEngland.Butthat’sonlyoneexampl
e.
Well,actually,it’stwo:OurfriendMandygotherbachelor’sfromthe
University of Virginia, her home state school, declining offers
from
Duke, Harvard, Princeton, and Stanford. Today, Mandy teaches
at
Harvard.
Asampleoftwoisstilltoosmallforreliablecausalinference.We’d
like to comparemany people likeMandy andNancy tomany other
similarpeoplewhochoseprivatecollegesanduniversities.Fromlarge

r
groupcomparisons,wecanhopetodrawgeneral lessons.Accesstoa
largesampleisnotenough,however.Thefirstandmostimportantstep
inourefforttoisolatetheserendipitouscomponentofschoolchoiceist
o
hold constant the most obvious and important differences
between
studentswhogotoprivateandstateschools.Inthismanner,wehope
(thoughcannotpromise)tomakeotherthingsequal.
Here’s a small-sample numerical example to illustrate the
ceteris
paribus idea (we’ll have more data when the time comes for real
empiricalwork).Supposetheonlythingsthatmatterinlife,atleastas
farasyourearningsgo,areyourSATscoresandwhereyougotoschool.
ConsiderUmaandHarvey,bothofwhomhaveacombinedreadingand
mathscoreof1,400ontheSAT.1UmawenttoU-Mass,whileHarvey
wenttoHarvard.WestartbycomparingUma’sandHarvey’searnings.
Becausewe’veassumedthatallthatmattersforearningsbesidescolle
ge
choiceisthecombinedSATscore,Umavs.Harveyisaceterisparibus
comparison.
Inpractice,ofcourse,lifeismorecomplicated.Thissimpleexample
suggests one significant complication: Uma is a young woman,
and
Harveyisayoungman.Womenwithsimilareducationalqualification
s
oftenearnlessthanmen,perhapsduetodiscriminationortimespent
outofthelabormarkettohavechildren.ThefactthatHarveyearns20%
morethanUmamaybetheeffectofasuperiorHarvardeducation,butit
might justaswellreflectamale-femalewagegapgeneratedbyother
things.
We’d like to disentangle the pureHarvard effect from these
other

things.Thisiseasyiftheonlyotherthingthatmattersisgender:replace
Harvey with a female Harvard student, Hannah, who also has a
combinedSATof1,400,comparingUmaandHannah.Finally,becaus
e
we’re after general conclusions that gobeyond individual
stories,we
lookformanysimilarsame-sexandsame-SATcontrastsacrossthetwo
schools. That is,we compute the average earnings difference
among
HarvardandU-MassstudentswiththesamegenderandSATscore.The
averageofallsuchgroup-specificHarvardversusU-
Massdifferencesis
ourfirstshotatestimatingthecausaleffectofaHarvard education.Thi
s
is an econometricmatching estimator that controls for—that is,
holds
fixed—
sexandSATscores.Assumingthat,conditionalonsexandSAT
scores, the students who attend Harvard and U-Mass have
similar
earningspotential,thisestimatorcapturestheaveragecausaleffectof
a
Harvarddegreeonearnings.
Matchmaker,Matchmaker
Alas,there’smoretoearningsthansex,schools,andSATscores.Since
collegeattendancedecisionsaren’trandomlyassigned,wemustcontr
ol
for all factors that determine both attendance decisions and later
earnings. These factors include student characteristics, like
writing
ability,diligence,familyconnections,andmore.Controlforsuchawi
de

rangeoffactorsseemsdaunting:thepossibilitiesarevirtuallyinfinite,
andmanycharacteristicsarehardtoquantify.ButStacyBergDaleand
AlanKruegercameupwithacleverandcompellingshortcut.2Instead
of
identifyingeverythingthatmightmatterforcollegechoiceandearnin
gs,
theyworkwithakeysummarymeasure:thecharacteristicsofcolleges
towhichstudentsappliedandwereadmitted.
ConsideragainthetaleofUmaandHarvey:bothappliedto,andwere
admittedto,U-MassandHarvard.ThefactthatUmaappliedtoHarvard
suggests she has themotivation to go there,while her admission
to
Harvardsuggestsshehastheabilitytosucceedthere,justlikeHarvey.
Atleastthat’swhattheHarvardadmissionsofficethinks,andtheyare
not easily fooled.3 Uma nevertheless opts for a cheaper U-Mass
education. Her choice might be attributable to factors that are
not
closelyrelatedtoUma’searningspotential,suchasasuccessfuluncle
whowenttoU-Mass,abestfriendwhochoseU-Mass,orthefactthat
Umamissedthedeadline for thateasilywonRotaryClubscholarship
thatwouldhavefundedanIvyLeagueeducation.Ifsuchserendipitous
eventsweredecisiveforUmaandHarvey,thenthetwoofthemmakea
goodmatch.
DaleandKruegeranalyzedalargedatasetcalledCollegeandBe yond
(C&B).TheC&Bdatasetcontainsinformationonthousandsofstuden
ts
whoenrolledinagroupofmoderatelytohighlyselectiveU.S.colleges
anduniversities, togetherwith survey information collected from
the
students at the time they took the SAT, about a year before
college
entry,andinformationcollectedin1996,longaftermosthadgraduate
d

fromcollege.Theanalysisherefocusesonstudentswhoenrolledin19
76
and who were working in 1995 (most adult college graduates are
working).Thecollegesincludeprestigiousprivateuniversities,liket
he
UniversityofPennsylvania,Princeton,andYale; anumberof
smaller
privatecolleges,likeSwarthmore,Williams,andOberlin;andfourpu
blic
universities(Michigan,TheUniversityofNorthCarolina,PennState,
and
MiamiUniversity inOhio). The average (1978) SAT scores at
these
schoolsrangedfromalowof1,020atTulanetoahighof1,370atBryn
Mawr.In1976,tuitionrateswereaslowas$540attheUniversityof
NorthCarolinaandashighas$3,850atTufts(thosewerethedays).
Table2.1details a stripped-downversionof theDale andKrueger
matchingstrategy,inasetupwecallthe“collegematchingmatrix.”Th
is
table lists applications, admissions, andmatriculation decisions
for a
(made-up) listofninestudents,eachofwhomapplied toasmanyas
threeschoolschosenfromanimaginarylistofsix.Threeoutofthesix
schoolslistedinthetablearepublic(AllState,TallState,andAltered
State)andthreeareprivate(Ivy,Leafy,andSmart).Fiveofournine
students(numbers1,2,4,6,and7)attendedprivateschools.Average
earnings in this group are $92,000. The other four, with average
earningsof$72,500,wenttoapublicschool.Thealmost$20,000gap
betweenthesetwogroupssuggestsalargeprivateschooladvantage.
TABLE2.1
Thecollegematchingmatrix

Note:Enrollmentdecisionsarehighlightedingray.
ThestudentsinTable2.1areorganizedinfourgroupsdefinedbythe
setof schools towhich theyappliedandwereadmitted.Withineach
group,studentsarelikelytohavesimilarcareerambitions,whilethey
were also judged to be of similar ability by admissions staff at
the
schools to which they applied. Within-group comparisons
should
therefore be considerably more apples-to-apples than
uncontrolled
comparisonsinvolvingallstudents.
ThethreegroupAstudentsappliedtotwoprivateschools,Leafyand
Smart,andonepublicschool,TallState.Althoughthesestudentswere
rejectedatLeafy,theywereadmittedtoSmartandTallState.Students
1
and2wenttoSmart,whilestudent3optedforTallState.Thestudents
ingroupAhavehighearnings,andprobablycomefromuppermiddle
classfamilies(asignalhereisthattheyappliedtomoreprivateschools
thanpublic).Student3, thoughadmitted toSmart,opted forcheaper
TallState,perhapstosaveherfamilymoney(likeourfriendsNancyan
d
Mandy).Althoughthestudents ingroupAhavedonewell,withhigh
averageearningsandahighrateofprivateschoolattendance,within
groupA,theprivateschooldifferentialisnegative:(110+100)/2−
110=−5,inotherwords,agapof−$5,000.
ThecomparisoningroupAisoneofanumberofpossiblematched
comparisonsinthetable.GroupBincludestwostudents,eachofwhom
appliedtooneprivateandtwopublicschools(Ivy,AllState,andAltere
d
State).ThestudentsingroupBhaveloweraverageearningsthanthose
in group A. Bothwere admitted to all three schools towhich they

applied.Number4enrolledatIvy,whilenumber5choseAlteredState.
Theearningsdifferentialhere is$30,000 (60−30=30).Thisgap
suggestsasubstantialprivateschooladvantage.
GroupCincludestwostudentswhoappliedtoasingleschool(Leafy),
where they were admitted and enrolled. Group C earnings reveal
nothing about the effects of private school attendance, because
both
studentsinthisgroupattendedprivateschool.Thetwostudentsingrou
p
Dappliedtothreeschools,wereadmittedtotwo,andmadedifferent
choices. But these two students choseAll State and Tall State,
both
publicschools,sotheirearningsalsorevealnothingaboutthevalueofa
privateeducation.GroupsCandDareuninformative,because,fromth
e
perspectiveofourefforttoestimateaprivateschooltreatmenteffect,
eachiscomposedofeitherall-treatedorall-controlindividuals.
GroupsAandBarewheretheactionisinourexample,sincethese
groupsincludepublicandprivateschoolstudentswhoappliedtoand
wereadmittedtothesamesetofschools.Togenerateasingleestimate
thatusesallavailabledata,weaveragethegroup-
specificestimates.The
averageof−$5,000forgroupAand$30,000forgroupBis$12,500.
This isagoodestimateof theeffectofprivate schoolattendanceon
averageearnings,because,toalargedegree,itcontrolsforapplicants’
choicesandabilities.
Thesimpleaverageoftreatment-controldifferencesingroupsAandB
isn’t theonlywell-controlled comparison that canbe computed
from
thesetwogroups.Forexample,wemightconstructaweightedaverage
whichreflectsthefactthatgroupBincludestwostudentsandgroupA
includesthree.Theweightedaverageinthiscaseiscalculatedas
Byemphasizinglargergroups,thisweightingschemeusesthedatamo

re
efficiently and may therefore generate a statistically more
precise
summaryoftheprivate-publicearningsdifferential.
Themostimportantpointinthiscontextistheapples-to-applesand
oranges-to-oranges nature of the underlying matched
comparisons.
ApplesingroupAarecomparedtoothergroupAapples,whileoranges
in group B are compared only with oranges. In contrast, naive
comparisons that simply compare the earnings of private and
public
schoolstudentsgenerateamuchlargergapof$19,500whencomputed
using all nine students in the table. Evenwhen limited to the
five
studentsingroupsAandB,theuncontrolledcomparisongeneratesaga
p
of$20,000(20=(110+100+60)/3−(110+30)/2).Thesemuch
larger uncontrolled comparisons reflect selection bias: students
who
apply to and are admitted to private schools have higher
earnings
wherevertheyultimatelychosetogo.
Evidence of selection bias emerges from a comparison of
average
earningsacross(insteadofwithin)groupsAandB.Averageearningsi
n
group A, where two-thirds apply to private schools, are around
$107,000.AverageearningsingroupB,wheretwo-
thirdsapplytopublic
schools, are only $45,000.Ourwithin-group comparisons reveal
that
much of this shortfall is unrelated to students’ college
attendance
decisions. Rather, the cross-group differential is explained by a
combinationofambitionandability,asreflectedinapplicationdecisi
ons

andthesetofschoolstowhichstudentswereadmitted.
2.2MakeMeaMatch,RunMeaRegression
Regression is the tool thatmasterspickup first, ifonly toprovidea
benchmarkformoreelaborateempiricalstrategies.Althoughregressi
on
isamany-splendoredthing,wethinkofitasanautomatedmatchmaker.
Specifically, regression estimates are weighted averages of
multiple
matched comparisons of the sort constructed for the groups in
our
stylized matching matrix (the appendix to this chapter discusses
a
closely related connection between regression and mathematical
expectation).
Thekeyingredientsintheregressionrecipeare
▪thedependentvariable,inthiscase,studenti’searningslaterin
life,alsocalledtheoutcomevariable(denotedbyYi);
▪ the treatment variable, in this case, a dummy variable that
indicatesstudentswhoattendedaprivatecollegeoruniversity
(denotedbyPi);and
▪asetofcontrolvariables,inthiscase,variablesthatidentifysets
ofschoolstowhichstudentsappliedandwereadmitted.
Inourmatchingmatrix,thefivestudentsingroupsAandB(Table
2.1)contributeusefuldata,whilestudents ingroupsCandDcanbe
discarded.InadatasetcontainingthoseleftafterdiscardinggroupsC
andD,asinglevariableindicatingthestudentsingroupAtellsuswhich

Mastering’MetricsMastering’MetricsThePathfro

Mastering’MetricsMastering’MetricsThePathfro

Recommended

Recommended

More Related Content

More from AbramMartino96

More from AbramMartino96 (20)

Recently uploaded

Recently uploaded (20)

Mastering’MetricsMastering’MetricsThePathfro