SlideShare a Scribd company logo
An Introduction to
Causal Inference in Tech
Emily Glassberg Sands
emily@coursera.org, @emilygsands
November 2016
About me
● Harvard Economics PhD
● Data Science Manager @
Coursera
econometrics
causal inference
experimental design
labor markets & education
Does X drive Y?
● Did PR coverage drive sign-ups?
● Does mobile app improve retention?
● Does customer support increase sales?
● Would lowering price increase revenues?
● ...
Inspired by work with Duncan Gilchrist, Economist and Data Scientist @ Wealthfront
Does X drive Y?
4
Raw Correlation
▪ Users engaging with X more likely
to have outcome Y?
▫ Plot Y against X
▫ corr(X, Y)
▪ But beware confounding variables
“Impact” of Mobile App Usage on
Retention
Mobile
Usage?
MoM
Retention
No 35%
Yes 40%
Selection
Bias?
Does X drive Y?
7
Testing
▪ Randomly assign some users and
not others an experience
▪ Estimate the causal effect of the
experience on the outcome
▪ Often best path forward…
...but not in all cases
Limitations of A/B Testing
Consider user
experience
Consider ethics
Consider effect
on user trust
5 Econometric Methods for
Causal Inference
Controlled
Regression
Difference-in-
Differences
Fixed
Effects
Regression
Instrumental
Variables
9
Regression
Discontinuity
Design
Controlled
Regression
10
Method 1:
Controlled Regression
11
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Control directly for the confounding
variables in a regression of Y on X
Assumption: Distribution of outcomes, Y,
conditionally independent of treatment, X, given
the confounders, C
Method 1:
Controlled Regression
12
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Effect of live chat support on sales.
▪ Age confounder →
Upward bias if regress sales on chat support
▪ Add control for age
In R:
fit <- lm(Y ~ X + C, data = ...)
summary(fit)
Method 1:
Controlled Regression
13
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pitfall 1: “Missing” controls →
Omitted Variable Bias
Can we tell how much of a problem?
▪ If adding proxies increases (adjusted)
R-squared without impacting estimate, could
be ok...*
*Oster 15 provides a formal treatment.
14
✓ Adding controls does NOT change point estimate
Method 1:
Controlled Regression
15
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
▪ ...but if adding proxies to regression impacts
coefficient on X, regression won’t suffice.
Adding controls DOES change point estimate
Relationship between Instructor & Enrollee Gender
Share Enrollments F
Any Instructor F .090*** .035***
(0.0076) (0.0074)
Controls NO YES
Adjusted R-squared 0.07 0.74
Base Group Mean 0.32 0.32
Watch for omitted variables
biasing coefficient of interest
16
Method 1:
Controlled Regression
17
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pitfall 2: “Bad” controls →
Included Variable Bias
Example:
▪ Suppose “ interest in product” is confounder
▪ Control for proportion of emails opened?
Not if directly impacted by treatment!
Leave out “controls” that are
themselves not fixed at the time
treatment was determined
18
19
Regression
Discontinuity
Design
Method 2:
Regression Discontinuity Design
20
Idea: Focus on a cut-off point that can be
thought of as a local randomized experiment
Example: Effect of passing course on income?
▪ A/B test? Randomly passing some, failing
others unethical
▪ Controlled regression? Key unobservables
like ability and motivation
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 2:
Regression Discontinuity Design
21
Example cont’d:
Passing cutoff → natural experiment!!
▪ User earning 69 similar to user earning 70
▪ Use discontinuity to estimate causal effect
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(rdd)
RDestimate(Y ~ D, data = …,
subset = …, cutpoint = …)
Method 2:
Regression Discontinuity Design
22
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables Threshold
Note on Validity - A/B testing
23
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Randomized
correctly, i.e. samples
balanced
External
validity
Unbiased for full
population
Experimental group
representative of
overall
Note on Validity - Regression
Discontinuity Design
24
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control
of assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Method 2:
Internal Validity in RDD
25
Assumption 1: Imprecise control of assignment,
AKA no manipulation at the threshold
▪ Users cannot control whether just above
versus just below the cutoff
In example: Cannot control grade around the
cutoff (e.g., asking for re-grade).
How can we tell?
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 2:
Internal Validity in RDD
26
Check 1: Mass just below ~= Mass just above
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Even mass around cut-off Agency over assignment
Method 2:
Internal Validity in RDD
27
Check 2: Composition of users in two buckets
similar along key observable dimension(s)
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Similar on observable Different on observable
Check for manipulation at the
threshold
28
1. Mass just below ~= Mass just above?
2. Just below vs. just above similar on key observables?
Method 2:
Internal Validity in RDD
29
Assumption 2: No confounding discontinuities
▪ Being just above (versus just below) the cutoff
should not influence other features
In example: Assumes passing is the only
differentiator between a 60 and a 70
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Watch out for confounding
discontinuities
30
31
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Note on Validity - Regression
Discontinuity Design
Method 2:
External Validity in RDD
32
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ “Local” around the cut-off
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Estimated effect is “local” average
treatment effect around cut-off
33
Difference-in-
Differences
34
Method 3:
Difference-in-Differences
35
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
36
Method 3:
Difference-in-Differences
37
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 3:
Difference-in-Differences
38
Example con’t:
▪ Change price + RDD? But if co-timed marketing,
feature launch, external shock …counterfactual
no longer obvious...
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 3:
Difference-in-Differences
39
Example con’t:
▪ DD design. Change price in some geos (e.g.,
countries) but not others
Use control markets to compute
counterfactual in treatment markets
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
DD more robust than RDD so
design for DD where feasible
40
Method 3:
Difference-in-Differences
41
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Control markets
Treatment markets
Date of Change
Method 3:
Difference-in-Differences
42
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
In R:
fit <- lm(Y ~ treatment +
post +
I(treatment * post),
data = … )
summary(fit)
Method 3:
Difference-in-Differences
43
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
In R (with time trends):
fit <- lm(Y ~ time +
treatment +
I((time >= 0) * treatment),
data = … )
summary(fit)
Method 3:
Difference-in-Differences
44
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 3:
Difference-in-Differences
45
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Control markets
Treatment markets
Date of Change
Note on Validity -
Difference-in-Differences
46
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect
Method 3:
Internal Validity in DD
47
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Parallel trends
▪ Absent treatment, same trends
In example: Treatment and control markets
would have followed same trends if no price
change
How can we tell?
Method 3:
Internal Validity in DD
48
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pre-experiment:
▪ Make treatment and control similar
▫ Stratified randomization.
1. Stratify based on key attributes
2. Randomize within strata
3. Pool across strata
▫ Matched pairs. Historically followed similar
trends and/or are expected to respond
similarly to internal or external shocks
Method 3:
Internal Validity in DD
49
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pre-experiment (cont):
▪ Check graphically & statistically that
pre-experiment trends parallel
✓ Parallel trends NOT parallel trends
Design DD for parallel trends
50
1. Set-up: stratified randomization, matched pairs
2. Check: parallel trends ex ante
Method 3:
Internal Validity in DD
51
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Post roll-out:
Problem 1: Confounder(s) in certain treatment or
control market(s), e.g., launch localized payments
Solution 1: Exclude those observations.
Method 3:
Internal Validity in DD
52
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Post roll-out (cont):
Problem 2: Confounder(s) in subset of treatment
and control market(s), e.g., Euro value plunges
Solution 2: Difference-in-Difference-in-Difference
Consider excluding confounded
observations, or triple differencing
53
Note on Validity -
Difference-in-Differences
54
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect
Method 3:
External Validity in DD
55
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Homogeneous treatment effects, as
with RDD
Pricing caveat: General Equilibrium? In
experiment, users influenced by price change
▫ Can cut on new users only
▫ See Pricing Post for more pricing tips
Method 3:
Extension: Bayesian Approach
56
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Construct a Bayesian structural time-series
model and use to predict counterfactual
Open source resource: Google’s CausalImpact
Method 3:
Extension: Bayesian Approach
57
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Discrete shock in given market, e.g.,
▪ PR announcement in India
▪ New partnership with Singaporean
government
A/B testing infeasible; CausalImpact compares
pre/post in treated/untreated markets
Fixed
Effects
Regression
58
Method 4:
Fixed Effects Regression
59
Idea: Special type of controlled regression
▪ most commonly used with panel data
▪ often to capture heterogeneity across
individuals (or products) fixed over time
Example: Estimate effect of price on conversion
▪ 1(pay) = ɑ + β*1($49) + X’Ⲅ
▫ X is vector of product fixed effects
▫ Ⲅ is a vector of product-specific intercepts
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 4:
Fixed Effects Regression
60
In R:
Note: Requires meaningful variation in X after
controlling for fixed effects.
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
fit <- lm(Y ~ X + factor(SKU),
data = …)
summary(fit)
Note on Validity - Fixed Effects
61
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Instrumental
Variables
62
Method 5:
Instrumental Variables
63
Idea: “Instrument” for X of interest with some
feature, Z, that drives Y only through its effect
on X; back out effect of X on Y
Requirements:
▪ Strong first stage: Z meaningfully affects X
▪ Exclusion restriction: Z affects Y only
through its effect on X
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 5:
Instrumental Variables
64
Implementation:
1. Instrument for X with Z
2. Estimate the effect of (instrumented) X on Y
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(aer)
fit <- ivreg(Y ~ X | Z, data = …)
summary(fit, vcov = sandwich,
df = Inf, diagnostics = TRUE)
Method 5:
Instrumental Variables
65
Sample output:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 5:
Instrumental Variables
66
Instruments in real world? Often look to policies
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Y X Instrument Economist(s)
Earnings Education Vietnam Draft lottery Angrist
Compulsory
schooling laws
Angrist & Krueger
Quarter of birth Angrist & Krueger
Crime Prison
populations
Prison overcrowding
litigation
Levitt
Police Electoral cycles Levitt
Method 5:
Instrumental Variables
67
Instruments in tech? Everywhere! Especially old
A/B tests
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 5:
Instrumental Variables
68
Instruments in tech? Everywhere! Especially old
A/B tests
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Y X Instrument Data Scientist
Platform
retention
Having friends on
the platform
Referral test 1 You!
Referral test 2 You!
Referral test 3 You!
... ...
Note on Validity - Instrumental Variables
69
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effect
Method 5:
Internal Validity in IV
70
Assumption 1: Strong first stage
▪ Experiment we chose “successful” at driving X
Why matters: If Z not strong predictor of X,
second stage estimate will be biased.
How can we tell? Check F-statistic on the first
stage regression; should be > 11 (rule-of-thumb)
▪ `Diagnostics = TRUE’ in R will include test of
weak instruments
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 5:
Internal Validity in IV
71
Assumption 2: Exclusion restriction
▪ Z affects Y only through X
How can we tell? No test; have to go on logic
In the example:
✓ Control group got otherwise equivalent email
Control group got no email
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Note on Validity - Instrumental Variables
72
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Method 5:
External Validity in IV
73
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ Relevant for the group impacted by the
instrument
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Method 5:
Make-Your-Own-Instrument!
74
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Instrumental
variables via
randomized
encouragement
+
Extensions: ML + Causal Inference
75
Traditionally distinct literatures:
▪ Machine Learning focuses on prediction
▫ Nonparametric prediction methods
▫ Cross-validation for model selection
▪ Economics and statistics focuses on causality
Weaknesses of classic causal approaches:
▪ Fail with many covariates
▪ Model selection unprincipled
ML + Causal Inference = <3
Extensions & New Directions:
ML + Causal Inference
Idea: In cases where many possible instrument
sets, use LASSO (penalized least squares) to select
instruments
Benefits:
▪ Less prone to data mining → more robust
▪ Stronger first stage → less weak instrument
bias
Extensions & New Directions:
ML + Causal Inference: LASSO
Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather
Extensions & New Directions:
ML + Causal Inference: LASSO
Extensions & New Directions:
ML + Causal Inference: LASSO
Effect of weather shocks on viewership
Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather
Challenge: Potential set of instruments large
▫ Risk of overfitting (e.g., including all)
▫ Risk of data minimum (e.g., hand-picking)
Solution: Implement LASSO methods to estimate
optimal instruments in linear IV models with many
instruments
Extensions & New Directions:
ML + Causal Inference: LASSO
Extensions & New Directions:
ML + Causal Inference: Trees
Idea: In cases where heterogeneous treatment
effects, use trees to identify subgroups
Example: Want to identify a partition of the
covariate space into subgroups based on
treatment effect heterogeneity
Solution: Athey & Imbens’ (2015) Causal Trees
▪ like regression trees but focuses on MSE of
treatment effect
▪ output is treatment effect & CI by subgroup
Extensions & New Directions:
ML + Causal Inference: Forest
Idea: Extension of trees; want personalized
estimate of treatment effect
Solution: Wager & Athey (2015) Causal Forests
▪ estimate is CATE (conditional average
treatment effect)
▪ predictions are asymptotically normal
▪ predictions centered on the true effect
Sample R code and simulation output
available on GitHub
Detailed context and more examples
available on Medium
References and reading list available here
Home grown resources
84
Thanks!!
Any questions?
You can find me at @emilygsands &
emily@coursera.org

More Related Content

What's hot

z-test
z-testz-test
Bus b272 f unit 1
Bus b272 f unit 1Bus b272 f unit 1
Bus b272 f unit 1
kocho2
 
Research Paper C - 1530 - 13-05-15
Research Paper C - 1530 - 13-05-15Research Paper C - 1530 - 13-05-15
Research Paper C - 1530 - 13-05-15Andrew Dash
 
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
Teck Nam Ang
 
Testing of hypothesis and Goodness of fit
Testing of hypothesis and Goodness of fitTesting of hypothesis and Goodness of fit
Testing of hypothesis and Goodness of fit
Sir Parashurambhau College, Pune
 
Test of hypothesis test of significance
Test of hypothesis test of significanceTest of hypothesis test of significance
Test of hypothesis test of significance
Dr. Jayesh Vyas
 
Finding Robust Solutions to Requirements Models
Finding Robust Solutions to Requirements ModelsFinding Robust Solutions to Requirements Models
Finding Robust Solutions to Requirements Models
gregoryg
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Kaori Kubo Germano, PhD
 
Test for proportion
Test for proportionTest for proportion
Test for proportion
Stephan Jade Navarro
 
Z test
Z testZ test
Z test
kagil
 
Test of significance
Test of significanceTest of significance
Test of significance
Dr Bushra Jabeen
 
Z test, f-test,etc
Z test, f-test,etcZ test, f-test,etc
Z test, f-test,etc
Juan Apolinario Reyes
 
Les5e ppt 11
Les5e ppt 11Les5e ppt 11
Les5e ppt 11
Subas Nandy
 
Modeling strategies for definitive screening designs using jmp and r
Modeling strategies for definitive  screening designs using jmp and rModeling strategies for definitive  screening designs using jmp and r
Modeling strategies for definitive screening designs using jmp and r
Philip Ramsey
 
Student t-test
Student t-testStudent t-test
Student t-test
Steve Bishop
 
Hypothesis testing part iii for difference of means
Hypothesis testing part iii for difference of meansHypothesis testing part iii for difference of means
Hypothesis testing part iii for difference of means
Nadeem Uddin
 
200844797 rsh-qam11-tif-01-doc
200844797 rsh-qam11-tif-01-doc200844797 rsh-qam11-tif-01-doc
200844797 rsh-qam11-tif-01-doc
Firas Husseini
 
Reconsidering baron and kenny
Reconsidering baron and kennyReconsidering baron and kenny
Reconsidering baron and kenny
Minhwan Lee
 
Quantitative Analysis For Management 13th Edition Render Test Bank
Quantitative Analysis For Management 13th Edition Render Test BankQuantitative Analysis For Management 13th Edition Render Test Bank
Quantitative Analysis For Management 13th Edition Render Test Bank
Jescieer
 
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
Chandan Sah
 

What's hot (20)

z-test
z-testz-test
z-test
 
Bus b272 f unit 1
Bus b272 f unit 1Bus b272 f unit 1
Bus b272 f unit 1
 
Research Paper C - 1530 - 13-05-15
Research Paper C - 1530 - 13-05-15Research Paper C - 1530 - 13-05-15
Research Paper C - 1530 - 13-05-15
 
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)
 
Testing of hypothesis and Goodness of fit
Testing of hypothesis and Goodness of fitTesting of hypothesis and Goodness of fit
Testing of hypothesis and Goodness of fit
 
Test of hypothesis test of significance
Test of hypothesis test of significanceTest of hypothesis test of significance
Test of hypothesis test of significance
 
Finding Robust Solutions to Requirements Models
Finding Robust Solutions to Requirements ModelsFinding Robust Solutions to Requirements Models
Finding Robust Solutions to Requirements Models
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Test for proportion
Test for proportionTest for proportion
Test for proportion
 
Z test
Z testZ test
Z test
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Z test, f-test,etc
Z test, f-test,etcZ test, f-test,etc
Z test, f-test,etc
 
Les5e ppt 11
Les5e ppt 11Les5e ppt 11
Les5e ppt 11
 
Modeling strategies for definitive screening designs using jmp and r
Modeling strategies for definitive  screening designs using jmp and rModeling strategies for definitive  screening designs using jmp and r
Modeling strategies for definitive screening designs using jmp and r
 
Student t-test
Student t-testStudent t-test
Student t-test
 
Hypothesis testing part iii for difference of means
Hypothesis testing part iii for difference of meansHypothesis testing part iii for difference of means
Hypothesis testing part iii for difference of means
 
200844797 rsh-qam11-tif-01-doc
200844797 rsh-qam11-tif-01-doc200844797 rsh-qam11-tif-01-doc
200844797 rsh-qam11-tif-01-doc
 
Reconsidering baron and kenny
Reconsidering baron and kennyReconsidering baron and kenny
Reconsidering baron and kenny
 
Quantitative Analysis For Management 13th Edition Render Test Bank
Quantitative Analysis For Management 13th Edition Render Test BankQuantitative Analysis For Management 13th Edition Render Test Bank
Quantitative Analysis For Management 13th Edition Render Test Bank
 
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
7 QC quality control (7 QC) tools for continuous improvement of manufacturing...
 

Similar to ODSC Causal Inference Workshop (November 2016) (1)

Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
Benoît Rostykus
 
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptxQA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
Teshome62
 
Experimental Evaluation Methods
Experimental Evaluation MethodsExperimental Evaluation Methods
Experimental Evaluation Methods
clearsateam
 
Evaluation in Africa RISING
Evaluation in Africa RISINGEvaluation in Africa RISING
Evaluation in Africa RISING
africa-rising
 
Causality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAELCausality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAEL
Hanan Shteingart
 
Detecting incorrectly implemented experiments
Detecting incorrectly implemented experimentsDetecting incorrectly implemented experiments
Detecting incorrectly implemented experiments
Optimizely
 
3. How to Randomize
3. How to Randomize3. How to Randomize
3. How to Randomize
vinhthedang
 
Impact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with RealityImpact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with Reality
Donna Smith-Moncrieffe
 
Conjoint by idrees iugc
Conjoint by idrees iugcConjoint by idrees iugc
Conjoint by idrees iugcId'rees Waris
 
Czarnitzki - Towards a portfolio of additionaliyu indicators
Czarnitzki - Towards a portfolio of additionaliyu indicatorsCzarnitzki - Towards a portfolio of additionaliyu indicators
Czarnitzki - Towards a portfolio of additionaliyu indicators
innovationoecd
 
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
Topic 10 DATA ANALYSIS TECHNIQUES.pptxTopic 10 DATA ANALYSIS TECHNIQUES.pptx
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
EdwinDagunot4
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and Validity
Hora Tjitra
 
Pmp exam prep Pdf- 2
Pmp exam prep Pdf- 2Pmp exam prep Pdf- 2
Pmp exam prep Pdf- 2
Amr Miqdadi
 
Optimization Seminar.pptx
Optimization Seminar.pptxOptimization Seminar.pptx
Optimization Seminar.pptx
PawanDhamala1
 
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.pptMarket Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Edu4Sure
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
Amit Sharma
 
Evaluation Methods
Evaluation MethodsEvaluation Methods
Evaluation Methods
clearsateam
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methods
gopichandbalusu
 
Biostatistics ppt.pptx
Biostatistics ppt.pptxBiostatistics ppt.pptx
Biostatistics ppt.pptx
chickuujunagade
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
Ramkumar Ravichandran
 

Similar to ODSC Causal Inference Workshop (November 2016) (1) (20)

Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptxQA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
 
Experimental Evaluation Methods
Experimental Evaluation MethodsExperimental Evaluation Methods
Experimental Evaluation Methods
 
Evaluation in Africa RISING
Evaluation in Africa RISINGEvaluation in Africa RISING
Evaluation in Africa RISING
 
Causality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAELCausality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAEL
 
Detecting incorrectly implemented experiments
Detecting incorrectly implemented experimentsDetecting incorrectly implemented experiments
Detecting incorrectly implemented experiments
 
3. How to Randomize
3. How to Randomize3. How to Randomize
3. How to Randomize
 
Impact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with RealityImpact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with Reality
 
Conjoint by idrees iugc
Conjoint by idrees iugcConjoint by idrees iugc
Conjoint by idrees iugc
 
Czarnitzki - Towards a portfolio of additionaliyu indicators
Czarnitzki - Towards a portfolio of additionaliyu indicatorsCzarnitzki - Towards a portfolio of additionaliyu indicators
Czarnitzki - Towards a portfolio of additionaliyu indicators
 
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
Topic 10 DATA ANALYSIS TECHNIQUES.pptxTopic 10 DATA ANALYSIS TECHNIQUES.pptx
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and Validity
 
Pmp exam prep Pdf- 2
Pmp exam prep Pdf- 2Pmp exam prep Pdf- 2
Pmp exam prep Pdf- 2
 
Optimization Seminar.pptx
Optimization Seminar.pptxOptimization Seminar.pptx
Optimization Seminar.pptx
 
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.pptMarket Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Evaluation Methods
Evaluation MethodsEvaluation Methods
Evaluation Methods
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methods
 
Biostatistics ppt.pptx
Biostatistics ppt.pptxBiostatistics ppt.pptx
Biostatistics ppt.pptx
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
 

ODSC Causal Inference Workshop (November 2016) (1)

  • 1. An Introduction to Causal Inference in Tech Emily Glassberg Sands emily@coursera.org, @emilygsands November 2016
  • 2. About me ● Harvard Economics PhD ● Data Science Manager @ Coursera econometrics causal inference experimental design labor markets & education
  • 3. Does X drive Y? ● Did PR coverage drive sign-ups? ● Does mobile app improve retention? ● Does customer support increase sales? ● Would lowering price increase revenues? ● ... Inspired by work with Duncan Gilchrist, Economist and Data Scientist @ Wealthfront
  • 4. Does X drive Y? 4 Raw Correlation ▪ Users engaging with X more likely to have outcome Y? ▫ Plot Y against X ▫ corr(X, Y) ▪ But beware confounding variables
  • 5. “Impact” of Mobile App Usage on Retention Mobile Usage? MoM Retention No 35% Yes 40% Selection Bias?
  • 6.
  • 7. Does X drive Y? 7 Testing ▪ Randomly assign some users and not others an experience ▪ Estimate the causal effect of the experience on the outcome ▪ Often best path forward… ...but not in all cases
  • 8. Limitations of A/B Testing Consider user experience Consider ethics Consider effect on user trust
  • 9. 5 Econometric Methods for Causal Inference Controlled Regression Difference-in- Differences Fixed Effects Regression Instrumental Variables 9 Regression Discontinuity Design
  • 11. Method 1: Controlled Regression 11 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Idea: Control directly for the confounding variables in a regression of Y on X Assumption: Distribution of outcomes, Y, conditionally independent of treatment, X, given the confounders, C
  • 12. Method 1: Controlled Regression 12 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Example: Effect of live chat support on sales. ▪ Age confounder → Upward bias if regress sales on chat support ▪ Add control for age In R: fit <- lm(Y ~ X + C, data = ...) summary(fit)
  • 13. Method 1: Controlled Regression 13 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Pitfall 1: “Missing” controls → Omitted Variable Bias Can we tell how much of a problem? ▪ If adding proxies increases (adjusted) R-squared without impacting estimate, could be ok...* *Oster 15 provides a formal treatment.
  • 14. 14 ✓ Adding controls does NOT change point estimate
  • 15. Method 1: Controlled Regression 15 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables ▪ ...but if adding proxies to regression impacts coefficient on X, regression won’t suffice. Adding controls DOES change point estimate Relationship between Instructor & Enrollee Gender Share Enrollments F Any Instructor F .090*** .035*** (0.0076) (0.0074) Controls NO YES Adjusted R-squared 0.07 0.74 Base Group Mean 0.32 0.32
  • 16. Watch for omitted variables biasing coefficient of interest 16
  • 17. Method 1: Controlled Regression 17 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Pitfall 2: “Bad” controls → Included Variable Bias Example: ▪ Suppose “ interest in product” is confounder ▪ Control for proportion of emails opened? Not if directly impacted by treatment!
  • 18. Leave out “controls” that are themselves not fixed at the time treatment was determined 18
  • 20. Method 2: Regression Discontinuity Design 20 Idea: Focus on a cut-off point that can be thought of as a local randomized experiment Example: Effect of passing course on income? ▪ A/B test? Randomly passing some, failing others unethical ▪ Controlled regression? Key unobservables like ability and motivation Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 21. Method 2: Regression Discontinuity Design 21 Example cont’d: Passing cutoff → natural experiment!! ▪ User earning 69 similar to user earning 70 ▪ Use discontinuity to estimate causal effect In R: Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables library(rdd) RDestimate(Y ~ D, data = …, subset = …, cutpoint = …)
  • 22. Method 2: Regression Discontinuity Design 22 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Threshold
  • 23. Note on Validity - A/B testing 23 Type Definition Assumptions Internal validity Unbiased for subpopulation studied Randomized correctly, i.e. samples balanced External validity Unbiased for full population Experimental group representative of overall
  • 24. Note on Validity - Regression Discontinuity Design 24 Type Definition Assumptions Internal validity Unbiased for subpopulation studied 1. Imprecise control of assignment 2. No confounding discontinuities External validity Unbiased for full population Homogeneous treatment effects
  • 25. Method 2: Internal Validity in RDD 25 Assumption 1: Imprecise control of assignment, AKA no manipulation at the threshold ▪ Users cannot control whether just above versus just below the cutoff In example: Cannot control grade around the cutoff (e.g., asking for re-grade). How can we tell? Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 26. Method 2: Internal Validity in RDD 26 Check 1: Mass just below ~= Mass just above Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables ✓ Even mass around cut-off Agency over assignment
  • 27. Method 2: Internal Validity in RDD 27 Check 2: Composition of users in two buckets similar along key observable dimension(s) Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables ✓ Similar on observable Different on observable
  • 28. Check for manipulation at the threshold 28 1. Mass just below ~= Mass just above? 2. Just below vs. just above similar on key observables?
  • 29. Method 2: Internal Validity in RDD 29 Assumption 2: No confounding discontinuities ▪ Being just above (versus just below) the cutoff should not influence other features In example: Assumes passing is the only differentiator between a 60 and a 70 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 30. Watch out for confounding discontinuities 30
  • 31. 31 Type Definition Assumptions Internal validity Unbiased for subpopulation studied 1. Imprecise control of assignment 2. No confounding discontinuities External validity Unbiased for full population Homogeneous treatment effects Note on Validity - Regression Discontinuity Design
  • 32. Method 2: External Validity in RDD 32 LATE: RDD estimates Local Average Treatment Effect (LATE) ▪ “Local” around the cut-off If heterogeneous treatment effects may not be applicable to the full group. But interventions we’d consider would often occur on margin anyway Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 33. Estimated effect is “local” average treatment effect around cut-off 33
  • 35. Method 3: Difference-in-Differences 35 Idea: Comparison of pre and post outcomes between treatment and control groups Example: Effect of lowering price on revenue? ▪ A/B test? Could, but may be perceived as unfair ▪ Alternative: Quasi-experimental design + DD Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 36. 36
  • 37. Method 3: Difference-in-Differences 37 Idea: Comparison of pre and post outcomes between treatment and control groups Example: Effect of lowering price on revenue? ▪ A/B test? Could, but may be perceived as unfair ▪ Alternative: Quasi-experimental design + DD Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 38. Method 3: Difference-in-Differences 38 Example con’t: ▪ Change price + RDD? But if co-timed marketing, feature launch, external shock …counterfactual no longer obvious... Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 39. Method 3: Difference-in-Differences 39 Example con’t: ▪ DD design. Change price in some geos (e.g., countries) but not others Use control markets to compute counterfactual in treatment markets Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 40. DD more robust than RDD so design for DD where feasible 40
  • 41. Method 3: Difference-in-Differences 41 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Control markets Treatment markets Date of Change
  • 42. Method 3: Difference-in-Differences 42 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables In R: fit <- lm(Y ~ treatment + post + I(treatment * post), data = … ) summary(fit)
  • 43. Method 3: Difference-in-Differences 43 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables In R (with time trends): fit <- lm(Y ~ time + treatment + I((time >= 0) * treatment), data = … ) summary(fit)
  • 44. Method 3: Difference-in-Differences 44 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 45. Method 3: Difference-in-Differences 45 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Control markets Treatment markets Date of Change
  • 46. Note on Validity - Difference-in-Differences 46 Type Definition Assumptions Internal validity Unbiased for subpopulation studied Parallel trends External validity Unbiased for full population Homogeneous treatment effect
  • 47. Method 3: Internal Validity in DD 47 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Assumption: Parallel trends ▪ Absent treatment, same trends In example: Treatment and control markets would have followed same trends if no price change How can we tell?
  • 48. Method 3: Internal Validity in DD 48 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Pre-experiment: ▪ Make treatment and control similar ▫ Stratified randomization. 1. Stratify based on key attributes 2. Randomize within strata 3. Pool across strata ▫ Matched pairs. Historically followed similar trends and/or are expected to respond similarly to internal or external shocks
  • 49. Method 3: Internal Validity in DD 49 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Pre-experiment (cont): ▪ Check graphically & statistically that pre-experiment trends parallel ✓ Parallel trends NOT parallel trends
  • 50. Design DD for parallel trends 50 1. Set-up: stratified randomization, matched pairs 2. Check: parallel trends ex ante
  • 51. Method 3: Internal Validity in DD 51 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Post roll-out: Problem 1: Confounder(s) in certain treatment or control market(s), e.g., launch localized payments Solution 1: Exclude those observations.
  • 52. Method 3: Internal Validity in DD 52 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Post roll-out (cont): Problem 2: Confounder(s) in subset of treatment and control market(s), e.g., Euro value plunges Solution 2: Difference-in-Difference-in-Difference
  • 53. Consider excluding confounded observations, or triple differencing 53
  • 54. Note on Validity - Difference-in-Differences 54 Type Definition Assumptions Internal validity Unbiased for subpopulation studied Parallel trends External validity Unbiased for full population Homogeneous treatment effect
  • 55. Method 3: External Validity in DD 55 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Assumption: Homogeneous treatment effects, as with RDD Pricing caveat: General Equilibrium? In experiment, users influenced by price change ▫ Can cut on new users only ▫ See Pricing Post for more pricing tips
  • 56. Method 3: Extension: Bayesian Approach 56 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Idea: Construct a Bayesian structural time-series model and use to predict counterfactual Open source resource: Google’s CausalImpact
  • 57. Method 3: Extension: Bayesian Approach 57 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Example: Discrete shock in given market, e.g., ▪ PR announcement in India ▪ New partnership with Singaporean government A/B testing infeasible; CausalImpact compares pre/post in treated/untreated markets
  • 59. Method 4: Fixed Effects Regression 59 Idea: Special type of controlled regression ▪ most commonly used with panel data ▪ often to capture heterogeneity across individuals (or products) fixed over time Example: Estimate effect of price on conversion ▪ 1(pay) = ɑ + β*1($49) + X’Ⲅ ▫ X is vector of product fixed effects ▫ Ⲅ is a vector of product-specific intercepts Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 60. Method 4: Fixed Effects Regression 60 In R: Note: Requires meaningful variation in X after controlling for fixed effects. Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables fit <- lm(Y ~ X + factor(SKU), data = …) summary(fit)
  • 61. Note on Validity - Fixed Effects 61 Type Definition Assumptions Internal validity Unbiased for subpopulation studied 1. Imprecise control of assignment 2. No confounding discontinuities External validity Unbiased for full population Homogeneous treatment effects
  • 63. Method 5: Instrumental Variables 63 Idea: “Instrument” for X of interest with some feature, Z, that drives Y only through its effect on X; back out effect of X on Y Requirements: ▪ Strong first stage: Z meaningfully affects X ▪ Exclusion restriction: Z affects Y only through its effect on X Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 64. Method 5: Instrumental Variables 64 Implementation: 1. Instrument for X with Z 2. Estimate the effect of (instrumented) X on Y In R: Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables library(aer) fit <- ivreg(Y ~ X | Z, data = …) summary(fit, vcov = sandwich, df = Inf, diagnostics = TRUE)
  • 65. Method 5: Instrumental Variables 65 Sample output: Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 66. Method 5: Instrumental Variables 66 Instruments in real world? Often look to policies Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Y X Instrument Economist(s) Earnings Education Vietnam Draft lottery Angrist Compulsory schooling laws Angrist & Krueger Quarter of birth Angrist & Krueger Crime Prison populations Prison overcrowding litigation Levitt Police Electoral cycles Levitt
  • 67. Method 5: Instrumental Variables 67 Instruments in tech? Everywhere! Especially old A/B tests Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 68. Method 5: Instrumental Variables 68 Instruments in tech? Everywhere! Especially old A/B tests Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Y X Instrument Data Scientist Platform retention Having friends on the platform Referral test 1 You! Referral test 2 You! Referral test 3 You! ... ...
  • 69. Note on Validity - Instrumental Variables 69 Type Definition Assumptions Internal validity Unbiased for subpopulation studied 1. Strong first stage 2. Exclusion restriction External validity Unbiased for full population Homogeneous treatment effect
  • 70. Method 5: Internal Validity in IV 70 Assumption 1: Strong first stage ▪ Experiment we chose “successful” at driving X Why matters: If Z not strong predictor of X, second stage estimate will be biased. How can we tell? Check F-statistic on the first stage regression; should be > 11 (rule-of-thumb) ▪ `Diagnostics = TRUE’ in R will include test of weak instruments Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 71. Method 5: Internal Validity in IV 71 Assumption 2: Exclusion restriction ▪ Z affects Y only through X How can we tell? No test; have to go on logic In the example: ✓ Control group got otherwise equivalent email Control group got no email Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 72. Note on Validity - Instrumental Variables 72 Type Definition Assumptions Internal validity Unbiased for subpopulation studied 1. Strong first stage 2. Exclusion restriction External validity Unbiased for full population Homogeneous treatment effects
  • 73. Method 5: External Validity in IV 73 LATE: RDD estimates Local Average Treatment Effect (LATE) ▪ Relevant for the group impacted by the instrument If heterogeneous treatment effects may not be applicable to the full group. But interventions we’d consider would often occur on margin anyway Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables
  • 74. Method 5: Make-Your-Own-Instrument! 74 Method 1: Controlled Regression Method 2: Regression Discontinuity Design Method 3: Difference-in- Differences Method 4: Fixed Effects Regression Method 5: Instrumental Variables Instrumental variables via randomized encouragement +
  • 75. Extensions: ML + Causal Inference 75
  • 76. Traditionally distinct literatures: ▪ Machine Learning focuses on prediction ▫ Nonparametric prediction methods ▫ Cross-validation for model selection ▪ Economics and statistics focuses on causality Weaknesses of classic causal approaches: ▪ Fail with many covariates ▪ Model selection unprincipled ML + Causal Inference = <3 Extensions & New Directions: ML + Causal Inference
  • 77. Idea: In cases where many possible instrument sets, use LASSO (penalized least squares) to select instruments Benefits: ▪ Less prone to data mining → more robust ▪ Stronger first stage → less weak instrument bias Extensions & New Directions: ML + Causal Inference: LASSO
  • 78. Example: Want to estimate social spillovers in movie consumption. ▪ Causal effect of viewership on later viewership? ▪ Instrument for viewership with weather Extensions & New Directions: ML + Causal Inference: LASSO
  • 79. Extensions & New Directions: ML + Causal Inference: LASSO Effect of weather shocks on viewership
  • 80. Example: Want to estimate social spillovers in movie consumption. ▪ Causal effect of viewership on later viewership? ▪ Instrument for viewership with weather Challenge: Potential set of instruments large ▫ Risk of overfitting (e.g., including all) ▫ Risk of data minimum (e.g., hand-picking) Solution: Implement LASSO methods to estimate optimal instruments in linear IV models with many instruments Extensions & New Directions: ML + Causal Inference: LASSO
  • 81. Extensions & New Directions: ML + Causal Inference: Trees Idea: In cases where heterogeneous treatment effects, use trees to identify subgroups Example: Want to identify a partition of the covariate space into subgroups based on treatment effect heterogeneity Solution: Athey & Imbens’ (2015) Causal Trees ▪ like regression trees but focuses on MSE of treatment effect ▪ output is treatment effect & CI by subgroup
  • 82. Extensions & New Directions: ML + Causal Inference: Forest Idea: Extension of trees; want personalized estimate of treatment effect Solution: Wager & Athey (2015) Causal Forests ▪ estimate is CATE (conditional average treatment effect) ▪ predictions are asymptotically normal ▪ predictions centered on the true effect
  • 83. Sample R code and simulation output available on GitHub Detailed context and more examples available on Medium References and reading list available here Home grown resources
  • 84. 84 Thanks!! Any questions? You can find me at @emilygsands & emily@coursera.org