ODSC Causal Inference Workshop (November 2016) (1)

An Introduction to
Causal Inference in Tech
Emily Glassberg Sands
emily@coursera.org, @emilygsands
November 2016

About me
● Harvard Economics PhD
● Data Science Manager @
Coursera
econometrics
causal inference
experimental design
labor markets & education

Does X drive Y?
● Did PR coverage drive sign-ups?
● Does mobile app improve retention?
● Does customer support increase sales?
● Would lowering price increase revenues?
● ...
Inspired by work with Duncan Gilchrist, Economist and Data Scientist @ Wealthfront

Does X drive Y?
4
Raw Correlation
▪ Users engaging with X more likely
to have outcome Y?
▫ Plot Y against X
▫ corr(X, Y)
▪ But beware confounding variables

“Impact” of Mobile App Usage on
Retention
Mobile
Usage?
MoM
Retention
No 35%
Yes 40%
Selection
Bias?

Does X drive Y?
7
Testing
▪ Randomly assign some users and
not others an experience
▪ Estimate the causal effect of the
experience on the outcome
▪ Often best path forward…
...but not in all cases

Limitations of A/B Testing
Consider user
experience
Consider ethics
Consider effect
on user trust

5 Econometric Methods for
Causal Inference
Controlled
Regression
Difference-in-
Differences
Fixed
Effects
Regression
Instrumental
Variables
9
Regression
Discontinuity
Design

Method 1:
Controlled Regression
11
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Control directly for the confounding
variables in a regression of Y on X
Assumption: Distribution of outcomes, Y,
conditionally independent of treatment, X, given
the confounders, C

Method 1:
12
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Effect of live chat support on sales.
▪ Age confounder →
Upward bias if regress sales on chat support
▪ Add control for age
In R:
fit <- lm(Y ~ X + C, data = ...)
summary(fit)

Method 1:
13
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pitfall 1: “Missing” controls →
Omitted Variable Bias
Can we tell how much of a problem?
▪ If adding proxies increases (adjusted)
R-squared without impacting estimate, could
be ok...*
*Oster 15 provides a formal treatment.

14
✓ Adding controls does NOT change point estimate

Method 1:
15
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
▪ ...but if adding proxies to regression impacts
coefficient on X, regression won’t suffice.
Adding controls DOES change point estimate
Relationship between Instructor & Enrollee Gender
Share Enrollments F
Any Instructor F .090*** .035***
(0.0076) (0.0074)
Controls NO YES
Adjusted R-squared 0.07 0.74
Base Group Mean 0.32 0.32

Watch for omitted variables
biasing coefficient of interest
16

Method 1:
17
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pitfall 2: “Bad” controls →
Included Variable Bias
Example:
▪ Suppose “ interest in product” is confounder
▪ Control for proportion of emails opened?
Not if directly impacted by treatment!

Leave out “controls” that are
themselves not fixed at the time
treatment was determined
18

19
Regression
Discontinuity
Design

Method 2:
Regression Discontinuity Design
20
Idea: Focus on a cut-off point that can be
thought of as a local randomized experiment
Example: Effect of passing course on income?
▪ A/B test? Randomly passing some, failing
others unethical
▪ Controlled regression? Key unobservables
like ability and motivation
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 2:
21
Example cont’d:
Passing cutoff → natural experiment!!
▪ User earning 69 similar to user earning 70
▪ Use discontinuity to estimate causal effect
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(rdd)
RDestimate(Y ~ D, data = …,
subset = …, cutpoint = …)

Method 2:
22
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables Threshold

Note on Validity - A/B testing
23
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Randomized
correctly, i.e. samples
balanced
External
validity
Unbiased for full
population
Experimental group
representative of
overall

Note on Validity - Regression
Discontinuity Design
24
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control
of assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects

Method 2:
Internal Validity in RDD
25
Assumption 1: Imprecise control of assignment,
AKA no manipulation at the threshold
▪ Users cannot control whether just above
versus just below the cutoff
In example: Cannot control grade around the
cutoff (e.g., asking for re-grade).
How can we tell?
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 2:
26
Check 1: Mass just below ~= Mass just above
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Even mass around cut-off Agency over assignment

Method 2:
27
Check 2: Composition of users in two buckets
similar along key observable dimension(s)
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Similar on observable Different on observable

Check for manipulation at the
threshold
28
1. Mass just below ~= Mass just above?
2. Just below vs. just above similar on key observables?

Method 2:
29
Assumption 2: No confounding discontinuities
▪ Being just above (versus just below) the cutoff
should not influence other features
In example: Assumes passing is the only
differentiator between a 60 and a 70
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Watch out for confounding
discontinuities
30

31
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Note on Validity - Regression
Discontinuity Design

Method 2:
External Validity in RDD
32
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ “Local” around the cut-off
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Estimated effect is “local” average
treatment effect around cut-off
33

Method 3:
Difference-in-Differences
35
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 3:
37
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 3:
38
Example con’t:
▪ Change price + RDD? But if co-timed marketing,
feature launch, external shock …counterfactual
no longer obvious...
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 3:
39
Example con’t:
▪ DD design. Change price in some geos (e.g.,
countries) but not others
Use control markets to compute
counterfactual in treatment markets
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

DD more robust than RDD so
design for DD where feasible
40

Method 3:
41
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Control markets
Treatment markets
Date of Change

Method 3:
42
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
In R:
fit <- lm(Y ~ treatment +
post +
I(treatment * post),
data = … )
summary(fit)

Method 3:
43
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
In R (with time trends):
fit <- lm(Y ~ time +
treatment +
I((time >= 0) * treatment),
data = … )
summary(fit)

Method 3:
44
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 3:
45
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Control markets
Treatment markets
Date of Change

Note on Validity -
46
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect

Method 3:
Internal Validity in DD
47
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Parallel trends
▪ Absent treatment, same trends
In example: Treatment and control markets
would have followed same trends if no price
change
How can we tell?

Method 3:
48
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pre-experiment:
▪ Make treatment and control similar
▫ Stratified randomization.
1. Stratify based on key attributes
2. Randomize within strata
3. Pool across strata
▫ Matched pairs. Historically followed similar
trends and/or are expected to respond
similarly to internal or external shocks

Method 3:
49
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pre-experiment (cont):
▪ Check graphically & statistically that
pre-experiment trends parallel
✓ Parallel trends NOT parallel trends

Design DD for parallel trends
50
1. Set-up: stratified randomization, matched pairs
2. Check: parallel trends ex ante

Method 3:
51
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Post roll-out:
Problem 1: Confounder(s) in certain treatment or
control market(s), e.g., launch localized payments
Solution 1: Exclude those observations.

Method 3:
52
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Post roll-out (cont):
Problem 2: Confounder(s) in subset of treatment
and control market(s), e.g., Euro value plunges
Solution 2: Difference-in-Difference-in-Difference

Consider excluding confounded
observations, or triple differencing
53

Note on Validity -
54
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect

Method 3:
External Validity in DD
55
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Homogeneous treatment effects, as
with RDD
Pricing caveat: General Equilibrium? In
experiment, users influenced by price change
▫ Can cut on new users only
▫ See Pricing Post for more pricing tips

Method 3:
Extension: Bayesian Approach
56
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Construct a Bayesian structural time-series
model and use to predict counterfactual
Open source resource: Google’s CausalImpact

Method 3:
Extension: Bayesian Approach
57
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Discrete shock in given market, e.g.,
▪ PR announcement in India
▪ New partnership with Singaporean
government
A/B testing infeasible; CausalImpact compares
pre/post in treated/untreated markets

Method 4:
Fixed Effects Regression
59
Idea: Special type of controlled regression
▪ most commonly used with panel data
▪ often to capture heterogeneity across
individuals (or products) fixed over time
Example: Estimate effect of price on conversion
▪ 1(pay) = ɑ + β*1($49) + X’Ⲅ
▫ X is vector of product fixed effects
▫ Ⲅ is a vector of product-specific intercepts
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 4:
Fixed Effects Regression
60
In R:
Note: Requires meaningful variation in X after
controlling for fixed effects.
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
fit <- lm(Y ~ X + factor(SKU),
data = …)
summary(fit)

Note on Validity - Fixed Effects
61
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects

Method 5:
Instrumental Variables
63
Idea: “Instrument” for X of interest with some
feature, Z, that drives Y only through its effect
on X; back out effect of X on Y
Requirements:
▪ Strong first stage: Z meaningfully affects X
▪ Exclusion restriction: Z affects Y only
through its effect on X
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 5:
64
Implementation:
1. Instrument for X with Z
2. Estimate the effect of (instrumented) X on Y
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(aer)
fit <- ivreg(Y ~ X | Z, data = …)
summary(fit, vcov = sandwich,
df = Inf, diagnostics = TRUE)

Method 5:
65
Sample output:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 5:
66
Instruments in real world? Often look to policies
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Y X Instrument Economist(s)
Earnings Education Vietnam Draft lottery Angrist
Compulsory
schooling laws
Angrist & Krueger
Quarter of birth Angrist & Krueger
Crime Prison
populations
Prison overcrowding
litigation
Levitt
Police Electoral cycles Levitt

Method 5:
67
Instruments in tech? Everywhere! Especially old
A/B tests
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 5:
68
Instruments in tech? Everywhere! Especially old
A/B tests
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Y X Instrument Data Scientist
Platform
retention
Having friends on
the platform
Referral test 1 You!
... ...

Note on Validity - Instrumental Variables
69
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effect

Method 5:
Internal Validity in IV
70
Assumption 1: Strong first stage
▪ Experiment we chose “successful” at driving X
Why matters: If Z not strong predictor of X,
second stage estimate will be biased.
How can we tell? Check F-statistic on the first
stage regression; should be > 11 (rule-of-thumb)
▪ `Diagnostics = TRUE’ in R will include test of
weak instruments
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 5:
Internal Validity in IV
71
Assumption 2: Exclusion restriction
▪ Z affects Y only through X
How can we tell? No test; have to go on logic
In the example:
✓ Control group got otherwise equivalent email
Control group got no email
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Note on Validity - Instrumental Variables
72
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effects

Method 5:
External Validity in IV
73
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ Relevant for the group impacted by the
instrument
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables

Method 5:
Make-Your-Own-Instrument!
74
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Instrumental
variables via
randomized
encouragement
+

Extensions: ML + Causal Inference
75

Traditionally distinct literatures:
▪ Machine Learning focuses on prediction
▫ Nonparametric prediction methods
▫ Cross-validation for model selection
▪ Economics and statistics focuses on causality
Weaknesses of classic causal approaches:
▪ Fail with many covariates
▪ Model selection unprincipled
ML + Causal Inference = <3
Extensions & New Directions:
ML + Causal Inference

Idea: In cases where many possible instrument
sets, use LASSO (penalized least squares) to select
instruments
Benefits:
▪ Less prone to data mining → more robust
▪ Stronger first stage → less weak instrument
bias
ML + Causal Inference: LASSO

Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather

Effect of weather shocks on viewership

Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather
Challenge: Potential set of instruments large
▫ Risk of overfitting (e.g., including all)
▫ Risk of data minimum (e.g., hand-picking)
Solution: Implement LASSO methods to estimate
optimal instruments in linear IV models with many
instruments

ML + Causal Inference: Trees
Idea: In cases where heterogeneous treatment
effects, use trees to identify subgroups
Example: Want to identify a partition of the
covariate space into subgroups based on
treatment effect heterogeneity
Solution: Athey & Imbens’ (2015) Causal Trees
▪ like regression trees but focuses on MSE of
treatment effect
▪ output is treatment effect & CI by subgroup

ML + Causal Inference: Forest
Idea: Extension of trees; want personalized
estimate of treatment effect
Solution: Wager & Athey (2015) Causal Forests
▪ estimate is CATE (conditional average
treatment effect)
▪ predictions are asymptotically normal
▪ predictions centered on the true effect

Sample R code and simulation output
available on GitHub
Detailed context and more examples
available on Medium
References and reading list available here
Home grown resources

84
Thanks!!
Any questions?
You can find me at @emilygsands &
emily@coursera.org

ODSC Causal Inference Workshop (November 2016) (1)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ODSC Causal Inference Workshop (November 2016) (1)

Similar to ODSC Causal Inference Workshop (November 2016) (1) (20)

ODSC Causal Inference Workshop (November 2016) (1)