This document introduces causal inference methods for determining whether a treatment or experience causes an outcome. It discusses five econometric methods: controlled regression, regression discontinuity design, difference-in-differences, fixed effects regression, and instrumental variables. For each method, it provides an example, explanation of the identifying assumptions, and tips for checking the internal validity of the method. The document emphasizes the importance of experimental design and testing assumptions to make causal inferences from non-experimental data.
Bruce Ingraham (Ingraham Consulting) gave a talk on Satisfaction and Loyalty at the SF Data Mining event: http://www.meetup.com/Data-Mining/events/68283282/
Common mistakes in measurement uncertainty calculationsGH Yeoh
The basic calculation for measurement uncertainty (MU) is through the law of propagation of uncertainty. Some find it difficult to apply and make some mistakes in the MU evaluation.
Bruce Ingraham (Ingraham Consulting) gave a talk on Satisfaction and Loyalty at the SF Data Mining event: http://www.meetup.com/Data-Mining/events/68283282/
Common mistakes in measurement uncertainty calculationsGH Yeoh
The basic calculation for measurement uncertainty (MU) is through the law of propagation of uncertainty. Some find it difficult to apply and make some mistakes in the MU evaluation.
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)Teck Nam Ang
This set of slides explains in a simple manner the purpose of experiment, various strategies of experiment, how to plan and design experiment, and the handling of experimental data.
This slideshow is related to testing of hypothesis and goodness of fit of statistics. This may be useful for students, teachers, managers concerned with bio statistics, bioinformatics, data science, etc.
The test used to ascertain whether the difference between estimator & parameter or between two estimator are real or due to chance are called test of hypothesis.
T-test.
Chi-square (휒^2)- test.
F-Test.
ANOVA.
Quantitative Analysis For Management 13th Edition Render Test BankJescieer
Full download : http://alibabadownload.com/product/quantitative-analysis-for-management-13th-edition-render-test-bank/ Quantitative Analysis For Management 13th Edition Render Test Bank
Introduction to Design of Experiments by Teck Nam Ang (University of Malaya)Teck Nam Ang
This set of slides explains in a simple manner the purpose of experiment, various strategies of experiment, how to plan and design experiment, and the handling of experimental data.
This slideshow is related to testing of hypothesis and goodness of fit of statistics. This may be useful for students, teachers, managers concerned with bio statistics, bioinformatics, data science, etc.
The test used to ascertain whether the difference between estimator & parameter or between two estimator are real or due to chance are called test of hypothesis.
T-test.
Chi-square (휒^2)- test.
F-Test.
ANOVA.
Quantitative Analysis For Management 13th Edition Render Test BankJescieer
Full download : http://alibabadownload.com/product/quantitative-analysis-for-management-13th-edition-render-test-bank/ Quantitative Analysis For Management 13th Edition Render Test Bank
Presented by Pascale Schnitzer and Carlo Azzarri, IFPRI at the Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013
Most data scientists are focused on predictive (aka supervised) projects, yet the real growth is usually in the estimation of action effects and optimizations of action policies. To this end, I will present causal inference and related packages.
There are three layers of analytics: descriptive (BI), predictive (supervised modeling), and prescriptive - the latter, the less-known one, focus on answering the most important business questions. For example, "what was the effect of giving a discount" ( or "what should I do to create the desired effect" - In this talk, we will first discuss what frameworks are used to answer these questions, namely causal inference, and reinforcement learning. Then we will deep dive into CI and discuss in causality crash 101 courses why is it important. Last but not least we will present existing causal-inference open-source packages and their limitations.
Has the outcome of an experiment ever gone so strongly against your intuition that you doubt its correct implementation? This is certainly a possibility, as correctly implementing an online experiment is fraught with data quality challenges. There is a considerable amount of engineering and processing of data before it reaches the analysis, leaving plenty of room for bugs and biases to creep in. One of the strongest signals of an incorrect implementation is a sample ratio mismatch (SRM) - when the number of users assigned to each variation differs significantly from what is expected under the intended random allocation.
This talk will:
- Demo a novel SRM test which allows experimentation teams to rapidly detect if there is a bug in the implementation, even while the experiment is running, allowing developers to quickly fix the underlying issue.
- Provide an introduction to both the mathematics of the SRM test as well as the newly open sourced library which developers can immediately integrate into their platform.
This university lecture at Carleton University shares various evaluation research designs that can be used with community based organizations, especially when a comparison group cannot be identified (i.e. implicit designs and regression discontinuity)
Market Research using SPSS _ Edu4Sure Sept 2023.pptEdu4Sure
SPSS Training Related Content. There is practical training on the tool. The PPT is for reference purpose.
For any training need, kindly connect us at partner@edu4sure.com or call us at +91-9555115533.
For more courses at our LMS, you can also refer www.testformula.com
#Edu4Sure #SPSS #Training #Certificate
Dowhy: An end-to-end library for causal inferenceAmit Sharma
In addition to efficient statistical estimators of a treatment's effect, successful application of causal inference requires specifying assumptions about the mechanisms underlying observed data and testing whether they are valid, and to what extent. However, most libraries for causal inference focus only on the task of providing powerful statistical estimators. We describe DoWhy, an open-source Python library that is built with causal assumptions as its first-class citizens, based on the formal framework of causal graphs to specify and test causal assumptions. DoWhy presents an API for the four steps common to any causal analysis---1) modeling the data using a causal graph and structural assumptions, 2) identifying whether the desired effect is estimable under the causal model, 3) estimating the effect using statistical estimators, and finally 4) refuting the obtained estimate through robustness checks and sensitivity analyses. In particular, DoWhy implements a number of robustness checks including placebo tests, bootstrap tests, and tests for unoberved confounding. DoWhy is an extensible library that supports interoperability with other implementations, such as EconML and CausalML for the the estimation step.
1. An Introduction to
Causal Inference in Tech
Emily Glassberg Sands
emily@coursera.org, @emilygsands
November 2016
2. About me
● Harvard Economics PhD
● Data Science Manager @
Coursera
econometrics
causal inference
experimental design
labor markets & education
3. Does X drive Y?
● Did PR coverage drive sign-ups?
● Does mobile app improve retention?
● Does customer support increase sales?
● Would lowering price increase revenues?
● ...
Inspired by work with Duncan Gilchrist, Economist and Data Scientist @ Wealthfront
4. Does X drive Y?
4
Raw Correlation
▪ Users engaging with X more likely
to have outcome Y?
▫ Plot Y against X
▫ corr(X, Y)
▪ But beware confounding variables
5. “Impact” of Mobile App Usage on
Retention
Mobile
Usage?
MoM
Retention
No 35%
Yes 40%
Selection
Bias?
6.
7. Does X drive Y?
7
Testing
▪ Randomly assign some users and
not others an experience
▪ Estimate the causal effect of the
experience on the outcome
▪ Often best path forward…
...but not in all cases
8. Limitations of A/B Testing
Consider user
experience
Consider ethics
Consider effect
on user trust
11. Method 1:
Controlled Regression
11
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Control directly for the confounding
variables in a regression of Y on X
Assumption: Distribution of outcomes, Y,
conditionally independent of treatment, X, given
the confounders, C
12. Method 1:
Controlled Regression
12
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Effect of live chat support on sales.
▪ Age confounder →
Upward bias if regress sales on chat support
▪ Add control for age
In R:
fit <- lm(Y ~ X + C, data = ...)
summary(fit)
13. Method 1:
Controlled Regression
13
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pitfall 1: “Missing” controls →
Omitted Variable Bias
Can we tell how much of a problem?
▪ If adding proxies increases (adjusted)
R-squared without impacting estimate, could
be ok...*
*Oster 15 provides a formal treatment.
20. Method 2:
Regression Discontinuity Design
20
Idea: Focus on a cut-off point that can be
thought of as a local randomized experiment
Example: Effect of passing course on income?
▪ A/B test? Randomly passing some, failing
others unethical
▪ Controlled regression? Key unobservables
like ability and motivation
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
21. Method 2:
Regression Discontinuity Design
21
Example cont’d:
Passing cutoff → natural experiment!!
▪ User earning 69 similar to user earning 70
▪ Use discontinuity to estimate causal effect
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(rdd)
RDestimate(Y ~ D, data = …,
subset = …, cutpoint = …)
23. Note on Validity - A/B testing
23
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Randomized
correctly, i.e. samples
balanced
External
validity
Unbiased for full
population
Experimental group
representative of
overall
24. Note on Validity - Regression
Discontinuity Design
24
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control
of assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
25. Method 2:
Internal Validity in RDD
25
Assumption 1: Imprecise control of assignment,
AKA no manipulation at the threshold
▪ Users cannot control whether just above
versus just below the cutoff
In example: Cannot control grade around the
cutoff (e.g., asking for re-grade).
How can we tell?
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
26. Method 2:
Internal Validity in RDD
26
Check 1: Mass just below ~= Mass just above
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Even mass around cut-off Agency over assignment
27. Method 2:
Internal Validity in RDD
27
Check 2: Composition of users in two buckets
similar along key observable dimension(s)
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
✓ Similar on observable Different on observable
28. Check for manipulation at the
threshold
28
1. Mass just below ~= Mass just above?
2. Just below vs. just above similar on key observables?
29. Method 2:
Internal Validity in RDD
29
Assumption 2: No confounding discontinuities
▪ Being just above (versus just below) the cutoff
should not influence other features
In example: Assumes passing is the only
differentiator between a 60 and a 70
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
31. 31
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
Note on Validity - Regression
Discontinuity Design
32. Method 2:
External Validity in RDD
32
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ “Local” around the cut-off
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
35. Method 3:
Difference-in-Differences
35
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
37. Method 3:
Difference-in-Differences
37
Idea: Comparison of pre and post outcomes
between treatment and control groups
Example: Effect of lowering price on revenue?
▪ A/B test? Could, but may be perceived as
unfair
▪ Alternative: Quasi-experimental design + DD
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
46. Note on Validity -
Difference-in-Differences
46
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect
47. Method 3:
Internal Validity in DD
47
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Parallel trends
▪ Absent treatment, same trends
In example: Treatment and control markets
would have followed same trends if no price
change
How can we tell?
48. Method 3:
Internal Validity in DD
48
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Pre-experiment:
▪ Make treatment and control similar
▫ Stratified randomization.
1. Stratify based on key attributes
2. Randomize within strata
3. Pool across strata
▫ Matched pairs. Historically followed similar
trends and/or are expected to respond
similarly to internal or external shocks
54. Note on Validity -
Difference-in-Differences
54
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
Parallel trends
External
validity
Unbiased for full
population
Homogeneous
treatment effect
55. Method 3:
External Validity in DD
55
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Assumption: Homogeneous treatment effects, as
with RDD
Pricing caveat: General Equilibrium? In
experiment, users influenced by price change
▫ Can cut on new users only
▫ See Pricing Post for more pricing tips
56. Method 3:
Extension: Bayesian Approach
56
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Idea: Construct a Bayesian structural time-series
model and use to predict counterfactual
Open source resource: Google’s CausalImpact
57. Method 3:
Extension: Bayesian Approach
57
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Example: Discrete shock in given market, e.g.,
▪ PR announcement in India
▪ New partnership with Singaporean
government
A/B testing infeasible; CausalImpact compares
pre/post in treated/untreated markets
59. Method 4:
Fixed Effects Regression
59
Idea: Special type of controlled regression
▪ most commonly used with panel data
▪ often to capture heterogeneity across
individuals (or products) fixed over time
Example: Estimate effect of price on conversion
▪ 1(pay) = ɑ + β*1($49) + X’Ⲅ
▫ X is vector of product fixed effects
▫ Ⲅ is a vector of product-specific intercepts
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
60. Method 4:
Fixed Effects Regression
60
In R:
Note: Requires meaningful variation in X after
controlling for fixed effects.
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
fit <- lm(Y ~ X + factor(SKU),
data = …)
summary(fit)
61. Note on Validity - Fixed Effects
61
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Imprecise control of
assignment
2. No confounding
discontinuities
External
validity
Unbiased for full
population
Homogeneous
treatment effects
63. Method 5:
Instrumental Variables
63
Idea: “Instrument” for X of interest with some
feature, Z, that drives Y only through its effect
on X; back out effect of X on Y
Requirements:
▪ Strong first stage: Z meaningfully affects X
▪ Exclusion restriction: Z affects Y only
through its effect on X
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
64. Method 5:
Instrumental Variables
64
Implementation:
1. Instrument for X with Z
2. Estimate the effect of (instrumented) X on Y
In R:
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
library(aer)
fit <- ivreg(Y ~ X | Z, data = …)
summary(fit, vcov = sandwich,
df = Inf, diagnostics = TRUE)
68. Method 5:
Instrumental Variables
68
Instruments in tech? Everywhere! Especially old
A/B tests
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
Y X Instrument Data Scientist
Platform
retention
Having friends on
the platform
Referral test 1 You!
Referral test 2 You!
Referral test 3 You!
... ...
69. Note on Validity - Instrumental Variables
69
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effect
70. Method 5:
Internal Validity in IV
70
Assumption 1: Strong first stage
▪ Experiment we chose “successful” at driving X
Why matters: If Z not strong predictor of X,
second stage estimate will be biased.
How can we tell? Check F-statistic on the first
stage regression; should be > 11 (rule-of-thumb)
▪ `Diagnostics = TRUE’ in R will include test of
weak instruments
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
71. Method 5:
Internal Validity in IV
71
Assumption 2: Exclusion restriction
▪ Z affects Y only through X
How can we tell? No test; have to go on logic
In the example:
✓ Control group got otherwise equivalent email
Control group got no email
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
72. Note on Validity - Instrumental Variables
72
Type Definition Assumptions
Internal
validity
Unbiased for
subpopulation
studied
1. Strong first stage
2. Exclusion restriction
External
validity
Unbiased for full
population
Homogeneous
treatment effects
73. Method 5:
External Validity in IV
73
LATE: RDD estimates Local Average Treatment
Effect (LATE)
▪ Relevant for the group impacted by the
instrument
If heterogeneous treatment effects may not be
applicable to the full group.
But interventions we’d consider would often
occur on margin anyway
Method 1:
Controlled
Regression
Method 2:
Regression
Discontinuity
Design
Method 3:
Difference-in-
Differences
Method 4: Fixed
Effects
Regression
Method 5:
Instrumental
Variables
76. Traditionally distinct literatures:
▪ Machine Learning focuses on prediction
▫ Nonparametric prediction methods
▫ Cross-validation for model selection
▪ Economics and statistics focuses on causality
Weaknesses of classic causal approaches:
▪ Fail with many covariates
▪ Model selection unprincipled
ML + Causal Inference = <3
Extensions & New Directions:
ML + Causal Inference
77. Idea: In cases where many possible instrument
sets, use LASSO (penalized least squares) to select
instruments
Benefits:
▪ Less prone to data mining → more robust
▪ Stronger first stage → less weak instrument
bias
Extensions & New Directions:
ML + Causal Inference: LASSO
78. Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather
Extensions & New Directions:
ML + Causal Inference: LASSO
79. Extensions & New Directions:
ML + Causal Inference: LASSO
Effect of weather shocks on viewership
80. Example: Want to estimate social spillovers in
movie consumption.
▪ Causal effect of viewership on later viewership?
▪ Instrument for viewership with weather
Challenge: Potential set of instruments large
▫ Risk of overfitting (e.g., including all)
▫ Risk of data minimum (e.g., hand-picking)
Solution: Implement LASSO methods to estimate
optimal instruments in linear IV models with many
instruments
Extensions & New Directions:
ML + Causal Inference: LASSO
81. Extensions & New Directions:
ML + Causal Inference: Trees
Idea: In cases where heterogeneous treatment
effects, use trees to identify subgroups
Example: Want to identify a partition of the
covariate space into subgroups based on
treatment effect heterogeneity
Solution: Athey & Imbens’ (2015) Causal Trees
▪ like regression trees but focuses on MSE of
treatment effect
▪ output is treatment effect & CI by subgroup
82. Extensions & New Directions:
ML + Causal Inference: Forest
Idea: Extension of trees; want personalized
estimate of treatment effect
Solution: Wager & Athey (2015) Causal Forests
▪ estimate is CATE (conditional average
treatment effect)
▪ predictions are asymptotically normal
▪ predictions centered on the true effect
83. Sample R code and simulation output
available on GitHub
Detailed context and more examples
available on Medium
References and reading list available here
Home grown resources