SlideShare a Scribd company logo
1 of 45
Here, there,
causality is
everywhere
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma
My route to causality
Building
recommender
systems in social
networks
Conducting user
experiments
Estimating impact of
recommendations
and social feeds
Causality is everywhere
Spans every branch of science.
◦ Economics
◦ Political science
◦ Study of human behavior
◦ Biology and medicine
◦ Computer science (?)
Spans centuries of thought.
◦ Aristotle: “To know, is to know the final cause.”
Took us until 1930s to come up with the randomized experiment
(Fisher).
Still early days for estimating causal effects from observational data.
Causality in economics
David Card. The causal effect of education on earnings (1999)
Conley and Heerwig. The Long-Term Effects of Military
Conscription on Mortality: Estimates From the Vietnam-Era Draft
Lottery (2012)
Causality in political science
Darrell West. Air Wars (2013)
Chattopadhyay and Duflo. Women as Policy Makers:
Evidence from a Randomized Policy Experiment in
India (2004)
Causality in human behavior
Thistlewaithe and Campbell. Effect of public recognition
of scholastic achievement (1960)
Christakis and Fowler. The collective dynamics
of smoking in a large social network (2008)
Causality in biology and
medicine
Effect of Vitamin D deficiency on colon cancer
Effect of heart attack surgery on long-term
health of patient
Causality in web applications
Sharma and Cosley. Distinguishing between personal preference
and homophily in online activity feeds (2016).
Sharma, Hofman and Watts. Estimating the causal impact of
recommender systems (2015).
Counterfactual reasoning
Correlation question: How well can X predict Y?
◦ Machine learning, Statistical estimation.
Interventionist question: If X is changed to X’, what will be
the value of Y?
◦ Experiments, Reinforcement learning, Contextual bandits.
Counterfactual question: If X would have been X’, what
would be the value of Y?
◦ Today’s focus.
Estimating causal effects from
observational data
Why is causal inference hard?
◦ Simpson’s paradox
The language of graphical models
◦ Backdoor criterion
◦ Frontdoor criterion
Common approaches for causal inference
◦ Conditioning
◦ Mechanism-based
◦ Natural Experiments
Example: Estimating causal impact of recommender systems
Estimating the effectiveness of
kidney stone treatment
Treatment A Treatment B
Small stones 93% (81/87) 87% (234/270)
Large stones 73% (192/263) 69% (55/80)
Both 78% (273/350) 83% (289/350)
Julious and Mullee. Confounding and Simpson’s Paradox (1994).
http://en.wikipedia.org/wiki/Simpson’s_paradox
Two treatments for kidney stones
Treatment A : 78% effective
Treatment B : 83% effective
Estimating ad placement on a
search engine
Suppose we would like to optimize the set of ads shown for a query,
rather than optimize inidividually.
Click probability
estimates: q1, q2
Does q2 depend on
q1?
1st, q1
2nd, q2
Confounders in ad placement
Let us define two groups with 2000 queries each:
◦ High q1: (149/2000) CTR on second ad
◦ Low q1: (124/2000) CTR on second ad
Low q1 High q1
Low q2 5.1% (92/1823) 4.8% (71/1500)
High q2 18.1% (32/176) 15.6% (78/500)
Both 6.2% (124/2000) 7.5% (149/2000)
Bottou et al. Counterfactual reasoning and learning systems
(2013).
Causal graphical models: a
framework for causality
Structural equation modeling (SEM)
X = q1
Y = CTR on second ad
Which variables to condition
on?
Observed variables
◦ Which observed variables?
◦ As we will see, observing on all variables may not be correct.
Known unknowns:
◦ Age, Past diseases, Food intake
Unknown unknowns:
◦ What else could impact recovery from kidney stones?
◦ Genetic markers?
Which variables to condition
on?
Connections to Bayesian
networks
Markov assumption: Probability of an effect is independent of
everything else given its direct causes.
Two
approaches:
--Backdoor
criterion
--Frontdoor
criterion
Graphical Models and common
methods for causal estimation
Condition on
observed covariates
• Stratification
• Matching
• Regression (?)
Mechanism-based
strategies
• Path-based
approaches
Natural experiments
• As-if experiments
• Instrumental
Variables
• Regression
discontinuity
I. Conditioning on observed
covariates
Corresponds to Backdoor criterion.
a) Stratification
Condition on different levels of socio-
economic status.
b) Matching
Socio-Economic status is a function of parents’ income,
locality and other observed indicators.
b) Matching
Model propensity to attend a particular school.
Pschool = f(PI, Loc, …)
c) Regression
Condition on observed covariates by
adding them as independent variables
in regression.
Works only if true causal
relationship between
variables is linear.
II. Mechanism-based
strategies
Corresponds to Front door criterion.
III. Natural Experiments
Look for experiments happening in the real world.
Promise greater generalizability than controlled lab experiments.
Require greater care to ensure validity of causal identification.
a. (As-if) random
experiments
b) Regression discontinuity
c) Instrumental variables
Shock!
Increase in
traffic
Summary: Two graphical criteria
explain all of conventional
approaches
A principled, succinct framework for causality.
Allows arbitrary functional forms for relationships between variables.
Leads to clear statements about causal assumptions.
If a causal effect can be identified, it can be derived using do-calculus
(helpful for bigger graphs).
Product
recommendati
ons on Amazon
Do recommendations expose
people to new products?
Do recommendations lead to
more purchases?
Counterfactual
reasoning
What would have
happened in case there
were no
recommendations?
X = Activity on current item that the user is
viewing
Y = Activity on the recommended Item
UX = Latent properties of X
UY = Latent Properties of Y
Why is
estimating
effects of
recommenda
tions difficult
using
observational
data?
If latent properties for X and Y
are correlated, then observed
changes in AY cannot be
directly attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations
(ref. Pearl 09)
AX = Visits on a product X on Amazon
AY = Recommendation click-throughs from X
to Y
UX = Consumer demand for X
UY = Consumer demand for Y
If latent
properties for X
and Y are
correlated, then
observed
changes in AY
cannot be
directly
attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations
Example:
Looking for a
machine
learning book
Observed clickthrough data
due to recommendations do
not tell the full story.
For example, let’s assume I just
completed the Artificial
Intelligence book by Russell
and Norvig and now I want to
learn more about machine
learning.
Xi: Focal Product
Yj: Recommended Products
Xi: Focal Product
Yj: Recommended Products
Causal
Link
Convenience
Link
Revisi
t Link
Waste
d Link
There could be also be irrelevant links.
The Shock strategy (I.V.)
If direct visits to product Yj are nearly constant, then we
can assume that the convenience clicks to Yj will be
nearly constant.
Thus,
The Shock strategy
We cannot say much during normal traffic for a product. But if a product experiences a spike in
visits and its recommended product does not, then we can demonstrate a method to compute
the causal clickthrough rate.
Data description
Dataset: Anonymized Amazon URL log data from Bing toolbar for opted-in
users.
Eight months (Sept. 1 2013 to May 31 2014).
URL structure allows us to determine:
◦ Type of page visited (product, search, cart, bestsellers, wishlist)
◦ Type of referral to a product (recommendation, search, none, others)
After filtering out bots, sellers, authors, publishers and unpopular products (<5
visits):
◦ Number of products = 1.38 M
◦ Number of users = 2.1M
◦ 60 product categories (such as Books, Toys, Electronics)
Implementing
the strategy:
The shock
criteria
Large: Visits during a shock
must exceed 5 times the
median traffic for a product
Sudden: Visits during a
shock must be 5 times the
last day’s traffic and 5 times
the last week’s traffic
Sane: Visits from at least 10
unique users and on 5
different days before and
after a shock
4776
shocks to
4126
products
Implementing the strategy: The shock
criteria
Additionally, we want direct visits to Yj be constant. Maximum change in direct visits to Yj should not bigger
than the size of the shock.
When beta=1, ideally causal. When beta=1, all bets are off.
Good shock Bad shock (filtered out at beta=0.7)
Results:
Fraction of
causal
clickthroughs
by category
Majority of the
clickthroughs are due to
convenience.
Within any category, 5% or
lower is a more accurate
estimate of clickthroughs
caused by
recommendations.
Robustness checks
Shocks may not be representative
◦ Distribution of users, popularity and the affinity between users and products
does not see much difference (except that shocked products are, on average,
more popular).
Shocks may be caused by deals which make the focal product more
attractive
◦ Verification using referrals from log data (e.g. bookbub.com) and manual
inspection of past prices (from camelcamelcamel.com)
Shocks may be a property of the weird holiday season.
◦ They occur throughout the data, although with more frequency during the
holidays.
Graphical models form a succinct,
sound and complete framework
for reasoning about causality.
They can also be practical.
THANK YOU!
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma

More Related Content

What's hot

Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Krishnaram Kenthapadi
 
The Next Big Thing in AI - Causality
The Next Big Thing in AI - CausalityThe Next Big Thing in AI - Causality
The Next Big Thing in AI - CausalityVaticle
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationSara Hooker
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Krishnaram Kenthapadi
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksHima Patel
 
Time series forecasting with machine learning
Time series forecasting with machine learningTime series forecasting with machine learning
Time series forecasting with machine learningDr Wei Liu
 
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Praxitelis Nikolaos Kouroupetroglou
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionGianluca Bontempi
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionalityNikhil Sharma
 
初心者による初心者のための「質的データの二変量解析」
初心者による初心者のための「質的データの二変量解析」初心者による初心者のための「質的データの二変量解析」
初心者による初心者のための「質的データの二変量解析」Yasuyuki Okumura
 
Fairness and Bias in Machine Learning
Fairness and Bias in Machine LearningFairness and Bias in Machine Learning
Fairness and Bias in Machine LearningSurya Dutta
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionVARUN KUMAR
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfittingSivapriyaS12
 

What's hot (20)

Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
 
The Next Big Thing in AI - Causality
The Next Big Thing in AI - CausalityThe Next Big Thing in AI - Causality
The Next Big Thing in AI - Causality
 
Probability Theory for Data Scientists
Probability Theory for Data ScientistsProbability Theory for Data Scientists
Probability Theory for Data Scientists
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
 
Time series forecasting with machine learning
Time series forecasting with machine learningTime series forecasting with machine learning
Time series forecasting with machine learning
 
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 
初心者による初心者のための「質的データの二変量解析」
初心者による初心者のための「質的データの二変量解析」初心者による初心者のための「質的データの二変量解析」
初心者による初心者のための「質的データの二変量解析」
 
Fairness and Bias in Machine Learning
Fairness and Bias in Machine LearningFairness and Bias in Machine Learning
Fairness and Bias in Machine Learning
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Timeseries forecasting
Timeseries forecastingTimeseries forecasting
Timeseries forecasting
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfitting
 

Viewers also liked

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesAmit Sharma
 
Data mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comData mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comAmit Sharma
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsAmit Sharma
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)colegiommc
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de juliocolegiommc
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzocolegiommc
 
типы химических связей
типы химических связейтипы химических связей
типы химических связейOlga Pishchik
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014colegiommc
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Димка Куликов
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книгеДимка Куликов
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferencesAmit Sharma
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Димка Куликов
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAmit Sharma
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезаДимка Куликов
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsOfficeReports
 

Viewers also liked (20)

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practices
 
Data mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comData mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.com
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systems
 
Semana 24
Semana 24Semana 24
Semana 24
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de julio
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzo
 
Semana 19
Semana 19Semana 19
Semana 19
 
типы химических связей
типы химических связейтипы химических связей
типы химических связей
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014
 
Semana 20 (1)
Semana 20 (1)Semana 20 (1)
Semana 20 (1)
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книге
 
гид2013
гид2013гид2013
гид2013
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferences
 
тюз
тюзтюз
тюз
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ.
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographics
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулеза
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and Icons
 

Similar to Causal inference in practice

What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift Conference
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsAmit Sharma
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series AnalysisAmanda Reed
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereAmit Sharma
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxwilcockiris
 
Business Optimization via Causal Inference
Business Optimization via Causal InferenceBusiness Optimization via Causal Inference
Business Optimization via Causal InferenceHanan Shteingart
 
Increasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasIncreasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasWilte Zijlstra
 
Sensitivity Analysis
Sensitivity AnalysisSensitivity Analysis
Sensitivity AnalysisBeth Johnson
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docxfelicidaddinwoodie
 
Measuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesMeasuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesJody Keyser
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and ValidityHora Tjitra
 
A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...Contact Centre Management Group
 
The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceAmit Sharma
 
Possible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietPossible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietJamie Jackson
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AISeth Grimes
 

Similar to Causal inference in practice (20)

What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systems
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series Analysis
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhere
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
 
Business Optimization via Causal Inference
Business Optimization via Causal InferenceBusiness Optimization via Causal Inference
Business Optimization via Causal Inference
 
Increasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasIncreasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing bias
 
Sensitivity Analysis
Sensitivity AnalysisSensitivity Analysis
Sensitivity Analysis
 
Slalom
SlalomSlalom
Slalom
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
 
Measuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesMeasuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What Does
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and Validity
 
A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...
 
The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practice
 
Possible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietPossible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And Juliet
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 

More from Amit Sharma

Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAmit Sharma
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactAmit Sharma
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleAmit Sharma
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesAmit Sharma
 
Estimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsEstimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsAmit Sharma
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Amit Sharma
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsAmit Sharma
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...Amit Sharma
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...Amit Sharma
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationAmit Sharma
 

More from Amit Sharma (10)

Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal Models
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal Impact
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scale
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
 
Estimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsEstimating the causal impact of recommender systems
Estimating the causal impact of recommender systems
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actions
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendation
 

Recently uploaded

Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxnoordubaliya2003
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 

Causal inference in practice

  • 1. Here, there, causality is everywhere AMIT SHARMA, MICROSOFT RESEARCH http://www.amitsharma.in @amt_shrma
  • 2. My route to causality Building recommender systems in social networks Conducting user experiments Estimating impact of recommendations and social feeds
  • 3. Causality is everywhere Spans every branch of science. ◦ Economics ◦ Political science ◦ Study of human behavior ◦ Biology and medicine ◦ Computer science (?) Spans centuries of thought. ◦ Aristotle: “To know, is to know the final cause.” Took us until 1930s to come up with the randomized experiment (Fisher). Still early days for estimating causal effects from observational data.
  • 4. Causality in economics David Card. The causal effect of education on earnings (1999) Conley and Heerwig. The Long-Term Effects of Military Conscription on Mortality: Estimates From the Vietnam-Era Draft Lottery (2012)
  • 5. Causality in political science Darrell West. Air Wars (2013) Chattopadhyay and Duflo. Women as Policy Makers: Evidence from a Randomized Policy Experiment in India (2004)
  • 6. Causality in human behavior Thistlewaithe and Campbell. Effect of public recognition of scholastic achievement (1960) Christakis and Fowler. The collective dynamics of smoking in a large social network (2008)
  • 7. Causality in biology and medicine Effect of Vitamin D deficiency on colon cancer Effect of heart attack surgery on long-term health of patient
  • 8. Causality in web applications Sharma and Cosley. Distinguishing between personal preference and homophily in online activity feeds (2016). Sharma, Hofman and Watts. Estimating the causal impact of recommender systems (2015).
  • 9. Counterfactual reasoning Correlation question: How well can X predict Y? ◦ Machine learning, Statistical estimation. Interventionist question: If X is changed to X’, what will be the value of Y? ◦ Experiments, Reinforcement learning, Contextual bandits. Counterfactual question: If X would have been X’, what would be the value of Y? ◦ Today’s focus.
  • 10. Estimating causal effects from observational data Why is causal inference hard? ◦ Simpson’s paradox The language of graphical models ◦ Backdoor criterion ◦ Frontdoor criterion Common approaches for causal inference ◦ Conditioning ◦ Mechanism-based ◦ Natural Experiments Example: Estimating causal impact of recommender systems
  • 11. Estimating the effectiveness of kidney stone treatment Treatment A Treatment B Small stones 93% (81/87) 87% (234/270) Large stones 73% (192/263) 69% (55/80) Both 78% (273/350) 83% (289/350) Julious and Mullee. Confounding and Simpson’s Paradox (1994). http://en.wikipedia.org/wiki/Simpson’s_paradox Two treatments for kidney stones Treatment A : 78% effective Treatment B : 83% effective
  • 12. Estimating ad placement on a search engine Suppose we would like to optimize the set of ads shown for a query, rather than optimize inidividually. Click probability estimates: q1, q2 Does q2 depend on q1? 1st, q1 2nd, q2
  • 13. Confounders in ad placement Let us define two groups with 2000 queries each: ◦ High q1: (149/2000) CTR on second ad ◦ Low q1: (124/2000) CTR on second ad Low q1 High q1 Low q2 5.1% (92/1823) 4.8% (71/1500) High q2 18.1% (32/176) 15.6% (78/500) Both 6.2% (124/2000) 7.5% (149/2000) Bottou et al. Counterfactual reasoning and learning systems (2013).
  • 14. Causal graphical models: a framework for causality Structural equation modeling (SEM) X = q1 Y = CTR on second ad
  • 15. Which variables to condition on? Observed variables ◦ Which observed variables? ◦ As we will see, observing on all variables may not be correct. Known unknowns: ◦ Age, Past diseases, Food intake Unknown unknowns: ◦ What else could impact recovery from kidney stones? ◦ Genetic markers?
  • 16. Which variables to condition on?
  • 17. Connections to Bayesian networks Markov assumption: Probability of an effect is independent of everything else given its direct causes. Two approaches: --Backdoor criterion --Frontdoor criterion
  • 18. Graphical Models and common methods for causal estimation Condition on observed covariates • Stratification • Matching • Regression (?) Mechanism-based strategies • Path-based approaches Natural experiments • As-if experiments • Instrumental Variables • Regression discontinuity
  • 19. I. Conditioning on observed covariates Corresponds to Backdoor criterion.
  • 20. a) Stratification Condition on different levels of socio- economic status.
  • 21. b) Matching Socio-Economic status is a function of parents’ income, locality and other observed indicators.
  • 22. b) Matching Model propensity to attend a particular school. Pschool = f(PI, Loc, …)
  • 23. c) Regression Condition on observed covariates by adding them as independent variables in regression. Works only if true causal relationship between variables is linear.
  • 25. III. Natural Experiments Look for experiments happening in the real world. Promise greater generalizability than controlled lab experiments. Require greater care to ensure validity of causal identification.
  • 29. Summary: Two graphical criteria explain all of conventional approaches A principled, succinct framework for causality. Allows arbitrary functional forms for relationships between variables. Leads to clear statements about causal assumptions. If a causal effect can be identified, it can be derived using do-calculus (helpful for bigger graphs).
  • 30. Product recommendati ons on Amazon Do recommendations expose people to new products? Do recommendations lead to more purchases?
  • 31. Counterfactual reasoning What would have happened in case there were no recommendations?
  • 32. X = Activity on current item that the user is viewing Y = Activity on the recommended Item UX = Latent properties of X UY = Latent Properties of Y Why is estimating effects of recommenda tions difficult using observational data? If latent properties for X and Y are correlated, then observed changes in AY cannot be directly attributed to AX. AX AY UYUX A causal graphical model for the impact of recommendations (ref. Pearl 09)
  • 33. AX = Visits on a product X on Amazon AY = Recommendation click-throughs from X to Y UX = Consumer demand for X UY = Consumer demand for Y If latent properties for X and Y are correlated, then observed changes in AY cannot be directly attributed to AX. AX AY UYUX A causal graphical model for the impact of recommendations
  • 34. Example: Looking for a machine learning book Observed clickthrough data due to recommendations do not tell the full story. For example, let’s assume I just completed the Artificial Intelligence book by Russell and Norvig and now I want to learn more about machine learning.
  • 35.
  • 36. Xi: Focal Product Yj: Recommended Products
  • 37. Xi: Focal Product Yj: Recommended Products Causal Link Convenience Link Revisi t Link Waste d Link There could be also be irrelevant links.
  • 38. The Shock strategy (I.V.) If direct visits to product Yj are nearly constant, then we can assume that the convenience clicks to Yj will be nearly constant. Thus,
  • 39. The Shock strategy We cannot say much during normal traffic for a product. But if a product experiences a spike in visits and its recommended product does not, then we can demonstrate a method to compute the causal clickthrough rate.
  • 40. Data description Dataset: Anonymized Amazon URL log data from Bing toolbar for opted-in users. Eight months (Sept. 1 2013 to May 31 2014). URL structure allows us to determine: ◦ Type of page visited (product, search, cart, bestsellers, wishlist) ◦ Type of referral to a product (recommendation, search, none, others) After filtering out bots, sellers, authors, publishers and unpopular products (<5 visits): ◦ Number of products = 1.38 M ◦ Number of users = 2.1M ◦ 60 product categories (such as Books, Toys, Electronics)
  • 41. Implementing the strategy: The shock criteria Large: Visits during a shock must exceed 5 times the median traffic for a product Sudden: Visits during a shock must be 5 times the last day’s traffic and 5 times the last week’s traffic Sane: Visits from at least 10 unique users and on 5 different days before and after a shock 4776 shocks to 4126 products
  • 42. Implementing the strategy: The shock criteria Additionally, we want direct visits to Yj be constant. Maximum change in direct visits to Yj should not bigger than the size of the shock. When beta=1, ideally causal. When beta=1, all bets are off. Good shock Bad shock (filtered out at beta=0.7)
  • 43. Results: Fraction of causal clickthroughs by category Majority of the clickthroughs are due to convenience. Within any category, 5% or lower is a more accurate estimate of clickthroughs caused by recommendations.
  • 44. Robustness checks Shocks may not be representative ◦ Distribution of users, popularity and the affinity between users and products does not see much difference (except that shocked products are, on average, more popular). Shocks may be caused by deals which make the focal product more attractive ◦ Verification using referrals from log data (e.g. bookbub.com) and manual inspection of past prices (from camelcamelcamel.com) Shocks may be a property of the weird holiday season. ◦ They occur throughout the data, although with more frequency during the holidays.
  • 45. Graphical models form a succinct, sound and complete framework for reasoning about causality. They can also be practical. THANK YOU! AMIT SHARMA, MICROSOFT RESEARCH http://www.amitsharma.in @amt_shrma

Editor's Notes

  1. Similarly, you can think about personalized, adaptive books.
  2. We are looking for causal clickthroughs
  3. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links.
  4. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links.
  5. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links. In case of a book, think of the search results as the contents in a book in the normal order. And maybe we want to personalize that.
  6. Going back to the causal diagram.
  7. Talk about filtering …how there were som
  8. Change the figure.