SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 30 day free trial to unlock unlimited reading.
The Impact of Computing Systems | Causal inference in practice
Computing and machine learning systems are affecting almost all parts of our lives and the society at large. How do we formulate and estimate the impact of these systems? This talk introduces causal inference as a methodology to answer such questions and provides examples of applying it to estimating impact of recommender systems, online social media feeds, search engines and interventions in public health in India.
Computing and machine learning systems are affecting almost all parts of our lives and the society at large. How do we formulate and estimate the impact of these systems? This talk introduces causal inference as a methodology to answer such questions and provides examples of applying it to estimating impact of recommender systems, online social media feeds, search engines and interventions in public health in India.
The Impact of Computing Systems | Causal inference in practice
1.
The Impact of Computing Systems:
Causal inference in practice
Amit Sharma
Microsoft Research
www.amitsharma.in
Twitter: @amt_shrma
Email: amshar@microsoft.com
Summer School on Human-Centered AI
http://www.hcixb.org/
2.
I. How little we know about
the systems we build
II. How can causal inference
help?
4.
What is the impact of these systems on our lives?
Efficie
ncy Convenie
nce
Inclusi
on
Fairne
ss Accountab
ility
Transpare
ncy
5.
What will be the impact of computing systems
on their lives?
6.
(New?) social science of a world
mediated by computing systems
Programming
Data science
Machine learning
Sensors and Systems
Sociology
Psychology
Ethics
Political Science
Economics
Development Studies
7.
Many different communities
• Human Computer Interaction (HCI)
• Human Factors in Computing Systems (CHI)
• Computer Supported Cooperative Work (CSCW)
• Science and Technology Studies (STS)
• Computational Social Science (CSS)
• Information & Communication Technology and Development (ICTD)
• Computing and Sustainable Societies (COMPASS)
9.
My path
“Intelligent systems that help
people”
Recommendation systems
Social networking platforms
Prediction
Can we predict what you’ll be interested in?
“How much do
recommender systems
shape people’s
decisions?”
“How much does a
social NewsFeed
influence people’s
information access?
“How do the
recommender systems
affect sellers on a
platform?
“How do you know
that recommendations
are having a positive
impact?
Causation
Can we estimate the effect of our recommendations?
10.
I. How little we know about
the systems we build
II. How can causal inference
help?
11.
1. What’s the right decision?
Use the social feed to predict a user's future
activity (e.g, Likes).
• Future Likes -> f( items in social feed) + 𝜖
Highly predictive model.
“Would changing what a
person sees in their feed
change what they Like?”
a) Yes
b) No
c) Maybe, maybe not
12.
Prediction !=
Decision-making
Would changing what
people see in the feed
affect what a user likes?
Maybe, maybe not (!)
Items liked
by a user
Homophily
Items in
Social Feed
Items liked
by a user
Items in
Social Feed
Predictability due to
feed influence
Predictability due to
homophily
14.
Comparing old versus new algorithm
17
Old Algorithm (A) New Algorithm (B)
50/1000 (5%) 54/1000 (5.4%)
15.
Change in Success Rate by activity-level
18
Old Algorithm (A) New Algorithm (B)
10/400 (2.5%) 4/200 (2%)
Old Algorithm (A) New Algorithm (B)
40/600 (6.6%) 50/800 (6.2%)
0
1
2
3
4
5
6
7
8
1 2 3 4
SR
16.
Is Algorithm A better?
Which algorithm will you choose?
Old algorithm (A) New Algorithm
(B)
CTR for Low-
Activity users
10/400 (2.5%) 4/200 (2%)
CTR for High-
Activity users
40/600 (6.6%) 50/800 (6.2%)
Total CTR 50/1000 (5%) 54/1000 (5.4%)
19
17.
Is Algorithm A still better?
The Simpson’s paradox
Old algorithm (A) New Algorithm (B)
CTR for Low-
Activity users
Low-Income: 1/200 (0.5%)
High-Income: 9/200 (4.5%)
Low-Income: 4/100 (4%)
High-Income: 0/100 (0%)
CTR for High-
Activity users
Low-Income: 10/500 (2%)
High-Income: 30/100 (30%)
Low-Income: 45/600 (7.5%)
High-Income: 5/200 (2.5%)
Total CTR 50/1000 (5%) 54/1000 (5.4%)
20
18.
E.g., Algorithm A could have been shown at different
times than B.
There could be other hidden causal variations.
Answer (as usual): May be, may be not.
21
19.
Average comment length decreases over time.
Example: Simpson’s paradox in Reddit
22
But for each yearly cohort of users, comment length
increases over time.
21.
I. How little we know about
the systems we build
II. How can causal inference
help?
22.
Causality: An enigma that has attracted
scholars for centuries
25
23.
What is the effect of a taxi-app’s matching algorithm on people’s incomes?
What is the effect of algorithmic screening on a patient’s health?
What is the influence of an online social feed on a person’s behavior?
From interventions to algorithmic interventions
24.
Definition: X causes Y iff
changing X leads to a change in Y,
keeping everything else constant.
The causal effect is the magnitude by which Y is changed by a
unit change in X.
Called the “interventionist” interpretation of causality.
A practical definition
27
http://plato.stanford.edu/entries/causation-
mani/
26.
Powerful statistical frameworks
29
For more details, check out a KDD tutorial on causal inference by Emre Kiciman and I:
https://causalinference.gitlab.io/kdd-tutorial/
27.
Running example: Estimating effect of an
algorithm
30
28.
Lookback: Need answers to “what if”
questions
31http://plato.stanford.edu/entries/causation-counterfactual/
33.
But randomized experiments can be
infeasibly, costly or even unethical…
36
34.
So how about comparing with a similar
user instead of random
37
35.
Continuing example: Effect of Algorithm on
CTR
38
Does new Algorithm B increase CTR for recommendations on Windows
Store, compared to old algorithm A?
36.
Previous example: Effect of Algorithm
over CTR
Does new Algorithm B increase CTR for recommendations on Windows
Store, compared to old algorithm A?
39
37.
Assumptions to estimate effect of
Algorithm
40
42.
I. How little we know about
the systems we build
II. How can causal inference
help?
43.
Example 1: Causal effect of a social news feed
Amit Sharma, Dan Cosley (2016). Distinguishing Between Personal Preferences and Social
Influence in Online Activity Feeds (Honorable Mention for Best Paper award) . Proceedings of
the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing.
44.
Example 1: Causal effect of a social newsfeed
47
Non-FriendsEgo Network
f5
u
f1
f4
f3f2
n5
u
n1
n4
n3n2
45.
Example 2: Is a search engine fair to all its users?
Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, Emine Yilmaz (2017).
Auditing Search Engines for Differential Satisfaction Across Demographics. Proceedings of the 26th International
Conference on World Wide Web (Industry Track).
46.
Tricky: straightforward optimization can lead
to differential performance
• Search engine uses a standard metric: time spent on clicked
result page as an indicator of satisfaction.
• Goal: estimate difference in user satisfaction between these two
demographic groups.
• Suppose older users issue more of “retirement planning” queries
Age: >50 years
80% users 10% users
Age: <30 years
…
47.
Overall metrics can hide differential
satisfaction
• Average user satisfaction for “retirement planning” may be high.
But,
• Average satisfaction for younger users=0.7
• Average satisfaction for older users=0.2
49.
Pitfalls with Overall Metrics
• Conflate two separate effects:
• natural demographic variation caused by the differing traits among the
different demographic groups e.g.
• Different queries issued
• Different information need for the same query
• Even for the same satisfaction, demographic A tends to click more than demographic B
• Systemic difference in user satisfaction due to the search engine
50.
Utilize work from causal inference
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
51.
I. Context Matching: selecting for activity with
near-identical context
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
Context
52.
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
Context
For any two users from different demographics,
1. Same Query
2. Same Information Need:
1. Control for user intent: same final SAT click
2. Only consider navigational queries
3. Identical top-8 Search Results
1.2 M impressions, 19K unique queries, 617K users
54.
Example 3: Effect of a recommendation system
57
55.
Confounding: Observed click-throughs
may be due to correlated demand
58
Demand for
The Road
Visits to The
Road
Rec. visits to
No Country
for Old Men
Demand for
No Country for
Old Men
56.
Observational click-through rate overestimates
causal effect
59
Amit Sharma, Jake M Hofman, Duncan J Watts (2018). Split-door criterion: Identification of causal effects
through auxiliary outcomes. The Annals of Applied Statistics.
57.
Example 4: Prioritizing tuberculosis patients
for followup
• TB is the leading infectious cause of death globally
• TB treatment takes 6 months or more
• Poor adherence to treatment increases risk of relapse, drug
resistance, and death
• India’s government TB program has used Directly Observed
Treatment (DOT) to monitor adherence, but effort-intensive
for patients and providers
Jackson A Killian, Bryan Wilder, Amit Sharma, Vinod Choudhary, Bistra Dilkina, Milind Tambe (2019). Learning to
Prescribe Interventions for Tuberculosis Patients using Digital Adherence Data. Proc. KDD 2019.
58.
Background: How 99Dots works
* Slide content sourced from Everwell.
59.
Combination of Caller
ID and numbers called
shows that doses are
in patient’s hands.
Background: How 99Dots works
* Slide content sourced from Everwell.
60.
Two questions
•“How to help health workers reprioritize their
interventions?”
• “Looking at a week’s data, can we predict adherence for the
next week?”
61.
Machine learning task
• Input (t-7,t)
• demographic features (age, gender, location)
• Call details (number of calls, time of calls, days between calls, etc.)
• Output (t, t+7)
• Number of calls in the next week
Obtain nearly 0.85 AUC.
62.
Tale of Two worlds
• Person makes no calls in week 1,
intervention, starts making calls
in week 2
• Person makes no calls in week 1,
intervention, no calls in week 2
63.
A causal model for interventions
Person’s
Behavior (t)
Health worker’s
intervention
Call to 99Dots
(t)
Person’s
Behavior (t-1)
Call to 99Dots
(t-1)
64.
Domain-based filtering solution
• 99Dots records suggested attention level for each patient
• High: 4 or more calls missed in the last week
• Medium: 1 to 4 calls missed in the last week
• Low: No missed calls
Medium -> High?
• Given last week’s data, can we predict whether a person moves
from Medium to High attention ?
65.
Complex model and lower accuracy, but are
able to save more missed doses
66.
Example 5: What is the effect of peer support
on mental health forums?
67.
Talklife: thousands of “counselling”
conversations online
• A social network for peer support
• People experiencing mental distress
can post on Talklife and get support
from their peers.
• Global network, but also has Indian
users
• Can we identify patterns of
successful peer support
conversations?
“Moments of cognitive change”
Yada Pruksachatkun, Sachin R. Pendse, Amit Sharma (2019). Moments of Change: Analyzing Peer-Based Cognitive
Support in Online Mental Health Forums. Proceedings of the 2019 CHI Conference on Human Factors in Computing
Systems.
68.
Summary
People + Computing
• Our lives are being mediated by computing systems, often using
predictive models.
• The impact can shape the future of our society!
• But their impact is far from obvious.
• Naïve prediction metrics can lead us astray.
Need causal reasoning + understanding context
69.
Thank you
Amit Sharma
@amt_shrma
www.amitsharma.in
• Our lives are being mediated by computing systems, often using
predictive models.
• The impact can shape the future of our society!
• But their impact is far from obvious.
• Naïve prediction metrics can lead us astray.
Need causal reasoning + understanding context