SlideShare a Scribd company logo
Matching Methods and Natural Experiments
Examples of Causal Inference from Social Media
Ingmar Weber
@ingmarweber
Given at OSSM Workshop at ICWSM’17
https://www.microsoft.com/en-us/research/event/ossm17/
Featuring joint work with Tiago Cunha, Gisele Pappa, Yelena Mejova and Sofiane Abbar
Slides partly based on joint WWW’16 tutorial with M. Strohmaier, C. Wagner and L. Aiello
performance
time
Training
Did the training cause an increase in performance?
Estimating Causal Effect
Yi(T)
Yi(C)
Causal Effect
different i and j
Ideal Solution: Run an Experiment
Gold Standard Method for causal inference
Randomly assign subjects to either treatment or control group
Assume both groups large enough to wash out differences in covariates
[More on best assignment strategies later]
If done correctly, no need for fancy analysis.
Average(treatment group) - Average(control group)
Limitations of Experiments
• Expensive
• Not all treatments are possible and ethical
• Internal validity is high, but external, i.e. generalization
validity is often limited
• Non-interference assumption is often violated in social
science field experiments, i.e. i‘s treatment effects j‘s
treatment
Alternative Solution:
Causal Inference from Observational Data
Approach 1: Natural Experiments
Has nature already done the work for us?
Is there a (partly) random observed assignment?
Approach 2: Matching Methods
Try to find pairs of as-similar-as-possible participants.
One happened to get treated, the other not
Part1Part2
Part 1: Natural Experiments
1854 Cholera Outbreak in London
Assumption at time: cholera is air-borne disease
John Snow’s hypothesis: cholera is water-borne disease
Famous use of mapping
Water
Company
# houses Deaths/10k
Southwark
and Vauxhall
40,046 315
Lambeth 26,107 37
Assignment to water company as-if random
• Could differ from one house to next
• Tenants don’t know their company
Lambeth’s inlet upstream = clean water
S&V’s inlet downstream = contaminated water
First (?) use of natural experiment!
There was one significant anomaly –
none of the workers in the nearby
Broad Street brewery contracted
cholera. They were given a daily
allowance of beer, and did not
consume water from the nearby well.
The water used in the brewing process
is boiled during mashing which kills
cholera bacteria. [Wikipedia]
Natural Experiments with Instrumental Variables
Study by Angrist 1990:
What is the effect of military service (M) on lifetime
earnings (E)?
+: Improve self discipline? Get external recognition?
Join network of “alumni”? – Could increase earnings.
-: Lose actual job experience? Become traumatized?
Lose touch with society? – Could decrease earnings.
Why not just compute:
Exp. earnings for military joiners – exp. earnings for others
E [ E | M=1] - E [ E | M=0 ]
Limits of Linear Regression
Assume the structural equation:
E = ® + ¯ * M + ²
Error term ² stands for all exogenous factors that affect E
when M is held constant
Crucial assumption: Cov(M,²) = 0 (not correlated)
M
P
E lifetime earnings
earning potential (unobserved)
military service
Actual model:
Limits of Linear Regression
(P)otential (M)ilitary
service
(E)arnings count
High = 1 Yes = 1 40,000 20
No = 0 30,000 80
Low = 0 Yes = 1 20,000 80
No = 0 10,000 20
E [ E | M=1] = (20*40k + 80*20k)/100 = 24k
E [ E | M=0 ] = (80*30k + 20*10k)/100 = 26k
Fitted linear regression: E = 26k - 2k* M + ²
At this stage you have enforced Cov(M,²)=0
True structural equation:
E = 10k + 10k*M + 20k*P
Example of
Simpson’s Paradox
Wrong assumption => biased estimator
Cov(M,P) = -0.15, r = -0.6
Instrumental Variable
Instrumental variable should be strongly correlated with
the included endogenous regressors (L <-> M), but not
with the effect outcome variable directly (L <-> E)
M
P
E
L
L is the draft-lottery
(can be 0 or 1)
L is an instrument for
the causal effect of M
on E
Angrist, Joshua D. (1990). "Lifetime Earnings and the Vietnam Draft Lottery: Evidence from Social
Security Administrative Records". American Economic Review 80 (3)
lifetime earnings
earning potential
military servicedraft lottery
Instrumental Variable
Observe:
E[ E | L=1 ] - E[ E | L=0 ] = -$2,000/year
Done?
Winning the lottery != Doing Military Service
Let’s call the military service “treatment”
Winning the lottery = assigned-to-treat
a% of population = Always-Treats, EA is unaffected
n% of population = Never-Treats, EN is unaffected
c% of population = Compliers, EC
T vs. EC
C
a + n + c = 100
Goal: Estimate EC
T vs. EC
C
Instrumental Variable
E[ E | L=1 ] = a*EA + n*EN + c* EC
T
E[ E | L=0 ] = a*EA + n*EN + c* EC
C
EC
T - EC
C = (E[ E | L=1 ] - E[ E | L=0 ]) / c
How to compute c?
# (M = 1 & L =1 ) = (c + a) * # (L=1)
# (M = 1 & L =0 ) = a * # (L = 0)
c = P( M = 1 | L = 1) – P(M = 1 | L = 0)
“Wald Estimator”
if L is binary
Plausibility check: what if L is perfect instrument?
Example: winning the lottery => 90% chance of joining military
& 10% joining without invitation
& $2,200 per year difference
$2,200 / 0.8 = $2,750 (what if 0.00001?)
Angrist found that military service decreases earnings about
$2,741 dollar per year
Instrumental Variable
E[E|L=1] – E[E|L=0]
E[M|L=1] – E[M|L=0]
Non-Binary: Cov(E,L)/Cov(M,L)
Instrumental Variable – Multi-Variate
δ is a consistent (=
asymptotically
unbiased) estimator
and estimates the
causal effect of M on E
¯ and L are vectors
Example 1: Emotional Contagion
• Homophily?
• Friends are friends because their “emotions are in sync”
• Both friends hate Mondays, like Friday evenings, …
• Common exposure?
• Friends read the same news, watch the same shows
• Happy/sad because of common external factors
• Social influence?
• Seeing your friend happy makes you happy
• Your friends force you to smile back
Friends’
expression
User’s
expression
Friends’
expression
User’s
expression
No manipulation of user experience!
Example 1: Emotional Contagion
External
variable
Friends’
expression
User’s
expression
• Social influence? Homophily? Common exposure?
• Use meteorological data as an instrument
Emotion on Facebook
• Classify semantic content of status updates using LIWC
• Emotion: fraction of posts with positive/negative words
Coviello et al., PLoS ONE 2014, “Detecting Emotional Contagion in Massive Social Networks”
Slides provided by Lorenzo Coviello. Thanks! Later partially modified.
yjt = user j’s happiness at time t, fraction of posts with positive/negative words
j = user whose emotion we’re predicting
i = a friend of user j
t = time window of interest
Θt = time-related fixed effect (there are “happy times”)
fj = user-related fixed effect (there are “happy users”)
𝛿jt = degree of user j at time t (friends come and go)
aijt = strength of relationship at time t between i and j
Cumulative effect of
a user on their friends
Individual-Level Model
Computationally demanding. One observation per (user, time) pair
g = city whose aggregate emotion we’re predicting
Θt = time-related fixed effect (there are “happy times”)
fg = city-related fixed effect (there are “happy cities”)
ng = number of users in city g
average emotion
in city g at time t
average strength of relationship between
i and an individual in city g
City-Level Model
Ygt = average emotional influence
on an individual in city g
Explanatory variable
(social contagion)
Instrumental variable zgt
Friends’ rain zgt
Friends’
emotion
Users’
emotion
Two-stage regression
• RAIN: 1 if it rained, 0 otherwise
Rainfall as Instrumental Variable
Instrument z should not affect when is held
constant
Friends’ rain z
Friends’
emotion
Users’
emotion
Break correlation between friends’ rain and user’s rain
– Restrict data to (city,day) WITHOUT rain
– Restrict data to (city,day) WITH rain
Exclusion Restriction
But weather in g and in friends’ cities could be correlated.
=> Weather directly influences emotions in g!
Results
Results
rain decreases positive posts by 1%
Results
rain decreases positive posts by 1%
Results
rain decreases positive posts by 1% a positive post leads to another 1.7
Open Issues
What are we measuring?
– People complain about rain on Twitter, ok. Does that mean they are
unhappy?
Just conversational dynamics? Topical contagion?
– User A: All the rain is making me depressed.
– User B: Poor guys who have to suffer in the rain.
Hidden weather variables
“it is overcast or not” might correlate between friends
even when fixing rainfall
Example 2: Physical Activity Social Networks
Aral & Nicolaides, Nature Comm. 2017, “Exercise Contagion in a Global Social Network”
Receive Updates About Friends’ Activity
Friends’
Activity
User’s
Activity
External
variable
Friends’
Activity
User’s
Activity
• Social influence? Homophily? Common exposure?
• Use meteorological data as an instrument
Data
Some undisclosed physical activity social network
5 years of data
1.1M users
359M km run
2.1M geographically located ties with weather
-> very sparse network!
How to define “good weather”?
No rain? Less rain than usual?
Not cold? Not too hot? “Nicer” than usual?
Compute percentiles for city-specific
precipitation and temperature
Use LASSO to select predictive features
i.e. feature-select the instruments to use
Predictive of friends’ activity
Average activity of i’s friends at time tweather influence stuff to be explained later
Exclusion restriction
Problem: friends and user have similar weather
– Then we’re including “common exposure”
Solution: only keep uncorrelated city pairs
cutoff ½ < .025
How to define your (friends’) activity?
Distance run? Time run? Pace? Cals burned?
– Try them all. Separately.
How to aggregate your friends’ activity?
– They use average of shared runs
– Could try lots of other alternatives (but very sparse)
User i’s friends’ physical
activity at time t Degree of user
i at time t
Link matrix at time t
Sum over all users j
The Details
Ait = activity of individual i on day t
Ap
it = a
νt = time fixed effects (holidays, marathon days, …)
ηi = user fixed effects (personal habits, motivation, …)
ωit = exogenous factors, i.e. weather
Xit = time varying characteristics, e.g. degree
Xp
it = time varying/independent factors of peers, e.g. age, country
Baseline: estimate beta using ordinary least squares regression on this model
Better: two stage least squares regression
Global Results
Results Broken Down by …
Robustness Tests Performed
Ensure that the instrument is “strong”
– Cragg-Donald Wald F statistic (Stock & Yogo, 2005)
Exogeneity
– Check (remaining) friends’ weather not predictive
Alternative instrument
– Use a good/bad weather binary setting
Falsification tests
– Friends’ future activity and weather has no influence
– Shuffle network to create “false friends”
Example 3: Weather and Icecream Contagion
Y. Mejova, S. Abbar, I. Weber, under construction …
#icecream on Instagram
2014/06/05 - 2015/10/19
Select 15 cities
143,122 posts
73,691 users
3,452,113 following
connections
Compute 7-day
running average
Binarize: weather is
good if > running
average
Binarize friends
activity: true if at
least one friend
posts on that day
WT03 - Thunder
WESF - Water equivalent of snowfall (tenths of mm)
WT04 - Ice pellets, sleet, snow pellets, or small hail"
PRCP - Precipitation (tenths of mm)
WT05 - Hail (may include small hail)
WT06 - Glaze or rime
WT07 - Dust, volcanic ash, blowing dust, blowing sand, or blowing obstruction
WT08 - Smoke or haze
SNWD - Snow depth (mm)
WT09 - Blowing or drifting snow
WT10 - Tornado, waterspout, or funnel cloud"
WT11 - High or damaging winds
TMAX - Maximum temperature (tenths of degrees C)
WT13 - Mist
SNOW - Snowfall (mm)
WT14 - Drizzle
WT15 - Freezing drizzle
WT16 - Rain (may include freezing rain, drizzle, and freezing drizzle)"
TOBS - Temperature at the time of observation (tenths of degrees C)
WT17 - Freezing rain
WT18 - Snow, snow pellets, snow grains, or ice crystals
WT19 - Unknown source of precipitation
AWND - Average daily wind speed (tenths of meters per second)
WT21 - Ground fog
WT22 - Ice fog or freezing fog
WT01 - Fog, ice fog, or freezing fog (may include heavy fog)
WESD - Water equivalent of snow on the ground (tenths of mm)
WT02 - Heavy fog or heaving freezing fog (not always distinguished from fog)
PSUN - Daily percent of possible sunshine (percent)
TMIN - Minimum temperature (tenths of degrees C)
TSUN - Daily total sunshine (minutes)
Instrumental variable: weather
average daily
temperature in New
York in black (left y-
axis) and the ratio of
#icecream to #food
in red (right y-axis).
Preliminary Results
Weird! Effect size increases when using weather as IV …
Natural Experiments Summary
Natural experiments can be powerful alternatives to
experiments
Find randomized variables that are highly correlated
with your regressor but not with your outcome
Carefully think of violations to exclusion criterion
Perform robustness checks and falsification tests
Part 2: Matching Methods
Matching Methods
Among given “organic” data (e.g. human trace data), can we
find a subset that looks like generated by an experiment?
matching == pruning
Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2007. “Matching as Nonparametric
Preprocessing for Reducing Model Dependence in Parametric Causal Inference.” Political Analysis 15: 199–
236. Copy at http://j.mp/jPupwz
Position
education (in years)
Outcome
1-dimensional covariate
Treated with
special training
Does Special Training Help Job Promotion?
Gary King, "Why Propensity Scores Should Not Be Used for Matching“, Methods Colloquium,
2015, https://www.youtube.com/watch?v=rBv39pK1iEs
Linear Regression
position(p)
education (e)
Correcting for education, the treated group has higher positions.
p = c + β*e + γ*is_treated
γ = estimated
treatment effect
binary variable
Quadratic Regression
position(p)
education (e)
Model Dependence
Too much freedom given to analyst.
Reason: Imbalance of covariates
Correcting for education, the treated group has lower positions.
p = c + β1*e + β2*e2 + γ*is_treated
Regression After Pruning
position(p)
education (e)
1) Preprocessing
(matching)
2) Estimation of
effects
(regression models)
Matching has reduced model dependence!
Matching Approximates Randomized
Experiment
Completely randomized:
Flip a coin for each patient. Heads -> “T”, tails -> “C”.
Could get unlucky: all men assigned “T”
Fully blocked experiment:
First pair up similar patients, same gender, age, …
Then flip a coin for each pair. One gets “T”, one “C”.
Balances the known covariates.
Both balance unknown covariates.
Fully blocked experiment dominates complete
randomization!
Distance Matching
Approximates fully blocked experiment
Many Variations:
Optimal match, greedy match,
match 1:1 or 1:many, and so on
Prune bad matched with
distance > threshold (“caliper”)
age
education
Mahalanobis Distance
Euclidean distance doesn’t make sense when different
dimensions are on different scales.
(yearly income, age, gender, body weight, …)
Distance dominated by largest values
Conceptual fix: first rescale each dimension to N(0,1)
Ok, but maybe want to correct for colinearity
In practice: could use “expert scaling” and Euclidean distance.
Example 1: Comment Quality on the Internet
Some sites no longer have them
Some sites still have them
But many no longer show downvotes
What made “comments go sour”?
Is there an effect of the votes received on a comment?
Re “operant conditioning” (punishment & reward)
“I believe that restricting immigration of highly
qualified people could hurt our economy.”
“Trump is a sh*thead.”
“According to a 2015 scientific study [reference]
However, [user1] makes a valid point that …”
79 up-votes
1 down-vote
2 up-votes
100 down-vote
Cheng, Danescu-Niculescu-Mizil, Leskovec. ICWSM’14.
“How Community Feedback Shapes User Behavior”
Approach: Match Similar Posts
1. Automatically quantify a post’s quality
2. Match pairs of posts of similar quality
One receives positive feedback
One negative feedback
Q: What happens next?
Data
Data from Disqus
CNN, Breitbart, IGN, Allkpop
1.2M threads, 42M comments, 140M votes, 1.8M users
Quantifying a Post’s Quality
1st: Measure community feedback
– P? -N? P-N? P/(P+N)?
– Ask AMT workers to rate feedback received
2nd: Build a text-only model to predict q
– R2 = .22 (using a separate test set)
– Compared to AMT labels q’: q R2 =.25, p R2 = .12
p=
Match on Predicted Quality q
|q(a0) - q(b0)|· 10-4
Also: # words, # past posts, % +tive vote
for CNN
Observed Effects
Both changes in (objective) quality and in community perception!
Changes on Activity
Negative feedback accelerates
commenting rate
Negative feedback keeps
users for longer
Negative feedback leads
to retaliation
But … matching seems to be imperfect.
Rate of giving positive feedback not balanced!
Hints at unbalanced latent factor
A perfect storm of a downward spiral!
Example 2: Social Feedback & Weight Loss
Cunha, Weber, Pappa. WWW’17 WebSci Track. “A Warm Welcome Matters!
The Link Between Social Feedback and Weight Loss in /r/loseit”
Support for Newcomers
So I've been working on losing weight since
December, but since June I've been in a rut
:(
3 points 0 comments submitted 4 years ago by moonyDP to r/loseit
Okay, so I was diagnosed back in December with GERD,
and my doctor told me it would help to lose weight. I'm 5'
8" and, at the time, was around 175-180. …
I'm 23 and weigh 550lbs. Please help
455 points 204 comments submitted 2 months ago by
Ecurtis936'5" 550Lbs Male to r/loseit
Starting weight: 550lbs Goal Weight: 250lbs
Just to tell you a little about myself; I'm 23 years old
6'5" and sadly weigh 560lbs. I work at a call center,
sitting in a desk for 10 hours a day. …
Data Collection
5 years of data (August 2010 to
October 2014)
107,886 unique users
70,949 posts and 922,245 comments
Metadata (timestamp, user name,
voting score and history of badges)
Define Treatment and Control
Look at first post of a user in the community
Treatment = received comments
Sparsity: 96% of posts received a comment
Re-Define:
–Treatment = received at least 4 comments
–4,657 treatment and 1,468 control
Covariates Choice
Matching only balances matched variables
– Important choice of what to match on
Build LASSO regularized model to predict receiving
“treatment”
Use LDA topics, LIWC, Question words, posts size,
sentiment
– 98 variables in total
Final model 20 variables (selected by LASSO)
Use coefficient values as covariates weights
Prune by Matching
Use cosine similarity for matching
– Weighted by LASSO coefficients
Use 1-to-Many matching
– To avoid throwing out data
Use a caliper to only keep “similar
enough” matches
– Extreme case: exact match
Balance Check
Compute standardized mean difference
Small dc = similar values of c in treatment and
control group
Remaining bias for variable c is considered to be
insignificant if dc is smaller than 0.1
Note: don’t use a significance test! Else “too little
data => no significant difference”
Estimate Effect Size
Effect on return rate
25,647 users present in Group 1. 18,000 treatment and 7,647
control.
Balance check Effect size
Estimate Effect Size
Effect on weight loss
6,143 users present in Group 2. 4,657 treatment and 1,468 control.
26%, or an absolute mean difference of 9 lbs.
Balance check Effect size
Mediation Analysis
Used a Sobel Test to check for mediation
No statistically significant mediation effect found
Social
Feedback Weight Loss
Engagement
in Community
Qualitative Evidence
Community survey: “What do you like about /r/loseit?”
After Our Study
Limitations
Using badges to track weight loss
–What if they don’t update badges?
Determining the start of weight loss
journey
–What if lost weight before first post?
Our choice of covariates
–Can only correct for known
covariates
Observability of returning users
–No return does not equal no weight loss
Matching Methods Summary
• Matching methods help to approximate causality
• Problems
– Researchers have lots of freedom on how to match
– Most matching methods have been developed for low
number of covariates
– Worst case: random pruning  increases imbalance 
increases bias and model dependence
• Test for balance of observed covariates
• Compare results from different matching methods,
different dimensionality reduction methods, different
models
– Avoid model dependence and method dependence!
There is No Magic Bullet
https://twitter.com/johnmyleswhite/status/854419974995050496
Thanks!

More Related Content

Similar to Matching Methods and Natural Experiments - Examples of Causal Inference from Social Media

Calculating sd
Calculating sdCalculating sd
Calculating sd
simoluca
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-iv
Laks Lakshmanan
 
Lecture on Environmental Impact Assessment.pdf
Lecture on Environmental Impact Assessment.pdfLecture on Environmental Impact Assessment.pdf
Lecture on Environmental Impact Assessment.pdf
apratim7
 

Similar to Matching Methods and Natural Experiments - Examples of Causal Inference from Social Media (20)

Challenges in Software Ecosystems Research
Challenges in Software Ecosystems ResearchChallenges in Software Ecosystems Research
Challenges in Software Ecosystems Research
 
Can we predict your sentiments by listening to your peers?
Can we predict your sentiments by listening to your peers?Can we predict your sentiments by listening to your peers?
Can we predict your sentiments by listening to your peers?
 
Roberti esa 2014 quantifying measurement uncertainty
Roberti esa 2014 quantifying measurement uncertaintyRoberti esa 2014 quantifying measurement uncertainty
Roberti esa 2014 quantifying measurement uncertainty
 
Calculating sd
Calculating sdCalculating sd
Calculating sd
 
Enforcement and inequality in collective payments to conserve tropical forests
Enforcement and inequality in collective payments to conserve tropical forestsEnforcement and inequality in collective payments to conserve tropical forests
Enforcement and inequality in collective payments to conserve tropical forests
 
Advanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxAdvanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptx
 
Consideration of reputation prediction of ladygaga using the mathematical mod...
Consideration of reputation prediction of ladygaga using the mathematical mod...Consideration of reputation prediction of ladygaga using the mathematical mod...
Consideration of reputation prediction of ladygaga using the mathematical mod...
 
Consideration of Reputation Prediction of Ladygaga Using the Mathematical Mod...
Consideration of Reputation Prediction of Ladygaga Using the Mathematical Mod...Consideration of Reputation Prediction of Ladygaga Using the Mathematical Mod...
Consideration of Reputation Prediction of Ladygaga Using the Mathematical Mod...
 
03 Ego Network Analysis (2016)
03 Ego Network Analysis (2016)03 Ego Network Analysis (2016)
03 Ego Network Analysis (2016)
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysis
 
Cooperation, Reputation & Gossiping
Cooperation, Reputation & GossipingCooperation, Reputation & Gossiping
Cooperation, Reputation & Gossiping
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-iv
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
Lecture on Environmental Impact Assessment.pdf
Lecture on Environmental Impact Assessment.pdfLecture on Environmental Impact Assessment.pdf
Lecture on Environmental Impact Assessment.pdf
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
Community Adaptation to Flooding in a Changing Climate: Assessing Municipal O...
Community Adaptation to Flooding in a Changing Climate: Assessing Municipal O...Community Adaptation to Flooding in a Changing Climate: Assessing Municipal O...
Community Adaptation to Flooding in a Changing Climate: Assessing Municipal O...
 
Seminar presentation: "Being a student for the years: the beauty of scientif...
Seminar presentation: "Being a student for the years:  the beauty of scientif...Seminar presentation: "Being a student for the years:  the beauty of scientif...
Seminar presentation: "Being a student for the years: the beauty of scientif...
 
Social Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic RulesSocial Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic Rules
 
Lecture2-LinearRegression.ppt
Lecture2-LinearRegression.pptLecture2-LinearRegression.ppt
Lecture2-LinearRegression.ppt
 

More from Ingmar Weber

More from Ingmar Weber (20)

Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social Media
 
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and Propaganda
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social Good
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introduction
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairs
 
Digital data for migration research
Digital data for migration researchDigital data for migration research
Digital data for migration research
 
Digital Trace Data for Demographic Research
Digital Trace Data for Demographic ResearchDigital Trace Data for Demographic Research
Digital Trace Data for Demographic Research
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration research
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for Good
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gaps
 
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and more
 
Hate Speech, Polarization and Online Data
Hate Speech, Polarization and Online DataHate Speech, Polarization and Online Data
Hate Speech, Polarization and Online Data
 
Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender Gaps
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migration
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy making
 
Not-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic ResearchNot-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic Research
 

Recently uploaded

ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 

Recently uploaded (20)

ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent University
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 

Matching Methods and Natural Experiments - Examples of Causal Inference from Social Media

  • 1. Matching Methods and Natural Experiments Examples of Causal Inference from Social Media Ingmar Weber @ingmarweber Given at OSSM Workshop at ICWSM’17 https://www.microsoft.com/en-us/research/event/ossm17/ Featuring joint work with Tiago Cunha, Gisele Pappa, Yelena Mejova and Sofiane Abbar Slides partly based on joint WWW’16 tutorial with M. Strohmaier, C. Wagner and L. Aiello
  • 2. performance time Training Did the training cause an increase in performance? Estimating Causal Effect Yi(T) Yi(C)
  • 4. Ideal Solution: Run an Experiment Gold Standard Method for causal inference Randomly assign subjects to either treatment or control group Assume both groups large enough to wash out differences in covariates [More on best assignment strategies later] If done correctly, no need for fancy analysis. Average(treatment group) - Average(control group)
  • 5. Limitations of Experiments • Expensive • Not all treatments are possible and ethical • Internal validity is high, but external, i.e. generalization validity is often limited • Non-interference assumption is often violated in social science field experiments, i.e. i‘s treatment effects j‘s treatment
  • 6. Alternative Solution: Causal Inference from Observational Data Approach 1: Natural Experiments Has nature already done the work for us? Is there a (partly) random observed assignment? Approach 2: Matching Methods Try to find pairs of as-similar-as-possible participants. One happened to get treated, the other not Part1Part2
  • 7. Part 1: Natural Experiments
  • 8. 1854 Cholera Outbreak in London Assumption at time: cholera is air-borne disease John Snow’s hypothesis: cholera is water-borne disease Famous use of mapping Water Company # houses Deaths/10k Southwark and Vauxhall 40,046 315 Lambeth 26,107 37 Assignment to water company as-if random • Could differ from one house to next • Tenants don’t know their company Lambeth’s inlet upstream = clean water S&V’s inlet downstream = contaminated water First (?) use of natural experiment! There was one significant anomaly – none of the workers in the nearby Broad Street brewery contracted cholera. They were given a daily allowance of beer, and did not consume water from the nearby well. The water used in the brewing process is boiled during mashing which kills cholera bacteria. [Wikipedia]
  • 9. Natural Experiments with Instrumental Variables Study by Angrist 1990: What is the effect of military service (M) on lifetime earnings (E)? +: Improve self discipline? Get external recognition? Join network of “alumni”? – Could increase earnings. -: Lose actual job experience? Become traumatized? Lose touch with society? – Could decrease earnings. Why not just compute: Exp. earnings for military joiners – exp. earnings for others E [ E | M=1] - E [ E | M=0 ]
  • 10. Limits of Linear Regression Assume the structural equation: E = ® + ¯ * M + ² Error term ² stands for all exogenous factors that affect E when M is held constant Crucial assumption: Cov(M,²) = 0 (not correlated) M P E lifetime earnings earning potential (unobserved) military service Actual model:
  • 11. Limits of Linear Regression (P)otential (M)ilitary service (E)arnings count High = 1 Yes = 1 40,000 20 No = 0 30,000 80 Low = 0 Yes = 1 20,000 80 No = 0 10,000 20 E [ E | M=1] = (20*40k + 80*20k)/100 = 24k E [ E | M=0 ] = (80*30k + 20*10k)/100 = 26k Fitted linear regression: E = 26k - 2k* M + ² At this stage you have enforced Cov(M,²)=0 True structural equation: E = 10k + 10k*M + 20k*P Example of Simpson’s Paradox Wrong assumption => biased estimator Cov(M,P) = -0.15, r = -0.6
  • 12. Instrumental Variable Instrumental variable should be strongly correlated with the included endogenous regressors (L <-> M), but not with the effect outcome variable directly (L <-> E) M P E L L is the draft-lottery (can be 0 or 1) L is an instrument for the causal effect of M on E Angrist, Joshua D. (1990). "Lifetime Earnings and the Vietnam Draft Lottery: Evidence from Social Security Administrative Records". American Economic Review 80 (3) lifetime earnings earning potential military servicedraft lottery
  • 13. Instrumental Variable Observe: E[ E | L=1 ] - E[ E | L=0 ] = -$2,000/year Done? Winning the lottery != Doing Military Service Let’s call the military service “treatment” Winning the lottery = assigned-to-treat a% of population = Always-Treats, EA is unaffected n% of population = Never-Treats, EN is unaffected c% of population = Compliers, EC T vs. EC C a + n + c = 100 Goal: Estimate EC T vs. EC C
  • 14. Instrumental Variable E[ E | L=1 ] = a*EA + n*EN + c* EC T E[ E | L=0 ] = a*EA + n*EN + c* EC C EC T - EC C = (E[ E | L=1 ] - E[ E | L=0 ]) / c How to compute c? # (M = 1 & L =1 ) = (c + a) * # (L=1) # (M = 1 & L =0 ) = a * # (L = 0) c = P( M = 1 | L = 1) – P(M = 1 | L = 0)
  • 15. “Wald Estimator” if L is binary Plausibility check: what if L is perfect instrument? Example: winning the lottery => 90% chance of joining military & 10% joining without invitation & $2,200 per year difference $2,200 / 0.8 = $2,750 (what if 0.00001?) Angrist found that military service decreases earnings about $2,741 dollar per year Instrumental Variable E[E|L=1] – E[E|L=0] E[M|L=1] – E[M|L=0] Non-Binary: Cov(E,L)/Cov(M,L)
  • 16. Instrumental Variable – Multi-Variate δ is a consistent (= asymptotically unbiased) estimator and estimates the causal effect of M on E ¯ and L are vectors
  • 17. Example 1: Emotional Contagion • Homophily? • Friends are friends because their “emotions are in sync” • Both friends hate Mondays, like Friday evenings, … • Common exposure? • Friends read the same news, watch the same shows • Happy/sad because of common external factors • Social influence? • Seeing your friend happy makes you happy • Your friends force you to smile back Friends’ expression User’s expression
  • 18. Friends’ expression User’s expression No manipulation of user experience! Example 1: Emotional Contagion External variable Friends’ expression User’s expression • Social influence? Homophily? Common exposure? • Use meteorological data as an instrument
  • 19. Emotion on Facebook • Classify semantic content of status updates using LIWC • Emotion: fraction of posts with positive/negative words Coviello et al., PLoS ONE 2014, “Detecting Emotional Contagion in Massive Social Networks” Slides provided by Lorenzo Coviello. Thanks! Later partially modified.
  • 20. yjt = user j’s happiness at time t, fraction of posts with positive/negative words j = user whose emotion we’re predicting i = a friend of user j t = time window of interest Θt = time-related fixed effect (there are “happy times”) fj = user-related fixed effect (there are “happy users”) 𝛿jt = degree of user j at time t (friends come and go) aijt = strength of relationship at time t between i and j Cumulative effect of a user on their friends Individual-Level Model Computationally demanding. One observation per (user, time) pair
  • 21. g = city whose aggregate emotion we’re predicting Θt = time-related fixed effect (there are “happy times”) fg = city-related fixed effect (there are “happy cities”) ng = number of users in city g average emotion in city g at time t average strength of relationship between i and an individual in city g City-Level Model Ygt = average emotional influence on an individual in city g
  • 22. Explanatory variable (social contagion) Instrumental variable zgt Friends’ rain zgt Friends’ emotion Users’ emotion Two-stage regression • RAIN: 1 if it rained, 0 otherwise Rainfall as Instrumental Variable
  • 23. Instrument z should not affect when is held constant Friends’ rain z Friends’ emotion Users’ emotion Break correlation between friends’ rain and user’s rain – Restrict data to (city,day) WITHOUT rain – Restrict data to (city,day) WITH rain Exclusion Restriction But weather in g and in friends’ cities could be correlated. => Weather directly influences emotions in g!
  • 27. Results rain decreases positive posts by 1% a positive post leads to another 1.7
  • 28. Open Issues What are we measuring? – People complain about rain on Twitter, ok. Does that mean they are unhappy? Just conversational dynamics? Topical contagion? – User A: All the rain is making me depressed. – User B: Poor guys who have to suffer in the rain. Hidden weather variables “it is overcast or not” might correlate between friends even when fixing rainfall
  • 29. Example 2: Physical Activity Social Networks Aral & Nicolaides, Nature Comm. 2017, “Exercise Contagion in a Global Social Network”
  • 30. Receive Updates About Friends’ Activity
  • 32. Data Some undisclosed physical activity social network 5 years of data 1.1M users 359M km run 2.1M geographically located ties with weather -> very sparse network!
  • 33. How to define “good weather”? No rain? Less rain than usual? Not cold? Not too hot? “Nicer” than usual? Compute percentiles for city-specific precipitation and temperature Use LASSO to select predictive features i.e. feature-select the instruments to use Predictive of friends’ activity Average activity of i’s friends at time tweather influence stuff to be explained later
  • 34. Exclusion restriction Problem: friends and user have similar weather – Then we’re including “common exposure” Solution: only keep uncorrelated city pairs cutoff ½ < .025
  • 35. How to define your (friends’) activity? Distance run? Time run? Pace? Cals burned? – Try them all. Separately. How to aggregate your friends’ activity? – They use average of shared runs – Could try lots of other alternatives (but very sparse) User i’s friends’ physical activity at time t Degree of user i at time t Link matrix at time t Sum over all users j
  • 36. The Details Ait = activity of individual i on day t Ap it = a νt = time fixed effects (holidays, marathon days, …) ηi = user fixed effects (personal habits, motivation, …) ωit = exogenous factors, i.e. weather Xit = time varying characteristics, e.g. degree Xp it = time varying/independent factors of peers, e.g. age, country Baseline: estimate beta using ordinary least squares regression on this model Better: two stage least squares regression
  • 39. Robustness Tests Performed Ensure that the instrument is “strong” – Cragg-Donald Wald F statistic (Stock & Yogo, 2005) Exogeneity – Check (remaining) friends’ weather not predictive Alternative instrument – Use a good/bad weather binary setting Falsification tests – Friends’ future activity and weather has no influence – Shuffle network to create “false friends”
  • 40. Example 3: Weather and Icecream Contagion Y. Mejova, S. Abbar, I. Weber, under construction … #icecream on Instagram
  • 41. 2014/06/05 - 2015/10/19 Select 15 cities 143,122 posts 73,691 users 3,452,113 following connections
  • 42. Compute 7-day running average Binarize: weather is good if > running average Binarize friends activity: true if at least one friend posts on that day WT03 - Thunder WESF - Water equivalent of snowfall (tenths of mm) WT04 - Ice pellets, sleet, snow pellets, or small hail" PRCP - Precipitation (tenths of mm) WT05 - Hail (may include small hail) WT06 - Glaze or rime WT07 - Dust, volcanic ash, blowing dust, blowing sand, or blowing obstruction WT08 - Smoke or haze SNWD - Snow depth (mm) WT09 - Blowing or drifting snow WT10 - Tornado, waterspout, or funnel cloud" WT11 - High or damaging winds TMAX - Maximum temperature (tenths of degrees C) WT13 - Mist SNOW - Snowfall (mm) WT14 - Drizzle WT15 - Freezing drizzle WT16 - Rain (may include freezing rain, drizzle, and freezing drizzle)" TOBS - Temperature at the time of observation (tenths of degrees C) WT17 - Freezing rain WT18 - Snow, snow pellets, snow grains, or ice crystals WT19 - Unknown source of precipitation AWND - Average daily wind speed (tenths of meters per second) WT21 - Ground fog WT22 - Ice fog or freezing fog WT01 - Fog, ice fog, or freezing fog (may include heavy fog) WESD - Water equivalent of snow on the ground (tenths of mm) WT02 - Heavy fog or heaving freezing fog (not always distinguished from fog) PSUN - Daily percent of possible sunshine (percent) TMIN - Minimum temperature (tenths of degrees C) TSUN - Daily total sunshine (minutes)
  • 43. Instrumental variable: weather average daily temperature in New York in black (left y- axis) and the ratio of #icecream to #food in red (right y-axis).
  • 44. Preliminary Results Weird! Effect size increases when using weather as IV …
  • 45. Natural Experiments Summary Natural experiments can be powerful alternatives to experiments Find randomized variables that are highly correlated with your regressor but not with your outcome Carefully think of violations to exclusion criterion Perform robustness checks and falsification tests
  • 46.
  • 47. Part 2: Matching Methods
  • 48. Matching Methods Among given “organic” data (e.g. human trace data), can we find a subset that looks like generated by an experiment? matching == pruning
  • 49. Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2007. “Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference.” Political Analysis 15: 199– 236. Copy at http://j.mp/jPupwz Position education (in years) Outcome 1-dimensional covariate Treated with special training Does Special Training Help Job Promotion? Gary King, "Why Propensity Scores Should Not Be Used for Matching“, Methods Colloquium, 2015, https://www.youtube.com/watch?v=rBv39pK1iEs
  • 50. Linear Regression position(p) education (e) Correcting for education, the treated group has higher positions. p = c + β*e + γ*is_treated γ = estimated treatment effect binary variable
  • 51. Quadratic Regression position(p) education (e) Model Dependence Too much freedom given to analyst. Reason: Imbalance of covariates Correcting for education, the treated group has lower positions. p = c + β1*e + β2*e2 + γ*is_treated
  • 52. Regression After Pruning position(p) education (e) 1) Preprocessing (matching) 2) Estimation of effects (regression models) Matching has reduced model dependence!
  • 53. Matching Approximates Randomized Experiment Completely randomized: Flip a coin for each patient. Heads -> “T”, tails -> “C”. Could get unlucky: all men assigned “T” Fully blocked experiment: First pair up similar patients, same gender, age, … Then flip a coin for each pair. One gets “T”, one “C”. Balances the known covariates. Both balance unknown covariates. Fully blocked experiment dominates complete randomization!
  • 54. Distance Matching Approximates fully blocked experiment Many Variations: Optimal match, greedy match, match 1:1 or 1:many, and so on Prune bad matched with distance > threshold (“caliper”) age education
  • 55. Mahalanobis Distance Euclidean distance doesn’t make sense when different dimensions are on different scales. (yearly income, age, gender, body weight, …) Distance dominated by largest values Conceptual fix: first rescale each dimension to N(0,1) Ok, but maybe want to correct for colinearity In practice: could use “expert scaling” and Euclidean distance.
  • 56. Example 1: Comment Quality on the Internet Some sites no longer have them Some sites still have them But many no longer show downvotes
  • 57. What made “comments go sour”? Is there an effect of the votes received on a comment? Re “operant conditioning” (punishment & reward) “I believe that restricting immigration of highly qualified people could hurt our economy.” “Trump is a sh*thead.” “According to a 2015 scientific study [reference] However, [user1] makes a valid point that …” 79 up-votes 1 down-vote 2 up-votes 100 down-vote Cheng, Danescu-Niculescu-Mizil, Leskovec. ICWSM’14. “How Community Feedback Shapes User Behavior”
  • 58. Approach: Match Similar Posts 1. Automatically quantify a post’s quality 2. Match pairs of posts of similar quality One receives positive feedback One negative feedback Q: What happens next?
  • 59. Data Data from Disqus CNN, Breitbart, IGN, Allkpop 1.2M threads, 42M comments, 140M votes, 1.8M users
  • 60. Quantifying a Post’s Quality 1st: Measure community feedback – P? -N? P-N? P/(P+N)? – Ask AMT workers to rate feedback received 2nd: Build a text-only model to predict q – R2 = .22 (using a separate test set) – Compared to AMT labels q’: q R2 =.25, p R2 = .12 p=
  • 61. Match on Predicted Quality q |q(a0) - q(b0)|· 10-4 Also: # words, # past posts, % +tive vote for CNN
  • 62. Observed Effects Both changes in (objective) quality and in community perception!
  • 63. Changes on Activity Negative feedback accelerates commenting rate Negative feedback keeps users for longer Negative feedback leads to retaliation But … matching seems to be imperfect. Rate of giving positive feedback not balanced! Hints at unbalanced latent factor A perfect storm of a downward spiral!
  • 64. Example 2: Social Feedback & Weight Loss Cunha, Weber, Pappa. WWW’17 WebSci Track. “A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /r/loseit”
  • 65. Support for Newcomers So I've been working on losing weight since December, but since June I've been in a rut :( 3 points 0 comments submitted 4 years ago by moonyDP to r/loseit Okay, so I was diagnosed back in December with GERD, and my doctor told me it would help to lose weight. I'm 5' 8" and, at the time, was around 175-180. … I'm 23 and weigh 550lbs. Please help 455 points 204 comments submitted 2 months ago by Ecurtis936'5" 550Lbs Male to r/loseit Starting weight: 550lbs Goal Weight: 250lbs Just to tell you a little about myself; I'm 23 years old 6'5" and sadly weigh 560lbs. I work at a call center, sitting in a desk for 10 hours a day. …
  • 66. Data Collection 5 years of data (August 2010 to October 2014) 107,886 unique users 70,949 posts and 922,245 comments Metadata (timestamp, user name, voting score and history of badges)
  • 67. Define Treatment and Control Look at first post of a user in the community Treatment = received comments Sparsity: 96% of posts received a comment Re-Define: –Treatment = received at least 4 comments –4,657 treatment and 1,468 control
  • 68. Covariates Choice Matching only balances matched variables – Important choice of what to match on Build LASSO regularized model to predict receiving “treatment” Use LDA topics, LIWC, Question words, posts size, sentiment – 98 variables in total Final model 20 variables (selected by LASSO) Use coefficient values as covariates weights
  • 69. Prune by Matching Use cosine similarity for matching – Weighted by LASSO coefficients Use 1-to-Many matching – To avoid throwing out data Use a caliper to only keep “similar enough” matches – Extreme case: exact match
  • 70. Balance Check Compute standardized mean difference Small dc = similar values of c in treatment and control group Remaining bias for variable c is considered to be insignificant if dc is smaller than 0.1 Note: don’t use a significance test! Else “too little data => no significant difference”
  • 71. Estimate Effect Size Effect on return rate 25,647 users present in Group 1. 18,000 treatment and 7,647 control. Balance check Effect size
  • 72. Estimate Effect Size Effect on weight loss 6,143 users present in Group 2. 4,657 treatment and 1,468 control. 26%, or an absolute mean difference of 9 lbs. Balance check Effect size
  • 73. Mediation Analysis Used a Sobel Test to check for mediation No statistically significant mediation effect found Social Feedback Weight Loss Engagement in Community
  • 74. Qualitative Evidence Community survey: “What do you like about /r/loseit?”
  • 76. Limitations Using badges to track weight loss –What if they don’t update badges? Determining the start of weight loss journey –What if lost weight before first post? Our choice of covariates –Can only correct for known covariates Observability of returning users –No return does not equal no weight loss
  • 77. Matching Methods Summary • Matching methods help to approximate causality • Problems – Researchers have lots of freedom on how to match – Most matching methods have been developed for low number of covariates – Worst case: random pruning  increases imbalance  increases bias and model dependence • Test for balance of observed covariates • Compare results from different matching methods, different dimensionality reduction methods, different models – Avoid model dependence and method dependence!
  • 78. There is No Magic Bullet https://twitter.com/johnmyleswhite/status/854419974995050496

Editor's Notes

  1. In Vietnam war people were selected via lottery based on which day they were born.
  2. S = covariance matrix