SlideShare a Scribd company logo
1 of 40
Download to read offline
Weitao Duan
LinkedIn
Creating Economic Opportunity for Every LinkedIn Member amid
Network Interferences: Journey and Learning
The world’s largest professional network
143M+ United States
14M+ Canada
30M+ Brazil
11M+
Mexico
6M+ Columbia
4M+ Chile
6M+ Argentina
90M+ Europe
6M+ South Africa
1M+ Kenya
1M+
Nigeria
2M+ Egypt
2M+ Saudi Arabia
1M+ Israel
6M+ Turkey
1M+ Morocco
45M+ India
36M+ China
1M+ Hong
Kong
1M+ Republic of Korea
1M+ Japan
9M+ Australia
1M+ New
Zealand
5M+ Philippines
9M+ Indonesia3M+ Malaysia
2M+
Singapore
Land a New Job
Share a Post
Follow Companies
Digest Updates from
My Connections
Connect with Someone
Learn a New Skill
60K
Schools
575M
Members
26M
Companies
50K
Skills
15M
Open Jobs
190B
Updates viewed
LINKEDIN’S VISION
Create economic opportunity
for every member of the global workforce
LINKEDIN’S MISSION
Connect the world’s professionals to be more
productive and successful
Strong Experiment Culture
We experiment on UI changes, relevance algorithms, backend changes,
and even bug fixes.
Advanced Experiment Infrastructure
We have start-of-the-art in-house platform to meet the growing need of
experimentation
Leverage the Power of Data to Create Economic Opportunity for Every Member
10000+
Metrics Computed
500+
Daily Active Experiments
10+ TB
Metric and Experiment
Assignment Data Processed
Data is in Our DNA
Shared Challenge among Social Networks
Network Interference
Shared Challenge among Social Networks
Shared Challenge among Social Networks
Network Interference
Comment
Reshare
Message
Like
Post
Posts an article
Anna
Treatment:
encourages posts
Control
Anna
Visits more
Weitao
Weitao
SUTVA no longer holds!
Posts more
Untangle the Nuances of Network Effect
Cluster members and
randomize on clusters
Cluster-based
Focus on network
effect within ego
clusters
Ego Cluster
Analyze interference
on edge level
Edge-level
Cluster members and
randomize on clusters
Cluster-based
LinkedIn
Members
LinkedIn
Members
Control (A)
Treatment (B)
Bernoulli
Randomization
LinkedIn
Members
LinkedIn
Members
LinkedIn
Members
LinkedIn
Members
Cluster-based
Randomization
Cluster-based
Randomization
Bernoulli
Randomization
VS
Cluster-based
Randomization
Bernoulli
Randomization
VS
A/B Testing A/B Tests
Cluster-based
Randomization
Bernoulli
Randomization
50% 50%
Δbernouilli Δcluster-based
[1]Testing for arbitrary interference on experimentation platforms
Jean Pouget-Abadie, Martin Saveski, Guillaume Saint-Jacques,
Weitao Duan,
Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi.
[2]Detecting Network Effects: Randomizing over Randomized
Experiments
Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques,
Weitao Duan,
Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi. KDD 2017.
Cluster-based was not built into the production environment, because
Good idea, but..
Low Power
Number of clusters is
too small to detect the
network effect
Many Edges Cut
Clusters are not perfectly
isolated from each other.
Some network effect were
not captured
High Management Cost
Considerable amount of
effort into clustering and
setting up the experiment
Shift from Clusters to Ego Networks
Focus on network effect
within ego clusters
Ego Cluster
Ego-net based approach
Ego Network
2019. Using ego-clusters to measure network effects at LinkedIn.
Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu.
Ego
Alter
Ego-net based approach
Step 2.
We only treat alters
(e.g. feed relevance
models encouraging
or discouragingg
comments/shares/likes
etc).
Step 3: We only
compare egos
Step 1: We pick some ego-networks
in the graph (think ~100K)
Control ego net
Treatment ego net
2019. Using ego-clusters to measure network effects at LinkedIn.
Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu.
Ego-net based approach
Observe Metric Mi from Ego i
Observe Metric Mj from Ego j
…
Control ego net Treatment ego net
Observe Metric Mp from Ego p
Observe Metric Mq from Ego q
…
• Under H0 (no network effect), the metric difference is zero.
• Use two-sample t-test to test if the difference = 0
• The difference is the network effect we have captured
with ego cluster design
Posts more
Ego
Posts
Comments &
Reshares
Alter
(Treatment)
1-hop
Measurable!
Treatment encourages
comments and shares
Alter’s Alter
(Treatment)
More Posts
More
Comments
& Reshares
Alter Ego
Visits more
2+ hop
Cannot be measured..
Treatment
encourages
more posts
Ego Cluster Learning
Captures “1-hop” network effect (could be the
majority of the network effect)
The captured network portion can be 4x higher than
Bernoulli randomization. Sometimes, network effect
can take opposite sign than Bernoulli randomization
(and with bigger magnitude)
1-hop
> 4X
Works well in feed experiments, but not in other
product areas
Feed Specific
Edge Level Analysis
Analyze interference on the
edge level
Edge-level
Motivating Example
Treatment: Control:
Nothing
How many more messages are sent in the ecosystem?
Parameters govern the flow of messages
q - the increase in probability of sending an initial
message (theoretical lift of an experiment)
𝛼 - probability of response
𝛽 - base rate at which messages are sent
Simulating the flow of a single message
In this cartoon example, James is
in treatment, Anna is in control
1)
2)
3)
4)
5)
1) James sends initial message with probability = 𝛽 *(1+q)
2) Anna receives ‘happy birthday’ w/ probability, 𝛽 *(1+q).
She sends “thanks” w/ probability = 𝛼
3) the message ‘thanks’ exists w/ probability = 𝛼 * 𝛽 *(1+q)
4) when James receives ‘thanks’, he replies w/ probability = 𝛼
5) the probability the “You’re welcome” exists depends on the initial send, the
probability of Anna responding AND the probability of James responding, or
probability = 𝛼 *(𝛼 * 𝛽 *(1+q))
James:You’re welcome!
James:Happy Birthday!
Anna:
Thanks!
● Notation:
○ X1 is used for variables seen when treatment is rolled
out (observed)
○ X0 is for variables when treatment does nothing (never
observed, but needed!)
● We decompose TtoT1 = TtoT0 + TtoT0 q1 (1+α+α2+…)
● We assume TtoT0 = CtoC0 = CtoC1 (with some normalizations)
● We get: TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…)
● We can compute total lift as a function of observables only!
● We can decompose it into q1 and α.
*Caveats: assumption that members do not react to the how ‘treated’ their
network is, only to treated status of individuals.
Estimating from observed data
T
C
T
C
C
T
TtoT
TtoC
CtoT
CtoC
Members in the treatment group (T), send
messages both to other members in
Treatment (T) and members in Control (C)
The 4 Edge Elements
Total lift
True Lift
True Lift: the expected lift to messages sent
for the ecosystem if all members have the
new feature
Estimated with only ‘clean’ C-C or T-T edges
MessagesSentNormedby
(theoretical)Edge
(by)
Control
(by)
Treatment
TtoTCtoC
delta
● TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…)
● TtoT1 = CtoC1 + CtoC1 * total lift
● With the right normalizations, this gives:
Mij : messages between from member i and j
R: ramp percentage of “treatment”
Corrected Lift is Larger
For experiments with a 50% ramp, the
average Bernoulli lift to messages sent
was 1.04%. By contrast, the corrected
true lift was 1.56%.
True Lift Is (always) Larger than Bernoulli Lift
Ramp type
Even = both 50% of traffic
Uneven = one of which is has more than 25% but less than 50% of traffic
Very uneven = one variant less than 25%
Bernoulli Lift
Edge Level Analysis
25%+ Higher
The true lifts can be 25%
to 50% higher than
Bernoulli Randomization
Simplicity
True Lift ≈ Message Sent Lift
+ Message Received Lift at
50% ramp
Messaging Specific
Assumptions, such as no
influence on CC edge, need
to hold
Untangle the Nuances of Network Effect
Cluster members and
randomize on clusters
Cluster-based
Focus on network
effect within ego
clusters
Ego Cluster
Analyze interference
on edge level
Edge-level
There’s more to explore and study
Cluster-based Ego Cluster Edge-level
Uncover the downstream, e.g.
Invitation Sends -> Member
Session
Downstream
Job seeker <-> Recruiter
Member <-> Marketer
Two (Three) sided Market
Weitao Duan - Creating economic opportunity for every linkedin member amid network interferences, journey and learning
Weitao Duan - Creating economic opportunity for every linkedin member amid network interferences, journey and learning

More Related Content

Similar to Weitao Duan - Creating economic opportunity for every linkedin member amid network interferences, journey and learning

Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine LearningHumberto Marchezi
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshopodsc
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTrilok Sharma
 
IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37manish jindal
 
Real-time Ranking of Electrical Feeders using Expert Advice
Real-time Ranking of Electrical Feeders using Expert AdviceReal-time Ranking of Electrical Feeders using Expert Advice
Real-time Ranking of Electrical Feeders using Expert AdviceHila Becker
 
Empowering School Leaders to Manage and Lead I.T.
Empowering School Leaders to Manage and Lead I.T.Empowering School Leaders to Manage and Lead I.T.
Empowering School Leaders to Manage and Lead I.T.Mark S. Steed
 
Simple Essay Example Amat
Simple Essay Example  AmatSimple Essay Example  Amat
Simple Essay Example AmatJennifer Moore
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
IntroML_5_Classification_part2
IntroML_5_Classification_part2IntroML_5_Classification_part2
IntroML_5_Classification_part2Elio Laureano
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
Medical Segmentation Decathalon
Medical Segmentation DecathalonMedical Segmentation Decathalon
Medical Segmentation Decathalonimgcommcall
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 

Similar to Weitao Duan - Creating economic opportunity for every linkedin member amid network interferences, journey and learning (20)

Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine Learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshop
 
Joseph Jay Williams - WESST - Bridging Research via MOOClets and Collaborativ...
Joseph Jay Williams - WESST - Bridging Research via MOOClets and Collaborativ...Joseph Jay Williams - WESST - Bridging Research via MOOClets and Collaborativ...
Joseph Jay Williams - WESST - Bridging Research via MOOClets and Collaborativ...
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37
 
Real-time Ranking of Electrical Feeders using Expert Advice
Real-time Ranking of Electrical Feeders using Expert AdviceReal-time Ranking of Electrical Feeders using Expert Advice
Real-time Ranking of Electrical Feeders using Expert Advice
 
Empowering School Leaders to Manage and Lead I.T.
Empowering School Leaders to Manage and Lead I.T.Empowering School Leaders to Manage and Lead I.T.
Empowering School Leaders to Manage and Lead I.T.
 
Simple Essay Example Amat
Simple Essay Example  AmatSimple Essay Example  Amat
Simple Essay Example Amat
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
IntroML_5_Classification_part2
IntroML_5_Classification_part2IntroML_5_Classification_part2
IntroML_5_Classification_part2
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim
 
large scale Machine learning
large scale Machine learninglarge scale Machine learning
large scale Machine learning
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Medical Segmentation Decathalon
Medical Segmentation DecathalonMedical Segmentation Decathalon
Medical Segmentation Decathalon
 
Predictive Testing
Predictive TestingPredictive Testing
Predictive Testing
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
1582997627872.pdf
1582997627872.pdf1582997627872.pdf
1582997627872.pdf
 
IEEE
IEEEIEEE
IEEE
 

Recently uploaded

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 

Recently uploaded (20)

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 

Weitao Duan - Creating economic opportunity for every linkedin member amid network interferences, journey and learning

  • 1. Weitao Duan LinkedIn Creating Economic Opportunity for Every LinkedIn Member amid Network Interferences: Journey and Learning
  • 2. The world’s largest professional network 143M+ United States 14M+ Canada 30M+ Brazil 11M+ Mexico 6M+ Columbia 4M+ Chile 6M+ Argentina 90M+ Europe 6M+ South Africa 1M+ Kenya 1M+ Nigeria 2M+ Egypt 2M+ Saudi Arabia 1M+ Israel 6M+ Turkey 1M+ Morocco 45M+ India 36M+ China 1M+ Hong Kong 1M+ Republic of Korea 1M+ Japan 9M+ Australia 1M+ New Zealand 5M+ Philippines 9M+ Indonesia3M+ Malaysia 2M+ Singapore
  • 3. Land a New Job Share a Post Follow Companies Digest Updates from My Connections Connect with Someone Learn a New Skill
  • 5. LINKEDIN’S VISION Create economic opportunity for every member of the global workforce
  • 6. LINKEDIN’S MISSION Connect the world’s professionals to be more productive and successful
  • 7. Strong Experiment Culture We experiment on UI changes, relevance algorithms, backend changes, and even bug fixes. Advanced Experiment Infrastructure We have start-of-the-art in-house platform to meet the growing need of experimentation Leverage the Power of Data to Create Economic Opportunity for Every Member 10000+ Metrics Computed 500+ Daily Active Experiments 10+ TB Metric and Experiment Assignment Data Processed Data is in Our DNA
  • 8. Shared Challenge among Social Networks Network Interference Shared Challenge among Social Networks
  • 9. Shared Challenge among Social Networks Network Interference Comment Reshare Message Like Post
  • 10. Posts an article Anna Treatment: encourages posts Control Anna Visits more Weitao Weitao SUTVA no longer holds! Posts more
  • 11. Untangle the Nuances of Network Effect Cluster members and randomize on clusters Cluster-based Focus on network effect within ego clusters Ego Cluster Analyze interference on edge level Edge-level
  • 12. Cluster members and randomize on clusters Cluster-based
  • 21. Cluster-based Randomization Bernoulli Randomization 50% 50% Δbernouilli Δcluster-based [1]Testing for arbitrary interference on experimentation platforms Jean Pouget-Abadie, Martin Saveski, Guillaume Saint-Jacques, Weitao Duan, Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi. [2]Detecting Network Effects: Randomizing over Randomized Experiments Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques, Weitao Duan, Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi. KDD 2017.
  • 22. Cluster-based was not built into the production environment, because Good idea, but.. Low Power Number of clusters is too small to detect the network effect Many Edges Cut Clusters are not perfectly isolated from each other. Some network effect were not captured High Management Cost Considerable amount of effort into clustering and setting up the experiment
  • 23. Shift from Clusters to Ego Networks Focus on network effect within ego clusters Ego Cluster
  • 24. Ego-net based approach Ego Network 2019. Using ego-clusters to measure network effects at LinkedIn. Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu. Ego Alter
  • 25. Ego-net based approach Step 2. We only treat alters (e.g. feed relevance models encouraging or discouragingg comments/shares/likes etc). Step 3: We only compare egos Step 1: We pick some ego-networks in the graph (think ~100K) Control ego net Treatment ego net 2019. Using ego-clusters to measure network effects at LinkedIn. Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu.
  • 26. Ego-net based approach Observe Metric Mi from Ego i Observe Metric Mj from Ego j … Control ego net Treatment ego net Observe Metric Mp from Ego p Observe Metric Mq from Ego q … • Under H0 (no network effect), the metric difference is zero. • Use two-sample t-test to test if the difference = 0 • The difference is the network effect we have captured with ego cluster design
  • 28. Alter’s Alter (Treatment) More Posts More Comments & Reshares Alter Ego Visits more 2+ hop Cannot be measured.. Treatment encourages more posts
  • 29. Ego Cluster Learning Captures “1-hop” network effect (could be the majority of the network effect) The captured network portion can be 4x higher than Bernoulli randomization. Sometimes, network effect can take opposite sign than Bernoulli randomization (and with bigger magnitude) 1-hop > 4X Works well in feed experiments, but not in other product areas Feed Specific
  • 30. Edge Level Analysis Analyze interference on the edge level Edge-level
  • 31. Motivating Example Treatment: Control: Nothing How many more messages are sent in the ecosystem?
  • 32. Parameters govern the flow of messages q - the increase in probability of sending an initial message (theoretical lift of an experiment) 𝛼 - probability of response 𝛽 - base rate at which messages are sent Simulating the flow of a single message In this cartoon example, James is in treatment, Anna is in control 1) 2) 3) 4) 5) 1) James sends initial message with probability = 𝛽 *(1+q) 2) Anna receives ‘happy birthday’ w/ probability, 𝛽 *(1+q). She sends “thanks” w/ probability = 𝛼 3) the message ‘thanks’ exists w/ probability = 𝛼 * 𝛽 *(1+q) 4) when James receives ‘thanks’, he replies w/ probability = 𝛼 5) the probability the “You’re welcome” exists depends on the initial send, the probability of Anna responding AND the probability of James responding, or probability = 𝛼 *(𝛼 * 𝛽 *(1+q)) James:You’re welcome! James:Happy Birthday! Anna: Thanks!
  • 33. ● Notation: ○ X1 is used for variables seen when treatment is rolled out (observed) ○ X0 is for variables when treatment does nothing (never observed, but needed!) ● We decompose TtoT1 = TtoT0 + TtoT0 q1 (1+α+α2+…) ● We assume TtoT0 = CtoC0 = CtoC1 (with some normalizations) ● We get: TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…) ● We can compute total lift as a function of observables only! ● We can decompose it into q1 and α. *Caveats: assumption that members do not react to the how ‘treated’ their network is, only to treated status of individuals. Estimating from observed data T C T C C T TtoT TtoC CtoT CtoC Members in the treatment group (T), send messages both to other members in Treatment (T) and members in Control (C) The 4 Edge Elements Total lift
  • 34. True Lift True Lift: the expected lift to messages sent for the ecosystem if all members have the new feature Estimated with only ‘clean’ C-C or T-T edges MessagesSentNormedby (theoretical)Edge (by) Control (by) Treatment TtoTCtoC delta ● TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…) ● TtoT1 = CtoC1 + CtoC1 * total lift ● With the right normalizations, this gives: Mij : messages between from member i and j R: ramp percentage of “treatment”
  • 35. Corrected Lift is Larger For experiments with a 50% ramp, the average Bernoulli lift to messages sent was 1.04%. By contrast, the corrected true lift was 1.56%. True Lift Is (always) Larger than Bernoulli Lift Ramp type Even = both 50% of traffic Uneven = one of which is has more than 25% but less than 50% of traffic Very uneven = one variant less than 25% Bernoulli Lift
  • 36. Edge Level Analysis 25%+ Higher The true lifts can be 25% to 50% higher than Bernoulli Randomization Simplicity True Lift ≈ Message Sent Lift + Message Received Lift at 50% ramp Messaging Specific Assumptions, such as no influence on CC edge, need to hold
  • 37. Untangle the Nuances of Network Effect Cluster members and randomize on clusters Cluster-based Focus on network effect within ego clusters Ego Cluster Analyze interference on edge level Edge-level
  • 38. There’s more to explore and study Cluster-based Ego Cluster Edge-level Uncover the downstream, e.g. Invitation Sends -> Member Session Downstream Job seeker <-> Recruiter Member <-> Marketer Two (Three) sided Market