2. The world’s largest professional network
143M+ United States
14M+ Canada
30M+ Brazil
11M+
Mexico
6M+ Columbia
4M+ Chile
6M+ Argentina
90M+ Europe
6M+ South Africa
1M+ Kenya
1M+
Nigeria
2M+ Egypt
2M+ Saudi Arabia
1M+ Israel
6M+ Turkey
1M+ Morocco
45M+ India
36M+ China
1M+ Hong
Kong
1M+ Republic of Korea
1M+ Japan
9M+ Australia
1M+ New
Zealand
5M+ Philippines
9M+ Indonesia3M+ Malaysia
2M+
Singapore
3. Land a New Job
Share a Post
Follow Companies
Digest Updates from
My Connections
Connect with Someone
Learn a New Skill
7. Strong Experiment Culture
We experiment on UI changes, relevance algorithms, backend changes,
and even bug fixes.
Advanced Experiment Infrastructure
We have start-of-the-art in-house platform to meet the growing need of
experimentation
Leverage the Power of Data to Create Economic Opportunity for Every Member
10000+
Metrics Computed
500+
Daily Active Experiments
10+ TB
Metric and Experiment
Assignment Data Processed
Data is in Our DNA
8. Shared Challenge among Social Networks
Network Interference
Shared Challenge among Social Networks
9. Shared Challenge among Social Networks
Network Interference
Comment
Reshare
Message
Like
Post
11. Untangle the Nuances of Network Effect
Cluster members and
randomize on clusters
Cluster-based
Focus on network
effect within ego
clusters
Ego Cluster
Analyze interference
on edge level
Edge-level
21. Cluster-based
Randomization
Bernoulli
Randomization
50% 50%
Δbernouilli Δcluster-based
[1]Testing for arbitrary interference on experimentation platforms
Jean Pouget-Abadie, Martin Saveski, Guillaume Saint-Jacques,
Weitao Duan,
Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi.
[2]Detecting Network Effects: Randomizing over Randomized
Experiments
Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques,
Weitao Duan,
Ya Xu, Souvik Ghosh, and Edoardo M. Airoldi. KDD 2017.
22. Cluster-based was not built into the production environment, because
Good idea, but..
Low Power
Number of clusters is
too small to detect the
network effect
Many Edges Cut
Clusters are not perfectly
isolated from each other.
Some network effect were
not captured
High Management Cost
Considerable amount of
effort into clustering and
setting up the experiment
23. Shift from Clusters to Ego Networks
Focus on network effect
within ego clusters
Ego Cluster
24. Ego-net based approach
Ego Network
2019. Using ego-clusters to measure network effects at LinkedIn.
Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu.
Ego
Alter
25. Ego-net based approach
Step 2.
We only treat alters
(e.g. feed relevance
models encouraging
or discouragingg
comments/shares/likes
etc).
Step 3: We only
compare egos
Step 1: We pick some ego-networks
in the graph (think ~100K)
Control ego net
Treatment ego net
2019. Using ego-clusters to measure network effects at LinkedIn.
Guillaume Saint-Jacques, Maneesh Varshney, Jeremy Simpson, Ya Xu.
26. Ego-net based approach
Observe Metric Mi from Ego i
Observe Metric Mj from Ego j
…
Control ego net Treatment ego net
Observe Metric Mp from Ego p
Observe Metric Mq from Ego q
…
• Under H0 (no network effect), the metric difference is zero.
• Use two-sample t-test to test if the difference = 0
• The difference is the network effect we have captured
with ego cluster design
29. Ego Cluster Learning
Captures “1-hop” network effect (could be the
majority of the network effect)
The captured network portion can be 4x higher than
Bernoulli randomization. Sometimes, network effect
can take opposite sign than Bernoulli randomization
(and with bigger magnitude)
1-hop
> 4X
Works well in feed experiments, but not in other
product areas
Feed Specific
32. Parameters govern the flow of messages
q - the increase in probability of sending an initial
message (theoretical lift of an experiment)
𝛼 - probability of response
𝛽 - base rate at which messages are sent
Simulating the flow of a single message
In this cartoon example, James is
in treatment, Anna is in control
1)
2)
3)
4)
5)
1) James sends initial message with probability = 𝛽 *(1+q)
2) Anna receives ‘happy birthday’ w/ probability, 𝛽 *(1+q).
She sends “thanks” w/ probability = 𝛼
3) the message ‘thanks’ exists w/ probability = 𝛼 * 𝛽 *(1+q)
4) when James receives ‘thanks’, he replies w/ probability = 𝛼
5) the probability the “You’re welcome” exists depends on the initial send, the
probability of Anna responding AND the probability of James responding, or
probability = 𝛼 *(𝛼 * 𝛽 *(1+q))
James:You’re welcome!
James:Happy Birthday!
Anna:
Thanks!
33. ● Notation:
○ X1 is used for variables seen when treatment is rolled
out (observed)
○ X0 is for variables when treatment does nothing (never
observed, but needed!)
● We decompose TtoT1 = TtoT0 + TtoT0 q1 (1+α+α2+…)
● We assume TtoT0 = CtoC0 = CtoC1 (with some normalizations)
● We get: TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…)
● We can compute total lift as a function of observables only!
● We can decompose it into q1 and α.
*Caveats: assumption that members do not react to the how ‘treated’ their
network is, only to treated status of individuals.
Estimating from observed data
T
C
T
C
C
T
TtoT
TtoC
CtoT
CtoC
Members in the treatment group (T), send
messages both to other members in
Treatment (T) and members in Control (C)
The 4 Edge Elements
Total lift
34. True Lift
True Lift: the expected lift to messages sent
for the ecosystem if all members have the
new feature
Estimated with only ‘clean’ C-C or T-T edges
MessagesSentNormedby
(theoretical)Edge
(by)
Control
(by)
Treatment
TtoTCtoC
delta
● TtoT1 = CtoC1 + CtoC1 q1 (1+α+α2+…)
● TtoT1 = CtoC1 + CtoC1 * total lift
● With the right normalizations, this gives:
Mij : messages between from member i and j
R: ramp percentage of “treatment”
35. Corrected Lift is Larger
For experiments with a 50% ramp, the
average Bernoulli lift to messages sent
was 1.04%. By contrast, the corrected
true lift was 1.56%.
True Lift Is (always) Larger than Bernoulli Lift
Ramp type
Even = both 50% of traffic
Uneven = one of which is has more than 25% but less than 50% of traffic
Very uneven = one variant less than 25%
Bernoulli Lift
36. Edge Level Analysis
25%+ Higher
The true lifts can be 25%
to 50% higher than
Bernoulli Randomization
Simplicity
True Lift ≈ Message Sent Lift
+ Message Received Lift at
50% ramp
Messaging Specific
Assumptions, such as no
influence on CC edge, need
to hold
37. Untangle the Nuances of Network Effect
Cluster members and
randomize on clusters
Cluster-based
Focus on network
effect within ego
clusters
Ego Cluster
Analyze interference
on edge level
Edge-level
38. There’s more to explore and study
Cluster-based Ego Cluster Edge-level
Uncover the downstream, e.g.
Invitation Sends -> Member
Session
Downstream
Job seeker <-> Recruiter
Member <-> Marketer
Two (Three) sided Market