SlideShare a Scribd company logo
1 of 15
Personalizing LinkedIn Feed
Presenter: Qi He (qhe@linkedin.com)
Other authors:
Deepak Agarwal, Bee-Chung Chen, Zhenhao Hua, Guy Levanon, Yiming Ma, Pannagadatta
Shivaswamy, Hsiao-Ping Tseng, Jaewon Yang, Liang Zhang
In SIGKDD
Aug 2015, Sydney
LinkedIn Confidential ©2015 All Rights Reserved 1
LinkedIn Feed
Professional network
Heterogeneous updates
 More than 40 types
 Share articles, like activities, connection updates etc.
Challenges
 Large scale (300+M members)
 Personalized relevance
 Freshness, diversity, user fatigue
How do we rank activities in a personalized way?
LinkedIn Confidential ©2015 All Rights Reserved
Personalization Overview
 What to show to our members?
Personalization and Ranking based on
CTR, e.g., maximize the number of
clicks per page view, which is user
specific.
 Methodologies to predict CTR
No personalization on activities
– time
– global popularity of updates
(user, context)-specific affinity
LinkedIn Confidential ©2015 All Rights Reserved 3
No Personalization
 Reverse chronological ranking
– Fresh but not relevant
 Ranking by social popularity
– Likes, a useful signal
– CTR not monotonically related
– Not all activities have likes
 Ranking by update type
popularity
– Update type taxonomy (actor
type, verb type, object type)
– Connection : (member, connect,
member)
– Opinion: (member, like, article)
 CTR of #likes=0 is normalized as CTR=1.0;
 CTR=1.6 means +60% CTR increase.
LinkedIn Confidential ©2015 All Rights Reserved 4
 The average CTR of all types is normalized
as CTR=1.0
Personalization: (user, context)-specific affinities
 Viewer – ActivityType Affinity: personal
preference on activity types
 Viewer-Actor Affinity: personal
preference on the actor of activity
LinkedIn Confidential ©2015 All Rights Reserved 5
impression
click
Viewer – ActivityType Affinity Model
LinkedIn Confidential ©2015 All Rights Reserved 6
aik = the likelihood score of viewer i clicks on the activity type k.
1) Direct estimate
ˆaik =
clickik
impik
=
Cik
Iik
, for large sample sizes.
2) Feature-based model
ˆaik = f (xik;q)
3) Gamma-Poisson model
ˆaik = f (xik;q)×gik
Cik ~ Poisson(Eik × gik ), Eik = expected clicks
gik ~ Gamma(mean =1,var =
1
g
),the correction factor
ˆgik =
g +Cik
g + Eik
=
g +Cik
g + f (xe;q)
eÎIik
å
CTR correction factor + feature-based CTR
ì
í
ï
ï
ïï
î
ï
ï
ï
ï
Viewer – Actor Affinity Model
LinkedIn Confidential ©2015 All Rights Reserved 7
viewer i, actor j, activity type k, activity t
P(Y =1) ~ Bernoulli(s(bXijkt ))
bXijkt = bij Xij + bijk Xijk + bt Xt
Xijk exists, ˆaijk = bij Xij + bijk Xijk, Viewer - Actor - ActivityType affinity
ˆaij = bij Xij, Viewer - Actor affinity
ì
í
ï
îï
Xij, Xijk : interaction features (warm-start)
Xij : member profile features (cold-start)
Viewer – Actor Affinity Features
 Warm-start features
– Number of past interactions (clicks,
shares, likes, …)
– Number of past impressions
– Over multiple time windows.
LinkedIn Confidential ©2015 All Rights Reserved 8
impression
click
Viewer – Actor Affinity Features
 Cold-start features
– Viewer profile X actor profile
 Education
 Jobs
 Location
 Skills
 ……
– Social network of (viewer, actor)
 Number of common friends
 Number of viewer’s neighbors
that took actions on the same
actor
 ……
LinkedIn Confidential ©2015 All Rights Reserved 9
Top N profile
features
Number of
Connections
acted on the
same actor
Jointly Train Click Prediction Model
LinkedIn Confidential ©2015 All Rights Reserved 10
BIG DATA
Partition 1 Partition 2 Partition 3 Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
Logistic
Regression
Consensus
Computation
ADMM - Alternating Direction Method of Multipliers
Affinity Deployment Framework
LinkedIn Confidential ©2015 All Rights Reserved 11
 Offline
– Daily update
 Hourly: +0.1%
 2-day: -0.4%
– Viewer-ActivityType
 300M x 50: type affinity
– Viewer-Actor
 Pairs with actions in the
past half a year
 Tens of billions for
desktop and mobile
 Top 10K scores for heavy
viewers (only 0.08%
offline metric loss)
Online workflow
Desktop A/B Tests
LinkedIn Confidential ©2015 All Rights Reserved 12
Viewer-ActivityType affinity
vs.
no affinity
Viewer-Actor affinity
vs.
Viewer-ActivityType affinity
Viewer-Actor-ActivityType affinity
vs.
Viewer-Actor affinity +
Viewer-ActivityType affinity
Mobile A/B Tests
LinkedIn Confidential ©2015 All Rights Reserved 13
Viewer-ActivityType affinity
vs.
no affinity
Viewer-Actor-ActivityType affinity
vs.
Viewer-ActivityType affinity
Summary
 Conclusions
– Personalization of finer
granularity achieves higher
CTR.
– Scalability and data sparsity
are two major concerns of
production design.
 Future Work
– Activity-dependent
personalization, e.g., the
affinity between viewer and
the content topic of activity.
– Personalization at viewer id
level, e.g., each viewer has
her own personalization
model.
LinkedIn Confidential ©2015 All Rights Reserved 14
LinkedIn Confidential ©2015 All Rights Reserved
Q&A
Qi He (qhe@linkedin.com)

More Related Content

Similar to Personalizing LinkedIn Feed

Riot slides 2.1
Riot slides 2.1 Riot slides 2.1
Riot slides 2.1 fsachs
 
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca GrivetQualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivethypergriso
 
Riot slides gd 1.8.2
Riot slides gd 1.8.2Riot slides gd 1.8.2
Riot slides gd 1.8.2fsachs
 
Liggett Methods And Tools Slides Q1 2011
Liggett Methods And Tools Slides Q1 2011Liggett Methods And Tools Slides Q1 2011
Liggett Methods And Tools Slides Q1 2011tliggett
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastDataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastHakka Labs
 
Linked in stream experimentation framework
Linked in stream experimentation frameworkLinked in stream experimentation framework
Linked in stream experimentation frameworkJoseph Adler
 
Privacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedInPrivacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedInKrishnaram Kenthapadi
 
Modoop: LinkedIn Next Gen Predictive Engine for Email Marketing
Modoop: LinkedIn Next Gen Predictive Engine for Email MarketingModoop: LinkedIn Next Gen Predictive Engine for Email Marketing
Modoop: LinkedIn Next Gen Predictive Engine for Email MarketingShaobo Liu
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyKishore Gopalakrishna
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Advanced measurement: Proving Business Value to Expand or Sustain Your Community
Advanced measurement: Proving Business Value to Expand or Sustain Your CommunityAdvanced measurement: Proving Business Value to Expand or Sustain Your Community
Advanced measurement: Proving Business Value to Expand or Sustain Your CommunityClaire Flanagan, MBA
 
A New Data Architecture for the App Economy - StampedeCon 2013
A New Data Architecture for the App Economy - StampedeCon 2013A New Data Architecture for the App Economy - StampedeCon 2013
A New Data Architecture for the App Economy - StampedeCon 2013StampedeCon
 
Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Lee Trevena
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionDeepak Agarwal
 
Measureable Knowledge Management
Measureable Knowledge ManagementMeasureable Knowledge Management
Measureable Knowledge ManagementPeter H. Reiser
 
KDD2018-ADS-InvitedTalk
KDD2018-ADS-InvitedTalkKDD2018-ADS-InvitedTalk
KDD2018-ADS-InvitedTalkHema Raghavan
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Top 10 Social Gaming Metrics
Top 10 Social Gaming MetricsTop 10 Social Gaming Metrics
Top 10 Social Gaming Metricsjefftee
 

Similar to Personalizing LinkedIn Feed (20)

Riot slides 2.1
Riot slides 2.1 Riot slides 2.1
Riot slides 2.1
 
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca GrivetQualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
 
Riot slides gd 1.8.2
Riot slides gd 1.8.2Riot slides gd 1.8.2
Riot slides gd 1.8.2
 
Liggett Methods And Tools Slides Q1 2011
Liggett Methods And Tools Slides Q1 2011Liggett Methods And Tools Slides Q1 2011
Liggett Methods And Tools Slides Q1 2011
 
Open Data Canvas 0.1
Open Data Canvas 0.1Open Data Canvas 0.1
Open Data Canvas 0.1
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastDataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
 
Linked in stream experimentation framework
Linked in stream experimentation frameworkLinked in stream experimentation framework
Linked in stream experimentation framework
 
Privacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedInPrivacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedIn
 
Modoop: LinkedIn Next Gen Predictive Engine for Email Marketing
Modoop: LinkedIn Next Gen Predictive Engine for Email MarketingModoop: LinkedIn Next Gen Predictive Engine for Email Marketing
Modoop: LinkedIn Next Gen Predictive Engine for Email Marketing
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Advanced measurement: Proving Business Value to Expand or Sustain Your Community
Advanced measurement: Proving Business Value to Expand or Sustain Your CommunityAdvanced measurement: Proving Business Value to Expand or Sustain Your Community
Advanced measurement: Proving Business Value to Expand or Sustain Your Community
 
A New Data Architecture for the App Economy - StampedeCon 2013
A New Data Architecture for the App Economy - StampedeCon 2013A New Data Architecture for the App Economy - StampedeCon 2013
A New Data Architecture for the App Economy - StampedeCon 2013
 
Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversion
 
Measureable Knowledge Management
Measureable Knowledge ManagementMeasureable Knowledge Management
Measureable Knowledge Management
 
KDD2018-ADS-InvitedTalk
KDD2018-ADS-InvitedTalkKDD2018-ADS-InvitedTalk
KDD2018-ADS-InvitedTalk
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Top 10 Social Gaming Metrics
Top 10 Social Gaming MetricsTop 10 Social Gaming Metrics
Top 10 Social Gaming Metrics
 

Personalizing LinkedIn Feed

  • 1. Personalizing LinkedIn Feed Presenter: Qi He (qhe@linkedin.com) Other authors: Deepak Agarwal, Bee-Chung Chen, Zhenhao Hua, Guy Levanon, Yiming Ma, Pannagadatta Shivaswamy, Hsiao-Ping Tseng, Jaewon Yang, Liang Zhang In SIGKDD Aug 2015, Sydney LinkedIn Confidential ©2015 All Rights Reserved 1
  • 2. LinkedIn Feed Professional network Heterogeneous updates  More than 40 types  Share articles, like activities, connection updates etc. Challenges  Large scale (300+M members)  Personalized relevance  Freshness, diversity, user fatigue How do we rank activities in a personalized way? LinkedIn Confidential ©2015 All Rights Reserved
  • 3. Personalization Overview  What to show to our members? Personalization and Ranking based on CTR, e.g., maximize the number of clicks per page view, which is user specific.  Methodologies to predict CTR No personalization on activities – time – global popularity of updates (user, context)-specific affinity LinkedIn Confidential ©2015 All Rights Reserved 3
  • 4. No Personalization  Reverse chronological ranking – Fresh but not relevant  Ranking by social popularity – Likes, a useful signal – CTR not monotonically related – Not all activities have likes  Ranking by update type popularity – Update type taxonomy (actor type, verb type, object type) – Connection : (member, connect, member) – Opinion: (member, like, article)  CTR of #likes=0 is normalized as CTR=1.0;  CTR=1.6 means +60% CTR increase. LinkedIn Confidential ©2015 All Rights Reserved 4  The average CTR of all types is normalized as CTR=1.0
  • 5. Personalization: (user, context)-specific affinities  Viewer – ActivityType Affinity: personal preference on activity types  Viewer-Actor Affinity: personal preference on the actor of activity LinkedIn Confidential ©2015 All Rights Reserved 5 impression click
  • 6. Viewer – ActivityType Affinity Model LinkedIn Confidential ©2015 All Rights Reserved 6 aik = the likelihood score of viewer i clicks on the activity type k. 1) Direct estimate ˆaik = clickik impik = Cik Iik , for large sample sizes. 2) Feature-based model ˆaik = f (xik;q) 3) Gamma-Poisson model ˆaik = f (xik;q)×gik Cik ~ Poisson(Eik × gik ), Eik = expected clicks gik ~ Gamma(mean =1,var = 1 g ),the correction factor ˆgik = g +Cik g + Eik = g +Cik g + f (xe;q) eÎIik å CTR correction factor + feature-based CTR ì í ï ï ïï î ï ï ï ï
  • 7. Viewer – Actor Affinity Model LinkedIn Confidential ©2015 All Rights Reserved 7 viewer i, actor j, activity type k, activity t P(Y =1) ~ Bernoulli(s(bXijkt )) bXijkt = bij Xij + bijk Xijk + bt Xt Xijk exists, ˆaijk = bij Xij + bijk Xijk, Viewer - Actor - ActivityType affinity ˆaij = bij Xij, Viewer - Actor affinity ì í ï îï Xij, Xijk : interaction features (warm-start) Xij : member profile features (cold-start)
  • 8. Viewer – Actor Affinity Features  Warm-start features – Number of past interactions (clicks, shares, likes, …) – Number of past impressions – Over multiple time windows. LinkedIn Confidential ©2015 All Rights Reserved 8 impression click
  • 9. Viewer – Actor Affinity Features  Cold-start features – Viewer profile X actor profile  Education  Jobs  Location  Skills  …… – Social network of (viewer, actor)  Number of common friends  Number of viewer’s neighbors that took actions on the same actor  …… LinkedIn Confidential ©2015 All Rights Reserved 9 Top N profile features Number of Connections acted on the same actor
  • 10. Jointly Train Click Prediction Model LinkedIn Confidential ©2015 All Rights Reserved 10 BIG DATA Partition 1 Partition 2 Partition 3 Partition K Logistic Regression Logistic Regression Logistic Regression Logistic Regression Consensus Computation ADMM - Alternating Direction Method of Multipliers
  • 11. Affinity Deployment Framework LinkedIn Confidential ©2015 All Rights Reserved 11  Offline – Daily update  Hourly: +0.1%  2-day: -0.4% – Viewer-ActivityType  300M x 50: type affinity – Viewer-Actor  Pairs with actions in the past half a year  Tens of billions for desktop and mobile  Top 10K scores for heavy viewers (only 0.08% offline metric loss) Online workflow
  • 12. Desktop A/B Tests LinkedIn Confidential ©2015 All Rights Reserved 12 Viewer-ActivityType affinity vs. no affinity Viewer-Actor affinity vs. Viewer-ActivityType affinity Viewer-Actor-ActivityType affinity vs. Viewer-Actor affinity + Viewer-ActivityType affinity
  • 13. Mobile A/B Tests LinkedIn Confidential ©2015 All Rights Reserved 13 Viewer-ActivityType affinity vs. no affinity Viewer-Actor-ActivityType affinity vs. Viewer-ActivityType affinity
  • 14. Summary  Conclusions – Personalization of finer granularity achieves higher CTR. – Scalability and data sparsity are two major concerns of production design.  Future Work – Activity-dependent personalization, e.g., the affinity between viewer and the content topic of activity. – Personalization at viewer id level, e.g., each viewer has her own personalization model. LinkedIn Confidential ©2015 All Rights Reserved 14
  • 15. LinkedIn Confidential ©2015 All Rights Reserved Q&A Qi He (qhe@linkedin.com)

Editor's Notes

  1. fatigue [fəˈtiɡ]
  2. Chronological [ˌkrɑ:nəˈlɑ:dʒɪkl]
  3. A balance between cold-start features and warm-start features.