LinkedIn Endorsements: Reputation, Virality, and Social Tagging
Upcoming SlideShare
Loading in...5
×
 

LinkedIn Endorsements: Reputation, Virality, and Social Tagging

on

  • 1,386 views

Endorsements are a one-click system to recognize someone for their skills and expertise on LinkedIn, the largest professional online social network. This is one of the latest “data features” in ...

Endorsements are a one-click system to recognize someone for their skills and expertise on LinkedIn, the largest professional online social network. This is one of the latest “data features” in LinkedIn’s portfolio, and the endorsement ecosystem generates a large graph of reputation signals and viral user activity.

In this talk, we’ll examine the practical aspects of building a data feature like Endorsements. We’ll talk about marrying product design and data, deep diving into several of the lessons we’ve learned along the way - all using skills & endorsements as an empirical case study. We’ll include technical detail on our approaches and how we combine crowdsourcing, machine learning, and large scale distributed systems to recommend topics to users.

Statistics

Views

Total Views
1,386
Views on SlideShare
1,344
Embed Views
42

Actions

Likes
3
Downloads
28
Comments
0

6 Embeds 42

http://www.linkedin.com 21
https://twitter.com 11
https://www.linkedin.com 5
http://www.datawrangling.com 2
http://datawrangling.com 2
http://localhost 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

LinkedIn Endorsements: Reputation, Virality, and Social Tagging LinkedIn Endorsements: Reputation, Virality, and Social Tagging Presentation Transcript

  • LinkedIn Endorsements: Reputation, Virality, andSocial TaggingO‟Reilly Strata - February 28, 2013Sam Shah @sam_shahPete Skomoroch @peteskomoroch©2012 LinkedIn Corporation. All Rights Reserved.
  • Sam Shah Principal Engineer and Engineering Manager @sam_shah www.linkedin.com/in/shahsam Peter Skomoroch Principal Data Scientist @peteskomoroch www.linkedin.com/in/peterskomoroch©2012 LinkedIn Corporation. All Rights Reserved.
  • LinkedIn: The Professional Profile of Record 200+M Members 200M Member Profiles ©2012 LinkedIn Corporation. All Rights Reserved. 3
  • LinkedIn‟s Latest Data Product: Skill Endorsements 4
  • Viral Growth: 800M Endorsements in 4 Months 5
  • Data Amplifies Desire1. Desire + Social Proof2. Viral Loops + Network Effects3. Data Foundation + Recommendation Algorithms 6
  • 1) Desire & Social Proof 7
  • Email News Feed Notification2) Viral Loops & Network Effects A B B “accepts” endorses notified endorsement B Endorsement recommendations B B endorses endorses C D
  • 3) Data Foundation: Skills & Suggested Skills 9
  • Data Foundation: LinkedIn Skills 10
  • Social Tagging Accelerates Adoption Skill marketingSkill recommendations Virality only Suggested endorsements ©2012 LinkedIn Cororation. All Rights Reserved.
  • Outline Skill discovery Skill tagging Skill recommendations Suggested endorsements 12
  • Unsupervised Topic Discovery from Profiles Extract 13
  • ProfileBuilding the Skills Dictionary (specialties) What is the skills dictionary? – A growing taxonomy of skills Tokenization Clustering – Generated by mining profiles and maintained by the Skills team at LinkedIn Crowdsourcing – Created using clustering and crowdsourcing. – Multiple phrases, acronyms, and misspellings map to a single standardized skill. 250+ different phrases map to “Microsoft Office” Taxonomy 14
  • Topic Clustering & Phrase Sense Disambiguation 15
  • Skills Dictionary: Microsoft Office – ms office – ms office suite – computer skills including ms office – office 97 – microsoft office user Microsoft Office – mac office – microsoft office 2003 & 2007 (Skill ID = 366) – microsoft office suits – microsoft ofice – microsoft ofiice – ms office certified – office 98 – … 16
  • Deduplication Signals from Mechanical Turk 17
  • Sample Task for Mechanical Turk Workers 18
  • Skill Phrase Deduplication 19
  • Outline Skill discovery Skill tagging Skill recommendations Suggested endorsements 20
  • Skills Classification Use skill dictionary metadata to tag, standardize and infer skills Run classifiers for each skill on member profiles Public Speaking Ruby on Rails Entrepreneurship Microsoft Office AP Style 21
  • Document Tagging Skill Phrases (ex: Profile) Tagging: Extract potential skill phrases from text Lead designer and engineer for the implementation of a user- centric, fully-configurable UI for data aggregation and reporting. Developed over 20 SaaS custom applications using Python, Javascript and RoR. Tokenization Phrases JavaScript RoR SaaS Python (up to 6 words) Standardize unambiguous phrase variants Skills Tagger ror rubyonrails Skills ruby on rails development Ruby on Rails (unordered) ruby rails ruby on rail Skills Classifier Skills (ranked by relevance) 22
  • Outline Skill discovery Skill tagging Skill recommendations Suggested endorsements 23
  • Skills Classification on Member Profiles The skills classifier computes the likelihood of a member to have a skill based on the member’s profile, other profiles which share common attributes and their connections. Tagging Standardization InferenceProfile Tokenize free Transform tags Rank skills by text text into phrase tags into potential skills likelihood Profile attributes & network signals 24
  • ProfileSkill Inference How suggested/inferred skills work: Extract – Profiles with skills help build a massive dataset of attributes (attribute: skills). Feature - Company ID Example with a title: Vectors - Title ID - Groups ID Software Engineer Java 100 000 - Industry ID Software Engineer C++ 88 000 -… … Skills Classifier Title Skill Occurrences Skills (ranked by likelihood) 25
  • ProfileSkill Inference How suggested/inferred skills work: Extract – The skill likelihood is a conditional model attributes Feature - Company ID – Probabilities are combined using a Naïve Bayes Vectors - Title ID Classifier - Groups ID - Industry ID -… Skills Classifier If you are an engineer at Apple, you probably know about iPhone Development. Skills (ranked by likelihood) 26
  • Skill Suggestions for Your LinkedIn Profile 4% Conversion 49% Conversion 29
  • Outline Skill discovery Skill tagging Skill recommendations Suggested endorsements 30
  • Social Tagging via Skill Endorsements 31
  • Suggesting Endorsements Candidate People-skill combinations in a member‟s network generation Binary classification Feature - Company Features Vectors - Title – Skill inference score - Groups – Company overlap - Industry – School overlap -… – Group overlap – Industry and functional area similarity Classifier – Title similarity – Site interactions – Co-interactions Suggested Endorsements (ranked by likelihood) 32
  • Social Tagging Accelerates Adoption Skill marketingSkill recommendations Skill endorsements ©2012 LinkedIn Cororation. All Rights Reserved.
  • Can We Find Influencers In Venture Capital? 34
  • Which Skills Are Important for a Data Scientist? 35
  • What Technologies are Professionals Adopting? 36
  • Data Amplifies Desire1. Desire + Social Proof2. Viral Loops + Network Effects3. Data Catalyst + Recommendation Algorithms 37
  • Infrastructure• Apache Hadoop: Parallel processing architecture• Apache Kafka: Ingress pipes• Azkaban: Hadoop scheduler• Voldemort: Egress database• Apache Pig: High-level MR language• DataFu: Convenience routineshttp://data.linkedin.comR. Sumbaly, J. Kreps, and S. Shah. “The „Big Data‟ ecosystem at LinkedIn”. In SIGMOD 2013 (to appear). ©2012 LinkedIn Corporation. All Rights Reserved. 38
  • Learning Moredata.linkedin.com