Your SlideShare is downloading. ×
Skills, Reputation, and Search
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Skills, Reputation, and Search

26,304
views

Published on

This keynote presentation describes the critical role that search and Lucene has in building next generation products that understand reputation and relevance. We also describe how data science and …

This keynote presentation describes the critical role that search and Lucene has in building next generation products that understand reputation and relevance. We also describe how data science and machine learning have been applied at LinkedIn to collect, interpret, and index data around topical reputation.

Lucene Revolution is the biggest open source conference dedicated to Apache Lucene/Solr.

Published in: Technology, Education

0 Comments
19 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
26,304
On Slideshare
0
From Embeds
0
Number of Embeds
34
Actions
Shares
0
Downloads
32
Comments
0
Likes
19
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Skills, Reputation, and SearchPete SkomorochPrincipal Data Scientist, LinkedIn
  • 2. Vision: Create Economic Opportunity for Every Professional2Location
  • 3. LinkedIn: The Professional Profile of Record©2012 LinkedIn Corporation. All Rights Reserved. 3200+MMembers 200M MemberProfiles
  • 4. LinkedIn Search: Connecting Talent with Opportunity4
  • 5. Skills Correlated with the Job Title “Data Scientist”5
  • 6. Skills Related to “Big Data”6
  • 7. Information Retrieval7
  • 8. Soul Retrieval8
  • 9. 9
  • 10. Lucene on LinkedIn10
  • 11. Lucene Endorsement Graph11
  • 12. Solr on LinkedIn12
  • 13. Solr Endorsement Graph13
  • 14. Reputation: Building the Endorsement Graph14
  • 15. 15Viral Growth: 1 Billion Endorsements in 5 Months
  • 16. How Did We Gather this Data?161. Desire + Social Proof2. Viral Loops + Network Effects3. Data Foundation + Recommendation Algorithms
  • 17. 171) Desire & Social Proof
  • 18. AendorsesBBnotifiedB “accepts”endorsementBendorsesCBendorsesDEndorsementrecommendationsEmail NotificationNews Feed2) Viral Loops & Network Effects
  • 19. 3) Data Foundation: Skills & Suggested Skills19
  • 20. Data Foundation: LinkedIn Skills20
  • 21. Social Tagging Accelerates AdoptionSuggestedendorsementsSkill recommendationsSkill marketing©2012 LinkedIn Cororation. All Rights Reserved.Virality only
  • 22. Outline22Skill discoverySkill taggingSkill recommendationsSuggested endorsements
  • 23. Skill Discovery: Unsupervised Topics from Profiles23Extract
  • 24. Topic Clustering & Phrase Sense Disambiguation24
  • 25. Deduplication Signals from Mechanical Turk25
  • 26. Sample Task for Mechanical Turk Workers26
  • 27. Skill Phrase Deduplication27
  • 28. Outline28Skill discoverySkill taggingSkill recommendationsSuggested endorsements
  • 29. Lead designer and engineer for the implementation of a user-centric, fully-configurable UI for data aggregation and reporting.Developed over 20 SaaS custom applications using Python,Javascript and RoR.Tagging Skill Phrases Tagging: Extract potential skill phrases from text Standardize unambiguous phrase variants29JavaScript RoR SaaS Pythonrorrubyonrailsruby on rails developmentruby railsruby on railRuby on RailsDocument(ex: Profile)TokenizationSkills TaggerPhrases(up to 6 words)Skills ClassifierSkills(unordered)Skills(ranked by relevance)
  • 30. Outline30Skill discoverySkill taggingSkill recommendationsSuggested endorsements
  • 31. Skill Inference How suggested/inferred skills work:– The skill likelihood is a conditional model– Probabilities are combined using a Naïve BayesClassifier If you are an engineer at Apple, you probably knowabout iPhone Development.31ProfileExtractattributes- Company ID- Title ID- Groups ID- Industry ID- …Skills ClassifierSkills(ranked by likelihood)FeatureVectors
  • 32. Skill Recommendations for Your LinkedIn Profile3749% Conversion4% Conversion
  • 33. Outline38Skill discoverySkill taggingSkill recommendationsSuggested endorsements
  • 34. Social Tagging via Skill Endorsements39
  • 35. Social Tagging Accelerates AdoptionSkill endorsementsSkill recommendationsSkill marketing©2012 LinkedIn Cororation. All Rights Reserved.
  • 36. Data Amplifies Desire411. Desire + Social Proof2. Viral Loops + Network Effects3. Data Catalyst + Recommendation Algorithms
  • 37. Over 58 Million Profiles are now Tagged with Skills42
  • 38. All This Data Flows Back Into Our Lucene Index43
  • 39. Helping us Connect Talent & Opportunity44Location
  • 40. Questions?We’re hiring: data.linkedin.com@peteskomoroch©2012 LinkedIn Corporation. All Rights Reserved. 45
  • 41. CONTACTPete Skomoroch@peteskomorochhttp://data.linkedin.com