Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned


Published on

*** Please check out our LinkedIn Engineering blog post: ***

LinkedIn Talent Solutions business contributes to around 65% of LinkedIn’s annual revenue, and provides tools for job providers to reach out to potential candidates and for job seekers to find suitable career opportunities. LinkedIn’s job ecosystem has been designed as a platform to connect job providers and job seekers, and to serve as a marketplace for efficient matching between potential candidates and job openings. A key mechanism to help achieve these goals is the LinkedIn Recruiter product, which enables recruiters to search for relevant candidates and obtain candidate recommendations for their job postings.

We highlight a few unique information retrieval, system, and modeling challenges associated with talent search and recommendation systems.

In this talk, we will present how we formulated and addressed the problems, the overall system design and architecture, the challenges encountered in practice, and the lessons learned from the production deployment of these systems at LinkedIn. By presenting our experiences of applying techniques at the intersection of recommender systems, information retrieval, machine learning, and statistical modeling in a large-scale industrial setting and highlighting the open problems, we hope to stimulate further research and collaborations within the SIGIR community.

Published in: Technology

Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned

  1. 1. Talent Search and Recommendation Systems at LinkedIn Practical Challenges and Lessons Learned Qi Guo, Sahin Cem Geyik, Bo Hu, Cagri Ozcaglar, Ketan Thakkar, Xianren Wu, Krishnaram Kenthapadi AI @ LinkedIn +SIGIR 2018
  2. 2. + The Team Qi Guo Sahin Cem Geyik Bo Hu Cagri Ozcaglar Ketan Thakkar Xianren Wu Krishnaram Kenthapadi
  3. 3. Contents • Introduction • Ranking Models for Talent Search • Personalization • Talent Search Architecture • Summary
  4. 4. Introduction
  5. 5. LinkedIn Talent Solution: ~65% of LinkedIn’s Annual Revenue A H I R I N G E C O S Y S T E M
  6. 6. LinkedIn Recruiter MAJOR PRODUCT A Talent Search and Recommendation System
  7. 7. Recruiter Search • Criteria-Based Search • A recruiter has specific requisitions to fill • Candidate Recommendation System • A recruiter may want many qualified candidates, goes through pages • Considers Both Sides of the Talent Marketplace • Talents are limited resources
  8. 8. # of InMail Accepts OPTIMIZATION OBJECTIVE: 3. Accept 2. Send InMail Recruiter Candidate 1. Search
  9. 9. Ranking Models for Talent Search
  10. 10. Number of InMail Accepts Per Seat: 30% YoY O V E R A L L I M P R O V E M E N T
  11. 11. Go Non-Linear with Tree Model • Before: Linear Model optimized for NDCG with Coordinate Ascent • After: XGBoost Tree Model • Captures feature interactions • XGBoost: gradient boosting tree models for richer model complexity • Online Results: METRIC PRECISION@5 PRECISION@25 OVERALL ACCEPT Lift +7.5% +7.4% +5.1% P-Value 2.1e-4 4.8e-4 0.01
  12. 12. Search for “Dentist”, a Software Engineer ranks high P R O B L E M O B S E R V E D
  13. 13. Search for “Dentist”, a Software Engineer ranks high P R O B L E M O B S E R V E D • Focused too much on promoting active job-seeking candidate • We want our ranking to be more context-aware f( , , ) => Accept? Reject? Recruiter Context Query Context
  14. 14. Context-Aware Ranking – Pairwise Training f( , , )1 - f( , , )2Recruiter Context Query Context { Shared Context => • Pair up two candidates from the same search request: Accept? 1 Accept? 2?>
  15. 15. Context-Aware Ranking • Before: Pointwise XGBoost • After: Pairwise XGBoost with Context-Aware Features • Recruiter Context: Personalization features • Query Context: Query-Candidate matching features • Online Results: METRIC PRECISION@5 PRECISION@25 OVERALL ACCEPT Lift +18.2% +13.7% +8% P-Value 1e-16 1.1e-11 9.6e-4
  16. 16. Search for “Machine Learning Engineer”, desirable to include some Data Scientists P R O B L E M O B S E R V E D
  17. 17. Representation Learning • Fuzzy semantic match on title ids, skill ids, company ids etc. • Unsupervised Graph Embedding • Co-Occurrence Graph based on profile data
  18. 18. Representation Learning • Before: XGBoost • After: XGBoost with Title Similarity Feature • Based on unsupervised graph embedding • Online Results: METRIC PRECISION@5 PRECISION@25 OVERALL ACCEPT Lift +2% +1.8% +3% P-Value 0.2 0.25 0.11
  19. 19. Deep Learning? • Differentiable Programming with TensorFlow • Flexible for model engineering • Offline result does not justify the effort yet. • Offline Results (Pairwise NN v.s. Pointwise XGBoost): METRIC PRECISION@1 PRECISION@5 PRECISION@25 Lift +5.3% +2.8% +1.7%
  20. 20. Personalization for Talent Search
  21. 21. Entity-Level Personalization with GLMix • GLMix: Generalized Linear Mixed Models • GLMix: global model + per-entity models • We added per-recruiter model and per-contract/company model
  22. 22. Entity-Level Personalization with GLMix • Model Ensemble • Nonlinearity via tree interaction features • Each leaf node is a feature • Offline Results (GLMix vs. Pairwise XGBoost): METRIC PRECISION@1 PRECISION@5 PRECISION@25 Lift +8.5% +4.7% +2.0%
  23. 23. Using Recruiter Search requires a lot of skills. P R O B L E M O B S E R V E D
  24. 24. A Stream of Recommended Candidates Recommended Matches SIMPLIFIED EXPERIENCE
  25. 25. In-Session Personalization • Step 1: Segment the Space • Query Intent Clustering • Step 2: Evaluate each segment • Multi-Armed Bandits • Step 3: Modify each segment • Term Weight Updates
  26. 26. In-Session Personalization: Results
  27. 27. Talent Search Architecture
  28. 28. Search and Retrieval Architecture • LinkedIn’s Galene is built on top of Lucene. • Three main components: • Search index on searcher • The fanout queries through broker, and • Live updates to the index using live-updater. • Query language is similar to Lucene with OR, AND, NOT. • The search index contains two types of fields: • Inverted Fields • Forward Fields
  29. 29. Search and Retrieval Architecture • Static Rank • An auxiliary rank for members to help with retrieving at scale • Based on member profile and activity • Early termination • Index partitioned into N-shards, each retrieves and scores candidates • Not all members in a shard can be retrieved, so query is early terminated on the basis of static rank. • Galene Facet Counting: • Galene supports facet counting (such as region, titles, etc) for any given query. • Uses statistical counting approximation based on sample in each shard
  30. 30. Layered Ranking Architecture • L1: Better to scoop into the talent pool and score/rank more candidates. • L2: Refines the short-listed talent to apply more dynamic features using external cache.
  31. 31. Summary
  32. 32. Summary • Talent Search • Criteria Search, Recommendation System, Marketplace • Talent Search Ranking • Context-Aware Pairwise Training • Representation Learning & Deep Learning • GLMix Personalization • In-Session Personalization
  33. 33. + Thank You Qi Guo Sahin Cem Geyik Bo Hu Cagri Ozcaglar Ketan Thakkar Xianren Wu Krishnaram Kenthapadi