Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned

Talent Search and Recommendation
Systems at LinkedIn
Practical Challenges and Lessons Learned
Qi Guo, Sahin Cem Geyik, Bo Hu, Cagri Ozcaglar,
Ketan Thakkar, Xianren Wu, Krishnaram Kenthapadi
AI @ LinkedIn
+SIGIR 2018

+
The Team
Qi Guo Sahin Cem Geyik Bo Hu Cagri Ozcaglar
Ketan Thakkar Xianren Wu Krishnaram Kenthapadi

Contents
• Introduction
• Ranking Models for Talent Search
• Personalization
• Talent Search Architecture
• Summary

LinkedIn Talent Solution:
~65% of LinkedIn’s Annual Revenue
A H I R I N G E C O S Y S T E M

LinkedIn Recruiter
MAJOR PRODUCT
A Talent Search and
Recommendation System

Recruiter Search
• Criteria-Based Search
• A recruiter has specific requisitions to fill
• Candidate Recommendation System
• A recruiter may want many qualified candidates, goes through pages
• Considers Both Sides of the Talent Marketplace
• Talents are limited resources

# of InMail Accepts
OPTIMIZATION OBJECTIVE:
3. Accept
2. Send
InMail
Recruiter
Candidate
1. Search

Ranking Models for Talent Search

Number of InMail Accepts Per Seat: 30% YoY
O V E R A L L I M P R O V E M E N T

Go Non-Linear with Tree Model
• Before: Linear Model optimized for NDCG with Coordinate Ascent
• After: XGBoost Tree Model
• Captures feature interactions
• XGBoost: gradient boosting tree models for richer model complexity
• Online Results:
METRIC PRECISION@5 PRECISION@25 OVERALL ACCEPT
Lift +7.5% +7.4% +5.1%
P-Value 2.1e-4 4.8e-4 0.01

Search for “Dentist”, a Software Engineer ranks high
P R O B L E M O B S E R V E D

Search for “Dentist”, a Software Engineer ranks high
• Focused too much on promoting active job-seeking candidate
• We want our ranking to be more context-aware
f( , , ) => Accept?
Reject?
Recruiter
Context
Query
Context

Context-Aware Ranking – Pairwise Training
f( , , )1
- f( , , )2Recruiter
Context
Query
Context
{
Shared Context
=>
• Pair up two candidates from the same search request:
Accept?
1
Accept?
2?>

Context-Aware Ranking
• Before: Pointwise XGBoost
• After: Pairwise XGBoost with Context-Aware Features
• Recruiter Context: Personalization features
• Query Context: Query-Candidate matching features
• Online Results:
Lift +18.2% +13.7% +8%
P-Value 1e-16 1.1e-11 9.6e-4

Search for “Machine Learning Engineer”,
desirable to include some Data Scientists

Representation Learning
• Fuzzy semantic match on title ids, skill ids, company ids etc.
• Unsupervised Graph Embedding
• Co-Occurrence Graph based on profile data

Representation Learning
• Before: XGBoost
• After: XGBoost with Title Similarity Feature
• Based on unsupervised graph embedding
• Online Results:
Lift +2% +1.8% +3%
P-Value 0.2 0.25 0.11

Deep Learning?
• Differentiable Programming with TensorFlow
• Flexible for model engineering
• Offline result does not justify the effort yet.
• Offline Results (Pairwise NN v.s. Pointwise XGBoost):
METRIC PRECISION@1 PRECISION@5 PRECISION@25
Lift +5.3% +2.8% +1.7%

Personalization for
Talent Search

Entity-Level Personalization with GLMix
• GLMix: Generalized Linear Mixed Models
• GLMix: global model + per-entity models
• We added per-recruiter model and per-contract/company model

Entity-Level Personalization with GLMix
• Model Ensemble
• Nonlinearity via tree interaction features
• Each leaf node is a feature
• Offline Results (GLMix vs. Pairwise XGBoost):
METRIC PRECISION@1 PRECISION@5 PRECISION@25
Lift +8.5% +4.7% +2.0%

Using Recruiter Search requires a lot of skills.

A Stream of
Recommended Candidates
Recommended Matches
SIMPLIFIED EXPERIENCE

In-Session Personalization
• Step 1: Segment the Space
• Query Intent Clustering
• Step 2: Evaluate each segment
• Multi-Armed Bandits
• Step 3: Modify each segment
• Term Weight Updates

In-Session Personalization: Results

Search and Retrieval Architecture
• LinkedIn’s Galene is built on top of Lucene.
• Three main components:
• Search index on searcher
• The fanout queries through broker, and
• Live updates to the index using live-updater.
• Query language is similar to Lucene with OR, AND, NOT.
• The search index contains two types of fields:
• Inverted Fields
• Forward Fields

Search and Retrieval Architecture
• Static Rank
• An auxiliary rank for members to help with retrieving at scale
• Based on member profile and activity
• Early termination
• Index partitioned into N-shards, each retrieves and scores candidates
• Not all members in a shard can be retrieved, so query is early terminated on the basis of
static rank.
• Galene Facet Counting:
• Galene supports facet counting (such as region, titles, etc) for any given query.
• Uses statistical counting approximation based on sample in each shard

Layered Ranking Architecture
• L1: Better to scoop into the talent pool and score/rank more candidates.
• L2: Refines the short-listed talent to apply more dynamic features using external cache.

Summary
• Talent Search
• Criteria Search, Recommendation System, Marketplace
• Talent Search Ranking
• Context-Aware Pairwise Training
• Representation Learning & Deep Learning
• GLMix Personalization
• In-Session Personalization

+
Thank You
Qi Guo Sahin Cem Geyik Bo Hu Cagri Ozcaglar
Ketan Thakkar Xianren Wu Krishnaram Kenthapadi

Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned

More Related Content

What's hot

Similar to Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned

Recently uploaded

Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned