Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn


Published on

ECIR 2013 workshop keynote

Published in: Technology, Business
  • Be the first to comment

Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn

  1. 1. Recruiters, Job Seekers and Spammers:Innovations in Job Search at LinkedIn Daria Sorokina Senior Data Scientist LinkedIn
  2. 2. Part I: Recruiters“Multiple Objective Optimization in RecommendationSystems”, Mario Rodriguez, Christian Posse, EthanZhang. RecSys‟12
  3. 3. TalentMatch Job Posting Member Profiles Ranked Talent Talent Match
  4. 4. TalentMatch Model Job Postingtitle industry …geo descriptioncompany functional area Text similarity features CandidateGeneral Current Positionexpertise titlespecialties summaryeducation tenure lengthheadline industrygeo functional areaexperience … The model can be trained on user activity signals like job ad clicks or job applications
  5. 5. TalentMatch Utility = fn(email rate, reply rate) Email Rate Recruiter Reply Job Problem! Rate seeker?
  6. 6. Job Seeker Intent PASSIVE NON-JOB- SEEKER ACTIVEModel: time till the job changeo How long will this person stay in this job after this date?o Trained on past job positions from our users profileso Accelerated failure time (AFT) modelo æ ö Ti = exp çå bk xik + sei ÷ è k ø
  7. 7. Job-SeekerFeatureExample:Attrition byIndustry Probability Time
  8. 8. TalentMatch Utility fn(email rate, reply rate)Job-Seeking Intent:16x reply rate oncareer-related mail Reply Rate
  9. 9. How: ControlledRe-ranking Ranking Score DistributionsTalent Match rankingMatch Score1, Item X, 0.98, Non-Seeker2, Item Y, 0.91, Non-Seeker--------------------------------------- Divergenc3, Item Z, 0.89, Active e score Re-ranking function f() optimize for bothImproved rankingMatch Score, Reranking Score1, Item X, 0.98, 0.98, Non-Seeker Objective Score:2, Item Z, 0.89, 0.93, Active #Active in top N--------------------------------------------3, Item Y, 0.91, 0.91, Non-Seeker
  10. 10. Part II: Job SeekersLearning to Rank. Fast and personalized.
  11. 11. Job Search.Query “Data Scientist LinkedIn”
  12. 12. Learning To Rank Regular approach – A data point is a pair: {Query, Document} – Data label: “Is this document relevant for this query?”  Can be done by crowdsourcing Job Search reality – A data point is a triple: {Query, Job position, User} – Data label: “Is this job relevant for this user who asked this query?”  Depends on the user‟s location, industry, seniority…  Too much to ask from a random person  Have to collect labels from user signals
  13. 13. We use simplified version of FairPairs(Radlinski, Joachims AAAI‟06) Clicked! ✔flipped ✗  Each pair is flipped with a 50% chance ✔not flipped  Choose pairs where ✔ only the lower document is clicked ✗ label 0not flipped  Save 1 positive (lower) ✔ label 1 and 1 negative (upper) results for the labeled ✗ data set flipped ✗
  14. 14. Fair Pairs data is not enough for training The user clicks or skips only whatever is shown Bad results are not shown So there will be no “really bad” negatives in the training data We need to add them! For queries with many results, add all results from the last page as “easy negatives” label 0 label 0 label 0 … … label 0
  15. 15. Learning To Rank – Training a Model Best models for LTR are complex ensembles of trees – See results of Yahoo Learning to Rank „10 competition – LambdaMART, BagBoo, Additive Groves, MatrixNet … Complex models come at a cost – It takes long to calculate predictions – Requires a lot of optimization, often used with multi-level ranking Can we train a simple model that will resemble a complex one? – Train a complex model – Get insights on what it looks like – Modify a simple model accordingly
  16. 16. Training a Simple Model using a Complex Model Base simple model – logistic or linear regression p log = b0 + b1 x1 + b2 x2 +... + bn xn 1- p – Does not handle well features with non-linear effects – Does not handle interactions (e.g., if-then-else rules) Target complex model – Additive Groves – (Sorokina, Caruana, Riedewald ECML‟07)(1/N)· +…+ + (1/N)· +…+ +…+ (1/N)· +…+ – Comes with interaction detection and effect visualization tools
  17. 17. Improving LR – Feature Transformations Additive Groves can model and visualize non-linear effects  Approximate the effect curve average prediction with a polynomial transform T(x) – anything simple will do  Apply T(x) to the original feature values feature values average prediction Now the feature effect is linear Regression model will love it! b0 + b1 T(x1 )+ b2 x2 +... + bn xn T(x) values
  18. 18. Improving LR – Interaction Splits Additive Groves‟ interaction detection tool produces a list of strong interactions and corresponding joint effect plots average prediction X2=1  Effect of X1 is stronger when X2 = 0  Simple regression will not capture this  Often such X2 interacts with other features as well values of feature X1 X2=?  Solution:  Build separate models for different values of X2 b0 + b1 x1 +... + bn xn a0 + a1 x1 +...+ an xn
  19. 19. Improving LR – Tree with LR leaves and transforms Both operations (effect transforms and interaction splits) can be applied multiple times in any order Resulting model – a simple tree with regression model leaves X2=? b0 + b1 T(x1 )+...+ bn xn X10< 0.1234 ? a0 + a1 P(x1 )+...+ anQ(xn ) g 0 + g1 R(x1 )+...+ g nQ(xn ) Gives a significant boost to the performance of the basic LR model
  20. 20. TreeExtra package A set of machine learning tools – Additive Groves ensemble – Interaction detection – Effect and interaction visualization – Created by Daria Sorokina while in Cornell, CMU, Yandex, LinkedIn from 2006 to 2013
  21. 21. Part III: SpammersFighting black SEO
  22. 22. Search Spam
  23. 23. Search Spam
  24. 24. Search Spam
  25. 25. Training data for the search spam classifier Find the queries targeted by spammers. – 10,000 most common non-name queries. – Spammers love optimizing for [marketing] – But not so much for [david smith] Look at top results for a generic user. – i.e., show unpersonalized search results. Label data by crowdsourcing. – Definition of spam is non-personalized Train a model – Spam scores are recalculated offline once in a while – So the model complexity is not an issue – Additive Groves works well. (Could use any ensemble of trees)
  26. 26. ROC curve. Choosing thresholds. 1Spam score threshold 0.9 0.8 a 0.7 0.6 0.5 b 0.4 0.30<a<b<1 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1
  27. 27. Integrating the Spam Score into Relevance Spam model yields a probability between 0 and 1. Convert spam score into a factor – [0.0 <= score <= a]  not a spammer,  factor = 1.0 – [b <= score <= 1.0]  Spammer  factor = 0.0 – [a <= score <= b]  Suspicious  linearly scale score from [a, b] to [1, 0] Multiply relevance score by factor
  28. 28. We are hiring!