Your SlideShare is downloading. ×
Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn


Published on

ECIR 2013 workshop keynote

ECIR 2013 workshop keynote

Published in: Technology, Business

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • …This meaningfully contributes to the growth of our three diverse revenue streams – Talent Solutions, Marketing Solutions, and Premium Subscriptions:Talent Solutions:As the world’s largest professional network, LinkedIn is the single best place to connect with passive and high-quality active job candidates. LinkedIn Talent Solutions improves the efficiency of recruiting the best at scale, giving recruiting teams a competitive edge in the war for talent. [LinkedIn uniquely possesses an unprecedented wealth of accurate and up-to-date professional information, the full extent of which can only be accessed through LinkedIn Talent Solutions. We provide the tools that enable recruiting teams to understand their target audience, position their company as the employer of choice, engage with relevant, high-quality candidates at scale and accurately measure their results.]Marketing Solutions:LinkedIn Marketing Solutions helps advertisers and marketers reach influential, affluent and highly-educated audiences in a very relevant and engaging way. LinkedIn has the most valuable audience by composition anywhere on the internet. This global and unique asset gets products and services in front of right professional at right time. [LinkedIn Marketing Solutions helps advertisers/marketers reach influential, affluent and highly-educated audiences in a very relevant and engaging way. Ithas three competitive advantages over any other online platform: Scale, Accuracy, and Portfolio: Scale: Over 200 million members that are all professionals; one of the most influential, affluent and highly educated audiences on the Web; more decision makers, higher average household incomes, and more college or post-college graduates than U.S. visitors of many leading business websites.  Accuracy: Rich profile-based targeting allows advertisers to reach very specific audiences; targeting includes by geography, job function, industry, company size, seniority, age, gender, company name, LinkedIn Group and even job title. Portfolio: A suite of high-impact and engaging products helps advertisers get their messages across, from text-based ads to massive display campaigns to socially-driven branding opportunities like Company Pages and Groups.] Premium Subscriptions:LinkedIn Premium Subscriptions are tailored for an array of member needs and segments, to provide the right/additional tools to enable customers to be better at what they do every day / more productive and successful in their careers. The aim of our three business lines, and all of the products and services we develop is to connect talent and opportunity at a massive scale, and to make the world’s professionals more productive and successful. In doing so, we aspire to create economic opportunity for every professional in the world.
  • So, here is high level overview of Talent Match. Someone comes to the site and posts a job. We then scour the entire member database looking for the members who best match that job, and we recommend a ranked list of those members to the job poster.
  • This is how we do this matching. We combine the job and the candidate into a single feature vector, where each feature denotes various similarity measures between attributes of the job and attributes of the job poster, and then we find the relative importance of these features using a supervised learning method like logistic regression trained on a click signal such as job applications. This gives us a model that knows how to differentiate good job-member pairs from bad job-member pairs.
  • Let’s go over the facets of the utility function of the Talent Match system. First, the snippet needs to be good enough to convince the job poster to purchase the recommendations. That’s the booking rate. Then, once purchased, the job poster gets to look at the full profile of the candidate recommended and decides whether or not they are indeed a good match for the job. If the candidate is a good match, the job poster may then decide to email the candidate regarding the job opportunity. That’s the email rate. Finally, if the candidate is interested, then the candidate will reply positively to the job poster. Giving us the reply rate. Now that the link is established, they can take it from there. But from our perspective, these 3 steps are required for there to be relevant engagement within this system.Out of the 3 facets of the utility function, the reply rate was identified as needing improvement. Job posters were complaining the they were emailing candidates, but the candidates were not replying enough. This was the problem we needed to solve. We figured the booking rate and the email rate were well accounted for by the existing TalentMatch model, but even if someone is a great match for the job, that does not mean they are going to reply. So, we thought that maybe people were not replying because they were probably not looking for a job. What if we could determine if someone was a job seeker, and then include more of those people in the recommendations?
  • So, we had already developed a model that computes the job seeking propensity for each member, and we affectionately refer to this model as flightmeter. It turns out that many people who are open to new opportunities, do not self-identify as job seekers, so this model helps us identify those people. You can think of the job seeking propensity as the probability that the member will switch positions in the next month. We also output a segmentation of this probability into actives, passives, and non-job-seekers, and we consider actives and passives to have a high job seeking intent.This Flightmeter model is completely different from the TalentMatch model. It is a survival model where the entity whose survival we’re analyzing is a job, or more specifically, a position. Based on data derived from the lifetime of millions of positions, we model the duration of a position as a function of various features in what is known as an accelerated failure time model, and this allows us to compute the probability that a given position will end within the next month.
  • There are many signals the we can use to compute the job seeking intent. We may have the user’s job seeking activity on the site: are they searching or applying for jobs. Those are obvious signals. But we have others. For example, we know that different industries have different attrition rates. This plot includes a few representative industries and their survival curves. The survival curve gives the probability that someone will still be at their position X months down the road if they start that position today.These are survival curves for a few of the most extreme industries, some of the most hazardous including “political organization” and “animation” and some of the least hazardous including “alternative medicine” and “ranching”. In the “political organization” industry, which is the red line at the bottom, more than 50% of people don’t last 2 years in a given position.
  • So, Intuitively, it makes sense to suggest users who are job seekers in TalentMatch. But we confirmed our intuition, we ran the numbers, and saw that users with a high job seeking intent (actives and passives) have a much higher rate of reply to career related emails when compared to non-job-seekers (16 times the reply rate). And this is exactly the facet of the utility function of TalentMatch that we are interested in improving. So, what we want to do is incorporate the job seeker intent into the TalentMatch model, and we want to do so without negatively affecting the booking rate and the email rate.
  • So, what we want is a controlled perturbation of the ranking output by the talent match model, and this is how we are gonna do it: given the talent match ranking, we run a perturbation function on it that generates another ranking, the perturbed ranking, which optimizes for a metric we’re interested in (in the case of TalentMatch, it’s number of users with high-job seeking intent in the top-12 recommendations). Given the 2 rankings and their distribution of match scores, we can compute the distance between them using a variety of metrics, for example KL divergence or Euclidean distance. This divergence score is what will help us to make sure we are not negatively affecting the quality of the recommendations. Notice how, in the perturbed ranking, item Z was bumped from its original third position, below the cutoff line, to the second position, and so whereas before we had 2 non-seekers above the cutoff, meaning they would be recommended, now we have a non-seeker and an active. Also notice, that the perturbation is minimal. We should feel comfortable bumping item Z to the second position, but not to the first position.There are then 3 functions that we need to define: the perturbation function, the divergence function, and the objective function. The parameters of the perturbation function is what we will be estimating based the performance established by the divergence and objective measures: we want high scores on the objective and low scores on the divergence.
  • Transcript

    • 1. Recruiters, Job Seekers and Spammers:Innovations in Job Search at LinkedIn Daria Sorokina Senior Data Scientist LinkedIn
    • 2. Part I: Recruiters“Multiple Objective Optimization in RecommendationSystems”, Mario Rodriguez, Christian Posse, EthanZhang. RecSys‟12
    • 3. TalentMatch Job Posting Member Profiles Ranked Talent Talent Match
    • 4. TalentMatch Model Job Postingtitle industry …geo descriptioncompany functional area Text similarity features CandidateGeneral Current Positionexpertise titlespecialties summaryeducation tenure lengthheadline industrygeo functional areaexperience … The model can be trained on user activity signals like job ad clicks or job applications
    • 5. TalentMatch Utility = fn(email rate, reply rate) Email Rate Recruiter Reply Job Problem! Rate seeker?
    • 6. Job Seeker Intent PASSIVE NON-JOB- SEEKER ACTIVEModel: time till the job changeo How long will this person stay in this job after this date?o Trained on past job positions from our users profileso Accelerated failure time (AFT) modelo æ ö Ti = exp çå bk xik + sei ÷ è k ø
    • 7. Job-SeekerFeatureExample:Attrition byIndustry Probability Time
    • 8. TalentMatch Utility fn(email rate, reply rate)Job-Seeking Intent:16x reply rate oncareer-related mail Reply Rate
    • 9. How: ControlledRe-ranking Ranking Score DistributionsTalent Match rankingMatch Score1, Item X, 0.98, Non-Seeker2, Item Y, 0.91, Non-Seeker--------------------------------------- Divergenc3, Item Z, 0.89, Active e score Re-ranking function f() optimize for bothImproved rankingMatch Score, Reranking Score1, Item X, 0.98, 0.98, Non-Seeker Objective Score:2, Item Z, 0.89, 0.93, Active #Active in top N--------------------------------------------3, Item Y, 0.91, 0.91, Non-Seeker
    • 10. Part II: Job SeekersLearning to Rank. Fast and personalized.
    • 11. Job Search.Query “Data Scientist LinkedIn”
    • 12. Learning To Rank Regular approach – A data point is a pair: {Query, Document} – Data label: “Is this document relevant for this query?”  Can be done by crowdsourcing Job Search reality – A data point is a triple: {Query, Job position, User} – Data label: “Is this job relevant for this user who asked this query?”  Depends on the user‟s location, industry, seniority…  Too much to ask from a random person  Have to collect labels from user signals
    • 13. We use simplified version of FairPairs(Radlinski, Joachims AAAI‟06) Clicked! ✔flipped ✗  Each pair is flipped with a 50% chance ✔not flipped  Choose pairs where ✔ only the lower document is clicked ✗ label 0not flipped  Save 1 positive (lower) ✔ label 1 and 1 negative (upper) results for the labeled ✗ data set flipped ✗
    • 14. Fair Pairs data is not enough for training The user clicks or skips only whatever is shown Bad results are not shown So there will be no “really bad” negatives in the training data We need to add them! For queries with many results, add all results from the last page as “easy negatives” label 0 label 0 label 0 … … label 0
    • 15. Learning To Rank – Training a Model Best models for LTR are complex ensembles of trees – See results of Yahoo Learning to Rank „10 competition – LambdaMART, BagBoo, Additive Groves, MatrixNet … Complex models come at a cost – It takes long to calculate predictions – Requires a lot of optimization, often used with multi-level ranking Can we train a simple model that will resemble a complex one? – Train a complex model – Get insights on what it looks like – Modify a simple model accordingly
    • 16. Training a Simple Model using a Complex Model Base simple model – logistic or linear regression p log = b0 + b1 x1 + b2 x2 +... + bn xn 1- p – Does not handle well features with non-linear effects – Does not handle interactions (e.g., if-then-else rules) Target complex model – Additive Groves – (Sorokina, Caruana, Riedewald ECML‟07)(1/N)· +…+ + (1/N)· +…+ +…+ (1/N)· +…+ – Comes with interaction detection and effect visualization tools
    • 17. Improving LR – Feature Transformations Additive Groves can model and visualize non-linear effects  Approximate the effect curve average prediction with a polynomial transform T(x) – anything simple will do  Apply T(x) to the original feature values feature values average prediction Now the feature effect is linear Regression model will love it! b0 + b1 T(x1 )+ b2 x2 +... + bn xn T(x) values
    • 18. Improving LR – Interaction Splits Additive Groves‟ interaction detection tool produces a list of strong interactions and corresponding joint effect plots average prediction X2=1  Effect of X1 is stronger when X2 = 0  Simple regression will not capture this  Often such X2 interacts with other features as well values of feature X1 X2=?  Solution:  Build separate models for different values of X2 b0 + b1 x1 +... + bn xn a0 + a1 x1 +...+ an xn
    • 19. Improving LR – Tree with LR leaves and transforms Both operations (effect transforms and interaction splits) can be applied multiple times in any order Resulting model – a simple tree with regression model leaves X2=? b0 + b1 T(x1 )+...+ bn xn X10< 0.1234 ? a0 + a1 P(x1 )+...+ anQ(xn ) g 0 + g1 R(x1 )+...+ g nQ(xn ) Gives a significant boost to the performance of the basic LR model
    • 20. TreeExtra package A set of machine learning tools – Additive Groves ensemble – Interaction detection – Effect and interaction visualization – Created by Daria Sorokina while in Cornell, CMU, Yandex, LinkedIn from 2006 to 2013
    • 21. Part III: SpammersFighting black SEO
    • 22. Search Spam
    • 23. Search Spam
    • 24. Search Spam
    • 25. Training data for the search spam classifier Find the queries targeted by spammers. – 10,000 most common non-name queries. – Spammers love optimizing for [marketing] – But not so much for [david smith] Look at top results for a generic user. – i.e., show unpersonalized search results. Label data by crowdsourcing. – Definition of spam is non-personalized Train a model – Spam scores are recalculated offline once in a while – So the model complexity is not an issue – Additive Groves works well. (Could use any ensemble of trees)
    • 26. ROC curve. Choosing thresholds. 1Spam score threshold 0.9 0.8 a 0.7 0.6 0.5 b 0.4 0.30<a<b<1 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1
    • 27. Integrating the Spam Score into Relevance Spam model yields a probability between 0 and 1. Convert spam score into a factor – [0.0 <= score <= a]  not a spammer,  factor = 1.0 – [b <= score <= 1.0]  Spammer  factor = 0.0 – [a <= score <= b]  Suspicious  linearly scale score from [a, b] to [1, 0] Multiply relevance score by factor
    • 28. We are hiring!