SlideShare a Scribd company logo
1 of 13
Download to read offline
Toon De Pessemier, Kris Vanhecke, Luc Martens,
September, 2016
iMinds – Ghent University, Belgium
toon.depessemier@ugent.be
A Scalable, High-performance Algorithm
for Hybrid Job Recommendations
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
2
Introduction: Job recommendations
Not a classic recommender story
Not a classic solution
 Specific metadata characteristics
 Discipline, industry, career level, …
 Detailed user profile
 Experience, education (university degree), employment
 Limited availability in time (active_during_test)
 Various user-item interactions
 Click, bookmark, reply, delete
 Specific meaning of delete (click on “X”  load new item)
 Impressions
 Recommendations generated by XING’s recommender  Bias
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
3
Our goals
 XING’s evaluation measure
 Reflects typical XING use case
 Scalable
 Number of users and items
 Dataset = subset of XING users
 Incremental updates
 Continuous stream of new job items
 Updating models instead of recalculating
 Fast score calculation
 New job items  fast distribution to target users
 Limited computational resources
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
4
Findings
 Challenge = Prediction task
 ≠ Recommendation task
 No influence on user behavior
 Recommendations are not evaluated
by the user
 Important quality metrics are not evaluated
 Usefulness
Risk: Items already discovered by the user
Items that the user already interacted with, can be recommended
 Diversity
Risk: Too much of the same
 Serendipity
Risk: Items that are difficult to find but interesting, are unfairly evaluated as
“poor recommendations”
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
5
Findings
 The information value of impressions is
limited
 Recommendations of existing job
recommender
 Bias to Xing’s algorithm
 Less diverse
 Subset of recommendations
 No guarantee that the user has seen the item
 No cold start user  Better results if only the
interactions are used
 Penalty for items with a limited visibility
 Low visibility  low probability of interaction
 Low visibility  penalty  better results
 Item visibility estimated by number of interactions in training
set
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
6
Findings
 Influence of the user’s region
 Expected: interest for jobs located in the user’s
home region or in adjacent regions
 Observed: Many interactions for jobs located in
non-adjacent or far away regions
 E.g. Users of Lower Saxony  Jobs in Baden-
Württemberg
 Many cold-start users
 No interactions, no impressions (9.7%)
 CB recommendation based on explicit profile
 Risk: too general or to specific profile
 Risk: not updated by the user
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
7
Findings
 Traditional classification does not work
 Positive class: click, bookmark, reply
 Negative class: delete
 Recommendations: items most typical for the positive class
 Poor score
 Reasoning: meaning of delete action
 Click on X button in recommendation list
 New recommendation will be loaded and displayed
 Deletes not sampled from complete job offer but from
recommendations (bias: items more similar to the user’s interests
than random items)
 Not necessarily a disinterest of the user
 Intension to click: new recommendation
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
8
Content-Based Recommender
 Based on feature matching
 Explicit user profile
 Interactions  counter for each feature
 Interaction weight
 Updating counters
 Delete=0, click=1, bookmark=10, reply=10 (no significant effect of deletes)
 Positive counters (posf,u)  item has feature
 Negative counters (negf,u)  item does not have feature
 Score calculation
 α = 0.5 (positive counters are more important than negative counters)
 IDF = inverse document frequency: feature frequency across all jobs
 N = total number of items
 nf = number of items with feature f
 wf = weight per feature type (tag, discipline, industry, …)
 u = user
 i = item
score(u,i) =
1
𝑓𝜖 𝑖
𝑓∈𝑖
𝑤𝑓 𝑝𝑜𝑠 𝑓,𝑢 − 𝛼 𝑛𝑒𝑔 𝑓,𝑢 𝑙𝑜𝑔
𝑁
𝑛 𝑓
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
9
Content-based calculation
 Profile
 Offline calculation
 Incremental updates of counters
 IDF
 Slightly varying over time
 Periodic updates
 Target items
 Active items
 Minimum matching threshold (positive counters and item
have X features in common)
 Algorithm running in parallel for different users
 Fast calculation of the recommendations
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
10
Collaborative filtering: KNN
 Traditional KNN
 Distance based on interactions
 Our KNN solution
 Distanced based on interactions and metadata
 2 items are similar if users have interacted with both
 2 items are similar if they have metadata features in common
Feature distance: factor 𝑙𝑜𝑔
𝑁
𝑛 𝑓
 Fine-grained distance function
 Risk of ties is reduced
 Method:
 For each candidate item:
 Calculate distance to k-nearest items that the user has positively interacted with
 Select items with shortest distance
 𝑠𝑐𝑜𝑟𝑒 𝑢, 𝑖 =
1
𝑘 𝑘
𝐷𝑖𝑠𝑡 𝑚𝑎𝑥−𝐷𝑖𝑠𝑡 𝑖,𝑘
𝐷𝑖𝑠𝑡 𝑚𝑎𝑥
 Based on Weka Framework
 BallTree implementation of NearestNeighbourSearch package
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
11
KNN calculation
 Item distances
 Offline calculation
 Slightly varying over time
 If partially computed distance > threshold
 stop calculation
 Score calculation
 Fast if distances are precomputed
 Algorithm running in parallel for different users
 Fast calculation of the recommendations
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
12
Results and fallback
 CB: 286,041.10
 KNN: 298,316.85
 Hybrid: 344,264.37
 Fallback cold start users:
 No interactions:
 KNN based on interactions is not possible (26.5% of users)
 No interactions  use impressions (16.8% of users)
 Solution without fallback to impressions (only based on profile):
292,909.26
 No interactions and no impressions (9.7% of the users):
 Hybrid  CB
 CB cannot generate recommendations:
 For 1485 users
 Recommend the 30 most popular items (most positive interactions)
 Without fallback to most popular recommender: 344,241.51
 Most popular recommender as the only solution: 73,298.13
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Toon De Pessemier, Kris Vanhecke, Luc Martens
13
Questions?

More Related Content

What's hot

Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
nextlib
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
butest
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
huguk
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
GrubhubTech
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Kishor Datta Gupta
 

What's hot (20)

Collaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CFCollaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CF
 
Strategies for Practical Active Learning, Robert Munro
Strategies for Practical Active Learning, Robert MunroStrategies for Practical Active Learning, Robert Munro
Strategies for Practical Active Learning, Robert Munro
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems Research
 
Starke2017 - Effective User Interface Designs to Increase Energy-efficient Be...
Starke2017 - Effective User Interface Designs to Increase Energy-efficient Be...Starke2017 - Effective User Interface Designs to Increase Energy-efficient Be...
Starke2017 - Effective User Interface Designs to Increase Energy-efficient Be...
 
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsCold-Start Management with Cross-Domain Collaborative Filtering and Tags
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
 
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 

Similar to A Scalable, High-performance Algorithm for Hybrid Job Recommendations

Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
QuestionPro
 

Similar to A Scalable, High-performance Algorithm for Hybrid Job Recommendations (20)

The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendation
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixRecommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at Netflix
 
Towards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender SystemsTowards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender Systems
 
Machine Learning in e commerce - Reboot
Machine Learning in e commerce - RebootMachine Learning in e commerce - Reboot
Machine Learning in e commerce - Reboot
 
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
 
Software Project Estimation
Software Project EstimationSoftware Project Estimation
Software Project Estimation
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedIn
 
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
 
3685807
36858073685807
3685807
 
Software Development in the Brave New world
Software Development in the Brave New worldSoftware Development in the Brave New world
Software Development in the Brave New world
 
UX Research
UX ResearchUX Research
UX Research
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG Cleveland
 
Prototyping and Usability Testing your designs
Prototyping and Usability Testing your designsPrototyping and Usability Testing your designs
Prototyping and Usability Testing your designs
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market Research
 
Measuring the User Experience in Digital Products
Measuring the User Experience in Digital ProductsMeasuring the User Experience in Digital Products
Measuring the User Experience in Digital Products
 
kdd2015
kdd2015kdd2015
kdd2015
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 Vf
 
IxD meets DS
IxD meets DSIxD meets DS
IxD meets DS
 

Recently uploaded

原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
gfhdsfr
 
一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理
A
 
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
B
 
一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书
A
 
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
gfhdsfr
 
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
gfhdsfr
 
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
Fir
 
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
Fir
 
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
lolsDocherty
 
一比一原版英国萨赛克斯大学毕业证如何办理
一比一原版英国萨赛克斯大学毕业证如何办理一比一原版英国萨赛克斯大学毕业证如何办理
一比一原版英国萨赛克斯大学毕业证如何办理
SDSA
 

Recently uploaded (20)

AI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model GeneratorAI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model Generator
 
原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
原版定制(爱大毕业证书)英国爱丁堡大学毕业证原件一模一样
 
一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理
 
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
一比一原版(Bath毕业证书)英国桑德兰大学毕业证如何办理
 
一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书
 
I’ll See Y’All Motherfuckers In Game 7 Shirt
I’ll See Y’All Motherfuckers In Game 7 ShirtI’ll See Y’All Motherfuckers In Game 7 Shirt
I’ll See Y’All Motherfuckers In Game 7 Shirt
 
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
 
Reggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirts
 
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
一比一定制(Dundee毕业证书)英国邓迪大学毕业证学位证书
 
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
 
GOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdfGOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdf
 
Development Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of appsDevelopment Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of apps
 
Premier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdfPremier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdf
 
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
 
Free on Wednesdays T Shirts Free on Wednesdays Sweatshirts
Free on Wednesdays T Shirts Free on Wednesdays SweatshirtsFree on Wednesdays T Shirts Free on Wednesdays Sweatshirts
Free on Wednesdays T Shirts Free on Wednesdays Sweatshirts
 
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
 
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWebiThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
 
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
 
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
 
一比一原版英国萨赛克斯大学毕业证如何办理
一比一原版英国萨赛克斯大学毕业证如何办理一比一原版英国萨赛克斯大学毕业证如何办理
一比一原版英国萨赛克斯大学毕业证如何办理
 

A Scalable, High-performance Algorithm for Hybrid Job Recommendations

  • 1. Toon De Pessemier, Kris Vanhecke, Luc Martens, September, 2016 iMinds – Ghent University, Belgium toon.depessemier@ugent.be A Scalable, High-performance Algorithm for Hybrid Job Recommendations
  • 2. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 2 Introduction: Job recommendations Not a classic recommender story Not a classic solution  Specific metadata characteristics  Discipline, industry, career level, …  Detailed user profile  Experience, education (university degree), employment  Limited availability in time (active_during_test)  Various user-item interactions  Click, bookmark, reply, delete  Specific meaning of delete (click on “X”  load new item)  Impressions  Recommendations generated by XING’s recommender  Bias
  • 3. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 3 Our goals  XING’s evaluation measure  Reflects typical XING use case  Scalable  Number of users and items  Dataset = subset of XING users  Incremental updates  Continuous stream of new job items  Updating models instead of recalculating  Fast score calculation  New job items  fast distribution to target users  Limited computational resources
  • 4. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 4 Findings  Challenge = Prediction task  ≠ Recommendation task  No influence on user behavior  Recommendations are not evaluated by the user  Important quality metrics are not evaluated  Usefulness Risk: Items already discovered by the user Items that the user already interacted with, can be recommended  Diversity Risk: Too much of the same  Serendipity Risk: Items that are difficult to find but interesting, are unfairly evaluated as “poor recommendations”
  • 5. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 5 Findings  The information value of impressions is limited  Recommendations of existing job recommender  Bias to Xing’s algorithm  Less diverse  Subset of recommendations  No guarantee that the user has seen the item  No cold start user  Better results if only the interactions are used  Penalty for items with a limited visibility  Low visibility  low probability of interaction  Low visibility  penalty  better results  Item visibility estimated by number of interactions in training set
  • 6. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 6 Findings  Influence of the user’s region  Expected: interest for jobs located in the user’s home region or in adjacent regions  Observed: Many interactions for jobs located in non-adjacent or far away regions  E.g. Users of Lower Saxony  Jobs in Baden- Württemberg  Many cold-start users  No interactions, no impressions (9.7%)  CB recommendation based on explicit profile  Risk: too general or to specific profile  Risk: not updated by the user
  • 7. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 7 Findings  Traditional classification does not work  Positive class: click, bookmark, reply  Negative class: delete  Recommendations: items most typical for the positive class  Poor score  Reasoning: meaning of delete action  Click on X button in recommendation list  New recommendation will be loaded and displayed  Deletes not sampled from complete job offer but from recommendations (bias: items more similar to the user’s interests than random items)  Not necessarily a disinterest of the user  Intension to click: new recommendation
  • 8. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 8 Content-Based Recommender  Based on feature matching  Explicit user profile  Interactions  counter for each feature  Interaction weight  Updating counters  Delete=0, click=1, bookmark=10, reply=10 (no significant effect of deletes)  Positive counters (posf,u)  item has feature  Negative counters (negf,u)  item does not have feature  Score calculation  α = 0.5 (positive counters are more important than negative counters)  IDF = inverse document frequency: feature frequency across all jobs  N = total number of items  nf = number of items with feature f  wf = weight per feature type (tag, discipline, industry, …)  u = user  i = item score(u,i) = 1 𝑓𝜖 𝑖 𝑓∈𝑖 𝑤𝑓 𝑝𝑜𝑠 𝑓,𝑢 − 𝛼 𝑛𝑒𝑔 𝑓,𝑢 𝑙𝑜𝑔 𝑁 𝑛 𝑓
  • 9. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 9 Content-based calculation  Profile  Offline calculation  Incremental updates of counters  IDF  Slightly varying over time  Periodic updates  Target items  Active items  Minimum matching threshold (positive counters and item have X features in common)  Algorithm running in parallel for different users  Fast calculation of the recommendations
  • 10. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 10 Collaborative filtering: KNN  Traditional KNN  Distance based on interactions  Our KNN solution  Distanced based on interactions and metadata  2 items are similar if users have interacted with both  2 items are similar if they have metadata features in common Feature distance: factor 𝑙𝑜𝑔 𝑁 𝑛 𝑓  Fine-grained distance function  Risk of ties is reduced  Method:  For each candidate item:  Calculate distance to k-nearest items that the user has positively interacted with  Select items with shortest distance  𝑠𝑐𝑜𝑟𝑒 𝑢, 𝑖 = 1 𝑘 𝑘 𝐷𝑖𝑠𝑡 𝑚𝑎𝑥−𝐷𝑖𝑠𝑡 𝑖,𝑘 𝐷𝑖𝑠𝑡 𝑚𝑎𝑥  Based on Weka Framework  BallTree implementation of NearestNeighbourSearch package
  • 11. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 11 KNN calculation  Item distances  Offline calculation  Slightly varying over time  If partially computed distance > threshold  stop calculation  Score calculation  Fast if distances are precomputed  Algorithm running in parallel for different users  Fast calculation of the recommendations
  • 12. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 12 Results and fallback  CB: 286,041.10  KNN: 298,316.85  Hybrid: 344,264.37  Fallback cold start users:  No interactions:  KNN based on interactions is not possible (26.5% of users)  No interactions  use impressions (16.8% of users)  Solution without fallback to impressions (only based on profile): 292,909.26  No interactions and no impressions (9.7% of the users):  Hybrid  CB  CB cannot generate recommendations:  For 1485 users  Recommend the 30 most popular items (most positive interactions)  Without fallback to most popular recommender: 344,241.51  Most popular recommender as the only solution: 73,298.13
  • 13. A Scalable, High-performance Algorithm for Hybrid Job Recommendations Toon De Pessemier, Kris Vanhecke, Luc Martens 13 Questions?