Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Content-Based approaches for Cold-Start Job Recommendations


Published on

Content-Based approaches for Cold-Start Job Recommendations
ACM RecSys Challenge 2017
Lunatic Goats @PoliMi
M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola, I. Inajjar, A. M. Metelli, L. Cella

Published in: Data & Analytics
  • Be the first to comment

Content-Based approaches for Cold-Start Job Recommendations

  1. 1. Titolo presentazione sottotitolo Milano, XX mese 20XX Content-Based approaches for Cold-Start Job Recommendations ACM RecSys Challenge 2017 Lunatic Goats @PoliMi M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola, I. Inajjar, A. M. Metelli, L. Cella
  2. 2. Lunatic Goats @PoliMi Task Outline ● Cold Start recommendation scenario: ○ job posting recommendations; ○ focus on getting positive interactions; ○ penalized for negative interaction; ○ rewarded for recruiter Interest. ● Two phases: ○ Offline - predictions for fixed sets of items and users. ○ Online - daily recommendation to variable sets of users.
  3. 3. Lunatic Goats @PoliMi Data Analysis - Impressions vs Interactions ● Impressions: ~97% of the data, little to no information contained (discarded). ● Interactions: ~3% of the data. ● Interactions divided in: ○ positive interactions (types 1, 2 and 3); ○ negative interactions (type 4); ○ recruiter interest (type 5). ● Interactions treated with implicit approach.
  4. 4. Lunatic Goats @PoliMi Local Validation ● Split the dataset in train and validation set. ● Random sampling procedure: ○ randomly select target items from dataset; ○ remove all interactions with these items; ○ pick target users as a subset of those who have interactions with these items. ● Preserve the user-item ratio. ● No cross-validation, too much data
  5. 5. Lunatic Goats @PoliMi Solution - Preprocessing ● One Hot Encoding of both user and items features. ● Feature aggregation: ● TF-IDF application. ● Negative User Filtering: removing heavy deleters.
  6. 6. Lunatic Goats @PoliMi Solution Overview
  7. 7. Lunatic Goats @PoliMi Solution - Negative Recommendation ● Scoring heavily penalized negative (type 4) interactions ● Using CBF approach, predict type 4 interactions ● Ensemble these predictions with negative weight
  8. 8. Lunatic Goats @PoliMi Solution – Content Based Filtering algorithms (CBF) Recommend to a user items similar to the ones he/she likes. ● Run separately on positive (CBF+) and negative (CBF-) interactions. ● Tanimoto similarity between items: ● Recommendation performed for filtered users only: ● Penalize heavy clickers.
  9. 9. Lunatic Goats @PoliMi Solution – Profile Matching (PM) Recommend to a user items matching his/her profile. ● Cosine similarity between user and item: ● Items’ tags and titles compared with users’ jobroles. ● Recommendation performed for filtered users only. ● Differently from CBF, PM is able to recommend also cold-start users.
  10. 10. Lunatic Goats @PoliMi Solution – Collaborative Filtering algorithms ● CF cannot be run directly in a cold-start scenario. ● Content-based microclustering approach: ○ for each cold-start item associate the interactions of the top 5 CBF-similar non-cold-start items; ○ run standard CF algorithms. ● CF algorithms: ○ CF with item cosine similarity; ○ iALS (Implicit Alternating Least Squares).
  11. 11. Lunatic Goats @PoliMi Solution - Ensemble Structure ● Divide algorithms by nature. ● Normalize and weight each layer. ● Generate upper layers by adding lower layers. ● Output 100 best scores.
  12. 12. Lunatic Goats @PoliMi Solution - Parameter Tuning ● Ensemble tuning: ○ 9 weights (one for each block), reduced to 6 due to normalization; ○ non-differentiable scoring function; ○ gradient-free optimization methods: ■ Genetic Algorithms - quick and acceptable results; ■ Powell’s Conjugate Direction method - slower but superior results. ● Individual algorithms tuning: ○ greedy search on local test.
  13. 13. Lunatic Goats @PoliMi Online - Changes to ensemble ● Normalization type. ● Cutting for each user before items. ● Excluding slower algorithms - prompt push gives more exposure → better scores.
  14. 14. Lunatic Goats @PoliMi Architecture & Runtime ● Recommender is run on VM’s with 8 cores and 16GB RAM. ● Only exception is content-based microclustering and iALS, run on 8 core 64GB RAM. ● Code is heavily optimized to use little memory efficiently (sparse matrix representations, efficient matrix operations). ● Results in optimal runtime.
  15. 15. Lunatic Goats @PoliMi Scores - Local vs Offline Algorithm Local score Leaderboard score Execution time CBF+ 57852 60257 13 min CBF- -1330 -8529 4 min PM 17260 16777 7 min CF 42213 39250 12 min iALS 48081 52411 150 min XING Baseline 14742 14395 40 min Ensemble 60625 71372 2 min
  16. 16. Lunatic Goats @PoliMi Results and Conclusions ● 2nd place in the online phase; ● 1st place in the offline phase. ● Points of strength: ○ speed (in particular offline ~20 min); ○ ease of implementation. ● Extensions: ○ feature weighting (user personalized, feature interaction); ○ time decay models.