Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Recommender System Challenge

414 views

Published on

Description of our work on a Recommender System Challenge on music recommendation on Kaggle.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Recommender System Challenge

  1. 1. RecSys Challenge MultiBeerBandits Emanuele Ghelfi Leonardo Arcari https://github.com/MultiBeerBandits/recsys_challenge_2017 January 23, 2018 Politecnico di Milano
  2. 2. Our Approach
  3. 3. Exploration Basic models: • User-based filtering • Item-based filtering • Content-based filtering • SVD (Content and Collaborative) • SLIM (RMSE and BPR) • Matrix Factorization (RMSE and BPR) • Factorization Machine 1
  4. 4. Exploration Hybrid models: • ICM + URM Content Based • CSLIM BPR • Ranked Lists Merging • Similarity Ensemble 2
  5. 5. What Worked
  6. 6. ICM + URM Content Based 2
  7. 7. ICM + URM Content Based Rationale: the similarity between two items is strongly influenced by their attributes and by the number of users they’ve in common. ICMaugmented = α1 · ICM α2 · URM Here users are seen as features of an item. 3
  8. 8. ICM + URM Content Based Succesful Preprocessing • Feature weighing (Hyperparameters to tune) • TFIDF + topK filtering (on tags) 4
  9. 9. ICM + URM Content Based Unsuccesful Preprocessing • TFIDF on all the features • Clustering on continuous-valued features (duration and playcount) • Inference on missing data (album, duration and playcount) • Aggregated features (tags) 1 1 Daniele Grattarola et al 2017. Content-Based approaches for Cold-Start Job Recommendations 5
  10. 10. Tuning process Raw Data Feat. Weighing TFIDF (tags) TopK (tags) Cosine (Similarity) Evaluate map@5Hyperparameters tuning (Random Forest) 6
  11. 11. CSLIM BPR 6
  12. 12. CSLIM BPR Preprocessing Tag aggregation: • Select the topK most popular tags • For each tuple T of tags compute the logical AND • The result is a new feature, describing the items that share tags in T 7
  13. 13. CSLIM BPR Our Improvements • Momentum update rule implemented • Samples to train the model are drawn from ICM and URM with different probabilities (in our case sampling more from ICM than URM gave better results). Results Even if this approach is a model-based one, we were able to achieve similar results with respect to the memory-based one: ICM + URM Content Based. 8
  14. 14. Training process Raw Data Feat. Aggregation BPR-OPT Model RMSPROP 9
  15. 15. Similarity Ensemble 9
  16. 16. Similarity Ensemble Rationale: • Combining different sets of features is very hard. Some features are more present than others and stating their relative importance is even harder • Different models capture different aspects of the similarity between items Our approach was to optimize each model separately and combine their similarity matrices. 10
  17. 17. Similarity Ensemble The model Sensemble = i∈M βi Si Where M is the set of models. The coefficients βi are hyperparameters that we tune with Random Forest. Properties: • The similarity of each item with the others must be normalized • For each item in Sensemble we keep only the topK similar items • Sensemble is again normalized 11
  18. 18. Raw Data ICM CBF SLIMBPR MF IBF UCBF Sensemble βi βi 12
  19. 19. Similarity Ensemble with Clustering Figure 1: This figure shows the performance of some models on users clustered with respect to different features. 13
  20. 20. Similarity Ensemble with Clustering The model Sensemblej = i∈M βij Si ˆrj i = ri · Sensemblej Let i be a user belonging to cluster j, the prediction ˆrj i is computed from the ensemble similarity matrix personalized for that cluster (Sensemblej ). 14
  21. 21. Lessons Learned
  22. 22. Lessons Learned • Exploration/Exploitation trade off • No Free Lunch theorem (No superior algorithm) • Do Ensemble 15
  23. 23. Thank you all! 15

×