Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems

1,508 views

Published on

Published in: Internet, Technology, Travel
  • Be the first to comment

Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems

  1. 1. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems Matthias Braunhofer ! Free University of Bozen - Bolzano Piazza Domenicani 3, 39100 Bolzano, Italy mbraunhofer@unibz.it
  2. 2. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 2 • Context-Aware Recommenders and the Cold-Start Problem • Related Work • Basic Context-Aware Rating Prediction Models • Hybrid Context-Aware Rating Prediction Models • Conclusions and Open Issues
  3. 3. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 2 • Context-Aware Recommenders and the Cold-Start Problem • Related Work • Basic Context-Aware Rating Prediction Models • Hybrid Context-Aware Rating Prediction Models • Conclusions and Open Issues
  4. 4. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark • Context-Aware Recommender Systems (CARSs) aim to provide better recommendations by exploiting contextual information (e.g., weather) • Rating prediction function is: R: Users x Items x Context → Ratings • Three basic approaches: • Contextual pre-filtering • Contextual post-filtering • Contextual modelling Context-Aware Recommender Systems 3
  5. 5. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Cold-Start Problem • CARSs suffer from the cold-start problem • New user problem: How do you recommend to a new user? • New item problem: How do you recommend a new item with no ratings? • New context problem: How do you recommend in a new context? 4 1 ? 1 ? 2 5 ? ? 3 ? 3 ? 5 ? 2 5 ? ? 3 ? 5 ? 5 ? 4 5 4 ? ? 3 5 ? 1 ? 1 2 5 ? 3 3 ? 5 2 5 ? 3 5 ? 5 4 5 4 ? 3 5 ? ? ? ? ? ? 1 ? 1 2 5 ? 3 3 ? 5 2 5 ? 3 5 ? 5 4 5 4 ? 3 5
  6. 6. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Our Solution: Hybrid CARS • Ultimate goal: design and development of hybrid CARSs that combine different CARS algorithms depending on their estimated strengths and weaknesses to predict a user’s rating for an item given a particular cold-start situation • Example: 5 (user, item, context) tuple CARS 1 CARS 2 Combination Final score Score Score Hybrid CARS
  7. 7. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Key Steps • Identify candidate basic context-aware rating prediction models • Analyse candidate rating prediction models (what are their strengths and weaknesses under cold-start situations?) • Design, develop and evaluate novel hybrid CARSs • Integrate the best-performing hybrid CARS into our STS (South Tyrol Suggests) mobile app • Evaluate it through a live user study 6
  8. 8. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 7 • Context-Aware Recommenders • Related Work • Basic Context-Aware Rating Prediction Models • Hybrid • Conclusions and Open Issues
  9. 9. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Related Work 8 Cold-starting CARSs … using additional data … better processing known data Active learning (Elahi et al., 2013) Cross-domain rec. (Enrich et al., 2013) User / item attributes (Woerndl et al., 2009) Context similarities (Zheng et al., 2013) (Codina et al., 2013)
  10. 10. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 9 • Context-Aware Recommenders • Related Work • Basic Context-Aware Rating Prediction Models • Conclusions and Open Issues • Hybrid Context-Aware Rating Prediction Models
  11. 11. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark CAMF-CC (Baltrunas et al., 2011) • CAMF-CC (Context-Aware Matrix Factorization for item categories) is a variant of CAMF that extends standard Matrix Factorization (MF) by incorporating baseline parameters for contextual condition-item category pairs 10 ˆruic1,...,ck = qi T pu + µ + bi + bu + btcj j=1 k ∑ t∈T (i) ∑ qi latent factor vector of item i pu latent factor vector of user u μ overall average rating bi baseline for item i bu baseline for user u T(i) set of categories associated to item i btcj baseline for item category-contextual condition tcj
  12. 12. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark SPF (Codina et al., 2013) • SPF (Semantic Pre-Filtering) is a contextual pre-filtering method that, given a target contextual situation, uses a standard MF model learnt from all the ratings tagged with contextual situations identical or similar to the target one • Conjecture: addresses cold-start problems caused by exact pre-filtering • Key step: similarity calculation 11 1 -0.5 2 1 -2 0.5 -2 -1.5 -2 0.5 -1 -1 1 -0.96 -0.84 -0.96 1 0.96 -0.84 0.96 1 Condition-to-item co-occurrence matrix Cosine similarity between conditions
  13. 13. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Category-based CAMF-CC • It is a novel variant of CAMF-CC that incorporates additional sources of information about the items, i.e., category or genre information • Conjecture: alleviates the new item problem of CAMF-CC 12 ˆruic1,...,ck = (qi + xt ) t∈T (i) ∑ T pu + µ + bi + bu + btcj j=1 k ∑ t∈T (i) ∑ qi latent factor vector of item i T(i) set of categories associated to item i xt latent factor vector of item category t pu latent factor vector of user u μ overall average rating bi baseline for item i bu baseline for user u btcj baseline for item category-contextual condition tcj
  14. 14. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Demographics-based CAMF-CC • It is a novel variant of CAMF-CC that profiles users through known user attributes (e.g., age group, gender, personality traits) • Conjecture: alleviates the new user problem of CAMF-CC 13 ˆruic1,...,ck = qi T (pu + ya ) a∈A(u) ∑ + µ + bi + bu + btcj j=1 k ∑ t∈T (i) ∑ qi latent factor vector of item i pu latent factor vector of user u A(u) set of user attributes ya latent factor vector of user attribute a μ overall average rating bi baseline for item i bu baseline for user u T(i) set of categories associated to item i btcj baseline for item category-contextual condition tcj
  15. 15. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Discussion • Offline evaluation of cold-start performance of CARSs is a complex task: • Not done before • Requires large (enough) contextually-tagged rating datasets with user and item attributes • Must consider multiple perspectives: new users, new items, new contextual situations, mixtures of elementary cold-start cases, different degrees of coldness, different types of user and item attribute information available 14
  16. 16. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark • 2 contextually-tagged rating datasets STS (Elahi et al., 2013) LDOS-CoMoDa (Odić et al., 2013) Domain POIs Movies Rating scale 1-5 1-5 Ratings 2,422 2,296 Users 305 121 Items 238 1,232 Contextual factors 14 12 Contextual conditions 57 49 Contextual situations 880 1,969 User attributes 7 4 Item features 1 7 Evaluation Used Datasets 15
  17. 17. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Evaluation Procedure • Five-fold cross-validation where proper subsets of the testing set are used, depending on the cold-start situation under consideration • Divide the ratings into five cross-validation folds • For each fold k = 1, 2, …, 5 • Use all ratings except those in fold k to train the prediction models • Calculate the Mean Absolute Error (MAE) on those ratings in fold k that are coming from new users, new items and new contextual situations, respectively • Users, items or contextual situations are new if they have at most n ratings in the training set, with n ranging from 0 to 10 • Advantage: allows to test for different degrees of coldness • Drawback: small testing sets are filtered and get even smaller 16
  18. 18. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Obtained Results (1/3) MAEs for new users 17 CoMoDa MAE 0.65 0.75 0.85 0.95 1.05 1.15 1.25 User profile size 0 1 2 3 4 5 6 7 8 9 10 MF CAMF-CC SPF Category-based CAMF-CC Demographics-based CAMF-CC STS MAE 0.65 0.75 0.85 0.95 1.05 1.15 1.25 User profile size 0 1 2 3 4 5 6 7 8 9 10
  19. 19. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Obtained Results (2/3) MAEs for new items 18 CoMoDa MAE 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 Item profile size 0 1 2 3 4 5 6 7 8 9 10 MF CAMF-CC SPF Category-based CAMF-CC Demographics-based CAMF-CC STS MAE 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 Item profile size 0 1 2 3 4 5 6 7 8 9 10
  20. 20. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Obtained Results (3/3) MAEs for new contextual situations 19 CoMoDa MAE 0.70 0.75 0.80 0.85 0.90 0.95 Context profile size 0 1 2 3 4 5 6 7 8 9 10 MF CAMF-CC SPF Category-based CAMF-CC Demographics-based CAMF-CC STS MAE 0.70 0.75 0.80 0.85 0.90 0.95 Context profile size 0 1 2 3 4 5 6 7 8 9 10
  21. 21. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 20 • Context-Aware Recommenders • Related Work • Basic Context-Aware Rating Prediction Models • Conclusions and Open Issues • Hybrid Context-Aware Rating Prediction Models
  22. 22. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Heuristic Switching* • Main idea: use a stable heuristic to switch between the basic CARS algorithms depending on the encountered cold-start situation 21 (user, item, context) tuple Final score Y Demogr.-CAMF-CC Content-CAMF-CC CAMF-CC New item? N Y N New context? New context? Y N New item? New user? Content-CAMF-CC & Demogr.-CAMF-CC Y N Y N Final score Final score Final score Score Score Score Score * Described in our short paper submitted to ACM RecSys 2014
  23. 23. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Adaptive Weighted* • Main idea: adaptively weight each basic CARS algorithm based on how well it performs for the user, item and contextual situation in question 22 (user, item, context) tuple CAMF-CC SPF Content-CAMF-CC Demogr.-CAMF-CC Adapter Adapter Adapter Adapter Score Score Score Score (Score, Weight) (Score, Weight) (Score, Weight) (Score, Weight) ∑ Final score Algorithms layer Adaptive layer Aggregation * Described in our paper submitted to ACM RecSys 2014 Doctoral Symposium
  24. 24. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark • 3 contextually-tagged rating datasets STS (Elahi et al., 2013) LDOS-CoMoDa (Odić et al., 2013) Music (Baltrunas et al., 2011) Domain POIs Movies Music Rating scale 1-5 1-5 1-5 Ratings 2,534 2,296 4,012 Users 325 121 139 Items 249 1,232 139 Contextual factors 14 12 8 Contextual conditions 57 49 26 Contextual situations 931 1,969 26 User attributes 7 4 10 Item features 1 7 2 Evaluation Used Datasets 23
  25. 25. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Evaluation Procedure 24 • Randomly split users / items / contexts into training set and testing set → creates a set of users / items / contexts in the testing set that have no ratings in the training set • Advantage: the entire rating dataset can be used • Drawback: can’t test for different degrees of coldness 1 ? 1 2 5 ? 3 3 ? 5 2 5 ? 3 5 ? 5 4 5 4 ? 3 5 1 ? 1 2 5 ? 3 3 ? 5 2 5 ? 3 5 ? 5 4 5 4 ? 3 5 1 ? 1 2 5 ? 3 3 ? 5 2 5 ? 3 5 ? 5 4 5 4 ? 3 5 New user test New item test New context test Training set Testing set
  26. 26. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Evaluation Summary of Obtained Results • Significant differences in normalised Discounted Cumulative Gain (nDCG) and MAE between basic CARS algorithms across different cold- start cases • Content-based CAMF-CC works best for the new item situation • Demographics-CAMF-CC works best both for the new user and new context situation • Hybridisation techniques can improve performance • In almost all cases, they outperformed the state-of-the-art CARS algorithms (i.e., CAMF-CC and SPF), thus easing the problem of model selection 25
  27. 27. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Outline 26 • Context-Aware Recommenders • Related Work • Basic Context-Aware Rating Prediction Models • Conclusions and Open Issues • Hybrid Context-Aware Rating Prediction Models
  28. 28. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Conclusions • Basic CARS algorithms perform very differently in the different cold-start situations • Knowledge of strengths and weaknesses of each basic CARS algorithm in the various cold-start situations allows the development of hybrid techniques • First developed and tested hybrid CARS algorithms are able to outperform the state-of-the-art CARS algorithms (i.e., CAMF-CC and SPF) 27
  29. 29. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Open Issues • Review additional knowledge sources which may be used to incorporate additional information about users, items and contextual situations • Check the availability of large-scale, contextually-tagged datasets with item and user attributes • Revise the used evaluation procedure and evaluation metrics • Identify the best-performing hybridisation method for cold-start situations • Design and execute a live user study 28
  30. 30. UMAP Doctoral Consortium - July 2014, Aalborg, Denmark Questions or Comments? Thank you.

×