Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Fusion for Dealing with the Recommendation Problem

699 views

Published on

Keynote at the #umap2016 Workshop on Multi-dimensional Information Fusion for User Modeling and Personalization

Published in: Technology
  • Be the first to comment

Data Fusion for Dealing with the Recommendation Problem

  1. 1. Data Fusion for Dealing with the Recommendation Problem Denis Parra, PUC Chile Keynote for IFUP Workshop on Multi-dimensional Information Fusion for User Modeling and Personalization UMAP 2016, Halifax, Canada
  2. 2. In this talk • Recommendation of articles with user-controlled fusion • Fusing data in the music domain • Fusion for e-marketplaces in virtual worlds • How to integrate time into collaborative filtering? 7/17/16 D. Parra, IFUP keynote, UMAP 2016 2
  3. 3. Part 1: Recommendation of Articles with User-Controlled Fusion 7/17/16 D. Parra, IFUP keynote, UMAP 2016 3
  4. 4. Recommendation of Articles • Problem: a) Traditional user feedback is (was?) difficult to obtain, b) Sparsity • There are several potential sources of recommendation, but mostly from the items: • Content • Co-citations, co-authorship • Etc. • Our approach: give users control over what to fuse. • Would it work? • How much data combination is the optimum? • Does visual representation affect the behavior/accuracy? 7/17/16 D. Parra, IFUP keynote, UMAP 2016 4
  5. 5. References • Verbert, K., Parra, D., Brusilovsky, P., & Duval, E. (2013). Visualizing recommendations to support exploration, transparency and controllability. In Proceedings of the 2013 international conference on Intelligent user interfaces (pp. 351-362). ACM. • Parra, D., Brusilovsky, P., & Trattner, C. (2014). See what you want to see: visual user-driven approach for hybrid recommendation. In Proceedings of the 19th international conference on Intelligent User Interfaces(pp. 235-240). ACM. • Verbert, K., Parra, D., & Brusilovksy, P. (2014). The effect of different set-based visualizations on user exploration of recommendations. In Proceedings of the Joint Workshop on Interfaces and Human Decision Making in Recommender Systems(pp. 37-44). 7/17/16 D. Parra, IFUP keynote, UMAP 2016 5
  6. 6. TalkExplorer • Implemented initially for a user study in ACM Hypertext 2012 for Conference Navigator. • Main question to address: Do users consider the fusion of several sources of data when choosing relevant items? 7/17/16 D. Parra, IFUP keynote, UMAP 2016 6
  7. 7. Recap – Conference Navigator Program Proceedings Author List Recommendations 7/17/16 D. Parra, IFUP keynote, UMAP 2016 7
  8. 8. TalkExplorer Interface 7/17/16 D. Parra, IFUP keynote, UMAP 2016 8
  9. 9. TalkExplorer - Entities Entities Tags, Recommender Agents, Users 7/17/16 D. Parra, IFUP keynote, UMAP 2016 9
  10. 10. TalkExplorer – Central Canvas Recommender Recommender Cluster with intersection of entities Cluster (of talks) associated to only one entity • Canvas Area: Intersections of Different Entities User 7/17/16 D. Parra, IFUP keynote, UMAP 2016 10
  11. 11. TalkExplorer - Articles Items Talks explored by the user 7/17/16 D. Parra, IFUP keynote, UMAP 2016 11
  12. 12. TalkExplorer Studies I & II • Study I • Controlled Experiment: Users were asked to discover relevant talks by exploring the three types of entities: tags, recommender agents and users. • Conducted at Hypertext and UMAP 2012 (21 users) • Subjects familiar with Visualizations and Recsys • Study II • Field Study: Users were left free to explore the interface. • Conducted at LAK 2012 and ECTEL 2013 (18 users) • Subjects familiar with visualizations, but not much with RecSys 7/17/16 D. Parra, IFUP keynote, UMAP 2016 12
  13. 13. Evaluation: Intersections & Effectiveness • What do we call an “Intersection”? • We used #explorations on intersections and their effectiveness, defined as: 7/17/16 D. Parra, IFUP keynote, UMAP 2016 13
  14. 14. Results of Studies I & II • Effectiveness increases with intersections of more entities • Effectiveness wasn’t affected in the field study (study 2) • … but exploration distribution was affected 7/17/16 D. Parra, IFUP keynote, UMAP 2016 14
  15. 15. SetFusion • Main motivation was investigating a simpler way to visualize recommendations from several sources. Would that improve “effectiveness” ? • 3 studies were conducted • Field study in CSCW 2013 • Controlled user with iConference series • Field study in UMAP 2013 7/17/16 D. Parra, IFUP keynote, UMAP 2016 15
  16. 16. SetFusion 7/17/16 D. Parra, IFUP keynote, UMAP 2016 16
  17. 17. SetFusion I Traditional Ranked List Paperssorted by Relevance. It combines3 recommendation approaches. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 17
  18. 18. SetFusion - II Sliders Allow the user to control the importance of each data source or recommendation method Interactive Venn Diagram Allows the user to inspect and to filter papers recommended. Actionsavailable: - Filter item list by clicking on an area - Highlight a paper by mouse-over on a circle - Scroll to paper by clicking on a circle - Indicate bookmarkedpapers 7/17/16 D. Parra, IFUP keynote, UMAP 2016 18
  19. 19. SetFusion Controlled Study • 40 users, within-subjects study, simulated iConference attendance 7/17/16 D. Parra, IFUP keynote, UMAP 2016 19
  20. 20. Controlled Study Main Results • Controlling and fusing sources of relevancy produces more bookmarks: • 58.44% of bookmarks after using sliders • 28.08% of bookmarks after using Venn diagram 7/17/16 D. Parra, IFUP keynote, UMAP 2016 20
  21. 21. Controlled Study Main Results • Users prefer articles recommended by a fusion of methods, in both conditions, but the effect is stronger with the visualization 7/17/16 D. Parra, IFUP keynote, UMAP 2016 21
  22. 22. SetFusion – UMAP 2013 • Field Study: let users freely explore the interface - ~50% (50 users) tried the SetFusion recommender - 28% (14 users) bookmarked at least one paper - Users explored in average 14.9 talks and bookmarked 7.36 talks in average. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 22
  23. 23. TalkExplorer Vs. SetFusion Clustermap Venn diagram 7/17/16 D. Parra, IFUP keynote, UMAP 2016 23
  24. 24. TalkExplorer vs. SetFusion • Comparing distributions of explorations In studies 1 and 2 over TalkExplorer we observed an important change in the distribution of explorations. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 24
  25. 25. TalkExplorer vs. SetFusion • Comparing distributions of explorations Comparing the field studies: - In TalkExplorer, 84% of the explorationsover intersectionswere performed over clusters of 1 item - In SetFusion, was only 52%, compared to 48% (18% + 30%) of multiple intersections, diff. not statistically significant 7/17/16 D. Parra, IFUP keynote, UMAP 2016 25
  26. 26. Take-aways • We showed that intersections of several contexts of relevance help to discover relevant items. • The visual paradigm used can have a strong effect on user behavior: we need to keep working on visual representations that promote exploration without increasing the cognitive load over the users. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 26
  27. 27. Part 2: Fusing Data in the Music Domain 7/17/16 D. Parra, IFUP keynote, UMAP 2016 27
  28. 28. References Parra-Santander, D., & Amatriain, X. (2011). Walk the Talk: Analyzing the relation between implicit and explicit feedback for preference elicitation. Proceedings of UMAP 2011, Girona, Spain Parra, D., Karatzoglou, A., Amatriain, X., & Yavuz, I. (2011). Implicit feedback recommendation via implicit-to-explicit ordinal logistic regression mapping. Proceedings of the CARS Workshop, RecSys Chicago, IL, USA, 2011. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 28
  29. 29. Introduction (back in 2011) • Most of recommender system approaches rely on explicit information of the users, but… • Explicit feedback: scarce (people are not especially eager to rate or to provide personal info) • Implicit feedback: Is less scarce, but (Hu et al., 2008) There’s no negative feedback … and if you watch a TV program just once or twice? Noisy … but explicit feedback is also noisy (Amatriain et al., 2009) Preference & Confidence … we aim to map the I.F. to preference (our main goal) Lack of evaluation metrics … if we can map I.F. and E.F., we can have a comparable evaluation 7/17/16 D. Parra, IFUP keynote, UMAP 2016 29
  30. 30. Introduction (Today) • Is it possible to map implicit behavior to explicit preference (ratings)? These data can eventually be fused into a single compact model. • OUR APPROACH: Study with Last.fm users • Part I: Ask users to rate 100 albums (how to sample) • Part II: Build a model to map collected implicit feedback and context to explicit feedback 7/17/16 D. Parra, IFUP keynote, UMAP 2016 30
  31. 31. Walk the Talk (2011) Albums they listened to during last: 7days, 3months, 6months, year, overall For each album in the list we obtained: # user plays (in each period), # of global listeners and # of global plays 7/17/16 D. Parra, IFUP keynote, UMAP 2016 31
  32. 32. Walk the Talk - 2 • Requirements: 18 y.o., scrobblings > 5000 7/17/16 D. Parra, IFUP keynote, UMAP 2016 32
  33. 33. Quantization of Data for Sampling • What items should they rate? Item (album) sampling: • Implicit Feedback (IF): playcountfor a user on a given album. Changed to scale [1-3], 3 means being more listened to. • Global Popularity (GP): global playcount for all users on a given album [1-3]. Changed to scale [1-3], 3 means being more listened to. • Recentness(R) : time elapsed since user played a given album. Changed to scale [1-3], 3 means being listened to more recently. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 33
  34. 34. Regression Analysis • Including Recentness increases R2 in more than 10% [ 1 -> 2] • Including GP increases R2, not much compared to RE + IF [ 1 -> 3] • Not Including GP, but including interaction between IF and RE improves the variance of the DV explained by the regression model. [ 2 -> 4 ] M1: implicit feedback M2: implicit feedback & recentness M4: Interaction of implicit feedback & recentness M3: implicit feedback, recentness, global popularity 7/17/16 D. Parra, IFUP keynote, UMAP 2016 34
  35. 35. Regression Analysis • We tested conclusions of regression analysis by predicting the score, checking RMSE in 10-fold cross validation. • Results of regression analysis are supported. Model RMSE1 RMSE2 User average 1.5308 1.1051 M1: Implicit feedback 1.4206 1.0402 M2: Implicit feedback + recentness 1.4136 1.034 M3: Implicit feedback + recentness + global popularity 1.4130 1.0338 M4: Interaction of Implicit feedback * recentness 1.4127 1.0332 7/17/16 D. Parra, IFUP keynote, UMAP 2016 35
  36. 36. Part II: Extension of Walk the Talk • Implicit Feedback Recommendation via Implicit-to- Explicit OLR Mapping (Recsys 2011, CARS Workshop) • Consider ratings as ordinal variables • Use mixed-models to account for non-independence of observations • Compare with state-of-the-art implicit feedback algorithm 7/17/16 D. Parra, IFUP keynote, UMAP 2016 36
  37. 37. Recalling the 1st study (5/5) • Prediction of rating by multiple Linear Regression evaluated with RMSE. • Results showed that Implicit feedback (play count of the album by a specific user) and recentness (how recently an album was listened to) were important factors, global popularity had a weaker effect. • Results also showed that listening style (if user preferred to listen to single tracks, CDs, or either) was also an important factor, and not the other ones. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 37
  38. 38. ... but • Linear Regression didn’t account for the nested nature of ratings • And ratings were treated as continuous, when they are actually ordinal. User 1 1 3 5 3 0 4 5 2 2 1 5 4 3 2 User n 3 2 1 0 4 5 2 5 4 3 2 1 3 5 . . . 7/17/16 D. Parra, IFUP keynote, UMAP 2016 38
  39. 39. So, Ordinal Logistic Regression! • Actually Mixed-Effects Ordinal Multinomial Logistic Regression • Mixed-effects: Nested nature of ratings • We obtain a distribution over ratings (ordinal multinomial) per each pair USER, ITEM -> we predict the rating using the expected value. • … And we can compare the inferred ratings with a method that directly uses implicit information (playcounts) to recommend ( by Hu, Koren et al. 2007) 7/17/16 D. Parra, IFUP keynote, UMAP 2016 39
  40. 40. Ordinal Regression for Mapping • Model • Predicted value 7/17/16 D. Parra, IFUP keynote, UMAP 2016 40
  41. 41. Datasets • D1: users, albums, if, re, gp, ratings, demographics/consumption • D2: users, albums, if, re, gp, NO RATINGS. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 41
  42. 42. Results 7/17/16 D. Parra, IFUP keynote, UMAP 2016 42
  43. 43. Conclusions (after 5 years) • Fusion of Implicit feedback (scrobbles) and recency can help to make more precise recommendations • Models like the one by Gurbanov and Ricci presented this year at UMAP offer a more compact way to work with these data: “Modeling and Predicting User Actionsin Recommender Systems” by Tural Gurbanov, Francesco Ricci, Meinhard Ploner • Evaluation is still a challenge! 7/17/16 D. Parra, IFUP keynote, UMAP 2016 43
  44. 44. Part 3: Data Fusion for Virtual Worlds 7/17/16 D. Parra, IFUP keynote, UMAP 2016 44
  45. 45. References Lacic, E., Kowald, D., Eberhard, L., Trattner, C., Parra, D., & Marinho, L. B. (2015). Utilizing online social network and location-based data to recommend products and categories in online marketplaces. In Mining, Modeling, and Recommending 'Things' in Social Media (pp. 96-115). Springer International Publishing. Trattner, C., Parra, D., Eberhard, L., & Wen, X. (2014, April). Who will trade with whom?: Predicting buyer- seller interactions in online trading platforms through social networks. In Proceedings of the 23rd International Conference on World Wide Web (pp. 387-388). ACM. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 45
  46. 46. Second Life 7/17/16 D. Parra, IFUP keynote, UMAP 2016 46 Social Network Marketplace Virtual World Christoph Trattner Know-Center Graz, Austria
  47. 47. Dataset (Task: Item recommendation) 7/17/16 D. Parra, IFUP keynote, UMAP 2016 47
  48. 48. Recommendation Approaches • User-based Collaborative Filtering, where • Hybrid approaches (combine features) 7/17/16 D. Parra, IFUP keynote, UMAP 2016 48
  49. 49. Similarity Features - I 7/17/16 D. Parra, IFUP keynote, UMAP 2016 49
  50. 50. Similarity Features II 7/17/16 D. Parra, IFUP keynote, UMAP 2016 50
  51. 51. Hybrids 7/17/16 D. Parra, IFUP keynote, UMAP 2016 51
  52. 52. Different Task: Predict Buyer-Seller 7/17/16 D. Parra, IFUP keynote, UMAP 2016 52
  53. 53. Predict Buyer-Sellers: AUC Results 7/17/16 D. Parra, IFUP keynote, UMAP 2016 53
  54. 54. Summary • These studies show that social network data is very important for certain types of recommendations. • Due to the lack of available cross-service data in the real world, using data from Second Life has the potential of a Proxy to build models for the real world. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 54
  55. 55. Part 4: Fusion of Time into CF 7/17/16 D. Parra, IFUP keynote, UMAP 2016 55
  56. 56. References Larrain, S., Trattner, C., Parra, D., Graells-Garrido, E., & Nørvåg, K. (2015). Good Times Bad Times: A Study on Recency Effects in Collaborative Filtering for Social Tagging. In Proceedings of the 9th ACM Conference on Recommender Systems (pp. 269-272). ACM. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 56
  57. 57. Time-Aware Collaborative Filtering • Collaborative Filtering (User and Item-based) considers all transactions equally important • But transactions which happened too long ago might be less important shaping the user model… 7/17/16 D. Parra, IFUP keynote, UMAP 2016 57 5 4 2 1 5 4 Active user User_1 User_2 2 3 4 Item 1 Item 2 consumed 2 years ago consumed 1 month ago
  58. 58. Two Concepts for Time-Aware CF • Items consumed recently might be more important than items consumed long time ago. •Whenand how to incorporate time in user- and item-based collaborative filtering? 7/17/16 D. Parra, IFUP keynote, UMAP 2016 58
  59. 59. When and How in UB-CF 7/17/16 D. Parra, IFUP keynote, UMAP 2016 59 Item 1 Item 2 … Item j Item m User 1 1 5 2 User 2 5 1 4 2 … User i 3 4 … User n 2 5 5 Step 1: Find similar users. Weight transactions based on recency difference
  60. 60. When and How in UB-CF 7/17/16 D. Parra, IFUP keynote, UMAP 2016 60 Item 1 Item 2 … Item j Item m User 1 1 5 2 User 2 5 1 4 3 … User i 3 4 … User n 2 5 4 Step 2: Similar users found. Recommend items with high ratings and consumed recently.
  61. 61. When and How in IB-CF 7/17/16 D. Parra, IFUP keynote, UMAP 2016 61 Item 1 Item 2 … Item j Item m User 1 1 5 2 User 2 5 1 4 2 … User i 3 4 … User n 2 5 5 Step 1: Find similar items sim(items(user i)). Weight items based on recency. Consu- med 1 week ago Consu- med 1 year ago
  62. 62. When and How in IB-CF 7/17/16 D. Parra, IFUP keynote, UMAP 2016 62 Item 1 Item 2 … Item j Item m User 1 1 5 2 User 2 5 1 4 2 … User i 3 4 … User n 2 5 5 Step 2: Find similar items Item 1. Weight items based on recency difference.
  63. 63. Decay functions • Exponential • Power • Linear • Logistic • BLL 7/17/16 D. Parra, IFUP keynote, UMAP 2016 63
  64. 64. Parameters and fitting 7/17/16 D. Parra, IFUP keynote, UMAP 2016 64 Days from bookmark Median = 50 days
  65. 65. Evaluation: Datasets 7/17/16 D. Parra, IFUP keynote, UMAP 2016 65
  66. 66. Evaluation: Results I 7/17/16 D. Parra, IFUP keynote, UMAP 2016 66
  67. 67. Evaluation: Results II 7/17/16 D. Parra, IFUP keynote, UMAP 2016 67
  68. 68. Summary • Best results: Post-filtering combined with power decay gives the best • Pre- and Post-filtering produce a strong effect, but UB-CF is more susceptible than IB-CF to the effect of filtering specially pre-filtering. • The hybridization of UB and IB improves makes the recommendation more robust. • Future work: fit parameters on a user basis rather than dataset basis. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 68
  69. 69. Wrapping up 7/17/16 D. Parra, IFUP keynote, UMAP 2016 69
  70. 70. • Visual approaches for user-controllable data fusion can work, but there’s room to find effective visual- interactive combinations. • In the music domain and other domains, time and recency can work very well for recommendation. • …but using time requires an adequate modeling of the decay functions. • Information from Virtual worlds could may be used as proxy to build models and use them for transfer learning. 7/17/16 D. Parra, IFUP keynote, UMAP 2016 70
  71. 71. Promising works in this UMAP 2016 • Using Semantic Information: Extend the work of Musto et al. (UMAP 2016) to support better models and more explainable models. • Combine taxonomies with implicit/explicit feedback using compact graphical models (co-authored by g. Guo) • Extend models with time and other sources of feedback (Turgnov et al.) 7/17/16 D. Parra, IFUP keynote, UMAP 2016 71
  72. 72. Ideas for Data Fusion • Combine multimodal information within the same embedding using deep learning has given great results in visual processing + NLP fields: • Visual Q&A • Automatic Captioning of Pictures 7/17/16 D. Parra, IFUP keynote, UMAP 2016 72 Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Lawrence Zitnick, C., & Parikh, D. (2015). Vqa: Visual question answering. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2425-2433).
  73. 73. Thanks! dparras@uc.cl http://dparra.sitios.ing.uc.cl/ 7/17/16 D. Parra, IFUP keynote, UMAP 2016 73

×