Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

User Engagement as Evaluation: a Ranking or a Regression Problem?

948 views

Published on

Slides presenting the winning approach of the Recsys Challenge 2014 workshop, presented at the RecSys 2014 conference on Oct 10, in Foster City (CA, USA) by Frédéric Guillou.

Published in: Science
  • Be the first to comment

User Engagement as Evaluation: a Ranking or a Regression Problem?

  1. 1. User Engagement as Evaluation: a Ranking or a Regression Problem? Frederic Guillou, Romaric Gaudel, Jeremie Mary et Philippe Preux Recsys - 10/10/2014 F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 1 / 17
  2. 2. Plan 1 Introduction 2 Recsys Challenge Description User Engagement as Evaluation The Retweet Eect 3 Method LambdaMART Model Regression Approach 4 Experiments Experimental Results Relevant Features 5 Conclusion F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 2 / 17
  3. 3. Introduction Standards approaches for recommender systems Try to predict ratings or interests by minimizing prediction error Such approaches have aws: good predictions does not mean good recommendation Learning to Rank (LTR) Use a set of query-item pairs described by a set of input features and a score Try to predict the most accurate ranking of items for a query F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 3 / 17
  4. 4. Recsys Challenge / Description A speci
  5. 5. c recommendation task Find items with the highest user engagement Rank a set of tweets according to their success (retweet, favorite) Use a regression or a LTR method? Data Tweets from MovieTweetings dataset: tweets representing ratings on IMDb app Input features are metadata about the user, the tweet and the movie F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 4 / 17
  6. 6. Recsys Challenge / User Engagement as Evaluation The overall evaluation: average of the NDCG@10 calculated for each user Metric Value Tweets 170,285 Unique users 22,079 Unique items 13,618 Tweets with 0 engagement 162,107 (95.2%) Unsuccessful users 17,502 (79.27%) 1 2 3 185,496 5-14 4 15-51 Figure : Distribution of user engagement for successful tweets Almost a binary classi
  7. 7. cation problem? Low number of successful tweets: most users only have tweet with zero engagement Even among successful tweets, the engagement is mostly 1 F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 5 / 17
  8. 8. Recsys Challenge / User Engagement as Evaluation 1 2 3-10 11-20 20 Figure : Distribution of the number of tweets per user Some cases where any ranking is equivalent: The user doesn't have any successful tweets The user has only one tweet or tweets with the exact same engagement F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 6 / 17
  9. 9. Recsys Challenge / The Retweet Eect Some of the tweets are retweets, not original tweets This has strong consequences: All these tweets have a strictly positive engagement The retweet count of the tweet is directly accessible through the original tweet metadata Metric Value Number of retweets 1,808 Percentage among successful tweets 22,10% Arti
  10. 10. cial successful users 28,44% After removal of the retweet eect and cleaning users, 2,792 users provide a clean ranking F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 7 / 17
  11. 11. Method / LambdaMART A listwise approach Consider the whole list of items as instances of learning Try directly to optimize a performance measure A combination of two previous algorithms MART: a pointwise approach based on boosted tree model Output is a linear combination of the output of a set of regression tree Can be viewed as performing gradient descent in a function space using regression trees LambdaRank: a method based on neural networks Express gradients based on the ranks of documents Lambda terms: rules de
  12. 12. ning how to change the ranks of items in order to optimize a performance measure F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 8 / 17
  13. 13. Method / LambdaMART Build the model Train on a modi
  14. 14. ed dataset with only useful users Very small dataset, with most users having only few successful tweets Lambda-MART score: 0.838 Lambda-MART cannot capture exactly the information about the retweet eect F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 9 / 17
  15. 15. Method / Regression methods The retweet eect can be taken into account through three steps: 1 Remove the retweet count of the original tweet from the features 2 Modify the user engagement by substracting the retweet count of the original tweet 3 Add the retweet count of the original tweet to the result returned by the learned model Pointwise approachs: builds a regression model and tries to predict the score of items Use of Linear Regression and Random Forests Lin / WrapLin : 0.806 / 0.843 RF / WrapRF : 0.823 / 0.858 F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 10 / 17
  16. 16. Experiments / Results Table : NDCG@10 on test dataset Model NDCG@10 Retweet 0.806 -MART 0.838 RF / WrapRF 0.823 / 0.858 Lin / WrapLin 0.806 / 0.843 Mean(-MART, Retweet) 0.876 Mean(-MART, WrapRF, WrapLin) 0.874 OptAvg(-MART, WrapRF, WrapLin) 0.876 OptAvg(-MART, RF, Retweet) 0.878 The improvement through wrapping shows the importance of the retweet eect All best scores of NDCG are using the LTR method LambdaMART F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 11 / 17
  17. 17. Experiments / Relevant Features with Random Forests 0.0 0.1 0.2 0.3 0.4 0.5 0.6 retweeted_fav retweeted_retweet listed_count followers_count retweeted_status lang_ar rating votes release_date created_at 0.0 0.1 0.2 0.3 0.4 0.5 0.6 followers_count listed_count favourites_count lang_ar rating media friends_count user_mentions time_tweet_scrap created_at 0.0 0.1 0.2 0.3 0.4 0.5 0.6 rating lang_ar time_tweet_movierelease release_date media user_mentions budget time_tweet_scrap lang_fa votes All features: features of the original tweet contribute the most Without retweet: user features are important ones Without user features: the rating given to the movie by the user contributes the most F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 12 / 17
  18. 18. Conclusion Recommendation problems are Learn To Rank problems Best approach builds upon a LTR strategy We can combine LambdaMART and Random Forests through a linear combination Scarcity of data make the robustness of model dicult to be proven The success of Twitter or IMDb should make the data collection easier to build bigger datasets and continue analysis and experiments F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 13 / 17
  19. 19. Thank you for your attention Questions ? ... F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 14 / 17
  20. 20. Thank you for your attention Questions ? F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 15 / 17
  21. 21. Discussions Temporal aspects Dierence of time interval between train/test set in uences prediction or evaluation Possibly a large impact of time on some features: A user has more chances to gather followers throughout a longer period of time Some movies are released at a speci
  22. 22. c time of the year A sequential approach? The evolution of a user or the popularity of a movie can be seen as sequential problems NDCG does not capture these information ) See metrics such as regret F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 16 / 17
  23. 23. Thank you for your attention Now is the real question time F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 17 / 17

×