Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CLEF NewsREEL 2016 Overview

2,230 views

Published on

Overview of the CLEF NewsREEL 2016 lab, presented on 5 September at CLEF 2016 in Evora, Portugal

Published in: Internet
  • Be the first to comment

  • Be the first to like this

CLEF NewsREEL 2016 Overview

  1. 1. News REcommendation Evaluation Lab (NewsREEL) Lab Overview Frank Hopfgartner, Benjamin Kille, Andreas Lommatzsch, Martha Larson, Torben Brodt, Jonas Seiler
  2. 2. Recommender systems or recommendation systems are a subclass of information filtering systems that seek to predict the "rating" or "preference“ that a user would give to an item. Recommender Systems
  3. 3. Items (Set-based recommenders)
  4. 4. Items (Streams)
  5. 5. • Recommender Systems • Evaluation • NewsREEL scenario • NewsREEL 2016 Outline
  6. 6. How do we evaluate? Academia Industry • Static, often rather old datasets • Offline Evaluation • Focus on Algorithms and Precision • Dynamic dataset • Online A/B testing • Focus on user satisfaction
  7. 7. Example: Recommending sites in Evora Sé Catedral Capela dos Ossos Templo romano Palacio de Don Manuel I Source (Images): Wikipedia
  8. 8. 1.Chose time point to split dataset 2.Classify ratings before t0 as training set 3.Classify ratings after as test set Offline Evaluation Dataset construction Centro Historico Capela dos Ossos Templo Romano Se Catedral Almendres Cromlech Cristiano 2 4 5 2 Marta 5 3 Luis 1 2
  9. 9. 1.Train rating function f(u,i) using training set 2.Predict rating for all pairs (u,i) of test set 3.Compute RMSE(f) over all rating predictions Offline Evaluation Benchmarking Recommendation Task
  10. 10. • Ignores users’ previous interactions/preferences • Does not consider trends/shifting preferences • ... • Technical challenges are ignored Drawbacks of offline evaluation And what about me?
  11. 11. Example: Online evaluation
  12. 12. Online Evaluation (A/B testing) Compare performance, e.g., based on profit margins $$ $ A B Click-through rate User retention time Required resources User satisfaction
  13. 13. • Large user base and costly infrastructure required • Different evaluation metrics required • Comparison to offline evaluation challenging Drawbacks of online evaluation And what about me?
  14. 14. • Academia and industry apply different evaluation approaches • Limited transfer from offline to online scenario • Multi-dimensional benchmarking • Combination of different evaluation approaches Evaluation challenges
  15. 15. • Recommender Systems • Evaluation • NewsREEL scenario • NewsREEL 2016 Outline
  16. 16. In CLEF NewsREEL, participants can develop stream- based news recommendation algorithms and have them benchmarked (a) online by millions of users over the period of a few months, and (b) offline by simulating a live stream. CLEF NewsREEL
  17. 17. NewsREEL scenario Image: Courtesy of T. Brodt (plista)
  18. 18. NewsREEL scenario Profit = Clicks on recommendations Benchmarking metric: Click-Through-Rate Request article Request article Request recommendation Request recommendation
  19. 19. Task 2: Offline Evaluation • Traffic and content updates of nine German-language news content provider websites • Traffic: Reading article, clicking on recommendations • Updates: adding and updating news articles • Simulation of data stream using Idomaar framework • Participants have to predict interactions with data stream • Quality measured by the ratio of successful predictions by the total number of predictions
  20. 20. Simulation process Idomaar simulate stream request article
  21. 21. Idomaar stream simulation 21
  22. 22. Task 1: Online Evaluation • Provide recommendations for visitors of the news portals of plista’s customers • Ten portals (local news, sports, business, technology) • Communication via Open Recommendation Platform (ORP) • Benchmark own performance with other participants and baseline algorithms during three pre-defined evaluation windows • Best algorithms determined in final evaluation period • Standard evaluation metrics
  23. 23. Real-Time Recommendation T. Brodt and F. Hopfgartner “Shedding Light on a Living Lab: The CLEF NewsREEL Open Recommendation Platform,” In Proc. of IIiX 2014, Regensburg, Germany, pp. 223-226, 2014.
  24. 24. • Recommender Systems • Evaluation • NewsREEL scenario • NewsREEL 2016 Outline
  25. 25. Participation 25
  26. 26. Task 1 – First evaluation window
  27. 27. Task 1 – Second evaluation window
  28. 28. Task 1 – Third Evaluation window
  29. 29. • NewsREEL presentations – Online Algorithms and Data Analysis – Frameworks and Algorithms • Evaluation results and task winners • Joint Session: LL4IR & NewsREEL: New Ideas NewsREEL session Tomorrow, 1:30pm – 3:30pm
  30. 30. More Information • http://orp.plista.com • http://www.clef-newsreel.org • http://www.crowdrec.eu • http://sigir.org/files/forum/2015D/p129.pdf Thank you

×