Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
When Recommenders Met Big Data
An Architectural Proposal and Evaluation
Daniel Valcarce Javier Parapar ´Alvaro Barreiro
CE...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Table of Contents
Introdu...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Motivation
According to S...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Recommender Systems
Objec...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Recommender Systems
Objec...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Table of Contents
Introdu...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our work
Recommender arch...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Generic Recommender Syste...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our goals
Scalability
Mor...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
6 of 19
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our proposal: Front-end
U...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
8 of 19
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our proposal: Recommendat...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
10 of 19
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our proposal: Storage Com...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Our proposal: Storage Com...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
13 of 19
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Table of Contents
Introdu...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Experiment: storing ratin...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Rating Insertion
Figure: ...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Rating Insertion
Figure: ...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Rating Insertion
Figure: ...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Recommendation Generation...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Recommendation Generation...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Recommendation Serving
Fi...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Table of Contents
Introdu...
Introduction Recommender System Architecture Experiments and results Conclusions and Future Work
Conclusions and Future Wo...
When Recommenders Met Big Data
An Architectural Proposal and Evaluation
Daniel Valcarce Javier Parapar ´Alvaro Barreiro
CE...
Upcoming SlideShare
Loading in …5
×

When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CERI '14 Slides]

236 views

Published on

Slides of the CERI 2014 paper:

Daniel Valcarce, Javier Parapar, Álvaro Barreiro. When Recommenders Met Big Data: an Architectural Proposal and Evaluation. Proceedings of the 3rd Spanish Conference on Information Retrieval, CERI 2014, pp. 73-84, A Coruña, Spain, 19 - 20 June, 2014. ISBN 978-84-9749-591-2.

http://www.dc.fi.udc.es/~dvalcarce/pubs/valcarce-etal-ceri2014.pdf

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CERI '14 Slides]

  1. 1. When Recommenders Met Big Data An Architectural Proposal and Evaluation Daniel Valcarce Javier Parapar ´Alvaro Barreiro CERI 2014 3rd Spanish Conference on Information Retrieval A Coru˜na, June 2014
  2. 2. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Table of Contents Introduction Motivation Recommender Systems Recommender System Architecture Overview Front-end Recommendation engine Storage Experiments and results Rating Insertion Recommendation Generation Recommendation Serving Conclusions and Future Work
  3. 3. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Motivation According to Shareaholic, in 2013... web traffic generated by search engines dropped 6% social networks increased more than 100% Users... used to query what they want want personalised recommendations 1 of 19
  4. 4. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Recommender Systems Objective Predict user preferences over items Approaches Content-based: uses properties of the items Collaborative filtering: based on similar users Hybrid approaches: combination of both 2 of 19
  5. 5. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Recommender Systems Objective Predict user preferences over items Approaches Content-based: uses properties of the items Collaborative filtering: based on similar users Hybrid approaches: combination of both 2 of 19
  6. 6. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Table of Contents Introduction Motivation Recommender Systems Recommender System Architecture Overview Front-end Recommendation engine Storage Experiments and results Rating Insertion Recommendation Generation Recommendation Serving Conclusions and Future Work
  7. 7. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our work Recommender architecture proposal for Big Data Detail specific technologies for each component Efficiency study of MySQL Cluster and Cassandra as alternatives for storing ratings and recommendations in the proposed architecture 3 of 19
  8. 8. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Generic Recommender System Architecture Front-end Storage Recommendation engine 4 of 19
  9. 9. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our goals Scalability More machines → more computational power Big Data capable High availability Fault-tolerance No single point of failure 5 of 19
  10. 10. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work 6 of 19
  11. 11. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our proposal: Front-end Use cases Search items Emit ratings Get recommendations Proposed architecture Distributed web application (Django) Redundant load balancers (Perlbal) Two levels of cache Reverse proxy cache (Varnish) Distributed memory cache (Memcached) 7 of 19
  12. 12. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work 8 of 19
  13. 13. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our proposal: Recommendation Engine Recommendations are precalculated and stored A batch process refreshes the suggestions regularly Use of MapReduce distributed model State-of-the-art paradigm for large-scale data processing Hadoop: MapReduce open source implementation Mahout: scalable machine learning library 9 of 19
  14. 14. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work 10 of 19
  15. 15. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our proposal: Storage Component I Information to be stored Common web application data (e.g., user profiles) Manage large amount of ratings and recommendations Data about items Requirements Read-scalable and fault-tolerance (replication) Write-scalable (sharding) Linear scalability with the number of nodes 11 of 19
  16. 16. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Our proposal: Storage Component II Proposed technologies Relational database (MySQL Cluster) NoSQL column store (Cassandra) Inverted indexes (Solr) 12 of 19
  17. 17. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work 13 of 19
  18. 18. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Table of Contents Introduction Motivation Recommender Systems Recommender System Architecture Overview Front-end Recommendation engine Storage Experiments and results Rating Insertion Recommendation Generation Recommendation Serving Conclusions and Future Work
  19. 19. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Experiment: storing ratings and recomendations Candidates MySQL Cluster Cassandra Netflix Prize Dataset 100M ratings 480k users 17.7k films Cluster configuration Number of machines: 4 Replication factor: 2 14 of 19
  20. 20. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Rating Insertion Figure: Average insertion rate obtained by inserting from 10 to 100 million ratings using 8 concurrent petitions 0.00 0.05 0.10 0.15 0.20 0.25 0.30 1e+07 2e+07 3e+07 4e+07 5e+07 6e+07 7e+07 8e+07 9e+07 1e+08 miliseconds/insertion # ratings MySQL Cluster 8 Cassandra 8 15 of 19
  21. 21. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Rating Insertion Figure: Average insertion rate obtained by inserting from 10 to 100 million ratings using 8, 16, 32 and 64 concurrent petitions 0.00 0.05 0.10 0.15 0.20 0.25 0.30 1e+07 2e+07 3e+07 4e+07 5e+07 6e+07 7e+07 8e+07 9e+07 1e+08 miliseconds/insertion # ratings MySQL Cluster 8 MySQL Cluster 16 MySQL Cluster 32 MySQL Cluster 64 Cassandra 8 Cassandra 16 Cassandra 32 Cassandra 64 15 of 19
  22. 22. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Rating Insertion Figure: Average insertion rate obtained by inserting from 10 to 100 million ratings using 8, 16, 32 and 64 concurrent petitions 0.00 0.05 0.10 0.15 0.20 0.25 0.30 1e+07 2e+07 3e+07 4e+07 5e+07 6e+07 7e+07 8e+07 9e+07 1e+08 miliseconds/insertion # ratings MySQL Cluster 8 MySQL Cluster 16 MySQL Cluster 32 MySQL Cluster 64 Cassandra 8 Cassandra 16 Cassandra 32 Cassandra 64 15 of 19
  23. 23. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Recommendation Generation Table: Times for Mahout’s Item-based Collaborative Filtering algorithm reading and writing directly to/from the database Storage Time Time per system (min) recommendation (ms) Cassandra 68.85 8.6 MySQL Cluster crash! crash! 16 of 19
  24. 24. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Recommendation Generation Table: Times for Mahout’s Item-based Collaborative Filtering algorithm Storage Time Time per system (min) recommendation (ms) Cassandra 68.85 8.6 MySQL Cluster * 274.73 34.3 * Using Sqoop, a tool for transferring bulk data between Hadoop Distributed File System and relational databases. 17 of 19
  25. 25. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Recommendation Serving Figure: Average serving rate obtained by querying the top 10 recommended items for 25 million users using 8, 16, 32 and 64 concurrent petitions 8 16 32 64 # threads 0.00 0.05 0.10 0.15 0.20 0.25 0.30 miliseconds/serving MySQL Cluster Cassandra 18 of 19
  26. 26. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Table of Contents Introduction Motivation Recommender Systems Recommender System Architecture Overview Front-end Recommendation engine Storage Experiments and results Rating Insertion Recommendation Generation Recommendation Serving Conclusions and Future Work
  27. 27. Introduction Recommender System Architecture Experiments and results Conclusions and Future Work Conclusions and Future Work We have proposed a highly scalable and fault-tolerant platform for recommender systems. We have benchmarked Cassandra and MySQL Cluster in the context of recommender systems. Future: study and benchmark more parts of the proposed platform. Future: develop more effective recommender algorithms on the plat- form. 19 of 19
  28. 28. When Recommenders Met Big Data An Architectural Proposal and Evaluation Daniel Valcarce Javier Parapar ´Alvaro Barreiro CERI 2014 3rd Spanish Conference on Information Retrieval A Coru˜na, June 2014

×