Introduction to Crab - Python Framework for Building Recommender Systems


Published on

Presentation about Crab - Framework in Python for Building Recommender Systems at Scipy Conference 2011 , Austin Tx, USA.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Introduction to Crab - Python Framework for Building Recommender Systems

  1. 1. Crab A Python Framework for Building Recommendation Engines Scipy 2011, Austin TXMarcel Caraciolo Ricardo Caspirro Bruno Melo @marcelcaraciolo @ricardocaspirro @brunomelo 1
  2. 2. What is Crab ? A python framework for building recommendation enginesA Scikit module for collaborative, content and hybrid filtering Mahout Alternative for Python Developers :D Open-Source under the BSD license 2
  3. 3. When started ?It began one year agoCommunity-driven, 4 membersSince April,2011 the open-source labs Muriçoca incorporated itSince April,2011 rewritting it as Scikit 3
  4. 4. Knowing ScikitsScikits are Scipy Toolkits - independent and projects hosted under a common namespace. Scikits Image Scikits MlabWrap Scikits AudioLab Scikit Learn .... 4
  5. 5. Knowing Scikits Scikit-Learn Machine Learning Algorithms + scientific Python packages (Numpy, Scipy and Matplotlib) goal: Incorporate the Crab as Scikit and incorporate some parts of them at Scikit-learn 5
  6. 6. Why Recommendations ?The world is an over-crowded place !"#$%&()$*+$,-$&.#/0&%)#)$1(,0# 6
  7. 7. Why Recommendations * +,&-.$/).#&0#/"1.#$%234(".# ? $/)#5(&6 7&.2.#"$4,#)$8 We are overloaded * 93((3&/.#&0#:&3".;#5&&<.# $/)#:-.34#2%$4<.#&/(3/"Thousands of news articles and blog posts each day * =/#>$/&3;#?#@A#+B#4,$//"(.;# 2,&-.$/).#&0#7%&6%$:.# Millions of movies, books and music tracks online "$4,#)$8 Several Places, Offers and Events * =/#C"1#D&%<;#.""%$(# Even Friends sometimes we are overloaded ! 2,&-.$/).#&0#$)#:"..$6".# ."/2#2&#-.#7"%#)$8 7
  8. 8. Why Recommendations ?We really need and consume only a few of them! “A lot of times, people don’t know what they want until you show it to them.” Steve Jobs “We are leaving the Information age, and entering into the Recommendation age.” Chris Anderson, from book Long Tail 8
  9. 9. Why Recommendations ?Can Google help ? Yes, but only when we really know what we are looking for But, what’s does it mean by “interesting” ?Can Facebook help ? Yes, I tend to find my friends’ stuffs interesting What if i had only few friends and what they like do not always attract me ?Can experts help ? Yes, but it won’t scale well. But it is what they like, not me! Exactly same advice! 9
  10. 10. Why Recommendations ? Recommendation SystemsSystems designed to recommend to me something I may like 10
  11. 11. Why Recommendations ? !"#$%&"$"()*#*+,) Recommendation Systems -+*#)+. -#/) 0#)1# !2 23&4"+)1 5,6 7),*%"&863 Graph Representation 11
  12. 12. The current CrabCollaborative Filtering algorithms User-Based, Item-Based and Slope OneEvaluation of the Recommender Algorithms Precision, Recall, F1-Score, RMSE Precision-Recall Charts 12
  13. 13. The current Crab Precision-Recall Charts 13
  14. 14. The current Crab 14
  15. 15. The current CrabUsing REST APIs to deploy the recommender django-piston, django-rest, django-tastypie 15
  16. 16. Crab is already in production Brazilian Social Network called Educational network with more than 60.000 students and 3000 video-classes Running on Python + Numpy + Scipy and DjangoBackend for RecommendationsMongoDB - mongoengine Daily Recommendations with Explanations 16
  17. 17. Evaluating your recommender Crab implements the most used recommender metrics. Precision, Recall, F1-Score, RMSE Using matplotlib for a plotter utility Implement new metricsSimulations support maybe (??) 17
  18. 18. Evaluating your recommenderAll you have to do is implement your Evaluator 18
  19. 19. Distributing the recommendation computationsUse Hadoop and Map-Reduce intensively Investigating the Yelp mrjob framework the Netflix and novel standard-of-the-art used Matrix Factorization, Singular Value Decomposition (SVD), Boltzman machinesThe most commonly used is Slope One technique. 19
  20. 20. Why migrate ?Old Crab running only using Pure Python Recommendations demand heavy maths calculations and lots of processingCompatible with Numpy and Scipy libraries High Standard and popular scientific libraries optimized for scientific calculations in PythonScikits projects are amazing! Active Communities, Scientific Conferences and updated projects (e.g. scikit-learn)Turn the Crab framework visible for the community Join the scientific researchers and machine learning developers around the Globe coding with Python to help us in this project Be Fast and Furious 20
  21. 21. Why migrate ?Numpy optimized with PyPy 2.x - 48.x Faster 21
  22. 22. How are we working ? Sprints, Online Discussions and Issues 22
  23. 23. Future Releases Planned Release 0.1 Collaborative Filtering Algorithms working, sample datasets to load and test Planned Release 0.11 Evaluation of Recommendation Algorithms and Database Models support Planned Release 0.12 Recommendation as Services with REST APIs.... 23
  24. 24. Join us!1. Read our Wiki Page Check out our current sprints and open issues Forks, Pull Requests mandatory4. Join us at #muricoca or at our discussion list 24
  25. 25. Crab A Python Framework for Building Recommendation Engines Caraciolo Ricardo Caspirro Bruno Melo @marcelcaraciolo @ricardocaspirro @brunomelo {marcel, ricardo,bruno} 25