Personalizationwith Big Data21/5/13.2013Yossi CohenCTO 3Base(Taldor Group)
Who we are & what we do Projects in cloud environment SAAS, Multi tenant, lots of users >20 Big Data projects in variou...
You must have seen this…3
4
5
6
7
Some examples8
The biggest one9
Another one10
Lessons Users have different tastes Users don’t have patience We need to show them things they like… Approaches Model...
Collaborative Filtering is … Given a user’spreferences for items,guess which otheritems would be highlypreferred Only ne...
Collaborative Filtering is …Sean likes “Scarface” a lotRobin likes “Scarface” somewhatGrant likes “The Notebook” not at al...
Recommending people food14
Item-Based Algorithm Recommend items similar to a user’shighly-preferred items15
Item-Based Algorithm Have user’s preference for items Know all items and can compute weightedaverage to estimate user’s ...
Item-Item Similarity Could be based on content… Two foods similar if both sweet, both cold BUT: In collaborative filter...
As matrix math User’s preferences are a vector Each dimension corresponds to one item Dimension value is the preference...
As matrix math16 9 16 5 69 30 19 3 216 19 23 5 45 3 5 10 206 2 4 20 916 animals ate bothhot dogs and icecream10 animals at...
Hadoop way20
Apache Mahout is … Machine learning … Collaborative filtering(recommenders) Clustering Classification Frequent item s...
Thank you For more information:Yossi@3base.co.ilwww.taldor.co.il
Upcoming SlideShare
Loading in...5
×

Yossi cohen 3 base

244

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
244
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Yossi cohen 3 base

  1. 1. Personalizationwith Big Data21/5/13.2013Yossi CohenCTO 3Base(Taldor Group)
  2. 2. Who we are & what we do Projects in cloud environment SAAS, Multi tenant, lots of users >20 Big Data projects in various technologies(NoSQL, Hadoop etc.) End to end solution Architecture & design Implementation Maintenance2
  3. 3. You must have seen this…3
  4. 4. 4
  5. 5. 5
  6. 6. 6
  7. 7. 7
  8. 8. Some examples8
  9. 9. The biggest one9
  10. 10. Another one10
  11. 11. Lessons Users have different tastes Users don’t have patience We need to show them things they like… Approaches Modeling Content Based Collaborative Filtering11
  12. 12. Collaborative Filtering is … Given a user’spreferences for items,guess which otheritems would be highlypreferred Only needspreferences;users and itemsopaque Many algorithms!12
  13. 13. Collaborative Filtering is …Sean likes “Scarface” a lotRobin likes “Scarface” somewhatGrant likes “The Notebook” not at all…(123,654,5.0)(789,654,3.0)(345,876,1.0)…(345,654,4.5)…(Magic)Grant may like “Scarface” quite a bit…13
  14. 14. Recommending people food14
  15. 15. Item-Based Algorithm Recommend items similar to a user’shighly-preferred items15
  16. 16. Item-Based Algorithm Have user’s preference for items Know all items and can compute weightedaverage to estimate user’s preference What is the item – item similarity notion?for every item i that u has no preference for yetfor every item j that u has a preference forcompute a similarity s between i and jadd us preference for j, weighted by s,to a running averagereturn top items, ranked by weighted average16
  17. 17. Item-Item Similarity Could be based on content… Two foods similar if both sweet, both cold BUT: In collaborative filtering, based only onpreferences (numbers) Pearson correlation between ratings ? Log-likelihood ratio ? Simple co-occurrence:Items similar when appearing often in the same user’s setof preferences17
  18. 18. As matrix math User’s preferences are a vector Each dimension corresponds to one item Dimension value is the preference value Item-item co-occurrences are a matrix Row i / column j is count of item i / j co-occurrence Estimating preferences:co-occurrence matrix × preference (column) vector18
  19. 19. As matrix math16 9 16 5 69 30 19 3 216 19 23 5 45 3 5 10 206 2 4 20 916 animals ate bothhot dogs and icecream10 animals ateblueberries05520135251220607019
  20. 20. Hadoop way20
  21. 21. Apache Mahout is … Machine learning … Collaborative filtering(recommenders) Clustering Classification Frequent item set mining and more … at scale Much implemented on Hadoop Efficient data structures21
  22. 22. Thank you For more information:Yossi@3base.co.ilwww.taldor.co.il

×