Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Popular != significant Pablo Musa

31 views

Published on

Popular!=significant

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Popular != significant Pablo Musa

  1. 1. @DataBeersMLG 18-May-2017 Pablo Musa Popular != Significant
  2. 2. Pablo Musa • MSc. Computer Science • Backend Developer • Software Architect • Infra Lover • 2 years Hadoop DevOps • 3 years Elastic Enthusiast 2
  3. 3. Motivation • Marketing, Recommendation, Analysis,... • It is all about understanding the "audience", the group, the niche • Sometimes understanding the group can be hard • In which city should I focus my Marketing Campaign? • A user watched "Mar Adentro" what should we recommend next? 3
  4. 4. Popularity • Are blue cars more common in London or Birmingham? • We will probably have more blue cars in London. However, not because Londoners like it more than others, but because London has a huge population. • Are there more people accessing "elastic.co" in London or Birmingham? 4
  5. 5. Popularity • A user watched "Mar Adentro" what movies should we recommend next? • We can look into the most common movies that users that watched "Mar Adentro" also watched and use it as recommendation. Will it be good? 5
  6. 6. 6 Be careful.
  7. 7. 7 Be careful. Make sure you do the correct analysis.
  8. 8. Popular • Are blue cars more common in London or Birmingham? • We can look into the most common movies that...? • Common: occurring, found, or done often; prevalent 8
  9. 9. Top Popular Cities per Color 9
  10. 10. Significant • Are blue cars more significant in London or Birmingham? • We can look into the most significant movies that...? • Significant: sufficiently great or important to be worthy of attention; noteworthy. 10
  11. 11. Top Significant Cities per Color 11
  12. 12. 12 999 users watched RecommendedWatched ?
  13. 13. 13 999 users watched 708 also watched 701 also watched 699 also watched RecommendedWatched
  14. 14. 14 999 users watched 371 also watched 263 also watched 354 also watched RecommendedWatched
  15. 15. 15 999 users watched 371 also watched 263 also watched 354 also watched RecommendedWatched bg:2437
 sc:7.45 bg:1311
 sc:7.04 bg:2619
 sc:6.28
  16. 16. 16 SignificantPopular
  17. 17. What about bands? 17
  18. 18. What about bands? 18
  19. 19. The Coldplay Effect 19
  20. 20. The Coldplay Effect Fixed 20
  21. 21. Another way to understand the coldplay effect 21 q=random q=mecano s=10% s=all
  22. 22. Another way to understand the coldplay effect 22 q=[mecano, amaral, pereza, coldplay] q=[mecano, amaral, pereza, coldplay] s=all s=500
  23. 23. Another way to understand the coldplay effect 23 q=[mecano, amaral, pereza, coldplay] q=[mecano, amaral, pereza, coldplay] s=all s=500 NOISE SIGNAL
  24. 24. References • https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search- aggregations-bucket- • terms-aggregation.html • significantterms-aggregation.html • sampler-aggregation.html • Datasets • https://grouplens.org/datasets/movielens/20M/ • http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/ lastfm-360K.html • http://data.dft.gov.uk/ 24
  25. 25. Thanks!! Pablo Musa @pablitomusa

×