Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Music Personalization At Spotify

2,725 views

Published on

Here are the slides from my talk at RecSys 2016.

Published in: Technology

Music Personalization At Spotify

  1. 1. Music Personalization @ Spotify Vidhya Murali @vid052 RecSys 2016
  2. 2. Spotify’s Big Data ‣ Started in 2006, now available in 58 countries ‣ 100+ million active users, 35+ million paid subscribers ‣ 30+ million songs in our catalog, ~20K added every day ‣ 2+ billion playlists ‣ 1 TB of log data every day ‣ Hadoop cluster with ~2500 nodes
  3. 3. 3 30 Million Tracks…
  4. 4. What to recommend?
  5. 5. What to recommend?
  6. 6. Personalization @ Spotify Features: Discover Discover Weekly Fresh Finds Home Radio Release Radar 5
  7. 7. Approaches ‣Manual Curation by Experts ‣Metadata (e.g: Label Provided Data, News, Blogs) ‣Audio Signals ‣Collaborative Filtering ‣ Hybrid
  8. 8. Latent Factor Models “Compact” representation for each user and items(songs): f-dimensional vectors
  9. 9. Latent Factor Models “Compact” representation for each user and items(songs): f-dimensional vectors Vidhya Rise .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . .. . .. . .. . . . ... ... ... ... .. mUsers Songs
  10. 10. Latent Factor Models “Compact” representation for each user and items(songs): f-dimensional vectors Vidhya Rise .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . .. . .. . .. . . . ... ... ... ... .. mUsers Songs User Vector Matrix: X: (m x f)
  11. 11. Latent Factor Models “Compact” representation for each user and items(songs): f-dimensional vectors Vidhya Rise .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . .. . .. . .. . . . ... ... ... ... .. mUsers Songs User Vector Matrix: X: (m x f) Song Vector Matrix: Y: (n x f)
  12. 12. Latent Factor Models “Compact” representation for each user and items(songs): f-dimensional vectors (here, f = 2) Vidhya Rise .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . .. . .. . .. . . . ... ... ... ... .. mUsers Songs User Vector Matrix: X: (m x f) Song Vector Matrix: Y: (n x f)
  13. 13. NLP Models on News and Blogs
  14. 14. NLP Models work great on Playlists!
  15. 15. Document : Playlist NLP Models work great on Playlists!
  16. 16. Document : Playlist Word : Song NLP Models work great on Playlists!
  17. 17. [1] http://benanne.github.io/2014/08/05/spotify-cnns.html Deep Learning on Audio
  18. 18. BlackBoxing Algorithms
  19. 19. Music in Latent Space
  20. 20. Vectors “COMPACT” representation for users and items musical fingerprint. Normalized Song Vectors
  21. 21. Vectors “COMPACT” representation for users and items musical fingerprint. Normalized Song Vectors User Vector
  22. 22. Why Vectors? Encodes higher order dependencies Users and Items in the same latent space User - Item recommendations Item - Item similarities Easy to scale up Complexity is linear in order of latent factors
  23. 23. Recommendations 15 Normalized Song Vectors User Vector
  24. 24. Recommendations 15 Normalized Song Vectors User Vector
  25. 25. Ranking Similarity score can be used for ranking
  26. 26. Ranking Similarity score can be used for ranking Balance relevance, diversity, popularity, freshness
  27. 27. Ranking Similarity score can be used for ranking Balance relevance, diversity, popularity, freshness Heuristic based
  28. 28. Ranking Similarity score can be used for ranking Balance relevance, diversity, popularity, freshness Heuristic based MAB Interactions Impressions Clicks Streams
  29. 29. Music Personalization Data Flow
  30. 30. 18
  31. 31. Challenges Unique to Spotify Scale of catalog Music is “niche” Music consumption has heavy correlation to users’ context Repeated consumption of music is NOT so uncommon.
  32. 32. Challenge Accepted! Cold start problem for both users and new music/upcoming artists: Content Based Signals Real Time Recommendations Measuring Quality: Implicit: A/B Test Metrics Explicit: Feedback from social forums Scam Attacks: Rule based model to detect scammers Humans choices are not always predictable: Faith in humanity
  33. 33. What Next? ‣Personalization! ‣Content signals such as lyrics, audio, images ‣Expanded Catalog: Shows, Podcasts ‣New Markets 21
  34. 34. We are hiring!
  35. 35. Thank You! You can reach me @ Email: vidhya@spotify.com Twitter: @vid052 23

×