Personalized Playlists at Spotify

Personalized Playlists
@
Spotify
Rohan Agrawal
RE-WORK Machine Intelligence Summit
• New York
• Nov 2, 2016

Spotify in Numbers
• Started in 2006, now available in 59
markets
• 100+ Million active users
• 30 Million + tracks
• 20,000 new songs added per day
• 2+ Billion user generated playlists

Personalization @ Spotify
‣ Features:
• Discover Weekly
• Release Radar
• Discover Page
• Playlist
Recommendations
• Radio
• Concerts
Recommendations …

Focus on track recommendations
‣Discover Weekly
‣Release Radar

Our ML Models seem to be working!

Today, we’ll talk about 3 types of models
‣ Latent Factor Models
‣ Deep Learning Audio models
‣ NLP models (which are also latent factor models …)

Lets start off with Latent Factor Models
“Compact” representation for each user and items(songs): f-dimensional
vectors
Rohan
Track a
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
mUsers
Songs
User Vector Matrix: X: (m x f) Song Vector Matrix: Y: (n x f)

If we were to visualize a few Artist Latent Factors

Implicit Feedback (Hu et al. 2008)
‣ If a user u, listens to an item i, dot product of the user vector and
item vector should be as close to 1 as possible.
‣ Also takes into account confidence of a user liking an item i
‣ Solve with Alternating Gradient Descent or Alternating Least
squares.

Logistic Matrix Factorization (Johnson 2014)
‣ Model the probability of a user clicking on an item as the logistic
function.
‣ Maximize the likelihood of observations R, given ….

Recent Advances in MF
‣ Different loss functions (rank loss)
‣ Use of side information (demographics, metadata)
‣ Use of context (where, when)
‣ Deep Learning CF models

Deep Learning on Audio
http://benanne.github.io/2014/08/05/spotify-cnns.html

Document : User Session
Word : Song
NLP Models For Recommendations

Word2Vec (Mikolov et al. 2013)
‣ Each word / track has an input
and output vector
representation.
‣ Output is a vector space with
similar items living close to each
other in cosine distance. (and
awesome vector algebra
property)
Softmax
skipgram

Sequential Data? RNN ?
‣ Output layer is same as word2vec, softmax. Make a prediction of
the next item based on hidden state
‣ Recurrent connection
‣ Learning output weights and b’s for each item
https://erikbern.com/2014/06/28/recurrent-neural-networks-for-collaborative-filtering/

User Representations?
‣ Word2vec can output word / track representation but what about user
representations.
‣ Simple Aggregation (Bag of words) ?
Averaging problems
‣ Doc2Vec ?
Retrain every time there is new user activity
‣ Clustering?
Losing vector addition information
‣ Learn user vector through RNN ?

Another RNN approach
‣ Assume item vectors are fixed
‣ Try to learn the next item vector in the sequence
‣ Long term intents, train RNN to predict longer ahead in the future

Challenges, what lies ahead
Side information in embedding models, remove regional
biases, external genre information, lyrics, Facebook /
Twitter account data, [ cover art, who knows :) ]
Deep Learning
Transfer Learning
Outlier Detection

Thank You!
You can reach me @
Email: rohanag@spotify.com
Twitter: @rohanag
20

Personalized Playlists at Spotify

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Personalized Playlists at Spotify

Similar to Personalized Playlists at Spotify (20)

Recently uploaded

Recently uploaded (20)

Personalized Playlists at Spotify