RecSysFr #3
Recommendation @Deezer
RecSysFr, Paris, 2016 June 22th
B. Mathieu, Head of Data Science
Deezer
/01
RecSysFr #3
Deezer overview
RecSysFr #3
● 420 employees in 20 cities
● 5M albums
● 40M tracks
● 100M playlists
● 16M MAU
● 6M subscribers
● ~500 servers
● 4.5 PB storage for audio files
● 1.5 TB of logs / day
● ~1B requests / day
● ~30k new albums each week
● Hadoop cluster with 1.5PB storage,
4TB RAM, 1000+ vcores
Some technical numbers
RecSysFr #3
Recommendation opportunities
/02
RecSysFr #3
● Interactive recommendation
● Understand user feedbacks
Interactive Radios
Algorithms and Evaluation
/03
RecSysFr #3
RecSysFr #3
Architecture overview
Content data:
- Tags
- Popularity
User data:
- Taste model
- Hot tracks
- Behaviors
Build tracklist
- Data cache
- User action history
- Update user models
- Consolidate tags data
- Build indexes
actions logs
RecSysFr #3
● % users listening more than 10mn
● % users who reconnect more than 3
days last week
● % users who do a like / dislike
=> take care of statistical confidence !
A/B Tests evaluation metrics
● A/B tests are costly, long
● Want to test more cases
Offline testing:
● setup benchmarking methodology
● Freeze data and evaluate algos with user future actions
RecSysFr #3
Offline testing / benchmarking
Offline
Testing
User
Study
AB
Testing
Candidates Best Offline
Candidates
Best User Studies
Candidates
Final choice
2013 - Shany, Gunawardana
Thanks for your attention
Enjoy RecSysFr #3 @Deezer !
http://www.deezer.com/jobs

Recommendation @Deezer