March 25, 2014
Neville Li
neville@spotify.com
@sinisa_lyh
Storm at Spotify
•@Spotify since 2011
•Recommendation Team
•Data & Backend
•Storm, Scalding, Spark,
Scala…
About Me
March 25, 2014
Spotify in numbers
Started in 2006, available in 55 markets
20+ million songs, 20,000 added per day
24+ mil...
Big Data
@spotify
600 node cluster
Every day
•400GB service logs
•4.5TB user data
•5,000 Hadoop jobs
•61TB generated
March 25, 2014
What is Storm?
In data-layman’s terms
• Real time stream processing
• Like Hadoop without HDFS
• Like Map/R...
Storm @spotify
•storm-0.8.0
•22 node cluster
•15+ topologies
•200,000+ tuples per second
•recommendation, ads,
monitoring,...
“Never Gonna Give You Up”
Rick Astley Map
!
First Storm
Application
@Spotify
7
RT Market Launch Stats
Other Uses
•Trending tracks
•Email campaign
•App performance tracking
•UX tracking
Anatomy of
A Storm Topology
From play to
recommendation
Social Listening
Take 1
•PUB/SUB
•Almost real-time
•Spammy
•Hard to scale
All characters appearing in this work are fictit...
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Storm at Spotify
Upcoming SlideShare
Loading in...5
×

Storm at Spotify

6,683

Published on

Slides for the NYC Storm user group meetup @spotify, Mar 25, 2014

Published in: Technology
1 Comment
23 Likes
Statistics
Notes
No Downloads
Views
Total Views
6,683
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
124
Comments
1
Likes
23
Embeds 0
No embeds

No notes for slide

Storm at Spotify

  1. 1. March 25, 2014 Neville Li neville@spotify.com @sinisa_lyh Storm at Spotify
  2. 2. •@Spotify since 2011 •Recommendation Team •Data & Backend •Storm, Scalding, Spark, Scala… About Me
  3. 3. March 25, 2014 Spotify in numbers Started in 2006, available in 55 markets 20+ million songs, 20,000 added per day 24+ million active users, 6+ million subscribers 1.5 billion playlists !
  4. 4. Big Data @spotify 600 node cluster Every day •400GB service logs •4.5TB user data •5,000 Hadoop jobs •61TB generated
  5. 5. March 25, 2014 What is Storm? In data-layman’s terms • Real time stream processing • Like Hadoop without HDFS • Like Map/Reduce with many reducer steps • Fault tolerant & guaranteed message processing Photo © Blaine Courts http://www.flickr.com/photos/blainecourts/8417266909/
  6. 6. Storm @spotify •storm-0.8.0 •22 node cluster •15+ topologies •200,000+ tuples per second •recommendation, ads, monitoring, analytics, etc.
  7. 7. “Never Gonna Give You Up” Rick Astley Map ! First Storm Application @Spotify 7
  8. 8. RT Market Launch Stats
  9. 9. Other Uses •Trending tracks •Email campaign •App performance tracking •UX tracking
  10. 10. Anatomy of A Storm Topology From play to recommendation
  11. 11. Social Listening Take 1 •PUB/SUB •Almost real-time •Spammy •Hard to scale All characters appearing in this work are fictitious. Any resemblance to real persons, living or dead, is purely coincidental. this
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×