SlideShare a Scribd company logo
behind the scenes	


                 Spotify	

             Ricardo Santos	

                @ricardovice
“spotifiera”, anyone?
Main goals	

•  A big catalogue, tons of music	

•  Available everywhere	

•  Great user experience	

•  More convenient than piracy	

•  Fast	

•  Reliable, high availability	

•  Scalable to many, many users
Business idea	

•  Free ad-funded version	

•  Paid subscription where users get:	

  •  No advertisements	

  •  Mobile access	

  •  Offline playback	

  •  API access
“music itself is going to become
like running water or electricity”	

          David Bowie, 2002
Accessibility	

•  People should always be able to access
  music	

  •  Whenever they want	

  •  Wherever they are
The catalogue	

•  All content is delivered by labels	

•  Currently over 10 million tracks	

•  Growing every day, around 10k per day	

•  96-320 kbps audio streams, most are Ogg
  Vorbis q5, 160kbps
that all sounds cool,
but let’s talk engineering!
“It’s Easy, Really.” 	

    Blaine Cook, 2007
Handling Growth	

•  Scaling is not an exact science	

•  There is no such thing as a magic formula	

•  Usage patterns differ	

•  There is always a limit to what you can
   handle	

•  Fail gracefully	

•  Continuous evolution process
Usage patterns	

Typically, some services are more demanding
than others, this can be due to:	

•  Higher popularity	

•  Higher complexity	

•  Both combined
Decoupling	

•  Divide and conquer!	

•  Resources assigned individually	

•  Using the right tools to address each
   problem	

•  Organization and delegation	

•  Problems are isolated	

•  Easier to handle growth
Decoupling	

Spotify’s internal services include:	

•  Access Point	

•  User	

•  Playlist	

•  Search	

•  Browse	

Can you guess which one is the most complex?
Playlist!
Playlist!	

Though it may sound simple, by far the most
 demanding:	

•  For each user there are several playlists	

•  Push notifications	

•  Offline writing	

•  Conflict resolution without user interaction
Metadata services	

Search and Browse allow users to find music	

•  Both handle read requests	

•  But their usage and responses differ	

•  Data sources should be optimized for each
   of these, called indices	

•  These are hard to maintain, easier to
   regenerate
Speed thrills
Latency matters	

•  High latency is a problem, not only in First
  Person Shooters	

•  Increased latency of Google searches by
  100 – 400ms decreased usage by 0.2 – 0.6%
  (Jake Brutlag, 2009)	

•  Slow performance is one of the major
  reasons users abandon services	

•  Users don't come back
Focus on low latency	

•  Our SLA is maintained by monitoring
  latency on the client side	

•  On average, the human notion of
  “instantly” is 200ms	

•  The current median latency to begin to
  play a track in Spotify is 265ms	

•  Due to disk lookup, at times it's actually
  faster to start playing a track from network
  than from disk
Playing a track	

•  Check local cache	

•  Request first piece from Spotify servers	

•  Meanwhile, search P2P for remainder	

•  Switch between servers  P2P as needed	

•  Towards the end of a track, start pre-
  fetching the next one via P2P rather than
  our servers
When to start playing?	

•  Trade off between stutter  latency	

•  Look at last 15 min of transfer rates	

•  Model as Markov chain and simulate	

•  Coupled with some heuristics
Production storage	

•  Production storage is a cache with fast
  drives  lots of RAM	

•  Serves the most popular content	

•  A cache miss will generate a request to
  master storage, slightly higher latency	

•  Production storage is available in several
  data centers to ensure closeness to the
  user (latency wise)
Master storage	

•  Works as a DHT, with some redundancy	

•  Contains all available tracks but has slower
  drives and access	

•  Tracks are kept in several formats, adding
  up to around 290TB
P2P helps	

•  Easier to scale	

•  Less servers	

•  Less bandwidth	

•  Better uptime	

•  Less costs	

•  Fun!
P2P overview	

•  Not a piracy network, all tracks are added
  by Spotify	

•  Used on all desktop clients (no mobile)	

•  Each client connected to = 60 others	

•  All nodes are equals (no super nodes)	

•  A track is downloaded from several peers
P2P custom protocol	

•  Ask for most urgent pieces first	

•  If a peer is slow, re-request from new
  peers	

•  When buffers run low, download from
  central servers	

•  If loading from servers, estimate at what
  point P2P will catch up	

•  If buffers are very low, stop uploading
P2P finding peers	

•  Partial central tracker (BitTorrent-style)	

•  Broadcast query in small neighborhood
  (Gnutella-style)	

•  Two mechanisms results in higher
  availability	

•  Limited broadcast for local (LAN) peer
  discovery (cherry on top...)
P2P security	

•  The P2P network needs to be a safe and
  trusted one	

•  All exchanged files have to come originally
  from Spotify	

•  All peers should be trusted Spotify clients
Security trough
          obscurity	

•  Our client needs to be able to read
  metadata and play music	

•  At the same time we have to prevent
  reverse engineering from doing the same	

•  Therefor, we can't openly discuss the
  details
but…	

•  Closed environment	

•  Integrity of downloaded files is checked	

•  Data transfers are encrypted	

•  Usernames are not exposed in P2P
  network, all peers assigned pseudonym	

•  Software obfuscation, makes life difficult for
  reverse engineers
Software obfuscation
So, what's the
           outcome?	

•  At over 10 million users the responses are	

  •  55.4% from client cache	

  •  35.8% from the P2P network	

  •  8.8% from the servers
Oh, and
we have
cake as
well! :D

spotify.com/jobs
jobs@spotify.com
I'd like to know more...	

•  Get in touch with us	

•  Checkout Gunnar Kreitz's slides and
  academic papers on the subject:	

http://www.csc.kth.se/~gkreitz/spotify-p2p10/
Thanks!	

http://commons.wikimedia.org/wiki/File:Surprised_young_cat.JPG	


http://commons.wikimedia.org/wiki/File:Chicken_February_2009-1.jpg	


http://xkcd.com/257/

More Related Content

What's hot

How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At Spotify
Josh Baer
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
Kevin Goldsmith
 
Gestion des drifts Terraform avec la méthode GitOps
Gestion des drifts Terraform avec la méthode GitOpsGestion des drifts Terraform avec la méthode GitOps
Gestion des drifts Terraform avec la méthode GitOps
Katia HIMEUR TALHI
 
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
confluent
 
Zillow's favorite big data & machine learning tools
Zillow's favorite big data & machine learning toolsZillow's favorite big data & machine learning tools
Zillow's favorite big data & machine learning tools
njstevens
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
Frederick Reiss
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
Neville Li
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
MIJIN AN
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
Erik Bernhardsson
 
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
Amazon Web Services
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Amazon Web Services
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
confluent
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at Spotify
Vidhya Murali
 
ML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive AnalyticsML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive Analytics
Erik Bernhardsson
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
confluent
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Seunghyun Lee
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
Stephan Ewen
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
Erik Bernhardsson
 
Arquitetando uma instituição financeira moderna - Lucas Cavalcanti
Arquitetando uma instituição financeira moderna - Lucas CavalcantiArquitetando uma instituição financeira moderna - Lucas Cavalcanti
Arquitetando uma instituição financeira moderna - Lucas Cavalcanti
iMasters
 

What's hot (20)

How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At Spotify
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
 
Gestion des drifts Terraform avec la méthode GitOps
Gestion des drifts Terraform avec la méthode GitOpsGestion des drifts Terraform avec la méthode GitOps
Gestion des drifts Terraform avec la méthode GitOps
 
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) K...
 
Zillow's favorite big data & machine learning tools
Zillow's favorite big data & machine learning toolsZillow's favorite big data & machine learning tools
Zillow's favorite big data & machine learning tools
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
 
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
A Day in the Life of a Cloud Network Engineer at Netflix - NET303 - re:Invent...
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at Spotify
 
ML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive AnalyticsML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive Analytics
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
 
Arquitetando uma instituição financeira moderna - Lucas Cavalcanti
Arquitetando uma instituição financeira moderna - Lucas CavalcantiArquitetando uma instituição financeira moderna - Lucas Cavalcanti
Arquitetando uma instituição financeira moderna - Lucas Cavalcanti
 

Viewers also liked

Spotify: P2P music streaming
Spotify: P2P music streamingSpotify: P2P music streaming
Spotify: P2P music streaming
Ricardo Vice Santos
 
Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at Spotify
Kevin Goldsmith
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
Kinshuk Mishra
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at Spotify
Ali Sarrafi
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model Analysis
Trevor Clendenin
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studies
Emily Wilkinson
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan BasalamahIndonesia Network Operators Group
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countries
APNIC
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talk
Oscar Carlsson
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
Nick Barkas
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Hakka Labs
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010
Affan Basalamah
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1
coachkevinperkins
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, Succeed
Kevin Goldsmith
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify Vincent Tsao
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifyValeria Aguerri
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
Tu Pham
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
Vidhya Murali
 

Viewers also liked (20)

Spotify: P2P music streaming
Spotify: P2P music streamingSpotify: P2P music streaming
Spotify: P2P music streaming
 
Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at Spotify
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at Spotify
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model Analysis
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studies
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countries
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talk
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1
 
Spotify Teknikdagarna
Spotify TeknikdagarnaSpotify Teknikdagarna
Spotify Teknikdagarna
 
Spotify presentation
 Spotify presentation Spotify presentation
Spotify presentation
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, Succeed
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of Spotify
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 

Similar to Spotify: behind the scenes

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
Ricardo Vice Santos
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
Wei-Ning Huang
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
Lorenzo Miniero
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
Jenn Riley
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
Digital Strategy Works LLC
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
NOLOH LLC.
 
P2P Lecture.ppt
P2P Lecture.pptP2P Lecture.ppt
P2P Lecture.ppt
JohnRebenRequinto1
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
Todd Palino
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Greg Kawere
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with Janus
Lorenzo Miniero
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting
3Play Media
 
Peer to peer(p2 p)
Peer to peer(p2 p)Peer to peer(p2 p)
Peer to peer(p2 p)
Mukesh Pilaniya
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache Way
Srinath Perera
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open Source
OPNFV
 
Podcasting
PodcastingPodcasting
Podcasting
Troy Tarpley
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022
Lorenzo Miniero
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2Pnewnwan
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021
Lorenzo Miniero
 
Music streams
Music streamsMusic streams
Music streams
Stefano Galarraga
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Barry Tarlton
 

Similar to Spotify: behind the scenes (20)

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
 
P2P Lecture.ppt
P2P Lecture.pptP2P Lecture.ppt
P2P Lecture.ppt
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo miniero
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with Janus
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting
 
Peer to peer(p2 p)
Peer to peer(p2 p)Peer to peer(p2 p)
Peer to peer(p2 p)
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache Way
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open Source
 
Podcasting
PodcastingPodcasting
Podcasting
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2P
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021
 
Music streams
Music streamsMusic streams
Music streams
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!
 

Recently uploaded

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Spotify: behind the scenes

  • 1. behind the scenes Spotify Ricardo Santos @ricardovice
  • 3. Main goals •  A big catalogue, tons of music •  Available everywhere •  Great user experience •  More convenient than piracy •  Fast •  Reliable, high availability •  Scalable to many, many users
  • 4. Business idea •  Free ad-funded version •  Paid subscription where users get: •  No advertisements •  Mobile access •  Offline playback •  API access
  • 5. “music itself is going to become like running water or electricity” David Bowie, 2002
  • 6. Accessibility •  People should always be able to access music •  Whenever they want •  Wherever they are
  • 7.
  • 8.
  • 9.
  • 10. The catalogue •  All content is delivered by labels •  Currently over 10 million tracks •  Growing every day, around 10k per day •  96-320 kbps audio streams, most are Ogg Vorbis q5, 160kbps
  • 11. that all sounds cool, but let’s talk engineering!
  • 12. “It’s Easy, Really.” Blaine Cook, 2007
  • 13. Handling Growth •  Scaling is not an exact science •  There is no such thing as a magic formula •  Usage patterns differ •  There is always a limit to what you can handle •  Fail gracefully •  Continuous evolution process
  • 14. Usage patterns Typically, some services are more demanding than others, this can be due to: •  Higher popularity •  Higher complexity •  Both combined
  • 15. Decoupling •  Divide and conquer! •  Resources assigned individually •  Using the right tools to address each problem •  Organization and delegation •  Problems are isolated •  Easier to handle growth
  • 16. Decoupling Spotify’s internal services include: •  Access Point •  User •  Playlist •  Search •  Browse Can you guess which one is the most complex?
  • 18. Playlist! Though it may sound simple, by far the most demanding: •  For each user there are several playlists •  Push notifications •  Offline writing •  Conflict resolution without user interaction
  • 19. Metadata services Search and Browse allow users to find music •  Both handle read requests •  But their usage and responses differ •  Data sources should be optimized for each of these, called indices •  These are hard to maintain, easier to regenerate
  • 20.
  • 22. Latency matters •  High latency is a problem, not only in First Person Shooters •  Increased latency of Google searches by 100 – 400ms decreased usage by 0.2 – 0.6% (Jake Brutlag, 2009) •  Slow performance is one of the major reasons users abandon services •  Users don't come back
  • 23. Focus on low latency •  Our SLA is maintained by monitoring latency on the client side •  On average, the human notion of “instantly” is 200ms •  The current median latency to begin to play a track in Spotify is 265ms •  Due to disk lookup, at times it's actually faster to start playing a track from network than from disk
  • 24. Playing a track •  Check local cache •  Request first piece from Spotify servers •  Meanwhile, search P2P for remainder •  Switch between servers P2P as needed •  Towards the end of a track, start pre- fetching the next one via P2P rather than our servers
  • 25. When to start playing? •  Trade off between stutter latency •  Look at last 15 min of transfer rates •  Model as Markov chain and simulate •  Coupled with some heuristics
  • 26.
  • 27. Production storage •  Production storage is a cache with fast drives lots of RAM •  Serves the most popular content •  A cache miss will generate a request to master storage, slightly higher latency •  Production storage is available in several data centers to ensure closeness to the user (latency wise)
  • 28. Master storage •  Works as a DHT, with some redundancy •  Contains all available tracks but has slower drives and access •  Tracks are kept in several formats, adding up to around 290TB
  • 29.
  • 30. P2P helps •  Easier to scale •  Less servers •  Less bandwidth •  Better uptime •  Less costs •  Fun!
  • 31. P2P overview •  Not a piracy network, all tracks are added by Spotify •  Used on all desktop clients (no mobile) •  Each client connected to = 60 others •  All nodes are equals (no super nodes) •  A track is downloaded from several peers
  • 32. P2P custom protocol •  Ask for most urgent pieces first •  If a peer is slow, re-request from new peers •  When buffers run low, download from central servers •  If loading from servers, estimate at what point P2P will catch up •  If buffers are very low, stop uploading
  • 33. P2P finding peers •  Partial central tracker (BitTorrent-style) •  Broadcast query in small neighborhood (Gnutella-style) •  Two mechanisms results in higher availability •  Limited broadcast for local (LAN) peer discovery (cherry on top...)
  • 34. P2P security •  The P2P network needs to be a safe and trusted one •  All exchanged files have to come originally from Spotify •  All peers should be trusted Spotify clients
  • 35. Security trough obscurity •  Our client needs to be able to read metadata and play music •  At the same time we have to prevent reverse engineering from doing the same •  Therefor, we can't openly discuss the details
  • 36. but… •  Closed environment •  Integrity of downloaded files is checked •  Data transfers are encrypted •  Usernames are not exposed in P2P network, all peers assigned pseudonym •  Software obfuscation, makes life difficult for reverse engineers
  • 38. So, what's the outcome? •  At over 10 million users the responses are •  55.4% from client cache •  35.8% from the P2P network •  8.8% from the servers
  • 39.
  • 40. Oh, and we have cake as well! :D spotify.com/jobs jobs@spotify.com
  • 41. I'd like to know more... •  Get in touch with us •  Checkout Gunnar Kreitz's slides and academic papers on the subject: http://www.csc.kth.se/~gkreitz/spotify-p2p10/