SlideShare a Scribd company logo
Playlists at Spotify
Using Cassandra to store version
controlled objects at large scale
Jimmy Mårdell <yarin@spotify.com>

#CassandraEU

October 18, 2013
Intro

About me
• Jimmy Mårdell
• Software Engineer
• 3 years at Spotify

#CassandraEU

2
Intro

About Spotify
• 24 million active users
– 6 million paying subscribers
• 4 000 servers in 4 data centers
• Over 1 billion playlists created

#CassandraEU

3
#CassandraEU

Intro

Contents
•Why version control?
•Playlists at Spotify
•Cassandra data model
•Lessons learned

4
Why version control?

#CassandraEU

What is version control?
• “Version control is the management of changes to documents” (Wikipedia)
• Stand-alone (most common)
– GIT, Subversion etc

• Embedded
– Google Docs

5
Why version control?

Embedded usage
• Collaborative editing
• Undo functionality
• Performance
• Business logic depends on document history

#CassandraEU

6
Playlists at Spotify

Playlists

#CassandraEU

7
Playlists at Spotify

#CassandraEU

8
Playlists at Spotify

Playlist challenges
• More than 1 billion playlists
• >40 000 requests/second at peak
• Offline mode
• Concurrent changes

#CassandraEU

9
Playlists at Spotify

Playlist client-server
• Every playlist is a version controlled object
• All playlists are synced on login
– Fetch all new changes

#CassandraEU

10
Playlists at Spotify

Playlist client-server
• Local queue of playlist modifications
– Clients optimistically accept changes - fast UI

• Queue flushed to server when possible
– Offline changes
– Fault tolerant

#CassandraEU

11
#CassandraEU

Playlists at Spotify

12

Playlist version control

3,038f...: REM(from=2, len=1)

A
C

2,19ca...: MOV(from=2, to=1, len=1)

A
C
B

1,4ed2...: ADD(ix=0, track=A,B,C)

A
B
C

0,ROOT

Representation of a playlist in the backend
#CassandraEU

Playlists at Spotify

Playlist branching
• Concurrent changes
– Offline
A

B

13
#CassandraEU

Playlists at Spotify

Playlist branching
merge

• Concurrent changes
– Offline

• Conflict resolution
– Operational Transformation

• Clients oblivious of branches

B’

A’

A

B

14
Cassandra data model

Cassandra data model

#CassandraEU

15
Cassandra data model

Cassandra at Spotify
• Playlist first system to use Cassandra
– Now we use it a lot...

• Started with Cassandra 0.7
• Using limited set of Cassandra features
– No super columns
– No CQL

#CassandraEU

16
Cassandra data model

Planning a data model
• Start with the queries!
• Three common playlist queries
– SYNC: Get all changes since a particular revision
– GET: Get the most recent snapshot
– APPEND: Add/move/delete tracks

#CassandraEU

17
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

1,4ed2...
parent=0,ROOT
op=ADD(ix=0, track=A,B,C)

2,19ca...
parent=1,4ed2...
op=MOV(from=2, to=1, len=1)

3,038f...
parent=2,19ca
op=REM(from=2, len=1)

18
#CassandraEU

Cassandra data model

19

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key

1,4ed2...

2,19ca...

parent=0,ROOT
op=ADD(ix=0, track=A,B,C)

parent=1,4ed2...
op=MOV(from=2, to=1, len=1)

1,8a20...

2,dd07...

spotify:user:yarin:playlist:
prnt=0,ROOT
4Pj4dCOEEYWDixfYyJwxEf op=...

2,b783...
prnt=1,8a20...
op=...

prnt=1,8a20...
op=...

3,39ef...

3,038f...
parent=2,19ca
op=REM(from=2, len=1)

3,5a9c...

prnt=2,dd07... prnt=2,b783...
op=...
op=...

4,03fc...
prnt=2,39ef...
prnt=3,5a9c...
Cassandra data model

Playlists in Cassandra
• Which revision is the latest?
– Changes with no children

• Multiple heads possible!
– Heads may appear anywhere within the row

#CassandraEU

20
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

1,4ed2...
prnt=0,ROOT
op=...

CF playlist_head
2,19ca...
prnt=1,4ed2...
op=...

3,038f...
prnt=2,19ca
op=...

Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

3,038f...

21
#CassandraEU

Cassandra data model

22

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

1,4ed2...
prnt=0,ROOT
op=...

1,8a20...
prnt=0,ROOT
op=...

CF playlist_head
2,19ca...
prnt=1,4ed2...
op=...

2,b783...
prnt=1,8a20...
op=...

3,038f...
prnt=2,19ca
op=...

2,dd07...
prnt=1,8a20...
op=...

Row key

3,038f...

spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

2,b783... 2,dd07...
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key

CF playlist_head

1,4ed2...
prnt=0,ROOT
op=...

2,19ca...
prnt=1,4ed2...
op=...

3,038f...
prnt=2,19ca
op=...

1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc.

spotify:user:yarin:p
laylist:4Pj4dCOEE prt=0,ROOT
YWDixfYyJwxEf
op=...

prnt=1,8a20
op=...

prnt=1,8a20
op=...

prnt=2,dd07
op=...

prnt=2,b783
op=...

prnt=2,39ef
prnt=3,5a9c

Row key

3,038f...

spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

4,03fc...

23
Cassandra data model

Playlist heads
• playlist_head is a small CF
– Fits in RAM

• 95% of playlist request only read from playlist_head
– Most playlists are already up-to-date

#CassandraEU

24
Cassandra data model

Playlist snapshots
• playlist_change works well when syncing playlists
• Not so well for fetching new playlists
– Snapshot cache

#CassandraEU

25
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

1,4ed2...
prnt=0,ROOT
op=...

1,8a20...
prnt=0,ROOT
op=...

CF playlist_snapshot
2,19ca...
prnt=1,4ed2...
op=...

2,b783...
prnt=1,8a20...
op=...

3,038f...
prnt=2,19ca
op=...

2,dd07...
prnt=1,8a20...
op=...

Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

cache
version=3,038f...
contents=A,C

Row key

cache

spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

version=2,b783...
contents=...

26
Cassandra data model

Updating playlists
• Validate change
– Locate snapshot
– Client may append to old version

• Update all tables
– playlist_head last

#CassandraEU

27
Cassandra data model

Cassandra consistency levels
• Replication factor 3
• All writes using CL_QUORUM
• Reads from playlist_head
– CL_QUORUM

• Reads from playlist_change and playlist_snapshot
– CL_ONE but may fallback to CL_QUORUM

#CassandraEU

28
Lessons learned

Lessons learned

#CassandraEU

29
Lessons learned

Optimizations
• Leveled compaction
– Improved performance a lot
• Compression
– Not as impressive
– CRC checks

#CassandraEU

30
Lessons learned

Optimizations
• Trusted Linux page cache to ensure playlist_head kept in RAM
– Didn’t work

• Tried Cassandra row cache
– NO!

• mlock to the rescue

#CassandraEU

31
Lessons learned

#CassandraEU

An enterprise ready solution
bash# while true; do
vmtouch -m 10000000000 -l *head* & sleep 10m
kill %vmtouch
done

32
Lessons learned

No moving parts
• Flash disks are awesome
• Reduced size of cluster from 60 to 30 nodes
– Thanks FusionIO!

• IOPS no longer the bottleneck

#CassandraEU

33
Lessons learned

Tombstone hell
• Noticed requests to playlist_head took several seconds
– Huh?

• Every change causes a value to be deleted in playlist_head
• playlist_head is essentially a queue
– Well-known anti-pattern

#CassandraEU

34
Lessons learned

Tombstone hell
• We had rows with >500,000 tombstones
• Solution: major compaction
– Relatively fast since playlist_head is in RAM

#CassandraEU

35
Lessons learned

And more...
• Large rows in playlist_change
– Modify version graph

• Reduce amount of requests
– Group playlists by owner

Sounds interesting? We’re hiring!

#CassandraEU

36
Questions?

More Related Content

What's hot

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Adam Kawa
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
ScyllaDB
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Spotify architecture - Pressing play
Spotify architecture - Pressing playSpotify architecture - Pressing play
Spotify architecture - Pressing play
Niklas Gustavsson
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
DataWorks Summit
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
Victor Coustenoble
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
Amazon Web Services
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan AgrawalApache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Databricks
 

What's hot (20)

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Spotify architecture - Pressing play
Spotify architecture - Pressing playSpotify architecture - Pressing play
Spotify architecture - Pressing play
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan AgrawalApache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
 

Similar to Playlists at Spotify - Using Cassandra to store version controlled objects

Playlists at Spotify
Playlists at SpotifyPlaylists at Spotify
Playlists at Spotify
DataStax Academy
 
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
DataStax
 
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
Emanuele Chioso
 
Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
Nikhil Tibrewal
 
Spotify cassandra london
Spotify cassandra londonSpotify cassandra london
Spotify cassandra london
Noa Resare
 
guider: a system-wide performance analyzer
guider: a system-wide performance analyzerguider: a system-wide performance analyzer
guider: a system-wide performance analyzer
Peace Lee
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
Patrick McFadin
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
recsysfr
 
Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)Oshin Hung
 
Automatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at ScaleAutomatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at Scale
Martina Iglesias Fernández
 
Last.fm API workshop - Stockholm
Last.fm API workshop - StockholmLast.fm API workshop - Stockholm
Last.fm API workshop - Stockholm
Matthew Ogle
 
Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
DataStax Academy
 
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Compuware
 
Miyagawa
MiyagawaMiyagawa
Miyagawaguru100
 
Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform
randomfromtheweb
 
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Paul Leclercq
 
The Zeitgeist Movement
The Zeitgeist MovementThe Zeitgeist Movement
The Zeitgeist Movement
guest915c8c5
 

Similar to Playlists at Spotify - Using Cassandra to store version controlled objects (20)

Playlists at Spotify
Playlists at SpotifyPlaylists at Spotify
Playlists at Spotify
 
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
 
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
 
Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
 
Spotify cassandra london
Spotify cassandra londonSpotify cassandra london
Spotify cassandra london
 
guider: a system-wide performance analyzer
guider: a system-wide performance analyzerguider: a system-wide performance analyzer
guider: a system-wide performance analyzer
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)
 
Automatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at ScaleAutomatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at Scale
 
Last.fm API workshop - Stockholm
Last.fm API workshop - StockholmLast.fm API workshop - Stockholm
Last.fm API workshop - Stockholm
 
Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
 
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform
 
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
 
The Zeitgeist Movement
The Zeitgeist MovementThe Zeitgeist Movement
The Zeitgeist Movement
 

Recently uploaded

JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 

Recently uploaded (20)

JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 

Playlists at Spotify - Using Cassandra to store version controlled objects

  • 1. Playlists at Spotify Using Cassandra to store version controlled objects at large scale Jimmy Mårdell <yarin@spotify.com> #CassandraEU October 18, 2013
  • 2. Intro About me • Jimmy Mårdell • Software Engineer • 3 years at Spotify #CassandraEU 2
  • 3. Intro About Spotify • 24 million active users – 6 million paying subscribers • 4 000 servers in 4 data centers • Over 1 billion playlists created #CassandraEU 3
  • 4. #CassandraEU Intro Contents •Why version control? •Playlists at Spotify •Cassandra data model •Lessons learned 4
  • 5. Why version control? #CassandraEU What is version control? • “Version control is the management of changes to documents” (Wikipedia) • Stand-alone (most common) – GIT, Subversion etc • Embedded – Google Docs 5
  • 6. Why version control? Embedded usage • Collaborative editing • Undo functionality • Performance • Business logic depends on document history #CassandraEU 6
  • 9. Playlists at Spotify Playlist challenges • More than 1 billion playlists • >40 000 requests/second at peak • Offline mode • Concurrent changes #CassandraEU 9
  • 10. Playlists at Spotify Playlist client-server • Every playlist is a version controlled object • All playlists are synced on login – Fetch all new changes #CassandraEU 10
  • 11. Playlists at Spotify Playlist client-server • Local queue of playlist modifications – Clients optimistically accept changes - fast UI • Queue flushed to server when possible – Offline changes – Fault tolerant #CassandraEU 11
  • 12. #CassandraEU Playlists at Spotify 12 Playlist version control 3,038f...: REM(from=2, len=1) A C 2,19ca...: MOV(from=2, to=1, len=1) A C B 1,4ed2...: ADD(ix=0, track=A,B,C) A B C 0,ROOT Representation of a playlist in the backend
  • 13. #CassandraEU Playlists at Spotify Playlist branching • Concurrent changes – Offline A B 13
  • 14. #CassandraEU Playlists at Spotify Playlist branching merge • Concurrent changes – Offline • Conflict resolution – Operational Transformation • Clients oblivious of branches B’ A’ A B 14
  • 15. Cassandra data model Cassandra data model #CassandraEU 15
  • 16. Cassandra data model Cassandra at Spotify • Playlist first system to use Cassandra – Now we use it a lot... • Started with Cassandra 0.7 • Using limited set of Cassandra features – No super columns – No CQL #CassandraEU 16
  • 17. Cassandra data model Planning a data model • Start with the queries! • Three common playlist queries – SYNC: Get all changes since a particular revision – GET: Get the most recent snapshot – APPEND: Add/move/delete tracks #CassandraEU 17
  • 18. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 1,4ed2... parent=0,ROOT op=ADD(ix=0, track=A,B,C) 2,19ca... parent=1,4ed2... op=MOV(from=2, to=1, len=1) 3,038f... parent=2,19ca op=REM(from=2, len=1) 18
  • 19. #CassandraEU Cassandra data model 19 Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key 1,4ed2... 2,19ca... parent=0,ROOT op=ADD(ix=0, track=A,B,C) parent=1,4ed2... op=MOV(from=2, to=1, len=1) 1,8a20... 2,dd07... spotify:user:yarin:playlist: prnt=0,ROOT 4Pj4dCOEEYWDixfYyJwxEf op=... 2,b783... prnt=1,8a20... op=... prnt=1,8a20... op=... 3,39ef... 3,038f... parent=2,19ca op=REM(from=2, len=1) 3,5a9c... prnt=2,dd07... prnt=2,b783... op=... op=... 4,03fc... prnt=2,39ef... prnt=3,5a9c...
  • 20. Cassandra data model Playlists in Cassandra • Which revision is the latest? – Changes with no children • Multiple heads possible! – Heads may appear anywhere within the row #CassandraEU 20
  • 21. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 1,4ed2... prnt=0,ROOT op=... CF playlist_head 2,19ca... prnt=1,4ed2... op=... 3,038f... prnt=2,19ca op=... Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 3,038f... 21
  • 22. #CassandraEU Cassandra data model 22 Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 1,4ed2... prnt=0,ROOT op=... 1,8a20... prnt=0,ROOT op=... CF playlist_head 2,19ca... prnt=1,4ed2... op=... 2,b783... prnt=1,8a20... op=... 3,038f... prnt=2,19ca op=... 2,dd07... prnt=1,8a20... op=... Row key 3,038f... spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 2,b783... 2,dd07...
  • 23. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key CF playlist_head 1,4ed2... prnt=0,ROOT op=... 2,19ca... prnt=1,4ed2... op=... 3,038f... prnt=2,19ca op=... 1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc. spotify:user:yarin:p laylist:4Pj4dCOEE prt=0,ROOT YWDixfYyJwxEf op=... prnt=1,8a20 op=... prnt=1,8a20 op=... prnt=2,dd07 op=... prnt=2,b783 op=... prnt=2,39ef prnt=3,5a9c Row key 3,038f... spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 4,03fc... 23
  • 24. Cassandra data model Playlist heads • playlist_head is a small CF – Fits in RAM • 95% of playlist request only read from playlist_head – Most playlists are already up-to-date #CassandraEU 24
  • 25. Cassandra data model Playlist snapshots • playlist_change works well when syncing playlists • Not so well for fetching new playlists – Snapshot cache #CassandraEU 25
  • 26. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 1,4ed2... prnt=0,ROOT op=... 1,8a20... prnt=0,ROOT op=... CF playlist_snapshot 2,19ca... prnt=1,4ed2... op=... 2,b783... prnt=1,8a20... op=... 3,038f... prnt=2,19ca op=... 2,dd07... prnt=1,8a20... op=... Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA cache version=3,038f... contents=A,C Row key cache spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf version=2,b783... contents=... 26
  • 27. Cassandra data model Updating playlists • Validate change – Locate snapshot – Client may append to old version • Update all tables – playlist_head last #CassandraEU 27
  • 28. Cassandra data model Cassandra consistency levels • Replication factor 3 • All writes using CL_QUORUM • Reads from playlist_head – CL_QUORUM • Reads from playlist_change and playlist_snapshot – CL_ONE but may fallback to CL_QUORUM #CassandraEU 28
  • 30. Lessons learned Optimizations • Leveled compaction – Improved performance a lot • Compression – Not as impressive – CRC checks #CassandraEU 30
  • 31. Lessons learned Optimizations • Trusted Linux page cache to ensure playlist_head kept in RAM – Didn’t work • Tried Cassandra row cache – NO! • mlock to the rescue #CassandraEU 31
  • 32. Lessons learned #CassandraEU An enterprise ready solution bash# while true; do vmtouch -m 10000000000 -l *head* & sleep 10m kill %vmtouch done 32
  • 33. Lessons learned No moving parts • Flash disks are awesome • Reduced size of cluster from 60 to 30 nodes – Thanks FusionIO! • IOPS no longer the bottleneck #CassandraEU 33
  • 34. Lessons learned Tombstone hell • Noticed requests to playlist_head took several seconds – Huh? • Every change causes a value to be deleted in playlist_head • playlist_head is essentially a queue – Well-known anti-pattern #CassandraEU 34
  • 35. Lessons learned Tombstone hell • We had rows with >500,000 tombstones • Solution: major compaction – Relatively fast since playlist_head is in RAM #CassandraEU 35
  • 36. Lessons learned And more... • Large rows in playlist_change – Modify version graph • Reduce amount of requests – Group playlists by owner Sounds interesting? We’re hiring! #CassandraEU 36