Key-Value-Stores -- The Key to Scaling?

Key-Value-Stores:
The Key to Scaling?
Tim Lossen

Who?
• @tlossen
• backend developer
- Ruby, Rails, Sinatra ...
• passionate about technology

Challenge
• backend for facebook game

Challenge
• backend for facebook game
• expected load:
- 1 mio. daily active users
- 20 mio. total users
- 100 KB data per user

Challenge
• expected peak traﬃc:
- 10.000 concurrent users
- 200.000 requests / minute

Challenge
• expected peak traﬃc:
- 10.000 concurrent users
- 200.000 requests / minute
• write-heavy workload

Wanted
• scalable database
• with high throughput
- especially for writes

Options
• relational database
- with sharding

Options
• relational database
- with sharding
• nosql database
- key-value-store
- document db
- graph db

Shortlist
• Cassandra
• Redis
• Membase

Facts
• written in Java
- 55.000 lines of code
• Thrift API
- clients for Java, Ruby, Python ...

History
• originally developed by Facebook
- in production for “Inbox Search”
• later open-sourced
- top-level Apache project

Features
• high availability
- no single point of failure
• incremental scalability
• eventual consistency

Architecture
• Dynamo-like hash ring
- partitioning + replication
- all nodes are equal

Architecture
• Dynamo-like hash ring
- partitioning + replication
- all nodes are equal
• Bigtable data model
- column families
- supercolumns

“Cassandra aims to run
on an in"astructure
of hundreds of nodes.”

Facts
• written in C
- 13.000 lines of code
• socket API
- redis-cli
- client libs for all major languages

Features
• high read & write throughput
- 50.000 to 100.000 ops / second

Features
- 50.000 to 100.000 ops / second
• interesting data structures
- lists, hashes, (sorted) sets
- atomic operations

Features
- 50.000 to 100.000 ops / second
• interesting data structures
- lists, hashes, (sorted) sets
- atomic operations
• strong consistency

Architecture
• in-memory database
- append-only log on disk
- virtual memory

Architecture
• in-memory database
- append-only log on disk
- virtual memory
• single instance
- master-slave replication
- clustering is on roadmap

“Memory is the new disk,
disk is the new tape.”
— Jim Gray

Facts
• written in C and Erlang
• API-compatible to Memcached
- same protocol
• client libs for all major languages

History
• developed by NorthScale & Zynga
- used in production (Farmville)
• released in June 2010
- Apache 2.0 License

Features
• “Memcached with persistence”
- extremely fast
- throughput scales linearly

Features
- extremely fast
• automatic data placement
- memory, ssd, disk

Features
- extremely fast
• automatic data placement
- memory, ssd, disk
• conﬁgurable replica count

Architecture
• cluster
- all nodes are alike
- one elected as “coordinator”

Architecture
• cluster
- all nodes are alike
- one elected as “coordinator”
• each node is master for part of key
space
- handles all reads & writes

Decision
• Cassandra ?
- too big, too complicated

Decision
• Cassandra ?
• Membase ?

Decision
• Cassandra ?
• Membase ?
- not yet available (then)

Decision
• Cassandra ?
• Membase ?
- not yet available (then)
• Redis !

Motivation
• keep operations simple
• use as few machines as possible
- ideally, only one

Design
• two machines (+ load balancer)
- Redis master handles all reads /
writes
- Redis slave as hot standby

Design
writes
- both machines used as app servers

Design
writes
- both machines used as app servers
• dedicated hardware

Data model
• one Redis hash per user
- key: facebook id
• store data as serialized JSON
- booleans, strings, numbers,
timestamps ...

Advantages
• turns Redis into “document db”
- eﬃcient to swap user data in / out
- atomic ops on parts
• easy to dump / restore user data

Capacity
• 4 GB memory for 20 mio. integer keys
- keys always stay in memory!

Capacity
• 2 GB memory for 10.000 user hashes
- others can be swapped out

Capacity
• 2 GB memory for 10.000 user hashes
- others can be swapped out
• 3.6 mio. ops / minute
- suﬃcient for 200.000 requests

Status
• game was launched in august
- currently still in beta

Status
• expect to reach 1 mio. daily active users
in Q1/2011

Status
• expect to reach 1 mio. daily active users
in Q1/2011
• will try to stick to 2 or 3 machines
- possibly bigger / faster ones

Conclusions
• use the right tool for the job

Conclusions
• keep it simple
- avoid sharding, if possible

Conclusions
• keep it simple
• don’t scale out too early
- but have a viable “plan b”

Conclusions
• keep it simple
• don’t scale out too early
- but have a viable “plan b”
• use dedicated hardware

Links
• cassandra.apache.org
• redis.io
• membase.org

• tim.lossen.de

Key-Value-Stores -- The Key to Scaling?

More Related Content

What's hot

Viewers also liked

Similar to Key-Value-Stores -- The Key to Scaling?

More from Tim Lossen

Recently uploaded

Key-Value-Stores -- The Key to Scaling?