Key-Value-Stores:
The Key to Scaling?
     Tim Lossen
Who?
• @tlossen
• backend developer
   -   Ruby, Rails, Sinatra ...
• passionate about technology
Problem
Challenge
• backend for facebook game
Challenge
• backend for facebook game
• expected load:
   -   1 mio. daily active users
   -   20 mio. total users
   -   ...
Challenge
• expected peak traffic:
   -   10.000 concurrent users
   -   200.000 requests / minute
Challenge
• expected peak traffic:
   -   10.000 concurrent users
   -   200.000 requests / minute
• write-heavy workload
Wanted
• scalable database
• with high throughput
   -   especially for writes
Options
• relational database
   -   with sharding
Options
• relational database
   -   with sharding
• nosql database
   -   key-value-store
   -   document db
   -   graph...
Options
• relational database
   -   with sharding
• nosql database
   -   key-value-store
   -   document db
   -   graph...
Options
• relational database
   -   with sharding
• nosql database
   -   key-value-store
   -   document db
   -   graph...
Options
• relational database
   -   with sharding
• nosql database
   -   key-value-store
   -   document db
   -   graph...
Shortlist
• Cassandra
• Redis
• Membase
Cassandra
Facts
• written in Java
   -   55.000 lines of code
• Thrift API
   -   clients for Java, Ruby, Python ...
History
• originally developed by Facebook
   -   in production for “Inbox Search”
• later open-sourced
   -   top-level A...
Features
• high availability
   -   no single point of failure
• incremental scalability
• eventual consistency
Architecture
• Dynamo-like hash ring
   -   partitioning + replication
   -   all nodes are equal
Hash Ring
Architecture
• Dynamo-like hash ring
   -   partitioning + replication
   -   all nodes are equal
• Bigtable data model
  ...
“Cassandra aims to run
  on an in"astructure
 of hundreds of nodes.”
Redis
Facts
• written in C
   -   13.000 lines of code
• socket API
   -   redis-cli
   -   client libs for all major languages
Features
• high read & write throughput
   -   50.000 to 100.000 ops / second
Features
• high read & write throughput
   -   50.000 to 100.000 ops / second
• interesting data structures
   -   lists, ...
Features
• high read & write throughput
   -   50.000 to 100.000 ops / second
• interesting data structures
   -   lists, ...
Architecture
• in-memory database
  -   append-only log on disk
  -   virtual memory
Architecture
• in-memory database
   -   append-only log on disk
   -   virtual memory
• single instance
   -   master-sla...
“Memory is the new disk,
  disk is the new tape.”
              — Jim Gray
Membase
Facts
• written in C and Erlang
• API-compatible to Memcached
   -   same protocol
• client libs for all major languages
History
• developed by NorthScale & Zynga
   -   used in production (Farmville)
• released in June 2010
   -   Apache 2.0 ...
Features
• “Memcached with persistence”
   -   extremely fast
   -   throughput scales linearly
Features
• “Memcached with persistence”
   -   extremely fast
   -   throughput scales linearly
• automatic data placement...
Features
• “Memcached with persistence”
   -   extremely fast
   -   throughput scales linearly
• automatic data placement...
Architecture
• cluster
   -   all nodes are alike
   -   one elected as “coordinator”
Architecture
• cluster
   -   all nodes are alike
   -   one elected as “coordinator”
• each node is master for part of ke...
Mapping Scheme
“simple, fast, elastic”
Solution
Which one
would you pick?
Decision
• Cassandra ?
Decision
• Cassandra ?
   -   too big, too complicated
Decision
• Cassandra ?
   -   too big, too complicated
• Membase ?
Decision
• Cassandra ?
   -   too big, too complicated
• Membase ?
   -   not yet available (then)
Decision
• Cassandra ?
   -   too big, too complicated
• Membase ?
   -   not yet available (then)
• Redis !
Motivation
• keep operations simple
• use as few machines as possible
   -   ideally, only one
Design
• two machines (+ load balancer)
   -   Redis master handles all reads /
       writes
   -   Redis slave as hot st...
Design
• two machines (+ load balancer)
   -   Redis master handles all reads /
       writes
   -   Redis slave as hot st...
Design
• two machines (+ load balancer)
   -   Redis master handles all reads /
       writes
   -   Redis slave as hot st...
Data model
• one Redis hash per user
   -   key: facebook id
• store data as serialized JSON
   -   booleans, strings, num...
Advantages
• turns Redis into “document db”
   -   efficient to swap user data in / out
   -   atomic ops on parts
• easy to...
Capacity
• 4 GB memory for 20 mio. integer keys
   -   keys always stay in memory!
Capacity
• 4 GB memory for 20 mio. integer keys
   -   keys always stay in memory!
• 2 GB memory for 10.000 user hashes
  ...
Capacity
• 4 GB memory for 20 mio. integer keys
   -   keys always stay in memory!
• 2 GB memory for 10.000 user hashes
  ...
Status
• game was launched in august
   -   currently still in beta
Status
• game was launched in august
   -   currently still in beta
• expect to reach 1 mio. daily active users
  in Q1/20...
Status
• game was launched in august
   -   currently still in beta
• expect to reach 1 mio. daily active users
  in Q1/20...
Conclusions
• use the right tool for the job
Conclusions
• use the right tool for the job
• keep it simple
   -   avoid sharding, if possible
Conclusions
• use the right tool for the job
• keep it simple
   -   avoid sharding, if possible
• don’t scale out too ear...
Conclusions
• use the right tool for the job
• keep it simple
   -   avoid sharding, if possible
• don’t scale out too ear...
Q&A
Links
• cassandra.apache.org
• redis.io
• membase.org



• tim.lossen.de
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
Upcoming SlideShare
Loading in...5
×

Key-Value-Stores -- The Key to Scaling?

7,368

Published on

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,368
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
143
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

Key-Value-Stores -- The Key to Scaling?

  1. 1. Key-Value-Stores: The Key to Scaling? Tim Lossen
  2. 2. Who? • @tlossen • backend developer - Ruby, Rails, Sinatra ... • passionate about technology
  3. 3. Problem
  4. 4. Challenge • backend for facebook game
  5. 5. Challenge • backend for facebook game • expected load: - 1 mio. daily active users - 20 mio. total users - 100 KB data per user
  6. 6. Challenge • expected peak traffic: - 10.000 concurrent users - 200.000 requests / minute
  7. 7. Challenge • expected peak traffic: - 10.000 concurrent users - 200.000 requests / minute • write-heavy workload
  8. 8. Wanted • scalable database • with high throughput - especially for writes
  9. 9. Options • relational database - with sharding
  10. 10. Options • relational database - with sharding • nosql database - key-value-store - document db - graph db
  11. 11. Options • relational database - with sharding • nosql database - key-value-store - document db - graph db
  12. 12. Options • relational database - with sharding • nosql database - key-value-store - document db - graph db
  13. 13. Options • relational database - with sharding • nosql database - key-value-store - document db - graph db
  14. 14. Shortlist • Cassandra • Redis • Membase
  15. 15. Cassandra
  16. 16. Facts • written in Java - 55.000 lines of code • Thrift API - clients for Java, Ruby, Python ...
  17. 17. History • originally developed by Facebook - in production for “Inbox Search” • later open-sourced - top-level Apache project
  18. 18. Features • high availability - no single point of failure • incremental scalability • eventual consistency
  19. 19. Architecture • Dynamo-like hash ring - partitioning + replication - all nodes are equal
  20. 20. Hash Ring
  21. 21. Architecture • Dynamo-like hash ring - partitioning + replication - all nodes are equal • Bigtable data model - column families - supercolumns
  22. 22. “Cassandra aims to run on an in"astructure of hundreds of nodes.”
  23. 23. Redis
  24. 24. Facts • written in C - 13.000 lines of code • socket API - redis-cli - client libs for all major languages
  25. 25. Features • high read & write throughput - 50.000 to 100.000 ops / second
  26. 26. Features • high read & write throughput - 50.000 to 100.000 ops / second • interesting data structures - lists, hashes, (sorted) sets - atomic operations
  27. 27. Features • high read & write throughput - 50.000 to 100.000 ops / second • interesting data structures - lists, hashes, (sorted) sets - atomic operations • strong consistency
  28. 28. Architecture • in-memory database - append-only log on disk - virtual memory
  29. 29. Architecture • in-memory database - append-only log on disk - virtual memory • single instance - master-slave replication - clustering is on roadmap
  30. 30. “Memory is the new disk, disk is the new tape.” — Jim Gray
  31. 31. Membase
  32. 32. Facts • written in C and Erlang • API-compatible to Memcached - same protocol • client libs for all major languages
  33. 33. History • developed by NorthScale & Zynga - used in production (Farmville) • released in June 2010 - Apache 2.0 License
  34. 34. Features • “Memcached with persistence” - extremely fast - throughput scales linearly
  35. 35. Features • “Memcached with persistence” - extremely fast - throughput scales linearly • automatic data placement - memory, ssd, disk
  36. 36. Features • “Memcached with persistence” - extremely fast - throughput scales linearly • automatic data placement - memory, ssd, disk • configurable replica count
  37. 37. Architecture • cluster - all nodes are alike - one elected as “coordinator”
  38. 38. Architecture • cluster - all nodes are alike - one elected as “coordinator” • each node is master for part of key space - handles all reads & writes
  39. 39. Mapping Scheme
  40. 40. “simple, fast, elastic”
  41. 41. Solution
  42. 42. Which one would you pick?
  43. 43. Decision • Cassandra ?
  44. 44. Decision • Cassandra ? - too big, too complicated
  45. 45. Decision • Cassandra ? - too big, too complicated • Membase ?
  46. 46. Decision • Cassandra ? - too big, too complicated • Membase ? - not yet available (then)
  47. 47. Decision • Cassandra ? - too big, too complicated • Membase ? - not yet available (then) • Redis !
  48. 48. Motivation • keep operations simple • use as few machines as possible - ideally, only one
  49. 49. Design • two machines (+ load balancer) - Redis master handles all reads / writes - Redis slave as hot standby
  50. 50. Design • two machines (+ load balancer) - Redis master handles all reads / writes - Redis slave as hot standby - both machines used as app servers
  51. 51. Design • two machines (+ load balancer) - Redis master handles all reads / writes - Redis slave as hot standby - both machines used as app servers • dedicated hardware
  52. 52. Data model • one Redis hash per user - key: facebook id • store data as serialized JSON - booleans, strings, numbers, timestamps ...
  53. 53. Advantages • turns Redis into “document db” - efficient to swap user data in / out - atomic ops on parts • easy to dump / restore user data
  54. 54. Capacity • 4 GB memory for 20 mio. integer keys - keys always stay in memory!
  55. 55. Capacity • 4 GB memory for 20 mio. integer keys - keys always stay in memory! • 2 GB memory for 10.000 user hashes - others can be swapped out
  56. 56. Capacity • 4 GB memory for 20 mio. integer keys - keys always stay in memory! • 2 GB memory for 10.000 user hashes - others can be swapped out • 3.6 mio. ops / minute - sufficient for 200.000 requests
  57. 57. Status • game was launched in august - currently still in beta
  58. 58. Status • game was launched in august - currently still in beta • expect to reach 1 mio. daily active users in Q1/2011
  59. 59. Status • game was launched in august - currently still in beta • expect to reach 1 mio. daily active users in Q1/2011 • will try to stick to 2 or 3 machines - possibly bigger / faster ones
  60. 60. Conclusions • use the right tool for the job
  61. 61. Conclusions • use the right tool for the job • keep it simple - avoid sharding, if possible
  62. 62. Conclusions • use the right tool for the job • keep it simple - avoid sharding, if possible • don’t scale out too early - but have a viable “plan b”
  63. 63. Conclusions • use the right tool for the job • keep it simple - avoid sharding, if possible • don’t scale out too early - but have a viable “plan b” • use dedicated hardware
  64. 64. Q&A
  65. 65. Links • cassandra.apache.org • redis.io • membase.org • tim.lossen.de
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×