Cassandra vs. Redis
 NoSQL Frankfurt
 2010-09-28
Who?
• Tim Lossen / @tlossen
• Ruby developer
• Berlin
Challenge
Requirements
• backend for facebook game
Requirements
• backend for facebook game
• 1 mio. daily users
• 10 mio. total users
• 100 KB data per user
Requirements
• peak traffic:
 • 10.000 concurrent users
 • 200.000 requests per minute
Requirements
• peak traffic:
 • 10.000 concurrent users
 • 200.000 requests per minute
• write-heavy workload
Sneak Preview
Cassandra
Overview
• written in Java
 • 55.000 lines of code
Overview
• written in Java
 • 55.000 lines of code
• Thrift API
 • clients for Java, Python, Ruby
History
• originally developed by facebook
 • in production for “inbox search”
History
• originally developed by facebook
 • in production for “inbox search”
• later open sourced
 • top-level apache pr...
Features
• high availability
 • no single point of failure
Features
• high availability
 • no single point of failure
• incremental scalability
Features
• high availability
 • no single point of failure
• incremental scalability
• eventual consistency
2006
2007
Architecture
• Dynamo-like ring
 • partitioning + replication
 • all nodes are equal
Architecture
• Dynamo-like ring
 • partitioning + replication
 • all nodes are equal
• Bigtable data model
 • column famil...
Cluster structure
“Cassandra aims to run
on top of an in"astructure
   of hundreds of nodes.”
Redis
Overview
• written in C
 • 13.000 lines of code
Overview
• written in C
 • 13.000 lines of code
• socket API
 • redis-cli
 • client libs for all major languages
Features
• high (write) throughput
 • 50 - 100 K ops / second
Features
• high (write) throughput
 • 50 - 100 K ops / second
• interesting data structures
 • lists, hashes, (sorted) set...
Features
• high (write) throughput
 • 50 - 100 K ops / second
• interesting data structures
 • lists, hashes, (sorted) set...
Architecture
• single instance (not clustered)
 • master-slave replication
Architecture
• single instance (not clustered)
 • master-slave replication
• in-memory database
 • append-only log on disk...
“Memory is the new disk,
  disk is the new tape.”
              ⎯ Jim Gray
Solution
Key Decisions
• keep operations simple
Key Decisions
• keep operations simple
• use as few machines as possible
 • ideally, only one
Architecture
• single Redis master
 • with virtual memory
 • handles all reads / writes
Architecture
• single Redis master
 • with virtual memory
 • handles all reads / writes
• single Redis slave
 • as hot sta...
Throughput
• redis-benchmark
 • 60 K ops / s = 3.6 mio ops / m
Throughput
• redis-benchmark
 • 60 K ops / s = 3.6 mio ops / m
• monitoring (rpm, scout)
 • ca. 10 ops per request
Throughput
• redis-benchmark
 • 60 K ops / s = 3.6 mio ops / m
• monitoring (rpm, scout)
 • ca. 10 ops per request
• 200 K...
Capacity 1
• 100 KB / user (on disk)
• 10.000 concurrent users (peak)
Capacity 1
• 100 KB / user (on disk)
• 10.000 concurrent users (peak)
• 1 GB memory
 • (plus Redis overhead) ✔
Capacity 2
• Redis keeps all keys in memory
• 10 mio. total users
• 20 GB / 100 mio. integer keys
Capacity 2
• Redis keeps all keys in memory
• 10 mio. total users
• 20 GB / 100 mio. integer keys
• 2 GB memory for keys ✔
Data model
• one Redis hash per user
 • key: facebook id
Data model
• one Redis hash per user
 • key: facebook id
• store data as serialized JSON
 • booleans, strings, numbers,
  ...
Advantages
• efficient to swap user data in / out
Advantages
• efficient to swap user data in / out
• turns Redis into “document db”
 • atomic ops on parts
Advantages
• efficient to swap user data in / out
• turns Redis into “document db”
 • atomic ops on parts
• easy to dump / r...
Lessons
“You are not facebook.”
               ⎯ me
Advice
• use the right tool for the job
Advice
• use the right tool for the job
• avoid scaling out / sharding, if
  possible
 • do the numbers!
Advice
• use the right tool for the job
• avoid scaling out / sharding, if
  possible
 • do the numbers!
• keep it simple
Q&A
Links
•cassandra.apache.org
•redis.io


•tim.lossen.de
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Cassandra vs. Redis
Upcoming SlideShare
Loading in...5
×

Cassandra vs. Redis

24,139

Published on

Published in: Technology
4 Comments
45 Likes
Statistics
Notes
No Downloads
Views
Total Views
24,139
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
322
Comments
4
Likes
45
Embeds 0
No embeds

No notes for slide

Cassandra vs. Redis

  1. 1. Cassandra vs. Redis NoSQL Frankfurt 2010-09-28
  2. 2. Who? • Tim Lossen / @tlossen • Ruby developer • Berlin
  3. 3. Challenge
  4. 4. Requirements • backend for facebook game
  5. 5. Requirements • backend for facebook game • 1 mio. daily users • 10 mio. total users • 100 KB data per user
  6. 6. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute
  7. 7. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute • write-heavy workload
  8. 8. Sneak Preview
  9. 9. Cassandra
  10. 10. Overview • written in Java • 55.000 lines of code
  11. 11. Overview • written in Java • 55.000 lines of code • Thrift API • clients for Java, Python, Ruby
  12. 12. History • originally developed by facebook • in production for “inbox search”
  13. 13. History • originally developed by facebook • in production for “inbox search” • later open sourced • top-level apache project
  14. 14. Features • high availability • no single point of failure
  15. 15. Features • high availability • no single point of failure • incremental scalability
  16. 16. Features • high availability • no single point of failure • incremental scalability • eventual consistency
  17. 17. 2006
  18. 18. 2007
  19. 19. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal
  20. 20. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal • Bigtable data model • column families
  21. 21. Cluster structure
  22. 22. “Cassandra aims to run on top of an in"astructure of hundreds of nodes.”
  23. 23. Redis
  24. 24. Overview • written in C • 13.000 lines of code
  25. 25. Overview • written in C • 13.000 lines of code • socket API • redis-cli • client libs for all major languages
  26. 26. Features • high (write) throughput • 50 - 100 K ops / second
  27. 27. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations
  28. 28. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations • full consistency
  29. 29. Architecture • single instance (not clustered) • master-slave replication
  30. 30. Architecture • single instance (not clustered) • master-slave replication • in-memory database • append-only log on disk • virtual memory
  31. 31. “Memory is the new disk, disk is the new tape.” ⎯ Jim Gray
  32. 32. Solution
  33. 33. Key Decisions • keep operations simple
  34. 34. Key Decisions • keep operations simple • use as few machines as possible • ideally, only one
  35. 35. Architecture • single Redis master • with virtual memory • handles all reads / writes
  36. 36. Architecture • single Redis master • with virtual memory • handles all reads / writes • single Redis slave • as hot standby (for failover)
  37. 37. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m
  38. 38. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request
  39. 39. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request • 200 K rpm = 2.0 mio ops / m ✔
  40. 40. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak)
  41. 41. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak) • 1 GB memory • (plus Redis overhead) ✔
  42. 42. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys
  43. 43. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys • 2 GB memory for keys ✔
  44. 44. Data model • one Redis hash per user • key: facebook id
  45. 45. Data model • one Redis hash per user • key: facebook id • store data as serialized JSON • booleans, strings, numbers, timestamps ...
  46. 46. Advantages • efficient to swap user data in / out
  47. 47. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts
  48. 48. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts • easy to dump / restore user data
  49. 49. Lessons
  50. 50. “You are not facebook.” ⎯ me
  51. 51. Advice • use the right tool for the job
  52. 52. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers!
  53. 53. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers! • keep it simple
  54. 54. Q&A
  55. 55. Links •cassandra.apache.org •redis.io •tim.lossen.de
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×