Cassandra vs. Redis

29,559 views
28,035 views

Published on

Published in: Technology
4 Comments
46 Likes
Statistics
Notes
No Downloads
Views
Total views
29,559
On SlideShare
0
From Embeds
0
Number of Embeds
1,608
Actions
Shares
0
Downloads
346
Comments
4
Likes
46
Embeds 0
No embeds

No notes for slide

Cassandra vs. Redis

  1. Cassandra vs. Redis NoSQL Frankfurt 2010-09-28
  2. Who? • Tim Lossen / @tlossen • Ruby developer • Berlin
  3. Challenge
  4. Requirements • backend for facebook game
  5. Requirements • backend for facebook game • 1 mio. daily users • 10 mio. total users • 100 KB data per user
  6. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute
  7. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute • write-heavy workload
  8. Sneak Preview
  9. Cassandra
  10. Overview • written in Java • 55.000 lines of code
  11. Overview • written in Java • 55.000 lines of code • Thrift API • clients for Java, Python, Ruby
  12. History • originally developed by facebook • in production for “inbox search”
  13. History • originally developed by facebook • in production for “inbox search” • later open sourced • top-level apache project
  14. Features • high availability • no single point of failure
  15. Features • high availability • no single point of failure • incremental scalability
  16. Features • high availability • no single point of failure • incremental scalability • eventual consistency
  17. 2006
  18. 2007
  19. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal
  20. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal • Bigtable data model • column families
  21. Cluster structure
  22. “Cassandra aims to run on top of an in"astructure of hundreds of nodes.”
  23. Redis
  24. Overview • written in C • 13.000 lines of code
  25. Overview • written in C • 13.000 lines of code • socket API • redis-cli • client libs for all major languages
  26. Features • high (write) throughput • 50 - 100 K ops / second
  27. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations
  28. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations • full consistency
  29. Architecture • single instance (not clustered) • master-slave replication
  30. Architecture • single instance (not clustered) • master-slave replication • in-memory database • append-only log on disk • virtual memory
  31. “Memory is the new disk, disk is the new tape.” ⎯ Jim Gray
  32. Solution
  33. Key Decisions • keep operations simple
  34. Key Decisions • keep operations simple • use as few machines as possible • ideally, only one
  35. Architecture • single Redis master • with virtual memory • handles all reads / writes
  36. Architecture • single Redis master • with virtual memory • handles all reads / writes • single Redis slave • as hot standby (for failover)
  37. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m
  38. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request
  39. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request • 200 K rpm = 2.0 mio ops / m ✔
  40. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak)
  41. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak) • 1 GB memory • (plus Redis overhead) ✔
  42. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys
  43. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys • 2 GB memory for keys ✔
  44. Data model • one Redis hash per user • key: facebook id
  45. Data model • one Redis hash per user • key: facebook id • store data as serialized JSON • booleans, strings, numbers, timestamps ...
  46. Advantages • efficient to swap user data in / out
  47. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts
  48. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts • easy to dump / restore user data
  49. Lessons
  50. “You are not facebook.” ⎯ me
  51. Advice • use the right tool for the job
  52. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers!
  53. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers! • keep it simple
  54. Q&A
  55. Links •cassandra.apache.org •redis.io •tim.lossen.de

×