Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra vs. Redis

34,462 views

Published on

Published in: Technology

Cassandra vs. Redis

  1. Cassandra vs. Redis NoSQL Frankfurt 2010-09-28
  2. Who? • Tim Lossen / @tlossen • Ruby developer • Berlin
  3. Challenge
  4. Requirements • backend for facebook game
  5. Requirements • backend for facebook game • 1 mio. daily users • 10 mio. total users • 100 KB data per user
  6. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute
  7. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute • write-heavy workload
  8. Sneak Preview
  9. Cassandra
  10. Overview • written in Java • 55.000 lines of code
  11. Overview • written in Java • 55.000 lines of code • Thrift API • clients for Java, Python, Ruby
  12. History • originally developed by facebook • in production for “inbox search”
  13. History • originally developed by facebook • in production for “inbox search” • later open sourced • top-level apache project
  14. Features • high availability • no single point of failure
  15. Features • high availability • no single point of failure • incremental scalability
  16. Features • high availability • no single point of failure • incremental scalability • eventual consistency
  17. 2006
  18. 2007
  19. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal
  20. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal • Bigtable data model • column families
  21. Cluster structure
  22. “Cassandra aims to run on top of an in"astructure of hundreds of nodes.”
  23. Redis
  24. Overview • written in C • 13.000 lines of code
  25. Overview • written in C • 13.000 lines of code • socket API • redis-cli • client libs for all major languages
  26. Features • high (write) throughput • 50 - 100 K ops / second
  27. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations
  28. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations • full consistency
  29. Architecture • single instance (not clustered) • master-slave replication
  30. Architecture • single instance (not clustered) • master-slave replication • in-memory database • append-only log on disk • virtual memory
  31. “Memory is the new disk, disk is the new tape.” ⎯ Jim Gray
  32. Solution
  33. Key Decisions • keep operations simple
  34. Key Decisions • keep operations simple • use as few machines as possible • ideally, only one
  35. Architecture • single Redis master • with virtual memory • handles all reads / writes
  36. Architecture • single Redis master • with virtual memory • handles all reads / writes • single Redis slave • as hot standby (for failover)
  37. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m
  38. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request
  39. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request • 200 K rpm = 2.0 mio ops / m ✔
  40. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak)
  41. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak) • 1 GB memory • (plus Redis overhead) ✔
  42. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys
  43. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys • 2 GB memory for keys ✔
  44. Data model • one Redis hash per user • key: facebook id
  45. Data model • one Redis hash per user • key: facebook id • store data as serialized JSON • booleans, strings, numbers, timestamps ...
  46. Advantages • efficient to swap user data in / out
  47. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts
  48. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts • easy to dump / restore user data
  49. Lessons
  50. “You are not facebook.” ⎯ me
  51. Advice • use the right tool for the job
  52. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers!
  53. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers! • keep it simple
  54. Q&A
  55. Links •cassandra.apache.org •redis.io •tim.lossen.de

×