Cassandra vs. Redis
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cassandra vs. Redis

on

  • 24,029 views

 

Statistics

Views

Total Views
24,029
Views on SlideShare
22,450
Embed Views
1,579

Actions

Likes
39
Downloads
298
Comments
4

11 Embeds 1,579

http://nosql.mypopescu.com 1488
http://stella.laurenzo.org 72
http://favtile.com 5
https://si0.twimg.com 4
http://translate.googleusercontent.com 3
https://mybu.bournemouth.ac.uk 2
http://tedwon.com 1
http://blekko.com 1
http://www.hanrss.com 1
applewebdata://A21CF9B0-A473-4925-93F1-F266CAC9780A 1
http://115.68.2.182 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra vs. Redis Presentation Transcript

  • 1. Cassandra vs. Redis NoSQL Frankfurt 2010-09-28
  • 2. Who? • Tim Lossen / @tlossen • Ruby developer • Berlin
  • 3. Challenge
  • 4. Requirements • backend for facebook game
  • 5. Requirements • backend for facebook game • 1 mio. daily users • 10 mio. total users • 100 KB data per user
  • 6. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute
  • 7. Requirements • peak traffic: • 10.000 concurrent users • 200.000 requests per minute • write-heavy workload
  • 8. Sneak Preview
  • 9. Cassandra
  • 10. Overview • written in Java • 55.000 lines of code
  • 11. Overview • written in Java • 55.000 lines of code • Thrift API • clients for Java, Python, Ruby
  • 12. History • originally developed by facebook • in production for “inbox search”
  • 13. History • originally developed by facebook • in production for “inbox search” • later open sourced • top-level apache project
  • 14. Features • high availability • no single point of failure
  • 15. Features • high availability • no single point of failure • incremental scalability
  • 16. Features • high availability • no single point of failure • incremental scalability • eventual consistency
  • 17. 2006
  • 18. 2007
  • 19. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal
  • 20. Architecture • Dynamo-like ring • partitioning + replication • all nodes are equal • Bigtable data model • column families
  • 21. Cluster structure
  • 22. “Cassandra aims to run on top of an in"astructure of hundreds of nodes.”
  • 23. Redis
  • 24. Overview • written in C • 13.000 lines of code
  • 25. Overview • written in C • 13.000 lines of code • socket API • redis-cli • client libs for all major languages
  • 26. Features • high (write) throughput • 50 - 100 K ops / second
  • 27. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations
  • 28. Features • high (write) throughput • 50 - 100 K ops / second • interesting data structures • lists, hashes, (sorted) sets • atomic operations • full consistency
  • 29. Architecture • single instance (not clustered) • master-slave replication
  • 30. Architecture • single instance (not clustered) • master-slave replication • in-memory database • append-only log on disk • virtual memory
  • 31. “Memory is the new disk, disk is the new tape.” ⎯ Jim Gray
  • 32. Solution
  • 33. Key Decisions • keep operations simple
  • 34. Key Decisions • keep operations simple • use as few machines as possible • ideally, only one
  • 35. Architecture • single Redis master • with virtual memory • handles all reads / writes
  • 36. Architecture • single Redis master • with virtual memory • handles all reads / writes • single Redis slave • as hot standby (for failover)
  • 37. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m
  • 38. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request
  • 39. Throughput • redis-benchmark • 60 K ops / s = 3.6 mio ops / m • monitoring (rpm, scout) • ca. 10 ops per request • 200 K rpm = 2.0 mio ops / m ✔
  • 40. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak)
  • 41. Capacity 1 • 100 KB / user (on disk) • 10.000 concurrent users (peak) • 1 GB memory • (plus Redis overhead) ✔
  • 42. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys
  • 43. Capacity 2 • Redis keeps all keys in memory • 10 mio. total users • 20 GB / 100 mio. integer keys • 2 GB memory for keys ✔
  • 44. Data model • one Redis hash per user • key: facebook id
  • 45. Data model • one Redis hash per user • key: facebook id • store data as serialized JSON • booleans, strings, numbers, timestamps ...
  • 46. Advantages • efficient to swap user data in / out
  • 47. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts
  • 48. Advantages • efficient to swap user data in / out • turns Redis into “document db” • atomic ops on parts • easy to dump / restore user data
  • 49. Lessons
  • 50. “You are not facebook.” ⎯ me
  • 51. Advice • use the right tool for the job
  • 52. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers!
  • 53. Advice • use the right tool for the job • avoid scaling out / sharding, if possible • do the numbers! • keep it simple
  • 54. Q&A
  • 55. Links •cassandra.apache.org •redis.io •tim.lossen.de