Your SlideShare is downloading. ×
Stampede con 2014   cassandra in the real world
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Stampede con 2014 cassandra in the real world

658

Published on

Three use cases of Apache Cassandra in real-world implementations and the best practices distilled from such.

Three use cases of Apache Cassandra in real-world implementations and the best practices distilled from such.

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
658
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. STAMPEDECON 2014 CASSANDRA IN THE REAL WORLD Nate McCall @zznate ! Co-Founder & Sr.Technical Consultant ! Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2. AboutThe Last Pickle. ! Work with clients to deliver and improve Apache Cassandra based solutions. ! Based in New Zealand & USA.
  • 3. “…in the Real World?” ! Lots of hype, stats get attention, as do big names
  • 4. “Real World?” ! “…1.1 million client writes per second. Data was automatically replicated across all three zones making a total of 3.3 million writes per second across the cluster.” http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
  • 5. “Real World?” ! “+10 clusters, +100s nodes, 250TB provisioned, 9 billion writes/day, 5 billion reads/day” http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-cassandra-summit-2013
  • 6. “Real World?” ! … • “but I don’t have an∞ AMZN budget” • “maybe one day I’ll have that much data”
  • 7. “Real World!” ! Most folks needed: real fault tolerance, scale out characteristics
  • 8. “Real World!” ! Most folks have: 3 to 12 nodes with 2-15TB, commodity hardware, small teams
  • 9. ! Cassandra at 10k feet Case Studies Common Best Practices Cassandra in the Real World.
  • 10. Cassandra Architecture (briefly). API's Cluster Aware Cluster Unaware Clients Disk
  • 11. Cassandra Cluster Architecture (briefly). API's Cluster Aware Cluster Unaware Clients Disk API's Cluster Aware Cluster Unaware Disk Node 1 Node 2
  • 12. Dynamo Cluster Architecture (briefly). API's Dynamo Database Clients Disk API's Dynamo Database Disk Node 1 Node 2
  • 13. Cassandra Architecture (briefly). ! API Dynamo Database
  • 14. APITransports. ! Thrift Native Binary
  • 15. Thrift transport. ! Extremely performant for specific workloads Astyanax, disruptor-based HSHA in 2.0
  • 16. APITransports. ! Thrift Native Binary
  • 17. Native BinaryTransport. ! Focus of future development Uses Netty, CQL 3 only, asynchronous
  • 18. API Services. ! JMX Thrift CQL 3 !
  • 19. API Services. ! JMX Thrift CQL 3 !
  • 20. API Services. ! JMX Thrift CQL 3 !
  • 21. Cassandra Architecture (briefly). ! API Dynamo Database Please see: http://www.slideshare.net/aaronmorton/cassandra-community-webinar-introduction-to-apache-cassandra-12-20353118 http://www.slideshare.net/planetcassandra/c-summit-eu-2013-cassandra-internals http://www.slideshare.net/aaronmorton/cassandra-community-webinar-august-29th-2013-in-case-of-emergency-break-glass
  • 22. Cassandra in the Real World. ! Cassandra at 10k feet Case Studies Common Best Practices
  • 23. Case Studies. Ad Tech Sensor Data Mobile Device Diagnostics
  • 24. AdTech. Latency = $$$
  • 25. AdTech. Large “Hot Data” set active users, targeting, display count
  • 26. AdTech. Huge Long Tail who saw what, used for billing, campaign effectiveness over time, all sorts of analytics
  • 27. AdTech: Software. Java CQL via DataStax Java Driver Python Pycassa (Thrift)
  • 28. AdTech: Cluster. Cluster 12 nodes, 2 datacenters, {DC1:R1:3,DC2:R2:3}
  • 29. AdTech: Systems. Physical Hardware commodity 1U 8xSSD, 36GB RAM, 10gigE + 4x1gigE
  • 30. Case Studies. AdTech Sensor Data Mobile Device Diagnostics
  • 31. Sensor Data. Latency != $$$
  • 32. Sensor Data. High Write Throughput: consistent “shape”, immutable data, large sequential reads, high uptime (for writes)
  • 33. Sensor Data: Software. REST application: separate reader service, writes to kafka, ELB to multiple regions
  • 34. Sensor Data: Software. Java: Thrift via Astyanax, read from kafka and batch insertions to optimal size
  • 35. Sensor Data: Cluster. Cluster 9 nodes, 1 availability zone, {RF:3}
  • 36. Sensor Data: Systems. m1.xlarge: 15GB, 2TB RAID0 “high”, tablesnap for backup
  • 37. Case Studies. AdTech Sensor Data Mobile Device Diagnostics
  • 38. Device Diagnostics. Latency = battery
  • 39. Device Diagnostics. Write Bursts large single payloads, large hot data set
  • 40. Device Diagnostics. Huge long tail but irrelevant after 2 months, external partner API* ! *thar be dragons
  • 41. Device Diagnostics: Software. Java CQL / DataStax Java Driver
  • 42. Device Diagnostics: Software. REST application Payloads to S3, pointer in kafka to payload
  • 43. Device Diagnostics: Cluster. Cluster 12 nodes, 3 availability zones {us-east-1:1}
  • 44. Device Diagnostics: Systems. i2.2xlarge 61gb, 1.8TB RAID0 SSD “Enhanced Networking”, dedicated ENI
  • 45. Device Diagnostics: Systems. No Backups. ! !
  • 46. Device Diagnostics: Systems. No Backups. ! “Replay the front end.”
  • 47. Cassandra in the Real World. ! Cassandra at 10k feet Case Studies Common Best Practices
  • 48. Common Best Practices. API's Cluster Aware Cluster Unaware Clients Disk
  • 49. Client Best Practices. Decouple! buffer writes for event based systems, use asynchronous operations
  • 50. Client Best Practices. Use Official Drivers (but there are exceptions)
  • 51. Client Best Practices. CQL3: collections, user defined types, tooling available
  • 52. Common Best Practices. API's Cluster Aware Cluster Unaware Clients Disk
  • 53. API Best Practices. Understand Replication!
  • 54. API Best Practices. Monitor & Instrument
  • 55. Common Best Practices. API's Cluster Aware Cluster Unaware Clients Disk
  • 56. Cluster Best Practices. Understand Replication! learn all you can about topology options
  • 57. Cluster Best Practices. Verify Assumptions: test failure scenarios explicitly
  • 58. Common Best Practices. API's Cluster Aware Cluster Unaware Clients Disk
  • 59. Systems Best Practices. Better to have a lot of a little commodity hardware*, 32-64gb or RAM (or more) *10gigE is now commodity
  • 60. Systems Best Practices. BUT: do you have staff that can tune kernels? larger hardware needs tuning: “receive packet steering”
  • 61. Systems Best Practices. EC2 SSD instances if you can, UseVPCs, Deployment groups and ENIs
  • 62. Common Best Practices. API's Cluster Aware Cluster Unaware Clients Disk
  • 63. Storage Best Practices. Dependent on workload can mix and match: rotational for commitlog and system
  • 64. Storage Best Practices. You can mix and match: rotational for commitlog and system, SSD for data
  • 65. Storage Best Practices. SSD consider JBOD, consumer grade works fine
  • 66. Storage Best Practices. “What about SANs?”
  • 67. Storage Best Practices. “What about SANs?” ! NO. ! (You would be moving a distributed system onto a centralized component)
  • 68. Storage Best Practices. Backups: tablesnap on EC2, rsync (immutable data FTW!)
  • 69. Storage Best Practices. Backups: combine rebuild+replay for best results (Bonus: loading production data to staging is testing your backups!)
  • 70. Thanks. !
  • 71. Nate McCall @zznate ! Co-Founder & Sr.Technical Consultant www.thelastpickle.com

×