Your SlideShare is downloading. ×
Cassandra On EC2
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Cassandra On EC2


Published on

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Cassandra On EC2Matthew F. Dennis // @mdennis @mdennis
  • 2. Instance Sizes● m1.xlarge is by far the most common size● m1.large is ok for many use cases● m2.4xlarge in some cases ● keep the entire dataset in memory● c1.xlarge / cc1.4xlarge ● Smallish but very hot set of data – regardless of how much data is on disk ● Extremely high request rate ● Encrypted node-node communications and high traffic ● Usually better off with many m1.xlarge instances because of the extra memory, but not always @mdennis
  • 3. Configuration● Stripe All Ephemeral Drives● data directory and commit log on same volume ● Only applies to EC2 and SSDs, not physical HW ● Why?● 6-8 GB heap on m1.xlarge● 3-4 GB heap on m1.large● Phi Convict Threshold? Maybe ... @mdennis
  • 4. EBS versus Ephemeral● Ephemeral drives are: ● Generally faster for C* ● More stable (no pauses/freezes; outages?) ● Cheaper ● Easier to initially configure● Striped EBS? ● yeah, about that …● TL;DL dont use EBS for C* on EC2 @mdennis
  • 5. Multi-Zone● Alternate zones in your token topology ● No really, this is important, alternate zones – We should probably fix this ...● “complicated, but possible” to add new zones after initial deployment● Never move a *token* to a different region or zone ● If you think that is what you want to do, really you want to bootstrap new one at token-1 in the new region/zone and then decom the old one @mdennis
  • 6. Multi-Region C* on EC2● Connectivity is the complicated part ● Ec2MultiRegionSnitch is not the entire answer –● Dont try to make a “fail over” DC, just go with active-active ● If you insist, then do the fail over in your application and configure C* the same as you would active-active● Generally requires a lot more storage ● Doesnt matter though because youre using ephemeral drives (right?) and dont want a TB of data on each node anyway @mdennis
  • 7. Multi-Region Connectivity Options● VPN● Encrypted node-node communication ● CPU utilization is often a downside● VPNCubed / VPCPlus ● Ive never deployed it, heard good things about it though● Amazon VPC ● anyone know if a single VPC can span regions yet?● SSH Tunnels● EC2 security groups● IPTables● Encrypted node-node + public IP binding + AWS security groups + IPTables (EIPs may simplify this, never actually tried it) @mdennis
  • 8. Recovery From Failures● Dont “fix” EC2 nodes, replace them ● boostrap at token-1, remove old token – bootstrap can be slow, but will get better● Other than that its the same in EC2 as not ... @mdennis
  • 9. Node Maintenance● “Maintenance” On EC2?● Usually not required (just replace the node)● If it is, just stop C*, CL+HH/repair/RR will fix it ● Same as physical HW ●● Stop Trying To Decom Nodes Just To Replace a Disk !!! @mdennis
  • 10. Backups● C* snapshots and push to S3● Directory Watcher that pushes new files to S3 ● SimpleGeo:● Netflix:● Keep a log of all incoming writes ● Not specific to S3 ● Can be coupled with snapshots / S3 ● Useful for other reasons as well● Compression in transit to S3 (or where ever) can be done on a separate EC2 instance to avoid burning CPU ● Usually not worth the extra complexity / cost @mdennis
  • 11. Changing Node Sizes● Start a new instance● rsync data from from original node to new node● Shutdown C* on original node● rsync data from from original node to new node● Start C* on new node● Shutdown original instance● NB: Assumes same token, region, zone, etc @mdennis
  • 12. Elastic Load Balancers● Theyre awesome, use them ● Could be more awesome (e.g. better integration with Route 53) ● What I really want is TCP anycast for ELB across regions (AWS could make it work)● Balance across regions with GeoIP / GeoDNS ● Zerigo, TZOHA, Neustar, “homegrown”, etc ● Route 53? You wish (though Route 53 itself is run over anycast) – “in the future we plan for Route 53 to also give you greater control over … the route your users take to reach an endpoint” --Werner Vogels● Put them in front of your app servers, not your C* instances● Keep your app servers stateless or at least “weakly” stateless (e.g. no sticky sessions required) @mdennis
  • 13. AMIs versus Scripted Setup● DataStax publishes C* AMIs● Chef Recipes as well● Or roll your own …● Whatever you do, just make sure its automated and repeatable● *personally* I prefer scripting the setup remotely, but this is … “less than ideal”● PSSH is, in general, awesome @mdennis
  • 14. WTF?!● Your zone X is not the same as my zone X ● Consistent within an EC2 account ● Problematic across accounts ● Does not apply to regions (i.e. your region X is my region X)● EIPs resolve to private IPs from within AWS● EBS volumes sometimes just “freeze” ● AWS: “yeah, that happens sometimes under load”● steal% sometimes 20% or more (1%-3% is “normal”) ● This is AWS literally stealing your money ● Thankfully not all that common, but watch out for it @mdennis
  • 15. Missing AWS Features● ELB over anycast ● Probably doable by AWS, but not others ...● GeoDNS from Route53 ● No really, WTF Doesnt Route53 Do GeoDNS ?!?!● Multi-Region VPC● Local SSDs @mdennis
  • 16. Were Hiring !● Developers● QA● Community Manager● Sales / SE● Interns – Dev – Support – QA● Smart People Interested In Cassandra @mdennis
  • 17. Cassandra On EC2 Q? (yes, Ill post the slides on slideshare) @mdennis