©2014 DataStax
@tjake
T Jake Luciani

Apache Cassandra Committer & PMC
Proof-of-Concept to Production
1
©2014 DataStax
The way we build software
1. Proof of Concept
2. ??
3. Production
4. Profit! 
2
Do Nothing!
Preparation!
De...
©2013 DataStax Confidential. Do not distribute without consent.
Cassandra Preparation
• Going to production with C* you mus...
©2013 DataStax Confidential. Do not distribute without consent.
Before we begin
• Be comfortable on the command line! 
• Wh...
©2013 DataStax Confidential. Do not distribute without consent.
Phase 1: DataModeling
• You’ve modeled your application in ...
©2013 DataStax Confidential. Do not distribute without consent.
CQL Stress tool
• Why? Because you can push your cluster to...
©2013 DataStax Confidential. Do not distribute without consent.
CQL Stress
7
YAML File + Demo
©2013 DataStax Confidential. Do not distribute without consent. 8
Drain Dump
©2013 DataStax Confidential. Do not distribute without consent.
Hardware
• Currently C* isn’t well suited for > 1TB per nod...
©2013 DataStax Confidential. Do not distribute without consent.
Unix level stuff
• turn off swap
• turn off cpuspeed
• swit...
©2013 DataStax Confidential. Do not distribute without consent.
Deployment
• Chef/Puppet/Ansible/etc
!
• Simpler rollout an...
©2013 DataStax Confidential. Do not distribute without consent.
Monitoring
• Stress your system and learn where it breaks d...
©2013 DataStax Confidential. Do not distribute without consent.
C* Monitoring
• Specific to C* things to monitor
• pending ...
©2013 DataStax Confidential. Do not distribute without consent.
Cassandra Ops
• Understand operational basics like:
• boots...
©2013 DataStax Confidential. Do not distribute without consent.
Choose your own consistency
• When things go wrong you are ...
©2013 DataStax Confidential. Do not distribute without consent.
Backups
• Backups in C* are primarily to avoid human error
...
©2013 DataStax Confidential. Do not distribute without consent.
Cassandra upgrades
• Read the release notes! NEWS.txt
• Rea...
©2013 DataStax Confidential. Do not distribute without consent.
Canary node
• When rolling out a new version of C* or your ...
©2013 DataStax Confidential. Do not distribute without consent.
Pre-Prod Environments
• Hard to do in large scale systems
•...
©2013 DataStax Confidential. Do not distribute without consent.
C* level stuff
• cassandra.yaml
• Use stress to size your w...
©2013 DataStax Confidential. Do not distribute without consent.
Thanks!
21
Questions?
!
@tjake
Upcoming SlideShare
Loading in...5
×

Cassandra Day NY 2014: From Proof of Concept to Production

587

Published on

This talk will cover how to load test your Cassandra cluster for your applications schema and other best practices to gain confidence in your Cassandra deployment before you run in production.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
587
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Cassandra Day NY 2014: From Proof of Concept to Production

  1. 1. ©2014 DataStax @tjake T Jake Luciani
 Apache Cassandra Committer & PMC Proof-of-Concept to Production 1
  2. 2. ©2014 DataStax The way we build software 1. Proof of Concept 2. ?? 3. Production 4. Profit! 2 Do Nothing! Preparation! Development Testing Performance Operations Monitoring
  3. 3. ©2013 DataStax Confidential. Do not distribute without consent. Cassandra Preparation • Going to production with C* you must validate your assumptions and have a plan for when you: • loose nodes, disks, networks • have spikes of traffic • need to add more nodes • upgrade cassandra, java, hardware • … 3 Plan for all the nightmare scenarios. This gives you confidence in your system
  4. 4. ©2013 DataStax Confidential. Do not distribute without consent. Before we begin • Be comfortable on the command line! • When something is going wrong you need to be able to get to the problem quickly and ask the write questions. Provide diagnostic information. • cassandra: nodetool, cqlsh • disk: iostat • cpu: top/htop • network: iftop • java: jstatd, jstack, jmx, visualvm (ok not command line) ! • cssh (csshx on osx) 4
  5. 5. ©2013 DataStax Confidential. Do not distribute without consent. Phase 1: DataModeling • You’ve modeled your application in Cassandra • You’ve de-normalized based on queries ! • Stop. Stress test time… • C* 2.1 native CQL stress tool (works with 2.0) • CASSANDRA-6164 • https://github.com/tjake/cassandra/archive/6164.zip 5
  6. 6. ©2013 DataStax Confidential. Do not distribute without consent. CQL Stress tool • Why? Because you can push your cluster to the limit, see how *your* queries run on *your* hardware ! • cassandra-stress write -schema yaml=my.yaml ! • cassandra-stress read -schema yaml=my.yaml query=simple1 6
  7. 7. ©2013 DataStax Confidential. Do not distribute without consent. CQL Stress 7 YAML File + Demo
  8. 8. ©2013 DataStax Confidential. Do not distribute without consent. 8 Drain Dump
  9. 9. ©2013 DataStax Confidential. Do not distribute without consent. Hardware • Currently C* isn’t well suited for > 1TB per node • Except DSE Hadoop nodes which can be much larger ! • Ideally 1U or smaller (blades) • separate network, power, disk ! • If you have larger machines • VMs with disk per vm • Containers? ! • EC2 use I2 instances 9
  10. 10. ©2013 DataStax Confidential. Do not distribute without consent. Unix level stuff • turn off swap • turn off cpuspeed • switch to deadline kernel scheduler • socket buffers resize • install numactl • raise limits.conf esp (nofile and • stress your disks using something like bonnie++ to get a idea of the raw limits 10
  11. 11. ©2013 DataStax Confidential. Do not distribute without consent. Deployment • Chef/Puppet/Ansible/etc ! • Simpler rollout and rollback ! • You should release your artifacts to a central location ! • Do this for Cassandra too • Makes upgrades easier 11
  12. 12. ©2013 DataStax Confidential. Do not distribute without consent. Monitoring • Stress your system and learn where it breaks down • Use that to create your alerts ! • Know your SLAs • Define them at each layer of your architecture ! • OpsCenter for all things C* ! • You can also easily integrate C* metrics into other metrics systems • http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2 12
  13. 13. ©2013 DataStax Confidential. Do not distribute without consent. C* Monitoring • Specific to C* things to monitor • pending compactions • exception count • disk space 13
  14. 14. ©2013 DataStax Confidential. Do not distribute without consent. Cassandra Ops • Understand operational basics like: • bootstrapping • repair • rebuild • scrub 14
  15. 15. ©2013 DataStax Confidential. Do not distribute without consent. Choose your own consistency • When things go wrong you are in control • Build consistency controls into your application • In a pinch you can lower consistency and stay available 15
  16. 16. ©2013 DataStax Confidential. Do not distribute without consent. Backups • Backups in C* are primarily to avoid human error • C* provides lightweight local snapshots • Traditional full backup of data in C* is hard todo • Your data needs to be de-duped since each nodes files contain data from many replicas • If you need full traditional backup you are best to do full machine backups • At a minimum backup system tables (incase you loose the entire box) 16
  17. 17. ©2013 DataStax Confidential. Do not distribute without consent. Cassandra upgrades • Read the release notes! NEWS.txt • Read the change log! CHANGES.txt • Understand the changes and how they impact your system ! ! • Do this even if you don’t plan on upgrading. • Someone else may have fixed a potential issue for you. ! • Always snapshot your data before upgrading 17
  18. 18. ©2013 DataStax Confidential. Do not distribute without consent. Canary node • When rolling out a new version of C* or your application, roll it out only to a single node and watch it • Quickly see if something is terribly wrong • Gives you ability to verify new functionality before full rollout 18
  19. 19. ©2013 DataStax Confidential. Do not distribute without consent. Pre-Prod Environments • Hard to do in large scale systems • Requires work like replaying traffic to second cluster • Doesn’t need to be 1:1 but offer a subset of real data to test with 19
  20. 20. ©2013 DataStax Confidential. Do not distribute without consent. C* level stuff • cassandra.yaml • Use stress to size your write and read pools • internode_compression: dc • lower request timeouts (improves tail latency) • set concurrent compactors to 1/4 your cores • in 2.1 we have off heap memtable • Turn on Authentication • Keeps you/apps from accidentally connecting to prod 20
  21. 21. ©2013 DataStax Confidential. Do not distribute without consent. Thanks! 21 Questions? ! @tjake
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×