Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra


Published on

As we move into the world of Big Data and the Internet of Things, the systems architectures and data models we've relied on for decades are becoming a hindrance. At the core of the problem is the read-modify-write cycle. In this session, Al will talk about how to build systems that don't rely on RMW, with a focus on Cassandra. Finally, for those times when RMW is unavoidable, he will cover how and when to use Cassandra's lightweight transactions and collections.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra

  1. 1. ©2014 DataStax Confidential. Do not distribute without consent. @AlTobey Open Source Mechanic @ Datastax Designing Commodity Storage 1
  2. 2. What is commodity storage? •software-defined storage •e.g. Cassandra, S3, GCE Persistent Disks •Intel/AMD x86_64 architecture ! Open Standards: •PCI-Express •Near-line SAS, Enterprise SATA, SATA SSD •1g/10g ethernet
  3. 3. Definitely NOT this Designed to solve different problems from a different era.
  4. 4. Not this either Besides SSDs most “desktop” gear is to be avoided for production deployment.
  5. 5. Enterprise
  6. 6. Rack & Stack •Blades & 1U for high CPU with low storage density •2U for plenty of CPU & storage & air flow •3U-4U for high-latency / high-density storage •“racks” don’t have to be literal •blade chassis •separate network/power is key
  7. 7. Vendors
  8. 8. Choosing Server Components •CPU •Memory •Motherboards •Host Bus Adapters •Hard Drives •Network Interface Cards
  9. 9. CPU Pricing E5-2620 E5-2630 E5-2650 E5-2670 E5-2687W E5-2690 0 550 1100 1650 2200 6 cores 2.6Ghz 80w 6 cores 2.1Ghz 80w 8 cores 2.6Ghz 95w 10 cores 2.5Ghz 115w (3.3Ghz turbo) 8 cores 3.4Ghz 150w 8 cores 2.9Ghz 135w (3.8Ghz turbo) Dollars 15MB L3 Cache 15MB 20MB 20MB 25MB 25MB
  10. 10. Processors Source:
  11. 11. Memory •always get ECC! •~5 single bit errors in 8 GB RAM per hour (top-end error rate) •unexplainable crashes •data corruption •8GB DIMMs are still the sweet spot ! •Registered Memory: match to your CPU/motherboard •Pretty much all server memory is ECC and Registered ! •Speed: match to fastest rating of CPU/motherboard
  12. 12. Motherboards •Largely out of your control •Dell / HP / etc. you’re looking at server model, e.g. DL380 •Supermicro: be very careful when picking your VAR •Features to watch for: •Socket count (NUMA) •IPMI •onboard SAS or SATA port speed/count •PCIe speed & layout •RAM capacity
  13. 13. Storage Adapters •Serial Attached SCSI •Bit Error Rate: 1 in 10^16 bits or 1bit in 1,250TiB •Supports SATA drives over STP •Near-line SAS drives are SATA chassis with SAS boards •Always use SAS if you need an expander •Check out enclosure services in Linux •Serial ATA •Bit Error Rate: 1 in 10^15 or 1 bit in 125 TiB •Avoid expanders
  14. 14. Storage Adapters •JBOD •cheap •OS manages drives •drivers usually shipped with OS •CPU overhead is negligible •HW RAID is sometimes faster, usually comes with cache •writethrough v.s. writeback •writeback + BBU provides interesting performance options •driver + utilities management
  15. 15. Parity RAID
  16. 16. RAID •JBOD •mount every drive with individual filesystems •cheap •RAID0 •single drive failure means node rebuild •cheap •RAID10 •fast, protects against single disk failure •expensive
  17. 17. RAID •RAID 5 / 6 (and beyond) •parity data protection •performance heavily dependent on implementation •cheapest option for drive failure protection •RAID 50 / 60 •stripe across multiple RAID[56] volumes •mostly useful with large number of drives •can provide decent performance esp. on HW RAID
  18. 18. Hard Drives •SATA HDD •there’s only one head carriage •seeks kill •decent performance on sequential IO •bit errors •cheap!
  19. 19. Hard Drives •SAS HDD •there’s only one head carriage •seeks kill •bit errors •expensive! •faster RPMs may help a little with seek latency
  20. 20. Hard Drives •SATA SSD •very low latency seeks •slightly lower sequential IO throughput •more expensive than SATA HDD •vendors might not want to sell them to you! •sometimes called “value series” or similar •Cassandra runs fine on consumer-grade SSDs •make sure your SATA/SAS bus and HBA are up to the task
  21. 21. Hard Drives •Enterprise SSD •quite expensive •vendor supported •more reliable •often faster as well
  22. 22. Hard Drives •PCIe SSD •e.g. FusionIO, ioSwitch •highest performance potential •not as expensive as you think •lots of new products entering the market •generally not hot-swappable
  23. 23. Networking •you don’t need 10gig •but it’s awesome •Broadcom cards are common and commonly buggy •Intel cards are expensive but a good bet •Consider lesser-known add-in cards, e.g. Myricom
  24. 24. To the Cloud! •Amazon, Google, etc. all use similar gear under the VM •same constraints apply, but you only get a fraction of the box •pass-through PCIe devices for the best performance •Avoid EBS in EC2, go with ephemerals •GCE PD’s may need additional read/write threads
  25. 25. @AlTobey Q & A Everybody is hiring, including Datastax! Open Source Mechanic, Datastax