Your SlideShare is downloading. ×
0
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra

851

Published on

As we move into the world of Big Data and the Internet of Things, the systems architectures and data models we've relied on for decades are becoming a hindrance. At the core of the problem is the …

As we move into the world of Big Data and the Internet of Things, the systems architectures and data models we've relied on for decades are becoming a hindrance. At the core of the problem is the read-modify-write cycle. In this session, Al will talk about how to build systems that don't rely on RMW, with a focus on Cassandra. Finally, for those times when RMW is unavoidable, he will cover how and when to use Cassandra's lightweight transactions and collections.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
851
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
32
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ©2014 DataStax Confidential. Do not distribute without consent. @AlTobey Open Source Mechanic @ Datastax Designing Commodity Storage 1
  • 2. What is commodity storage? •software-defined storage •e.g. Cassandra, S3, GCE Persistent Disks •Intel/AMD x86_64 architecture ! Open Standards: •PCI-Express •Near-line SAS, Enterprise SATA, SATA SSD •1g/10g ethernet
  • 3. Definitely NOT this Designed to solve different problems from a different era.
  • 4. Not this either Besides SSDs most “desktop” gear is to be avoided for production deployment.
  • 5. Enterprise
  • 6. Rack & Stack •Blades & 1U for high CPU with low storage density •2U for plenty of CPU & storage & air flow •3U-4U for high-latency / high-density storage •“racks” don’t have to be literal •blade chassis •separate network/power is key
  • 7. Vendors
  • 8. Choosing Server Components •CPU •Memory •Motherboards •Host Bus Adapters •Hard Drives •Network Interface Cards
  • 9. CPU Pricing E5-2620 E5-2630 E5-2650 E5-2670 E5-2687W E5-2690 0 550 1100 1650 2200 6 cores 2.6Ghz 80w 6 cores 2.1Ghz 80w 8 cores 2.6Ghz 95w 10 cores 2.5Ghz 115w (3.3Ghz turbo) 8 cores 3.4Ghz 150w 8 cores 2.9Ghz 135w (3.8Ghz turbo) Dollars 15MB L3 Cache 15MB 20MB 20MB 25MB 25MB
  • 10. Processors Source: http://en.wikipedia.org/wiki/Sandy_Bridge-E
  • 11. Memory •always get ECC! •~5 single bit errors in 8 GB RAM per hour (top-end error rate) •unexplainable crashes •data corruption •8GB DIMMs are still the sweet spot ! •Registered Memory: match to your CPU/motherboard •Pretty much all server memory is ECC and Registered ! •Speed: match to fastest rating of CPU/motherboard
  • 12. Motherboards •Largely out of your control •Dell / HP / etc. you’re looking at server model, e.g. DL380 •Supermicro: be very careful when picking your VAR •Features to watch for: •Socket count (NUMA) •IPMI •onboard SAS or SATA port speed/count •PCIe speed & layout •RAM capacity
  • 13. Storage Adapters •Serial Attached SCSI •Bit Error Rate: 1 in 10^16 bits or 1bit in 1,250TiB •Supports SATA drives over STP •Near-line SAS drives are SATA chassis with SAS boards •Always use SAS if you need an expander •Check out enclosure services in Linux •Serial ATA •Bit Error Rate: 1 in 10^15 or 1 bit in 125 TiB •Avoid expanders
  • 14. Storage Adapters •JBOD •cheap •OS manages drives •drivers usually shipped with OS •CPU overhead is negligible •HW RAID is sometimes faster, usually comes with cache •writethrough v.s. writeback •writeback + BBU provides interesting performance options •driver + utilities management
  • 15. Parity RAID
  • 16. RAID •JBOD •mount every drive with individual filesystems •cheap •RAID0 •single drive failure means node rebuild •cheap •RAID10 •fast, protects against single disk failure •expensive
  • 17. RAID •RAID 5 / 6 (and beyond) •parity data protection •performance heavily dependent on implementation •cheapest option for drive failure protection •RAID 50 / 60 •stripe across multiple RAID[56] volumes •mostly useful with large number of drives •can provide decent performance esp. on HW RAID
  • 18. Hard Drives •SATA HDD •there’s only one head carriage •seeks kill •decent performance on sequential IO •bit errors •cheap!
  • 19. Hard Drives •SAS HDD •there’s only one head carriage •seeks kill •bit errors •expensive! •faster RPMs may help a little with seek latency
  • 20. Hard Drives •SATA SSD •very low latency seeks •slightly lower sequential IO throughput •more expensive than SATA HDD •vendors might not want to sell them to you! •sometimes called “value series” or similar •Cassandra runs fine on consumer-grade SSDs •make sure your SATA/SAS bus and HBA are up to the task
  • 21. Hard Drives •Enterprise SSD •quite expensive •vendor supported •more reliable •often faster as well
  • 22. Hard Drives •PCIe SSD •e.g. FusionIO, ioSwitch •highest performance potential •not as expensive as you think •lots of new products entering the market •generally not hot-swappable
  • 23. Networking •you don’t need 10gig •but it’s awesome •Broadcom cards are common and commonly buggy •Intel cards are expensive but a good bet •Consider lesser-known add-in cards, e.g. Myricom
  • 24. To the Cloud! •Amazon, Google, etc. all use similar gear under the VM •same constraints apply, but you only get a fraction of the box •pass-through PCIe devices for the best performance •Avoid EBS in EC2, go with ephemerals •GCE PD’s may need additional read/write threads
  • 25. @AlTobey Q & A Everybody is hiring, including Datastax! Open Source Mechanic, Datastax

×