Understanding Elastic Block Store Availability and Performance
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Understanding Elastic Block Store Availability and Performance

on

  • 9,170 views

Depending on your application needs, Elastic Block Store’s volumes can be configured for optimal performance and higher availability.  In this session, we will present the different design ...

Depending on your application needs, Elastic Block Store’s volumes can be configured for optimal performance and higher availability.  In this session, we will present the different design characteristics of EBS Standard and Provisioned IOPS volumes, provide technical insights on how to think about EBS performance and availability, and share best practices to achieve higher availability and performance.  

Statistics

Views

Total Views
9,170
Views on SlideShare
8,549
Embed Views
621

Actions

Likes
16
Downloads
267
Comments
2

7 Embeds 621

http://www.scoop.it 356
https://twitter.com 246
http://www.linkedin.com 15
http://moderation.local 1
https://engage.eucalyptus.com 1
http://tweetedtimes.com 1
http://webcache.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Understanding Elastic Block Store Availability and Performance Presentation Transcript

  • 1. Jafar ShameemBusiness Development ManagerEric AndersonCTO and Co-Founder, CopperEggUnderstanding Amazon EBS Availability and Performance
  • 2. Agenda• Overview of Elastic Block Store• Some key concepts• Performance• Availability
  • 3. Storage Options on AWSBlock Storage(Elastic Block Store)Object Storage(S3, Glacier)Use for:• Access to rawunformatted blocklevel storage• Persistent StorageUse for:• Pictures, videos,highly durablemedia storage• Cold storage forlong-term archive
  • 4. Amazon Elastic Block Store (EBS)Elastic Block Storage: Persistent Storage for EC2Feature DetailsHigh performancefile systemMount EBS as drives and format asrequiredFlexible size Volumes from 1GB to 1TB in sizeSecure Private to your instancesAvailable Replicated within an Availability ZoneBackups Volumes can be snapshotted for point intime restoreMonitoring Detailed metrics captured via CloudWatch
  • 5. What are some of our customers doing with EBS?EnterprisesEnterpriseworkloads arebuilt on blockstorageOracle, SAP,MicrosoftApplicationsConvenient,cost-effective,reliable fileserverGaming/Social/Mobile/EducationVery highperformanceandconsistent IOfor NoSQLand relationalDBsMarketing /AnalyticsFastsequential IOaccess
  • 6. Key EBS concepts• Standard and Provisioned IOPSVolumes• Block Size• Queue Depth• Snapshots
  • 7. Standard and Provisioned IOPS Volume TypesStandard Volumes Provisioned IOPS VolumesOptimized for Workloads with low or moderate IOPSneeds and occasional bursts.Transactional workloads requiring consistentIOPS.VolumeAttributesUp to 1 TB, average 100 IOPS per volume.Best effort performance. Can be stripedtogether for larger size and higher IOPS.Up to 1TB, 2,000 IOPS per volume. ConsistentIOPS. Can be striped together for larger size andhigher IOPS.Workloads File server, Log processing, Websites,Analytics, Boot, etc.Business applications, MongoDB, SQL server,MySQL, Postgres, Oracle, etc.
  • 8. Block Size
  • 9. Queue DepthMaintain a number of pending I/O requests to get the most out of your ProvisionedIOPS volume. The volumes must maintain an average queue length of 1 (rounded up tothe nearest whole number) for every 200 provisioned IOPS in a minute
  • 10. Snapshots• Create snapshots (backups) of any Amazon EBS volume.• The volume need not be attached to a running instance in order to take asnapshot.• These snapshots can be used to create multiple new Amazon EBS volumes,expand the size of a volume, or move volumes across Availability Zones.• The snapshots can be shared with specific AWS accounts or made public.• You can use this functionality to increase the size of an existing volume, rapidlyreplicate development and testing environments, or use Snapshot Copy to copysnapshots to another region for disaster recovery or regional expansion.
  • 11. Performance• Architecting for Performance– Avoid throughput saturation– Striping• Achieving Consistent Performance– Pre-warm Provisioned IOPS volumes– Plan for snapshot• Snapshot Performance
  • 12. Architecting for Performance: Use EBS Optimized Instances
  • 13. Architecting for Performance: Avoid Throughput Saturation• Example:– Cluster Compute instance types have 2Gb/s bandwidth toEBS, more than 6 PIOPS volumes at 2000 IOPS each willsaturate 2 Gb/s network– EBS Optimized M3.2Xlarge instance has 1 Gb/sbandwidth dedicated to EBS, more than 12 PIOPSvolumes at 500 IOPS each will saturate the 1 GB/snetwork
  • 14. Architecting for Performance: Stripingsudo mdadm --verbose --create /dev/md0 --level=10 --chunk=256 --raid-devices=4 /dev/sdh1 /dev/sdh2 /dev/sdh3 /dev/sdh4
  • 15. Achieving Consistent Performance: Pre-warm ProvisionedIOPS volumes• There is a 5 to 50 percent performance reduction in IOPS when you first accessthe data on a Provisioned IOPS volume.• Write to all blocks on volumes before first use– $ dd of=/dev/md0 if=/dev/null
  • 16. Achieving Consistent Performance: Plan for SnapshotTo minimize the impact of snapshots on performance of a master node:– create snapshots from a read replica of your data– Plan snapshots during off-peak usage
  • 17. Snapshot Performance• To improve snapshot performance– Increase the frequency of snapshots
  • 18. Benchmarking performance• http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html#benchmark_piops• fio• Linux, Windows• For benchmarking I/O performance. (Note that fio has a dependency on libaio-devel.)• Oracle ORION• Linux, Windows• For calibrating the I/O performance of storage systems to be used with Oracle databases.• SQLIO• Windows• For calibrating the I/O performance of storage systems to be used with Microsoft SQL Server.
  • 19. Testing random 4K reads• One Volume: ~200 MongoOPS with some variability, <1mb/s• Loaded instance: ~ 1000 MongoOPS with some variability<10mb/sPIOPS+EBS• One Volume: 2000 MongoOPS with <1% variability, 3mb/s• Loaded Instance: 20000 MongoOPS with <1% variability, 60mb/sSSD• Hi1.4xlarge ephemeral: ~64,000 MongoOPS with low variability, ~245mb/s
  • 20. StableTesting random 4K readsEBS PIOPS+ SSDPIOPS+Stable
  • 21. Availability• RAID• Snapshots
  • 22. RAID• RAID 10: provides increased redundancy– Replace EBS volume without application downtime– Increases read throughput– However, 50% reduction of provisioned aggregate write performance– E.g., MongoDB optimized around the benefits of RAID 10• RAID 0:– All EBS volumes are replicated in the same AZ– Increased throughput
  • 23. Snapshots
  • 24. CopperEgg: EBS Use Case• How CopperEgg uses EBS• EBS vs Provisioned IOPS EBS• EBS and RAID• Backup/Snapshot best practices• Filesystem selection and tuning• Monitoring/Migrations/Planning
  • 25. How CopperEgg uses EBS• Real-time monitoring (every 5s)– System information– Processes– Synthetic HTTP/TCP/etc– Application metrics– Tons more..• Requirements:– Store many terabytes of data– Persist the data over long periods of time– Backups (use snapshots)– High IO: 50-60k+ ops/s per node• SSD + Provisioned IOPS EBS– Consistent IO behavior (non-spikey)
  • 26. EBS vs Provisioned IOPS EBS• Standard EBS– Good for low IO volume– Bursty workloads may be a goodfit: do the math• Provisioned IOPS EBS– Great for steady IO patterns thatneed consistency– Not always more expensive thanstandard!– Be sure to use the IOPS youprovision!
  • 27. EBS and RAID• Which RAID?– Depends on your use case, but:• We use stripes (RAID 0) for most things– Good performance, we build our fault tolerance at a different level• RAID 10 (stripe of mirrors)– Good RAID0 performance, but increase in fault tolerance due to mirrors– Twice the cost of RAID 0• RAID 0+1 (mirror of stripes)– Don’t do this – same performance, worse fault tolerance• RAID 5 (stripe with parity)– Could be dangerous: software RAID 5 can be bad if you have any write caching enabled.– Maybe RAID 6 (dual parity) is an option..• Block size– Use an appropriate stripe size for best results• We use 64kb – but you need to test various configs to get the best fit for your application
  • 28. Backup/Snapshot best practices• Snapshot regularly– At least once per day, more if you can– First snapshots take a while, subsequent are faster– Schedule for when your IO load is lowest to reduce impact• We do it at around 9pm CST• Use consistent naming for snapshots– {hostname}-{raid device}-{device}-{timestamp}• Use the API for creation– Faster kickoff, more likely to be consistent (script it!)– ec2-create-snapshot –d “{hostname}-{raid device}-{device}-{timestamp}” vol-d726382• Move older snapshots to S3/Glacier for long-term storage• RAID makes this a bit more complex:– Make sure you unmount/snapshot/remount your file system, or use fsfreeze to keepconsistent snapshots!
  • 29. Choosing a good file system• We like ext3/4, but we love XFS– High performance, consistent– Robust and lots of options for tweaking/adjusting as needed• Our favorite mount options: (your mileage may vary)– inode64, noatime, nodiratime, attr2, nobarrier, logbufs=8, logbsize=256k, osyncisdsync, nobootwait, noauto– Yields great performance, reduces unnecessary writes, stable• We like ZFS a lot too, but we want to see more runtime on linux first– But FreeBSD/ZFS would be a fine choice• However: test your workload!– File systems behave differently under different workloads
  • 30. EBS/File system performance tuning• Tuning file systems:– Set the scheduler to use ‘deadline’ (for each disk in RAID array/EBS):• [as root] echo deadline > /sys/block/[disk device]/queue/scheduler– Adjust how aggressively the cache is written to disk. Tune these back if you arebursty in write IO:• vm.dirty_ratio=30• vm.dirty_background_ratio=20• Track what you change!– Before changing anything, monitor it– After you make the change, monitor it– Then: KEEP monitoring it – things can change over time in unexpected ways
  • 31. Monitoring• Observing:– iostat –xcd –t 1• Watch the sum of r/s and w/s – this is your IOPS metric. For PIOPS, you want it close to the provisionedamount. We monitor this using CopperEgg custom metrics, and alert if it goes low, or high.– grep –A 1 dirty /proc/vmstat• If nr_dirty approaches nr_dirty_threshold, you need to tune down vm.dirty to flush writes more often.• Reference: http://docs.neo4j.org/chunked/stable/linux-performance-guide.html• Useful stats to capture:– In /proc/fs/xfs/stat• xs_trans* -> transactions• xs_read/write* -> read/write operations stats• xb_* -> buffer stats• Ignore SMART - does not work for EBS• Watch the console log– Use the AWS API to look for warning signs of EBS issues
  • 32. Migrations and Capacity Planning• Using PIOPS?– Plan on a data migration path if you need to increase PIOPS• You can’t (yet) increase IOPS on the fly• Migration steps from an EBS backed RAID:1. Snapshot 1hr before, then again, and again – each time it takes less time2. Stop all services3. Unmount the filesystem4. Stop the RAID (mdadm –stop /dev/md0)5. Take final snapshot6. Create new volumes based on last snapshot7. RAID attach new volumes – mdadm should detect the array and magically make it work.8. Mount the filesystem9. Restart services
  • 33. Resources & Questions• Jafar Shameem | shameemj@amazon.com | @rafaj• http://aws.amazon.com/ebs/• Amazon Provisioned IOPS– http://copperegg.com/amazon-provisioned-iops-ebs/• Benchmarking EBS performance:– http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html• Stratalux: Putting Amazon’s Provisioned IOPS to the test– http://www.stratalux.com/2012/08/09/putting-amazon’s-provisioned-iops-to-the-test/• MongoDB on AWS:– http://docs.mongodb.org/ecosystem/platforms/#amazon-web-services-ec2