Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)

6,560 views

Published on

In this popular session, you will learn about the latest features and use cases for Amazon EBS, including best practices, an overview of newly introduced features, and brand-new re:Invent announcements. In particular we will cover the expanded portoflio of volume types, including provisioned IOPS, cold storage, and throughput-optimized. This session will help database admins and application architects understand how to blend performance and cost with applicaitns for big data analytics, data warehousing, and transactional and NoSQL databases.

Published in: Technology

AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rob Alexander, AWS Principal Solutions Architect November 30, 2016 Deep Dive on Amazon Elastic Block Store STG301
  2. 2. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Who: Lead Software Development Engineers, Architects, and Technical PMs Where: Storage Booth Walk-up Bar When: Exhibit hours (Tues 5-7pm, Wed & Thurs 10:30a-6:00p) What: Architecture best practices, code reviews, feature requests Storage “Office Hours” Meet the People who Build AWS Storage
  3. 3. Storage service options Amazon Elastic Block Store Amazon Elastic File System Amazon S3 Block File Object
  4. 4. AWS block storage offerings EC2 instance store
  5. 5. What is Amazon EC2 instance store? EC2 instances • Local to instance • Non-persistent data store • Data not replicated (by default) • No snapshot support • SSD or HDD Physical Host Instance Store or
  6. 6. AWS block storage offerings EC2 instance store sc1st1 io1gp2 EBS SSD-backed volumes EBS HDD-backed volumes
  7. 7. What is EBS? EBS volume EC2 instance • Block storage as a service • Create, attach volumes through an API • Service accessed over the network
  8. 8. What is EBS? EBS volume EC2 instance !=
  9. 9. What is EBS? EBS volume EC2 instance
  10. 10. What is EBS? EBS volume Availability Zone AWS Region EC2 instance
  11. 11. What is EBS? EBS volume EC2 instance EC2 instance • Volumes persist independent of EC2 • Detach and attach between instances • Volume and instance must be in the same AZ Availability Zone AWS Region
  12. 12. What is EBS? EBS boot volume Availability Zone AWS Region EC2 instance EBS data volume EBS data volume • Volumes attach to one instance at a time • Many volumes can attach to an instance • Separate boot volume from data volumes
  13. 13. What is EBS? EBS volume Availability Zone Availability Zone AWS Region Replica
  14. 14. EBS is designed for: What is EBS? 99.999% service availability 0.1% to 0.2% annual failure rate (AFR)
  15. 15. What is an EBS snapshot? EBS volume Availability Zone AWS Region Amazon S3 EBS snapshot Availability Zone Replica
  16. 16. How does an EBS snapshot work? EBS volume • Point-in-time backup of modified volume blocks • Stored in S3, accessed via EBS APIs • Subsequent snapshots are incremental • Deleting snapshot will only remove data exclusive to that snapshot EBS snapshot
  17. 17. What can you do with a snapshot? EBS volume Availability Zone AWS Region EC2 instance EBS snapshot AMI
  18. 18. What can you do with a snapshot? EBS volume Availability Zone AWS Region Amazon S3 EBS snapshot Availability Zone EBS volume Replica Replica
  19. 19. What can you do with a snapshot? EBS volume Availability Zone AWS Region Amazon S3 EBS snapshot EBS volume Availability Zone AWS Region EBS snapshot Replica Replica
  20. 20. What can you do with a snapshot? AWS Region Public datasets on AWS available as EBS snapshots: Availability Zone EBS volume https://aws.amazon.com/public-data-sets/ • Genomic • Census • Global weather • Transportation Replica
  21. 21. What is an EBS-optimized instance? EBS volume Availability Zone AWS Region EBS-optimized EC2 instance
  22. 22. What is an EBS-optimized instance? EBS EC2 instances Internet Databases ~ 125 MB/s S3 Shared c3.2xlarge
  23. 23. What is an EBS-optimized instance? EBS EC2 instances InternetDatabases c3.2xlarge S3 ~ 125 MB/s Shared
  24. 24. What is an EBS-optimized instance? More details: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html • Dedicated network bandwidth for EBS I/O • Enabled by default on c4, d2, m4, p2, and x1 instances • Can be enabled at instance launch or on a running instance • Not an option on some 10 Gbps instance types (c3.8xlarge, r3.8xlarge, i2.8xlarge)
  25. 25. What is EBS encryption? Encryption • Attach both encrypted and unencrypted • No volume performance impact • Any current generation instance • Supported by all EBS volume types • Snapshots also encrypted • No extra cost • Boot and data volumes can be encrypted
  26. 26. EBS volume types
  27. 27. EBS volume types Hard disk driveSolid state drive
  28. 28. EBS volume types General Purpose SSD gp2 Provisioned IOPS SSD io1 Throughput Optimized HDD st1 Cold HDD sc1 SSD HDD
  29. 29. Throughput? or IOPS Choosing an EBS volume type What is more important to your workload:
  30. 30. Don’t know Choosing an EBS volume your workload yet?
  31. 31. EBS volume types: I/O Provisioned General Purpose SSD gp2 Throughput: 160 MB/s Latency: Single-digit ms Capacity: 1 GB to 16 TB Baseline: 3 IOPS per GB up to 10,000 Burst: 3,000 IOPS (for volumes up to 1 TB) Great for boot volumes, low-latency applications, and bursty databases
  32. 32. Burst & baseline: General Purpose SSD (gp2)IOPS 0 1 16 1,000 2,000 3,000 8,000 10,000 BASELINE IOPS (Baseline of 3 IOPS/GB) Burstable to 3,000 IOPS 3 90.5 Volume size (TB) ~ 3334 GB 100 IOPS Minimum 300 GB volume
  33. 33. Burst bucket: General Purpose SSD (gp2) Max I/O credit per bucket is 5.4M You can spend up to 3000 IOPS per second Baseline performance = 3 IOPS per GiB or 100 IOPS Always accumulating 3 IOPS per GiB per second gp2
  34. 34. How long can I burst on gp2? 0 100 200 300 400 500 600 700 1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 MinutesofBurst Volume size in GB 43 min 1 hour 10 hours
  35. 35. How do I monitor gp2 burst balance? VolumeWriteOpsBurstBalance 500 GB gp2 volume 900,000 write IOs over 5 min = 3000 IOPS 450,000 write IOs over 5 min = 1500 IOPS
  36. 36. Throughput? or IOPS Choosing an EBS volume type What is more important to your workload:
  37. 37. i2 gp2 io1 Choosing an EBS volume type Latency ? < 1 ms Single-digit ms Which is more important ? Cost Performance IOPS ≤ 65,000> 65,000 is more important
  38. 38. EBS volume types: I/O Provisioned Provisioned IOPS SSD io1 Baseline: 100 to 20,000 IOPS Throughput: 320 MB/s Latency: Single-digit ms Capacity: 4 GB to 16 TB Ideal for critical applications and databases with sustained IOPS
  39. 39. Scaling Provisioned IOPS SSD (io1)IOPS 0 2 16 1,000 5,000 10,000 15,000 20,000 6 90.4 MAX PROVISIONED IOPS (Maximum IOPS:GB ratio of 50:1) Available Provisioned IOPS Volume Size (TB) ~ 400 GB
  40. 40. i2 gp2 io1 Choosing an EBS volume type Latency ? < 1 ms Single-digit ms Which is more important ? Cost Performance IOPS ≤ 65,000> 65,000 is more important Throughput?
  41. 41. Throughput is more important Small, random I/O Large, sequential I/O i2 gp2 io1 st1 d2 Choosing an EBS volume type Latency ? < 1 ms Single-digit ms ≤ 1,250 MB/s Aggregate throughput? > 1,250 MB/s Which is more important ? Cost Performance IOPS ≤ 65,000> 65,000 is more important Which is more important ? Cost Performance
  42. 42. EBS volume types: Throughput Provisioned Throughput Optimized HDD st1 Baseline: 40 MB/s per TB up to 500 MB/s Capacity: 500 GB to 16 TB Burst: 250 MB/s per TB up to 500 MB/s Ideal for large-block, high-throughput sequential workloads
  43. 43. Throughput Optimized HDD – burst and base 0 100 200 300 400 500 600 0.5 1 2 4 6 8 10 12 14 16 ThroughputinMB/s Volume Size in TB Burst Base 320 ST1
  44. 44. Burst bucket: Throughput Optimized HDD (st1) Max I/O bucket credit is 1 TB of credit per TB in volume You can spend up to 250 MB/s per TB Baseline performance = 40 MB/s per TB Always accumulating 40 MB/s per TB st1
  45. 45. Up to 8 TB in I/O credit Always accumulating 320 MB/s You can spend up to 500 MB/s Burst bucket: example 8 TB st1 volume Baseline performance = 320 MB/s st1
  46. 46. Throughput is more important Small, random I//O Large, sequential I/O Which is more important? Latency? i2 gp2 io1 sc1 st1 d2 Choosing an EBS volume type IOPS ≤ 65,000> 65,000 < 1 ms Single-digit ms ≤ 1,250 MB/s Aggregate throughput? > 1,250 MB/s is more important Cost Performance Which is more important? Cost Performance
  47. 47. Cold HDD sc1 EBS volume types: Throughput Provisioned Baseline: 12 MB/s per TB up to 192 MB/s Capacity: 500 GB to 16 TB Burst: 80 MB/s per TB up to 250 MB/s Ideal for sequential throughput workloads, such as logging and backup
  48. 48. Cold HDD – burst and base 0 50 100 150 200 250 300 0.5 1 2 4 6 8 10 12 14 16 ThroughputinMB/s Volume size in TB Burst Base 192 SC1
  49. 49. Burst bucket: Cold HDD (sc1) Max I/O bucket credit is 1 TB of credit per TB in volume You can spend up to 80 MB/s per TB Baseline performance = 12 MB/s per TB Always accumulating 12 MB/s per TB sc1
  50. 50. Throughput is more important Small, random I/O Large, sequential I/O Which is more important? Latency? i2 gp2 io1 sc1 st1 d2 Choosing an EBS volume type IOPS ≤ 65,000> 65,000 < 1 ms Single-digit ms ≤ 1,250 MB/s Aggregate throughput? > 1,250 MB/s is more important Cost Performance Which is more important? Cost Performance
  51. 51. I/O Provisioned Volumes Throughput Provisioned Volumes sc1st1io1gp2 $0.10 per GB $0.125 per GB $0.065 per PIOPS * All prices are per month, and from the us-west-2 Region as of April 2016 $0.045 per GB $0.025 per GB Snapshot storage for all volume types is $0.05 per GB per month
  52. 52. Hybrid volume use cases c4 gp2 st1 STG205 Case Study: Librato’s Experience Running Cassandra Using Amazon EBS Data files Commit log i2
  53. 53. Hybrid volume use cases gp2 st1 STG311 Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS Hot data 0–7 Days Warm data 8–30 days sc1 Cold data 31–60 days Tiered Elasticsearch data:
  54. 54. Hybrid volume use cases st1 Case Study: Info: https://aws.amazon.com/solutions/case- studies/infor-ebs/ Transaction logs “We’ve seen much stronger performance for our database backup workloads with the Amazon EBS ST1 volumes, and we’re also saving 75 percent on our monthly backup costs.” Randy Young, Director of Cloud Operations, Infor i2 st1 Full backups st1 Partial backups SQL Server Database EBS snapshots
  55. 55. Hybrid volume use cases gp2 st1 Amazon EMR Apache Hadoop Example Frameworks on YARN HDFS sc1 EMR cluster instance • Random, small I/O • Shuffle, spill, and temp operations • Large, sequential I/O • Multiple volumes for more parallelism or
  56. 56. Hadoop with multiple EBS volume types [ { "classification":"yarn-site", "properties":{ "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage":"99.9", "yarn.nodemanager.local-dirs":"/mnt/yarn,/mnt1/yarn" } }, { "classification":"core-site", "properties":{ "fs.s3.buffer.dir":"/mnt/s3,/mnt1/s3" } }, { "classification":"hdfs-site", "properties":{ "dfs.namenode.name.dir":"file:///mnt2/namenode,file:///mnt3/namenode,file:///mnt4/namenode", "dfs.name.dir":"/mnt2/namenode,/mnt3/namenode,/mnt4/namenode", "dfs.data.dir":"/mnt2/hdfs,/mnt3/hdfs,/mnt4/hdfs", "dfs.datanode.data.dir":"file:///mnt2/hdfs,file:///mnt3/hdfs,file:///mnt4/hdfs" } }, { "classification":"mapred-site", "properties":{ "mapred.local.dir":"/mnt/mapred,/mnt1/mapred", "mapreduce.cluster.local.dir":"/mnt/mapred,/mnt1/mapred" } } ] GP2 = mnt, mnt1 ST1 / SC1 = mnt2, mnt3, mnt4
  57. 57. EBS deep dive
  58. 58. EBS deep dive Performance
  59. 59. How do we count I/Os for GP2 and IO1? When possible, we merge sequential I/Os (up to 256 KB in size) ...To minimize I/O charges on IO1 and maximize burst on GP2
  60. 60. How do we count I/Os for GP2 and IO1? Example 1: Random I/Os • 4 random I/Os (i.e., non sequential I/Os) • Each I/O 64 KB Up to 256 KB EC2 instance EBS Counted as 4 I/Os
  61. 61. How do we count I/Os for GP2 and IO1? Example 2: Sequential I/O • 4 sequential I/Os • Each I/O 64 KB Up to 256 KB EC2 instance EBS Counted as 1 I/O
  62. 62. How do we count I/Os for GP2 and IO1? Example 3: Large I/O • 1 I/O • 1024 KB Up to 256 KB EC2 instance EBS Counted as 4 I/Os
  63. 63. How do we count I/Os for ST1 and SC1? • When possible, we merge sequential I/Os (up to 1 MB in size) • Workloads with primarily large, sequential I/Os perform best on ST1 and SC1 • Ex: Big Data/EMR, Hadoop, Kafka, Log Processing, Data Warehouses
  64. 64. How do we count I/Os for ST1 and SC1? Example 1: Random I/Os • 4 random I/Os • Each I/O 64 KB Up to 1024 KB EC2 instance EBS Counted as 4 I/Os or 4 MB/s of burst
  65. 65. How do we count I/Os for ST1 and SC1? Example 2: Sequential I/O • 4 sequential I/Os • Each I/O 1024 KB Up to 1024 KB EC2 instance EBS Counted as 4 I/Os or 4 MB/s of burst
  66. 66. How do we count I/Os for ST1 and SC1? Example 3: Mixed I/O • 2 * 512 KB sequential I/Os • 2 * 64 KB random I/Os • 2 * 128 KB sequential I/Os Up to 1024 KB EC2 instance EBS Counted as 4 I/Os or 4 MB/s of burst (but only ~ 1.4 MB of data transferred)
  67. 67. Burst balance for ST1 and SC1 0 20 40 60 80 100 120 0 1 2 3 4 5 6 7 8 9 10 BurstBalance% Time in Hours 1 MB Sequential 16 KB Random 4 TB ST1 volume 1 MB Sequential: 500 MB/s for 3 hours 16 KB Random: 8 MB/s for 3 hours
  68. 68. Burst balance for ST1 and SC1 4 TB ST1 Volume 0 1000 2000 3000 4000 5000 6000 Data Transferred in GB 1 MB Sequential 16 KB Random 1 MB Sequential: 5.4 TB transferred 16 KB Random: 87 GB transferred
  69. 69. 2046 sectors x 512 bytes/sector = ~1024 KiB $ iostat –xm Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 0.20 0.00 523.40 0.00 523.00 2046.44 3.99 7.62 1.61 100.00 Verify workload I/O patterns iostat for Linux perfmon for Windows
  70. 70. 128 KiB Verify ST1 & SC1 workloads Amazon CloudWatch Console
  71. 71. Under 64 KiB? Verify ST1 & SC1 workloads Small or random IOPS likely CloudWatch Console
  72. 72. Stuck around 44 KiB? Verify ST1 & SC1 workloads CloudWatch Console Upgrade your Linux kernel to at least 3.8
  73. 73. Responses Requests Instance Ring queue userspace process kernel request queue noop deadline cfq scheduler I/O requests in a Linux virtual world I/O Driver Domain Hypervisor
  74. 74. I/O requests in a Linux virtual world: 3.8+ kernel Instance EBS userspace process kernel request queue scheduler noop deadline cfq pre 3.8: 44 KB post 3.8: 128 KB to 1024 KB I/O Driver Domain Hypervisor Up to 32 requests in queue
  75. 75. I/O requests in a Linux virtual world: 4.2+ kernel Instance EBS userspace process kernel request queue per core blk-mq pre 3.8: 44 KB post 3.8: 128 KB to 1024 KB I/O Driver Domain Hypervisor Up to 32 requests in queue
  76. 76. ST1 & SC1: Linux performance tuning Increase maximum request size: • Recommended for ST1, SC1 on a 4.2+ Linux kernel • Memory allocated per device • Default is 32, max for EC2 is 256 For example with GRUB’s /boot/grub/menu.lst configuration: kernel /boot/vmlinuz-4.4.5-15.26.amzn1.x86_64 root=LABEL=/ console=ttyS0 xen_blkfront.max=256 Verify setting: /sys/module/xen_blkfront/parameters/max • OS boot command line configuration
  77. 77. ST1 & SC1: Linux performance tuning Increase read-ahead buffer: • Recommended for high-throughput read workloads • Per device configuration • Default is 128 KiB (256 sectors) for Amazon Linux • Smaller or random I/O will degrade performance For example: $ sudo blockdev –setra 2048 /dev/xvdf http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
  78. 78. IOPS vs. throughput 20,000 IOPS 16 KB 320 MB/s 10,000 IOPS 64 KB 640 MB/s 10,000 IOPS 32 KB 320 MB/s I/O request size 1,250 IOPS 256 KB 320 MB/s Example: io1 volume 20,000 PIOPS
  79. 79. Performance: EBS-optimized c4.large Dedicated to EBS 500 Mbps ~ 62.5 MB/s 2 TB GP2 volume: 6,000 IOPS 160 MB/s max throughput 4,000 16K IOPS c4.2xlarge Dedicated to EBS 1 Gbps ~ 125 MB/s 8,000 16K IOPS 2 TB GP2 volume: 6,000 IOPS 160 MB/s max throughput
  80. 80. Performance: throughput workload EC2 instances InternetDatabases m4.16xlarge S3 10 Gbps ~ 1250 MB/s EBS-optimized 10 Gbps ~1,250 MB/s 65,000 16K IOPS 8 TB ST1 volume: 320 MB/s base 500 MB/s burst Shared
  81. 81. Performance: throughput workload EC2 instances InternetDatabases m4.16xlarge S3 3 * 8 TB ST1 volume: 960 MB/s base 1,250 MB/s burst RAID0 10 Gbps ~ 1250 MB/s Shared EBS-optimized 10 Gbps ~1,250 MB/s 65,000 16K IOPS
  82. 82. Best practice: RAID When to RAID? • Storage requirement > 16 TB • Throughput requirement > 500 MB/s • IOPS requirement > 20,000 @ 16K
  83. 83. Best practice: RAID EBS volume Availability Zone AWS Region EC2 instance EBS volume RAID0RAID10 Replica Replica
  84. 84. Best practice: RAID Avoid RAID for redundancy • EBS data is already replicated • RAID5/6 loses 20 – 30% of usable I/O to parity • RAID1 halves available EBS bandwidth
  85. 85. EBS deep dive Performance Reliability
  86. 86. What about EC2 instance failure? Availability Zone AWS Region EBS volume EC2 instance Replica
  87. 87. What about EC2 instance failure? Availability Zone AWS Region EBS volume New EC2 instance Replica
  88. 88. EBS enables EC2 auto recovery RECOVER Instance Instance ID Instance metadata Private IP addresses Elastic IP addresses EBS volume attachments Instance retains: * Supported on C3, C4, M3, M4, P2, R3, T2, and X1 instance types with EBS-only storage StatusCheckFailed_System Amazon CloudWatch per-instance metric alarm: When alarm triggers?
  89. 89. What about EC2 instance termination? Availability Zone EBS volume EC2 instance DeleteOnTermination = True DeleteOnTermination = False AWS Region Replica
  90. 90. Best practice: taking snapshots from Linux Quiesce I/O 1. Database: FLUSH and LOCK tables 2. Filesystem: sync and fsfreeze 3. EBS: snapshot all volumes 4. When CreateSnapshot API returns success, it is safe to resume
  91. 91. Best practice: taking snapshots from Windows 1. sync equivalent available 2. Use Volume Shadow Copy Service- (VSS) aware utilities for backups 3. EBS: backups on dedicated volume for snapshots
  92. 92. Best practice: taking EBS snapshots from Windows EBS boot volume Windows EC2 instance EBS data volume EBS backup volume Windows Server Backup EBS snapshot
  93. 93. EBS volume initialization New EBS volume? New EBS volume from snapshot? • Attach and it’s ready to go • Initialize for best performance • Random read across volume
  94. 94. Best practice: EBS volume initialization $ sudo yum install –y fio $ sudo fio --filename=/dev/xvdf --rw=randread --bs=128k --iodepth=32 --ioengine=libaio --direct=1 --name=volume-initialize Fio-based example:
  95. 95. Best practice: automate snapshots Key ingredients: AWS Lambda Amazon EC2 Run command Tagging https://aws.amazon.com/ec2/run-command/
  96. 96. Best practice: automate snapshots Lambda scheduled event: daily snapshots EC2 instances Backup Retention 30 days Search for instances tagged “Backup” EC2 Run commands to quiesce file system Snapshot attached volumes Tag snapshots with expire date 1. 2. 3. 4.
  97. 97. Best practice: automate snapshot expiration Lambda scheduled event: daily expire Search for snapshots tagged to “Expire On” today Delete expired snapshots 1. 2. EBS snapshots Backup ExpireOn Date
  98. 98. https://github.com/dR0ski/lambda-ebs-snapshot-custodian Prototype for EBS Snapshot Custodian:
  99. 99. EBS best practices Performance Reliability Security
  100. 100. Best practice: encryption EBS encryption: data volumes
  101. 101. Best practice: encryption Create a new AWS KMS master key for EBS • Define key rotation policy • Enable AWS CloudTrail auditing • Control who can use key • Control who can administer key
  102. 102. Best practice: encryption EBS encryption: data volumes
  103. 103. How does EBS encryption work? EBS volume 1 EBS master key KMS Data key 1 Data key 2 Data key 3Envelope encryption EBS volume 2 EBS volume 3
  104. 104. How does EBS encryption work? EBS volume 1 EBS master key AWS KMS Envelope encryption EBS volume 2 EBS volume 3
  105. 105. How does EBS encryption work? EBS volume 1 EBS master key KMS Envelope encryption • Limits exposure risk • Performance • Simplifies key management EBS volume 2 EBS volume 3
  106. 106. Summary Use encryption if you need it Take snapshots, tag snapshots Select the right instance for your workload Select the right volume for your workload
  107. 107. Thank you!
  108. 108. Remember to complete your evaluations!

×