Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we present the differences between the types of Amazon EBS block storage so that you can best understand which storage type to use for your different application deployments. We discuss how to maximize Amazon EBS performance with a special eye towards low-latency and high-throughput applications. We discuss Amazon EBS encryption and share best practices for Amazon EBS snapshot management. Throughout, we share tips for success.
Learning Objectives:
• Learn about the latest updates to EBS
• Learn about best practices for using EBS.
Who Should Attend:
• Application admins, DBAs, database and big data architects
3. EBS volumes are used for:
• Boot volumes
• SQL DBs
• NoSQL DBs
• Big Data workloads
• Data warehouses
• Log processing
What is Amazon EBS?
EBS
volume
EC2
instance
4. What is Amazon EBS?
EBS
volume
Availability Zone
AWS region
EC2
instance
• Availability Zone specific
• Persist independently
of the EC2 instance
EC2
instance
5. What is Amazon EBS?
EBS
boot
volume
Availability Zone
AWS region
EC2
instance
EBS
data
volume
EBS
data
volume
6. What is Amazon EBS?
EBS
volume
Availability Zone Availability Zone
AWS region
Replica
7. EBS is designed for:
• 99.999% Availability
• 0.1% to 0.2% Annual Failure Rate (AFR)
• 20x better than off the self disk drive which has an AFR of about 4%
• Make sure you use a data protection strategy
What is Amazon EBS?
AWS region
8. What is an Amazon EBS Snapshot ?
EBS
volume
Availability Zone
AWS region
Amazon
S3
EBS snapshot
Availability Zone
9. What can you do with a snapshot ?
EBS
volume
Availability Zone
AWS region
Amazon
S3
EBS snapshot
Availability Zone
EBS
volume
10. What can you do with a snapshot ?
EBS
volume
Availability Zone
AWS region
Amazon
S3
EBS snapshot
EBS
volume
Availability Zone
AWS region
EBS snapshot
11. What can you do with a snapshot ?
AWS region
Public data sets
available as EBS
snapshots:
Availability Zone
EBS
volume
https://aws.amazon.com/public-data-sets/
• Genomic
• Census
• Global weather
• Transportation
12. What can you do with a snapshot ?
EBS
volume
Availability Zone
AWS region
EC2 instance
AMI
EBS snapshot
13. What is an EBS-Optimized instance ?
EBS
volume
Availability Zone
AWS region
EBS-Optimized
EC2 instance
14. What is an EBS-Optimized instance ?
EBS
EC2
instances InternetDatabases
c3.2xlarge
~ 125 MB/s
S3
Shared
15. What is an EBS-Optimized instance ?
EC2
instances InternetDatabases
c3.2xlarge
S3
EBS
• C4 and newer instance families are EBS-Optimized by default
• http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html
~ 125 MB/s
Shared
16. What is EC2 instance local store ?
EBS
volume
Availability Zone
AWS region
EC2
instance
EC2 local storage:
• Directly attached
• Data not-persistent
• No snapshot support
• No native encryption
• Useful for stateless
applications
17. What is EBS Encryption ?
EBS
encryption
• Attach both encrypted and unencrypted
• No volume performance impact
• Any current generation instance
• Supported by all EBS volume types
• Snapshots also encrypted
• No extra cost
• Boot and data volumes can be encrypted
19. A few definitions…
IOPS: Input/output operations per second (#)
Throughput: Read/write rate to storage (MB/s)
Latency: Delay between request and completion (ms)
Capacity: Volume of data that can be stored (GB)
I/O size: Size of each I/O to disk (KB)
22. EBS Volume Types: General Purposed SSD
(gp2)
General Purpose SSD
gp2
Throughput: 160 MB/s
Latency: Single-digit ms
Capacity: 1 GB to 16 TB
Baseline: 3 IOPS per GB up to 10,000
Burst: 3,000 IOPS (for volumes up to 1 TB)
Great for boot volumes, low latency applications and bursty databases
24. Burst Bucket: General Purpose SSD (GP2)
Max I/O credit per bucket is 5.4M
You can spend up to
3000 IOPS per second
Always accumulating
3 IOPS per GB per second
gp2
Baseline performance = 3 IOPS per GB or 100 IOPS
25. How long can I burst ?
0
100
200
300
400
500
600
700
1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950
MinutesofBurst
Volume size in GB
1 hour
10 hours
Volume 300 GB in size can burst
at 3,000 IOPS for 43 minutes.
26. EBS Volume Types: Provisioned IOPS SSD (io1)
Provisioned IOPS SSD
io1
Baseline: 100 to 20,000 IOPS
Throughput: 320 MB/s
Latency: Single-digit ms
Capacity: 4 GB to 16 TB
Ideal for critical applications and databases with sustained IOPS
27. Scaling Provisioned IOPS SSD (IO1)IOPS
0 0.4 16
1,000
5,000
10,000
15,000
20,000
3…
MAX PROVISIONED IOPS
(Maximum IOPS:GB ratio of 50:1)
Available Provisioned IOPS
Volume Size (TB)
400GB
…
28. GP2/IO1: Common use cases
• Cassandra
• https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf
• MongoDB
• https://aws.amazon.com/blogs/aws/mongodb-on-the-aws-cloud-new-quick-start-
reference-deployment/
29. EBS Volume Type: Throughput Optimized HDD
(ST1)
Throughput
Optimized HDD
st1
Baseline: 40 MB/s per TB up to 500 MB/s
Capacity: 500 GB to 16 TB
Burst: 250 MB/s per TB up to 500 MB/s
Ideal for large block, high throughput sequential workloads
30. Throughput Optimized HDD – Burst and Base
0
100
200
300
400
500
600
0.5 2 4 6 8 10 12 14 16
ThroughputinMB/s
Volume Size in TB
Burst Base
320
ST1
500
31. Burst Bucket: Throughput Optimized HDD (ST1)
Max I/O bucket credit is 1 TB of
credit per TB in volume
You can spend up to
250 MB/s per TB
Baseline performance = 40 MB/s per TB
Always accumulating 40 MB/s per TB
st1
32. Cold HDD
sc1
EBS Volume Types: Cold HDD (sc1)
Baseline: 12 MB/s per TB up to 192 MB/s
Capacity: 500 GB to 16 TB
Burst: 80 MB/s per TB up to 250 MB/s
Ideal for sequential throughput workloads such as logging and backup
33. Cold HDD – Burst and Base
0
50
100
150
200
250
300
0.5 1 2 4 6 8 10 12 14 16
ThroughputinMB/s
Volume size in TB
Burst Base
192
SC1
34. Burst Bucket: Cold HDD (SC1)
Max I/O bucket credit is 1 TB of
credit per TB in volume
You can spend up to 80
MB/s per TB
Baseline performance = 12 MB/s per TB
Always accumulating 12 MB/s per TB
35. ST1/SC1: Common use cases
• Localytics created a petabyte scale data warehouse (Vertica)
• Confluent wrote about deploying Apache Kafka on these volume types
• Splunk wrote how using these volume types for colder data.
https://aws.amazon.com/blogs/aws/amazon-ebs-update-new-cold-storage-and-
throughput-options/
36. IO Provisioned Volumes Throughput Provisioned Volumes
sc1st1io1gp2
$0.10 per GB $0.125 per GB
$0.065 per PIOPS
* All prices are per month and from the us-west-2 region as of September 2016
$0.045 per GB $0.025 per GB
38. How is performance measured with EBS ?
• GP2 and IO1
• Performance measured in I/Os per second (IOPS)
• Max size of an I/O is 256 KB
• Ideal for Random I/Os, low latencies
• ST1 and SC1
• Performance measured in Megabytes per second (MB/s)
• Max size of an I/O is 1 MB
• Optimized for large sequential I/Os, high throughput
39. How do we count I/Os for GP2 and IO1 ?
When possible, we merge sequential I/Os (up to 256 KB in
size)
...To minimize I/O charges on IO1
& maximize burst on GP2
40. How do we count I/Os for GP2 and IO1 ?
Example 1: Random I/Os
• 4 random I/Os (i.e., non sequential I/Os)
• Each I/O 64 KB
Up to 256 KB
EC2
Instance
EBS
Counted as 4 I/Os
41. How do we count I/Os for GP2 and IO1 ?
Example 2: Random I/Os
• 4 random I/Os (i.e., non-sequential I/Os)
• Each I/O 256 KB
Up to 256 KB
EC2
Instance
EBS
Counted as 4 I/Os
42. How do we count I/Os for GP2 and IO1 ?
Example 3: Sequential I/O
• 4 sequential I/Os
• Each I/O 64 KB
Up to 256 KB
EC2
Instance
EBS
Counted as 1 I/O
43. How do we count I/Os for GP2 and IO1 ?
Example 4: Sequential I/O
• 4 sequential I/Os
• Each I/O 256 KB
Up to 256 KB
EC2
Instance
EBS
Counted as 4 I/Os
44. How do we count I/Os for ST1 and SC1 ?
• When possible, we merge sequential I/Os (up to 1 MB in size)
• Workloads with primarily large, sequential I/Os perform best on
ST1 and SC1
• Ex: Big Data/MapReduce, Hadoop, Kafka, Log Processing,
Data Warehouses
45. How do we count I/Os for ST1 and SC1 ?
Example 1: Small, Random I/Os
• 4 random I/Os
• Each I/O 64 KB
Up to 1024 KB
EC2
Instance
EBS
Counted as 4 I/Os or 4 MB/s of burst
46. How do we count I/Os for ST1 and SC1 ?
Example 2: Larger Random I/Os
• 4 random I/Os
• Each I/O 256 KB
Up to 1024KB
EC2
Instance
EBS
Counted as 4 I/Os or 4 MB/s
47. How do we count I/Os for ST1 and SC1 ?
Example 3: Large, Sequential I/Os
• 4 sequential I/Os
• Each I/O 64 KB in size
Up to 1024 KB
EC2
Instance
EBS
Counted as 1 I/O or 1 MB/s
Note: very small I/Os
can’t be merged with
perfect efficiency
48. How do we count I/Os for ST1 and SC1?
Example 4: Small Sequential I/Os
• 4 sequential I/Os
• Each I/O 256KB in size
Up to 1024 KB
EC2
Instance
EBS
Counted as 1 I/O or 1 MB/s
49. How is performance measured with EBS ?
• GP2 and IO1
• Performance measured in I/Os per second (IOPS)
• Max size of an I/O is 256 KB
• Ideal for Random I/Os, low latencies
• ST1 and SC1
• Performance measured in Megabytes per second (MB/s)
• Max size of an I/O is 1 MB
• Optimized for large sequential I/Os, high throughput
51. How does EBS encryption work ?
• Data encrypted at rest and in flight
• Encryption and decryption of EBS data in the memory of the
server hosting your EC2 instances.
• The data key is NEVER saved to disk in plain text
• Resources encrypted with a Customer Master Key (CMK)
• Two types of CMKs
• Default Key (per account per region)
• Custom Key
52. How does EBS encryption work ?
EBS volume 1
EBS
master
key
AWS KMS
Data key 1
Data key 2
Data key 3Envelope encryption
• Limits exposure risk
• Performance
• Simplifies key management
EBS volume 2
EBS volume 3
57. Use Multiple Availability Zones for Higher Availability
EBS
volume
Availability Zone
AWS region
Availability Zone
EBS
volume
• EC2 and EBS are availability zone independent services.
58. Use EC2 Autorecovery
Instance
status check fails?
REBOOT
System
status check fails?
RECOVER
Instance ID
Instance metadata
Private IP addresses
Elastic IP addresses
EBS volume attachments
Instance retains:
• Limited to C3, C4, M3, M4, R3, and T2 instance types with EBS only storage
• https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html
60. Take Snapshots
Quiesce I/O
1. Database: FLUSH and LOCK tables
2. Filesystem: sync and fsfreeze
3. EBS: snapshot all volumes
• When CreateSnapshot API returns
success, it is safe to resume
• A snapshot of an entire 16 TB volume
is designed to take no longer than the
time it takes to snapshot a 1 TB
volume
61. Automate Snapshot Creation and Retention
AWS Lambda
scheduled event:
daily snapshots
EC2
instances
Backup
Retention
30 days
Search for instances
tagged “Backup”
EC2 Run commands to
fsfreeze
Snapshot all
attached volumes
Tag snapshots with
expire date
1. 2. 3. 4.
62. Avoid RAIDing for redundancy
• RAID1 halves available EBS bandwidth
• RAID5/6 loses 20 – 30% of usable I/O to parity
• EBS data is already replicated
64. Use RAID 0 for performance
You should do RAID0 when your
• Storage requirement > 16 TB
• Throughput requirement > 500 MB/s
• IOPS requirement > 20,000 @ 16K
• Use a stripe size of 128KB or 256KB to maintain
a sufficient queue depth when driving high IOPS
66. Understand workload I/O patterns
iostat for Linux
256 sectors x 512 bytes/sector = 128 KiB
perfmon for Windows
67. Use a Modern Linux Kernel
Stuck
around 44
KiB?
CloudWatch
Console
Upgrade your
kernel to at least
3.8
68. For ST1/SC1 increase the read-ahead buffer
• For throughput heavy workloads (e.g. Log processing)
• Configure the read-ahead setting to 1 MB
• This is a runtime per volume configuration
• This may degrade performance if you are accessing the disk with small random I/Os
• https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
70. Create a AWS KMS Master key for EBS
• Control who can use key
• Control who can administer key
• Resources encrypted with a default key
cannot be shared
• Enable AWS CloudTrail for auditing
71. Initialize Volume for performance sensitive apps
New EBS volume? New EBS volume from snapshot?
• Attach and its ready to go • Initialize for best performance
• Random read across volume
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html
73. Summary
Use encryption if
you need it
Take snapshots,
Tag snapshots
Select the right
instance for your
workload
Select the right
volume for your
workload