6. What is EBS?
• Network block storage
• Designed for five nines of availability
• Attaches to Amazon EC2 within the same
Availability Zone
• Provides point-in-time snapshots to
Amazon S3
7. More about EBS
• It’s a service!
• It’s independent of EC2
• It has regional and AZ availability goals
– All EBS volumes are designed for 99.999% availability
• Over 1 million volumes are created per day
(average)
• Over 3.3 trillion I/Os per day
10. EBS
General Purpose (SSD)
Up to 16 TB
10,000 IOPS
Up to 160 MB/s
Provisioned IOPS (SSD)
Up to 16 TB
20,000 IOPS
Up to 320 MB/s
11. A few definitions…
• IOPS: Input/output operations per second (#)
• Throughput: Read/write rate to storage (MiB/s)
• Latency: Delay between request and completion (ms)
• Capacity: Volume of data that can be stored (GiB)
• Block size: Size of each I/O (KiB)
12. EBS volume types
• General Purpose (SSD)
• Provisioned IOPS (SSD)
• Magnetic
When performance matters, use SSD-backed volumes
13. EBS SSD volumes
• Applies to both General Purpose and
Provisioned IOPS
• IOPS measured up to 256 KiB
• Single-digit ms latency
• Designed for 99.999% availability
14. EBS General Purpose volumes (SSD)
New default volume type for EBS
Every volume can burst up to 3,000 IOPS
• Larger volumes can burst for longer periods
3 IOPS per GB baseline performance,
maximum of 10,000 IOPS
99% performance consistency
Up to 160 MB/s throughput
15. (2) Max I/O credit per bucket is 5.4M
(1) Always accumulating 3
IOPS per GB per second
(3) You can spend up to
3,000 IOPS per second
Understanding General Purpose (SSD) bursting
Baseline performance = 3 IOPS per GB
16. General Purpose (SSD) volumes example
Microsoft Windows 30 GB boot volume:
• Gets initial I/O credit of 5.4M
• Could burst for up to 30 mins @ 3,000 IOPS
• Always accumulating 90 I/O credits per
second
17. m3.medium US-EAST1
Volume type Boot time Access time OS
GP2 3:31 4:33
Windows Server
2012
Magnetic 4:30 7:16
Windows Server
2012
GP2 0:36 0:45 CentOS6
Magnetic 0:57 1:16 CentOS6
40% Reduction in boot times by using General Purpose SSD
Instance Boot Time
19. 1 TB PIOPS volume with 4,000 IOPS = $526.40 per month per volume
GP2 1 TB volume with 3,000 IOPS = $102.40
GP2 2 x 500 GB volume at 3,000, Burst to 6,000 = $102.40
80% cost savings, 50% more peak I/O with
General Purpose SSD
Database volume
20. EBS PIOPS (SSD) volumes
• Best for I/O intensive databases that require highest
consistency
• Throughput up to 320 MB/s
• Provision up to 20,000 IOPS per volume
(supports IOPS:GB ratio of 30)
• Designed for 99.9% performance consistency
21. EBS Magnetic volumes
• Best for cold workloads (rarely accessed data that needs
always-on access)
• IOPS: ~100 IOPS steady-state, with best-effort bursts
• Throughput: variable by workload, best effort to 10s of MB/s
• Latency: Varies, reads typically ~20-40 ms, writes typically
~2-10 ms
22. Price Performance
EBS
Magnetic General Purpose Provisioned IOPS
Use cases Infrequent data access
Boot volumes
Small to med DBs
Dev and Test
I/O intensive
Relational DBs
NoSQL DBs
Storage media Magnetic disk-backed SSD-backed SSD-backed
Max IOPS 40–200 IOPS 10,000 IOPS 20,000 IOPS
Latency (random
read)
20 ~ 40 ms 1 ~ 2 ms 1 ~ 2 ms
Availability Designed for 99.999% Designed for 99.999% Designed for 99.999%
Price
$.05/GB-month
$.05/million I/O
$.10/GB-month
$.125/GB-month
$.065/provisioned IOPS
24. Performance optimization is measured by:
IOPS: Read/write I/O rate (IOPS)
Latency: Time between I/O submission
and completion (ms)
Throughput: Read/write transfer rate
(MB/s);; throughput = IOPS X I/O size
25. Four key components of performance optimization
1. EC2 instance
2. I/O
4. EBS
3. Network link
26. Tools available for performance tuning:
1. EC2 instance: Network bandwidth (Mbps)
2. EBS-optimized instance: EC2 instance option (On/Off)
3. Workload: Block size, read/write ratio, serialization
4. Queue depth: The number of outstanding I/Os
5. RAID: Stripe volumes to maximize performance
6. Pre-warming: Eliminate first-touch penalty
27. Compute-optimized – C3/C4
Memory-optimized – R3
General-Purpose – M3
EBS
EC2
Select the EC2 instance that has the right Network,
RAM, and CPU resources for your applications
1. EC2 instance
28. Most instance families supports the EBS-optimized flag
EBS-optimized instances now support up to 4 Gbps
• c4.8xlarge, d2.8xlarge
• Drive 32,000 16K IOPS or 500 MB/s
Other *.8xlarge instances support 10 Gbps network
• Max IOPS per node supported is ~48,000 IOPS @ 16K
2. EBS-optimized instance
30. I/O size:
• 4 KB to 64 MB
I/O pattern:
• Sequential and random
I/O type:
• Read and write
I/O concurrency:
• Number of concurrent I/O
EBS SSD-backed volumes measure I/O size up to 256 KiB
EBS SSD-backed volumes deliver same performance for read and write
3. Workload
31. EBS IOPS and throughput limits
20,000 IOPS
PIOPS volume
20,000 IOPS
320 MB/s
throughput
You can achieve 20,000 IOPS when
driving smaller I/O operations
You can achieve up to 320 MB/s
when driving larger I/O operations
32. EBS IOPS and throughput limits
8,000 IOPS
PIOPS volume
8,000 IOPS
320 MB/s
throughput
8,000 x 64 KB=512 MB/s
5,000 x 64 KB = 320 MB/s
1,250 x 256 KB = 320 MB/s
8,000 X 8 KB = 64 MB/s
8,000 X 16 KB = 128 MB/s
16,000 x 8 KB = 128 MB/s
8,000 x 32 KB = 256 MB/s
33. Block (I/O) size determines whether your
application is IOPS bound or throughput bound
34. 4. Queue depth
An I/O operation
EBS
After it’s gone, it’s gone
EC2
Queue depth is the pending I/O for a volume
36. 0.075
35.1
2.09
1,865
4,152
3,851
-
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
0
5
10
15
20
25
30
35
1 4 8 12 16 20 24 28 32
Latency TP90 (ms)
Queue depth
16 KB random read IOPS, latency across various queue depths
Latency (TP90) Avg Read IOPS
IOPS
Queue depth between 4 and 8 has the optimal IOPS and latency performance
Queue depth vs. Random read latency
37. Queue depth vs. Random write latency
0.08
7.71
845
4,152
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
0
1
2
3
4
5
6
7
8
9
10
1 4 8 12 16 20 24 28 32
Latency TP90 (ms)
Queue depth
16 KB random write IOPS, latency across various queue depths
Latency (TP90) AvgIOPS
IOPS
Write latency queue depth and IOPS interaction is similar to that of read latency
38. Optimal queue depth to achieve lower latency and highest IOPS
is typically between 4-8;; ~1 queue depth per 500 IOPS
EBS-optimized instances provide consistent latency experience
Use SSD volumes with latest-generation EC2 instances
39. 5. RAID
Increases performance, or capacity, or both
Over 320 MB/s or 20,000 IOPS, striping needed
Don’t mix volume types
Typically RAID 0 or LVM stripe
Avoid RAID for redundancyEBS
EC2
40. • Eliminates first-access penalty
• Typically 5%, extreme worst case of 50% performance reduction in IOPS and latency
when volumes are used without pre-warming:
– Performance is as provisioned when all the chunks are accessed
• Recommendations before benchmarking:
– For new volumes:
• Linux: DD write
• Windows: NTFS full format
– Takes roughly an hour to pre-warm 1 TB PIOPS/General Purpose (SSD) volumes
• Always check latest documentation
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-prewarm.html
6. Pre-warming
41. Use large block size to speed up your pre-warming
Example: sudo dd if=/dev/zero
of=/dev/xvdf conv=notrunc bs=1M
42. Workload/
software
Typical block
size
Random/
Seq?
Max EBS @ 500
MB/s instances
Max EBS @
1 GB/s instances
Max EBS @ 10 GB/s
instances
Oracle DB Configurable:2 KB
–16 KB
Default 8 KB
random ~7,800 IOPS ~15,600 IOPS ~96,000 IOPS
Microsoft SQL
Server
8 KB w/ 64 KB
extents
random ~7,800 IOPS ~15,600 IOPS ~80,000 IOPS
MySQL 16 KB random ~4,000 IOPS ~7,800 IOPS ~48,000 IOPS
PostgreSQL 8 KB random ~7,800 IOPS ~15,600 IOPS ~96,000 IOPS
MongoDB 4 KB serialized ~15,600 IOPS ~31,000 IOPS ~96,000 IOPS
Apache
Cassandra
4 KB random ~15,600 IOPS ~31,000 IOPS ~96,000 IOPS
GlusterFS 128 KB sequential ~500 IOPS ~1,000 IOPS ~6,000 IOPS
Cheat sheet sample: Storage workloads on AWS
47. AWS pricing philosophy
Ecosystem
Global Footprint
New Features
New Services
More AWS
Usage
More
Infrastructure
Lower
Infrastructure
Costs
Reduced
Prices
More
CustomersInfrastructure
Innovation
48 price
reductions
since 2006Economies
of Scale
59. • Reduced redundancy storage class
– 99.99% durability vs. 99.999999999%
– Up to 20% savings
– Everything that is easy to reproduce
– Use Amazon SNS lost object notifications
• Amazon Glacier storage class
– Same 99.999999999% durability
– 3 to 5 hours restore time
– Up to 64% savings
– Archiving, long-term backups, and old data
• Use life-cycle rules
5: Use Amazon S3 storage classes
60. • Read/write capacity units (CUs) determine
most of DynamoDB cost
• By optimizing CUs, you can save a lot of money
• But:
– Need to provision enough capacity to not run into capacity errors
– Need to prepare for peaks
– Need to constantly monitor/adjust
6. Optimize Amazon DynamoDB capacity units
64. 7. Offload your architecture
• The more you can offload, the less
infrastructure you need to maintain, scale,
and pay for
• Three easy ways to offload:
– Use Amazon CloudFront
– Introduce caching
– Leverage existing Amazon web services