(CMP402) Amazon EC2 Instances Deep Dive

Amazon Web Services
Amazon Web ServicesAmazon Web Services
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
John Phillips, Principal Product Manager, Amazon EC2
October 7, 2015
CMP402
Amazon EC2 Instances Deep Dive
Delivering System Performance
InstancesAPIs
Networking
EC2
EC2
Purchase Options
Amazon Elastic Compute Cloud is Big
Host Server
Hypervisor
Guest 1 Guest 2 Guest n
Amazon EC2 Instances
2006 2008 2010 2012 2014 2016
m1.small
m1.large
m1.xlarge
c1.medium
c1.xlarge
m2.xlarge
m2.4xlarge
m2.2xlarge
cc1.4xlarge
t1.micro
cg1.4xlarge
cc2.8xlarge
m1.medium
hi1.4xlarge
m3.xlarge
m3.2xlarge
hs1.8xlarge
cr1.8xlarge
c3.large
c3.xlarge
c3.2xlarge
c3.4xlarge
c3.8xlarge
g2.2xlarge
i2.xlarge
i2.2xlarge
i2.4xlarge
i2.4xlarge
m3.medium
m3.large
r3.large
r3.xlarge
r3.2xlarge
r3.4xlarge
r3.8xlarge
t2.micro
t2.small
t2.med
c4.large
c4.xlarge
c4.2xlarge
c4.4xlarge
c4.8xlarge
d2.xlarge
d2.2xlarge
d2.4xlarge
d2.8xlarge
g2.8xlarge
t2.large
m4.large
m4.xlarge
m4.2xlarge
m4.4xlarge
m4.10xlarge
Amazon EC2 Instances History
What to Expect from the Session
• Defining system performance and how it is
characterized for different workloads
• How Amazon EC2 instances deliver performance
while providing flexibility and agility
• How to make the most of your EC2 instance experience
through the lens of several instance types
Defining Performance
• Servers are hired to do jobs
• Performance is measured differently depending on the job
Hiring a Server
?
• What performance means
depend on your perspective:
– Response time
– Throughput
– Consistency
Defining Performance: Perspective Matters
Application
System libraries
System calls
Kernel
Devices
Workload
Simple Performance Model for Single Thread
• Using CPU: executing (in user mode)
• Not using CPU: waiting for turn on CPU, waiting for disk or
network I/O, thread locks, memory paging, or for more work.
Performance Factors
Resource Performance factors Key indicators
CPU Sockets, number of cores, clock
frequency, bursting capability
CPU utilization, run queue length
Memory Memory capacity Free memory, anonymous paging,
thread swapping
Network
interface
Max bandwidth, packet rate Receive throughput, transmit throughput
over max bandwidth
Disks Input / output operations per
second, throughput
Wait queue length, device utilization,
device errors
Resource Utilization
• For given performance, how efficiently are resources being used
• Something at 100% utilization can’t accept any more work
• Low utilization can indicate more resource is being purchased
than needed
Example: Web Application
• MediaWiki installed on Apache with 140 pages of content
• Load increased in intervals over time
Example: Web Application
• Memory stats
Example: Web Application
• Disk stats
Example: Web Application
• Network stats
Example: Web Application
• CPU stats
• Picking an instance is tantamount to resource performance tuning
• Give back instances as easily as you can acquire new ones
• Find an ideal instance type and workload combination
Instance Selection = Performance Tuning
Delivering Compute Performance with
Amazon EC2 Instances
CPU Instructions and Protection Levels
Kernel
Application
• CPU has at least two protection levels: ring0 and ring1
• Privileged instructions can’t be executed in user mode to protect
system. Applications leverage system calls to the kernel.
Example: Web application system calls
X86 CPU Virtualization: Prior to Intel VT-x
VMM
Application
Kernel
PV
• Binary translation for privileged instructions
• Para-virtualization (PV)
• PV requires going through the VMM, adding latency
• Applications that are system call bound are most affected
X86 CPU Virtualization: After Intel VT-x
Kernel
Application
VMM
PV-HVM
• Hardware assisted virtualization (HVM)
• PV-HVM uses PV drivers opportunistically for operations that are
slow emulated:
• e.g. network and block I/O
Tip: Use PV-HVM AMIs with EBS
Time Keeping Explained
• Time keeping in an instance is deceptively hard
• gettimeofday(), clock_gettime(), QueryPerformanceCounter()
• The TSC
• CPU counter, accessible from userspace
• Requires calibration, vDSO
• Invariant on Sandy Bridge+ processors
• Xen pvclock; does not support vDSO
• On current generation instances, use TSC as clocksource
Tip: Use TSC as clocksource
CPU Performance and Scheduling
• Hypervisor ensures every guest receives CPU time
• Fixed allocation
• Uncapped vs. capped
• Variable allocation
• Different schedulers can be used depending on the goal
• Fairness
• Response time / deadline
• Shares
Review: C4 Instances
Custom Intel E5-2666 v3 at 2.9 GHz
P-state and C-state controls
Model vCPU Memory (GiB) EBS (Mbps)
c4.large 2 3.75 500
c4.xlarge 4 7.5 750
c4.2xlarge 8 15 1,000
c4.4xlarge 16 30 2,000
c4.8xlarge 36 60 4,000
• By entering deeper idle states, non-idle cores can achieve
up to 300MHz higher clock frequencies
• But… deeper idle states require more time to exit, may not
be appropriate for latency sensitive workloads
What’s new in C4: P-state and C-state control
Tip: P-state control for AVX2
• If an application makes heavy use of AVX2 on all cores, the
processor may attempt to draw more power than it should
• Processor will transparently reduce frequency
• Frequent changes of CPU frequency can slow an application
Review: T2 Instances
• Lowest cost EC2 Instance at $0.013 per hour
• Burstable performance
• Fixed allocation enforced with CPU Credits
Model vCPU CPU Credits
/ Hour
Memory
(GiB)
Storage
t2.micro 1 6 1 EBS Only
t2.small 1 12 2 EBS Only
t2.medium 2 24 4 EBS Only
t2.large 2 36 8 EBS Only
How Credits Work
Baseline Rate
Credit
Balance
• A CPU Credit provides the
performance of a full CPU core for
one minute
• An instance earns CPU credits at
a steady rate
• An instance consumes credits
when active
• Credits expire (leak) after 24 hours
Burst
Rate
Tip: Monitor CPU credit balance
Monitoring CPU Performance in Guest
• Indicators that work is being done
• User time
• System time (kernel mode)
• Wait I/O, threads blocked on disk I/O
• Else, Idle
• What happens if OS is scheduled off the CPU?
Tip: How to interpret Steal Time
• Fixed CPU allocations of CPU can be offered through
CPU caps
• Steal time happens when CPU cap is enforced
• Leverage CloudWatch metrics
Delivering I/O Performance with
Amazon EC2 Instances
I/O and Devices Virtualization
• Scheduling I/O requests between virtual devices and
shared physical hardware
• Split driver model
• Intel VT-d
• Direct pass through and IOMMU for dedicated devices
• Enhanced Networking
Hardware
Split Driver Model
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Split Driver Model
• Each virtual device has two main components
• Communication ring buffer
• An event channel signaling activity in the ring buffer
• Data is transferred through shared pages
• Shared pages requires inter domain permissions, or granting
Review: I2 Instances
16 vCPU: 3.2 TB SSD; 32 vCPU: 6.4 TB SSD
365K random read IOPS for 32 vCPU instance
Model vCPU Memory
(GiB)
Storage Read IOPS Write IOPS
i2.xlarge 4 30.5 1 x 800 SSD 35,000 35,000
i2.2xlarge 8 61 2 x 800 SSD 75,000 75,000
i2.4xlarge 16 122 4 x 800 SSD 175,000 155,000
i2.8xlarge 32 244 8 x 800 SSD 365,000 315,000
Granting in pre-3.8.0 Kernels
• Requires “grant mapping” prior to 3.8.0
• Grant mappings are expensive operations due to TLB flushes
read(fd, buffer,…)
• Grant mappings are setup in a pool once
• Data is copied in and out of the grant pool
read(fd, buffer…)
Granting in 3.8.0+ Kernels, Persistent and Indirect
Copy to
and from
grant pool
Tip: Use 3.8+ kernel
• Amazon Linux 13.09 or later
• Ubuntu 14.04 or later
• RHEL7 or later
• Etc.
Event Handling
• Guest vCPUs are interrupted to process events.
• Pre-2.6.36 kernels: notifications went to a single virtual
hardware interrupt
• Post-2.6.36 kernels: allow instance to tell hypervisor to deliver
notification to a specific vCPU for balancing
• Check "dmesg" for the following text: "Xen HVM callback vector for
event delivery is enabled“
• Also, check version of irqbalance is 1.0.7 or higher
Hardware
Split Driver Model: Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
Split Driver Model: Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
Split Driver Model: Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
Split Driver Model: Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
Split Driver Model: Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Device Pass Through: Enhanced Networking
• SR-IOV eliminates need for driver domain
• Physical network device exposes virtual function to
instance
• Requires a specialized driver, which means:
• Your instance OS needs to know about it
• EC2 needs to be told your instance can use it
Hardware
After Enhanced Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
NIC
Driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
SR-IOV Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
After Enhanced Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
NIC
Driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
SR-IOV Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Hardware
After Enhanced Networking
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
NIC
Driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
SR-IOV Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
Tip: Use Enhanced Networking
• Highest packets-per-second
• Lowest variance in latency
• Instance OS must support it
• Look for SR-IOV property of instance or image
Summary
• Find an instance type and workload combination
– Define performance
– Monitor resource utilization
– Make changes
Instance Selection = Performance Tuning
• PV-HVM
• Time keeping: use TSC
• C state and P state controls
• Monitor T2 CPU credits
• Persistent grants for I/O performance
• Event callbacks and IRQ balancing
• Enhanced Networking
Recap: Getting the Most Out of EC2 Instances
Next steps
• Visit the EC2 Instance Documentation
• Come visit us in the Developer Chat to hear more
Thank you!
Remember to complete
your evaluations!
1 of 59

Recommended

Performance Tuning EC2 Instances by
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
171.6K views81 slides
Apache Spark Core—Deep Dive—Proper Optimization by
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
6.1K views50 slides
Why does my choice of storage matter with cassandra? by
Why does my choice of storage matter with cassandra?Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?Johnny Miller
7.3K views32 slides
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc... by
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
24.1K views67 slides
Apache Spark Core – Practical Optimization by
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationDatabricks
2.5K views40 slides
Tame the small files problem and optimize data layout for streaming ingestion... by
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
803 views68 slides

More Related Content

What's hot

Bootstrapping state in Apache Flink by
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
1.7K views30 slides
Fine Tuning and Enhancing Performance of Apache Spark Jobs by
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsDatabricks
2.5K views31 slides
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique by
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization TechniqueSquare Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization TechniqueScyllaDB
1.1K views30 slides
YOW2018 Cloud Performance Root Cause Analysis at Netflix by
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
142.4K views101 slides
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva... by
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
48.6K views91 slides
Gc and-pagescan-attacks-by-linux by
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxCuong Tran
9.1K views10 slides

What's hot(20)

Bootstrapping state in Apache Flink by DataWorks Summit
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
DataWorks Summit1.7K views
Fine Tuning and Enhancing Performance of Apache Spark Jobs by Databricks
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks2.5K views
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique by ScyllaDB
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization TechniqueSquare Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
ScyllaDB1.1K views
YOW2018 Cloud Performance Root Cause Analysis at Netflix by Brendan Gregg
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg142.4K views
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva... by Monica Beckwith
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Monica Beckwith48.6K views
Gc and-pagescan-attacks-by-linux by Cuong Tran
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
Cuong Tran9.1K views
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ... by Monica Beckwith
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Monica Beckwith7.2K views
Mux loves Clickhouse. By Adam Brown, Mux founder by Altinity Ltd
Mux loves Clickhouse. By Adam Brown, Mux founderMux loves Clickhouse. By Adam Brown, Mux founder
Mux loves Clickhouse. By Adam Brown, Mux founder
Altinity Ltd1.4K views
Linux 4.x Tracing Tools: Using BPF Superpowers by Brendan Gregg
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
Brendan Gregg210.2K views
All about Zookeeper and ClickHouse Keeper.pdf by Altinity Ltd
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
Altinity Ltd3K views
Deep Dive into GPU Support in Apache Spark 3.x by Databricks
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
Databricks2.3K views
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang by Databricks
Optimal Strategies for Large Scale Batch ETL Jobs with Emma TangOptimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Databricks2.5K views
IBM Spectrum Scale and Its Use for Content Management by Sandeep Patil
 IBM Spectrum Scale and Its Use for Content Management IBM Spectrum Scale and Its Use for Content Management
IBM Spectrum Scale and Its Use for Content Management
Sandeep Patil1.4K views
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand... by inwin stack
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
inwin stack1K views
Kafka vs Pulsar @KafkaMeetup_20180316 by Nozomi Kurihara
Kafka vs Pulsar @KafkaMeetup_20180316Kafka vs Pulsar @KafkaMeetup_20180316
Kafka vs Pulsar @KafkaMeetup_20180316
Nozomi Kurihara5K views
Linux Performance Analysis: New Tools and Old Secrets by Brendan Gregg
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg603.9K views
The Parquet Format and Performance Optimization Opportunities by Databricks
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks8.1K views
Performance Optimizations in Apache Impala by Cloudera, Inc.
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.10.7K views

Viewers also liked

AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンス by
AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンスAWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンス
AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンスAmazon Web Services Japan
55.2K views114 slides
2015/04/01 AWS Blackbelt EC2 by
2015/04/01 AWS Blackbelt EC22015/04/01 AWS Blackbelt EC2
2015/04/01 AWS Blackbelt EC2Amazon Web Services Japan
41.1K views71 slides
Cloud Native Application on DEIS by using 12 factor by
Cloud Native Application on DEIS by using 12 factorCloud Native Application on DEIS by using 12 factor
Cloud Native Application on DEIS by using 12 factorYoshio Terada
18.6K views106 slides
Preparation to Start the Microservice for Java EE developers by
Preparation to Start the Microservice for Java EE developersPreparation to Start the Microservice for Java EE developers
Preparation to Start the Microservice for Java EE developersYoshio Terada
18.1K views91 slides
AWS ベーシックトレーニング-トレーニング資料 by
AWS ベーシックトレーニング-トレーニング資料AWS ベーシックトレーニング-トレーニング資料
AWS ベーシックトレーニング-トレーニング資料Amazon Web Services Japan
46.5K views133 slides
これでAWSマスター!? 初心者向けAWS簡単講座 by
これでAWSマスター!? 初心者向けAWS簡単講座これでAWSマスター!? 初心者向けAWS簡単講座
これでAWSマスター!? 初心者向けAWS簡単講座Serverworks Co.,Ltd.
43.1K views37 slides

Viewers also liked(7)

AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンス by Amazon Web Services Japan
AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンスAWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンス
AWS Black Belt Techシリーズ リザーブドインスタンス & スポットインスタンス
Cloud Native Application on DEIS by using 12 factor by Yoshio Terada
Cloud Native Application on DEIS by using 12 factorCloud Native Application on DEIS by using 12 factor
Cloud Native Application on DEIS by using 12 factor
Yoshio Terada18.6K views
Preparation to Start the Microservice for Java EE developers by Yoshio Terada
Preparation to Start the Microservice for Java EE developersPreparation to Start the Microservice for Java EE developers
Preparation to Start the Microservice for Java EE developers
Yoshio Terada18.1K views
これでAWSマスター!? 初心者向けAWS簡単講座 by Serverworks Co.,Ltd.
これでAWSマスター!? 初心者向けAWS簡単講座これでAWSマスター!? 初心者向けAWS簡単講座
これでAWSマスター!? 初心者向けAWS簡単講座
Serverworks Co.,Ltd.43.1K views

Similar to (CMP402) Amazon EC2 Instances Deep Dive

20160503 Amazed by AWS | Tips about Performance on AWS by
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWSAmazon Web Services Korea
1.8K views150 slides
Deep Dive on Delivering Amazon EC2 Instance Performance by
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
6.4K views63 slides
Deep Dive on Delivering Amazon EC2 Instance Performance by
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
971 views56 slides
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi... by
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
10.8K views55 slides
AWS Summit Bogotá Track Avanzado: EC2 avanzado by
AWS Summit Bogotá Track Avanzado: EC2 avanzadoAWS Summit Bogotá Track Avanzado: EC2 avanzado
AWS Summit Bogotá Track Avanzado: EC2 avanzadoAmazon Web Services
678 views38 slides
Deep Dive on Delivering Amazon EC2 Instance Performance by
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
1.7K views56 slides

Similar to (CMP402) Amazon EC2 Instances Deep Dive(20)

Deep Dive on Delivering Amazon EC2 Instance Performance by Amazon Web Services
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Amazon Web Services6.4K views
Deep Dive on Delivering Amazon EC2 Instance Performance by Amazon Web Services
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi... by Amazon Web Services
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Amazon Web Services10.8K views
Deep Dive on Delivering Amazon EC2 Instance Performance by Amazon Web Services
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Amazon Web Services1.7K views
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013 by Amazon Web Services
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Amazon Web Services13.2K views
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022 by HostedbyConfluent
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent741 views
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture by Ceph Community
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Community 212 views
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture by Danielle Womboldt
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt2.1K views
VMworld 2013: Extreme Performance Series: Monster Virtual Machines by VMworld
VMworld 2013: Extreme Performance Series: Monster Virtual Machines VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld2.1K views
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013 by Amazon Web Services
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
Cloud stack overview by howie YU
Cloud stack overviewCloud stack overview
Cloud stack overview
howie YU3.5K views
Advanced performance troubleshooting using esxtop by Alan Renouf
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtop
Alan Renouf5.6K views
webinar vmware v-sphere performance management Challenges and Best Practices by Metron
webinar vmware v-sphere performance management Challenges and Best Practiceswebinar vmware v-sphere performance management Challenges and Best Practices
webinar vmware v-sphere performance management Challenges and Best Practices
Metron 2.3K views
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017 by Amazon Web Services
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Amazon Web Services1.4K views

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn... by
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
26.5K views46 slides
Big Data per le Startup: come creare applicazioni Big Data in modalità Server... by
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
5.6K views44 slides
Esegui pod serverless con Amazon EKS e AWS Fargate by
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
4.1K views62 slides
Costruire Applicazioni Moderne con AWS by
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
2.8K views61 slides
Come spendere fino al 90% in meno con i container e le istanze spot by
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
1.8K views21 slides
Open banking as a service by
Open banking as a serviceOpen banking as a service
Open banking as a serviceAmazon Web Services
7.1K views14 slides

More from Amazon Web Services(20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn... by Amazon Web Services
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services26.5K views
Big Data per le Startup: come creare applicazioni Big Data in modalità Server... by Amazon Web Services
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services5.6K views
Esegui pod serverless con Amazon EKS e AWS Fargate by Amazon Web Services
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services4.1K views
Come spendere fino al 90% in meno con i container e le istanze spot by Amazon Web Services
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services1.8K views
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea... by Amazon Web Services
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services3.3K views
OpsWorks Configuration Management: automatizza la gestione e i deployment del... by Amazon Web Services
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services2.6K views
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads by Amazon Web Services
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services1.7K views
Database Oracle e VMware Cloud on AWS i miti da sfatare by Amazon Web Services
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services1.3K views
Crea la tua prima serverless ledger-based app con QLDB e NodeJS by Amazon Web Services
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services1.9K views
API moderne real-time per applicazioni mobili e web by Amazon Web Services
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services1.5K views
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare by Amazon Web Services
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services1.5K views
AWS_HK_StartupDay_Building Interactive websites while automating for efficien... by Amazon Web Services
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Introduzione a Amazon Elastic Container Service by Amazon Web Services
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services2.7K views

Recently uploaded

STPI OctaNE CoE Brochure.pdf by
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdfmadhurjyapb
14 views1 slide
Network Source of Truth and Infrastructure as Code revisited by
Network Source of Truth and Infrastructure as Code revisitedNetwork Source of Truth and Infrastructure as Code revisited
Network Source of Truth and Infrastructure as Code revisitedNetwork Automation Forum
26 views45 slides
Case Study Copenhagen Energy and Business Central.pdf by
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdfAitana
16 views3 slides
Melek BEN MAHMOUD.pdf by
Melek BEN MAHMOUD.pdfMelek BEN MAHMOUD.pdf
Melek BEN MAHMOUD.pdfMelekBenMahmoud
14 views1 slide
The Research Portal of Catalonia: Growing more (information) & more (services) by
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)CSUC - Consorci de Serveis Universitaris de Catalunya
80 views25 slides
AMAZON PRODUCT RESEARCH.pdf by
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdfJerikkLaureta
26 views13 slides

Recently uploaded(20)

STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana16 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta26 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10248 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada127 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab19 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi127 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri16 views

(CMP402) Amazon EC2 Instances Deep Dive

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. John Phillips, Principal Product Manager, Amazon EC2 October 7, 2015 CMP402 Amazon EC2 Instances Deep Dive Delivering System Performance
  • 3. Host Server Hypervisor Guest 1 Guest 2 Guest n Amazon EC2 Instances
  • 4. 2006 2008 2010 2012 2014 2016 m1.small m1.large m1.xlarge c1.medium c1.xlarge m2.xlarge m2.4xlarge m2.2xlarge cc1.4xlarge t1.micro cg1.4xlarge cc2.8xlarge m1.medium hi1.4xlarge m3.xlarge m3.2xlarge hs1.8xlarge cr1.8xlarge c3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarge g2.2xlarge i2.xlarge i2.2xlarge i2.4xlarge i2.4xlarge m3.medium m3.large r3.large r3.xlarge r3.2xlarge r3.4xlarge r3.8xlarge t2.micro t2.small t2.med c4.large c4.xlarge c4.2xlarge c4.4xlarge c4.8xlarge d2.xlarge d2.2xlarge d2.4xlarge d2.8xlarge g2.8xlarge t2.large m4.large m4.xlarge m4.2xlarge m4.4xlarge m4.10xlarge Amazon EC2 Instances History
  • 5. What to Expect from the Session • Defining system performance and how it is characterized for different workloads • How Amazon EC2 instances deliver performance while providing flexibility and agility • How to make the most of your EC2 instance experience through the lens of several instance types
  • 7. • Servers are hired to do jobs • Performance is measured differently depending on the job Hiring a Server ?
  • 8. • What performance means depend on your perspective: – Response time – Throughput – Consistency Defining Performance: Perspective Matters Application System libraries System calls Kernel Devices Workload
  • 9. Simple Performance Model for Single Thread • Using CPU: executing (in user mode) • Not using CPU: waiting for turn on CPU, waiting for disk or network I/O, thread locks, memory paging, or for more work.
  • 10. Performance Factors Resource Performance factors Key indicators CPU Sockets, number of cores, clock frequency, bursting capability CPU utilization, run queue length Memory Memory capacity Free memory, anonymous paging, thread swapping Network interface Max bandwidth, packet rate Receive throughput, transmit throughput over max bandwidth Disks Input / output operations per second, throughput Wait queue length, device utilization, device errors
  • 11. Resource Utilization • For given performance, how efficiently are resources being used • Something at 100% utilization can’t accept any more work • Low utilization can indicate more resource is being purchased than needed
  • 12. Example: Web Application • MediaWiki installed on Apache with 140 pages of content • Load increased in intervals over time
  • 17. • Picking an instance is tantamount to resource performance tuning • Give back instances as easily as you can acquire new ones • Find an ideal instance type and workload combination Instance Selection = Performance Tuning
  • 18. Delivering Compute Performance with Amazon EC2 Instances
  • 19. CPU Instructions and Protection Levels Kernel Application • CPU has at least two protection levels: ring0 and ring1 • Privileged instructions can’t be executed in user mode to protect system. Applications leverage system calls to the kernel.
  • 20. Example: Web application system calls
  • 21. X86 CPU Virtualization: Prior to Intel VT-x VMM Application Kernel PV • Binary translation for privileged instructions • Para-virtualization (PV) • PV requires going through the VMM, adding latency • Applications that are system call bound are most affected
  • 22. X86 CPU Virtualization: After Intel VT-x Kernel Application VMM PV-HVM • Hardware assisted virtualization (HVM) • PV-HVM uses PV drivers opportunistically for operations that are slow emulated: • e.g. network and block I/O
  • 23. Tip: Use PV-HVM AMIs with EBS
  • 24. Time Keeping Explained • Time keeping in an instance is deceptively hard • gettimeofday(), clock_gettime(), QueryPerformanceCounter() • The TSC • CPU counter, accessible from userspace • Requires calibration, vDSO • Invariant on Sandy Bridge+ processors • Xen pvclock; does not support vDSO • On current generation instances, use TSC as clocksource
  • 25. Tip: Use TSC as clocksource
  • 26. CPU Performance and Scheduling • Hypervisor ensures every guest receives CPU time • Fixed allocation • Uncapped vs. capped • Variable allocation • Different schedulers can be used depending on the goal • Fairness • Response time / deadline • Shares
  • 27. Review: C4 Instances Custom Intel E5-2666 v3 at 2.9 GHz P-state and C-state controls Model vCPU Memory (GiB) EBS (Mbps) c4.large 2 3.75 500 c4.xlarge 4 7.5 750 c4.2xlarge 8 15 1,000 c4.4xlarge 16 30 2,000 c4.8xlarge 36 60 4,000
  • 28. • By entering deeper idle states, non-idle cores can achieve up to 300MHz higher clock frequencies • But… deeper idle states require more time to exit, may not be appropriate for latency sensitive workloads What’s new in C4: P-state and C-state control
  • 29. Tip: P-state control for AVX2 • If an application makes heavy use of AVX2 on all cores, the processor may attempt to draw more power than it should • Processor will transparently reduce frequency • Frequent changes of CPU frequency can slow an application
  • 30. Review: T2 Instances • Lowest cost EC2 Instance at $0.013 per hour • Burstable performance • Fixed allocation enforced with CPU Credits Model vCPU CPU Credits / Hour Memory (GiB) Storage t2.micro 1 6 1 EBS Only t2.small 1 12 2 EBS Only t2.medium 2 24 4 EBS Only t2.large 2 36 8 EBS Only
  • 31. How Credits Work Baseline Rate Credit Balance • A CPU Credit provides the performance of a full CPU core for one minute • An instance earns CPU credits at a steady rate • An instance consumes credits when active • Credits expire (leak) after 24 hours Burst Rate
  • 32. Tip: Monitor CPU credit balance
  • 33. Monitoring CPU Performance in Guest • Indicators that work is being done • User time • System time (kernel mode) • Wait I/O, threads blocked on disk I/O • Else, Idle • What happens if OS is scheduled off the CPU?
  • 34. Tip: How to interpret Steal Time • Fixed CPU allocations of CPU can be offered through CPU caps • Steal time happens when CPU cap is enforced • Leverage CloudWatch metrics
  • 35. Delivering I/O Performance with Amazon EC2 Instances
  • 36. I/O and Devices Virtualization • Scheduling I/O requests between virtual devices and shared physical hardware • Split driver model • Intel VT-d • Direct pass through and IOMMU for dedicated devices • Enhanced Networking
  • 37. Hardware Split Driver Model Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling
  • 38. Split Driver Model • Each virtual device has two main components • Communication ring buffer • An event channel signaling activity in the ring buffer • Data is transferred through shared pages • Shared pages requires inter domain permissions, or granting
  • 39. Review: I2 Instances 16 vCPU: 3.2 TB SSD; 32 vCPU: 6.4 TB SSD 365K random read IOPS for 32 vCPU instance Model vCPU Memory (GiB) Storage Read IOPS Write IOPS i2.xlarge 4 30.5 1 x 800 SSD 35,000 35,000 i2.2xlarge 8 61 2 x 800 SSD 75,000 75,000 i2.4xlarge 16 122 4 x 800 SSD 175,000 155,000 i2.8xlarge 32 244 8 x 800 SSD 365,000 315,000
  • 40. Granting in pre-3.8.0 Kernels • Requires “grant mapping” prior to 3.8.0 • Grant mappings are expensive operations due to TLB flushes read(fd, buffer,…)
  • 41. • Grant mappings are setup in a pool once • Data is copied in and out of the grant pool read(fd, buffer…) Granting in 3.8.0+ Kernels, Persistent and Indirect Copy to and from grant pool
  • 42. Tip: Use 3.8+ kernel • Amazon Linux 13.09 or later • Ubuntu 14.04 or later • RHEL7 or later • Etc.
  • 43. Event Handling • Guest vCPUs are interrupted to process events. • Pre-2.6.36 kernels: notifications went to a single virtual hardware interrupt • Post-2.6.36 kernels: allow instance to tell hypervisor to deliver notification to a specific vCPU for balancing • Check "dmesg" for the following text: "Xen HVM callback vector for event delivery is enabled“ • Also, check version of irqbalance is 1.0.7 or higher
  • 44. Hardware Split Driver Model: Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 45. Hardware Split Driver Model: Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 46. Hardware Split Driver Model: Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 47. Hardware Split Driver Model: Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 48. Hardware Split Driver Model: Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 49. Device Pass Through: Enhanced Networking • SR-IOV eliminates need for driver domain • Physical network device exposes virtual function to instance • Requires a specialized driver, which means: • Your instance OS needs to know about it • EC2 needs to be told your instance can use it
  • 50. Hardware After Enhanced Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver NIC Driver Backend driver Device Driver Physical CPU Physical Memory SR-IOV Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 51. Hardware After Enhanced Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver NIC Driver Backend driver Device Driver Physical CPU Physical Memory SR-IOV Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 52. Hardware After Enhanced Networking Driver Domain Guest Domain Guest Domain VMM Frontend driver NIC Driver Backend driver Device Driver Physical CPU Physical Memory SR-IOV Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application
  • 53. Tip: Use Enhanced Networking • Highest packets-per-second • Lowest variance in latency • Instance OS must support it • Look for SR-IOV property of instance or image
  • 55. • Find an instance type and workload combination – Define performance – Monitor resource utilization – Make changes Instance Selection = Performance Tuning
  • 56. • PV-HVM • Time keeping: use TSC • C state and P state controls • Monitor T2 CPU credits • Persistent grants for I/O performance • Event callbacks and IRQ balancing • Enhanced Networking Recap: Getting the Most Out of EC2 Instances
  • 57. Next steps • Visit the EC2 Instance Documentation • Come visit us in the Developer Chat to hear more