SlideShare a Scribd company logo
Deep Dive on Amazon EC2 Instances
Featuring Performance Optimization Best Practices
By
Androski Spicer, Solutions Architect
© 2016 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web
Services, Inc.
What to Expect from the Session
 Understanding of the factors that goes into choosing an EC2
instance
 Defining system performance and how it is characterized for
different workloads
 How Amazon EC2 instances deliver performance while providing
flexibility and agility
 How to make the most of your EC2 instance experience through the
lens of several instance types
API
EC2
EC2
Amazon Elastic Compute Cloud is Big
Instances
Networking
Purchase options
Host Server
Hypervisor
Guest 1 Guest 2 Guest n
Amazon EC2 Instances
In the past
 First launched in August 2006
 M1 instance
 “One size fits all”
M1
Amazon EC2 Instances History
2006 2008 2010 2012 2014
2016
m1.small
m1.large
m1.xlarge
c1.medium
c1.xlarge
m2.xlarge
m2.4xlarge
m2.2xlarge
cc1.4xlarge
t1.micro
cg1.4xlarge
cc2.8xlarge
m1.medium
hi1.4xlarge
m3.xlarge
m3.2xlarge
hs1.8xlarge
cr1.8xlarge
c3.large
c3.xlarge
c3.2xlarge
c3.4xlarge
c3.8xlarge
g2.2xlarge
i2.xlarge
i2.2xlarge
i2.4xlarge
i2.4xlarge
m3.medium
m3.large
r3.large
r3.xlarge
r3.2xlarge
r3.4xlarge
r3.8xlarge
t2.micro
t2.small
t2.med
c4.large
c4.xlarge
c4.2xlarge
c4.4xlarge
c4.8xlarge
d2.xlarge
d2.2xlarge
d2.4xlarge
d2.8xlarge
g2.8xlarge
t2.large
m4.large
m4.xlarge
m4.2xlarge
m4.4xlarge
m4.10xlarge
x1.32xlarge
t2.nano
m4.16xlarge
p2.xlarge
p2.8xlarge
p2.16xlarge
Instance generation
c4.xlarge
Instance family Instance size
EC2 Instance Families
General
purpose
Compute
optimized
C3
Storage and I/O
optimized
I2
P2
GPU
optimized
Memory
optimized
R3C4
M4
D2
X1
G2
What’s a Virtual CPU? (vCPU)
 A vCPU is typically a hyper-threaded physical core*
 On Linux, “A” threads enumerated before “B” threads
 On Windows, threads are interleaved
 Divide vCPU count by 2 to get core count
 Cores by EC2 & RDS DB Instance type:
https://aws.amazon.com/ec2/virtualcores/
* The “t” family is special
Disable Hyper-Threading If You Need To
 Useful for FPU heavy applications
 Use ‘lscpu’ to validate layout
 Hot offline the “B” threads
for i in `seq 64 127`; do
echo 0 > /sys/devices/system/cpu/cpu${i}/online
done
 Set grub to only initialize the first half of
all threads
maxcpus=63
[ec2-user@ip-172-31-7-218 ~]$ lscpu
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 4
NUMA node(s): 4
Model name: Intel(R) Xeon(R) CPU
Hypervisor vendor: Xen
Virtualization type: full
NUMA node0 CPU(s): 0-15,64-79
NUMA node1 CPU(s): 16-31,80-95
NUMA node2 CPU(s): 32-47,96-111
NUMA node3 CPU(s): 48-63,112-127
Instance sizing
c4.8xlarge 2 - c4.4xlarge
≈
4 - c4.2xlarge
≈
8 - c4.xlarge
≈
Resource Allocation
 All resources assigned to you are dedicated to your instance with no
over commitment*
 All vCPUs are dedicated to you
 Memory allocated is assigned only to your instance
 Network resources are partitioned to avoid “noisy neighbors”
 Curious about the number of instances per host? Use “Dedicated
Hosts” as a guide.
*Again, the “T” family is special
“Launching new instances and running tests
in parallel is easy…[when choosing an
instance] there is no substitute for measuring
the performance of your full application.”
- EC2 documentation
Timekeeping Explained
 Timekeeping in an instance is deceptively hard
 gettimeofday(), clock_gettime(), QueryPerformanceCounter()
 The TSC
 CPU counter, accessible from userspace
 Requires calibration, vDSO
 Invariant on Sandy Bridge+ processors
 Xen pvclock; does not support vDSO
 On current generation instances, use TSC as clocksource
Benchmarking - Time Intensive Application
#include <sys/time.h>
#include <time.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
time_t start,end;
time (&start);
for ( int x = 0; x < 100000000; x++ ) {
float f;
float g;
float h;
f = 123456789.0f;
g = 123456789.0f;
h = f * g;
struct timeval tv;
gettimeofday(&tv, NULL);
}
time (&end);
double dif = difftime (end,start);
printf ("Elasped time is %.2lf seconds.n", dif );
return 0;
}
Using the Xen Clock Source
[centos@ip-192-168-1-77 testbench]$ strace -c ./test
Elasped time is 12.00 seconds.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.99 3.322956 2 2001862 gettimeofday
0.00 0.000096 6 16 mmap
0.00 0.000050 5 10 mprotect
0.00 0.000038 8 5 open
0.00 0.000026 5 5 fstat
0.00 0.000025 5 5 close
0.00 0.000023 6 4 read
0.00 0.000008 8 1 1 access
0.00 0.000006 6 1 brk
0.00 0.000006 6 1 execve
0.00 0.000005 5 1 arch_prctl
0.00 0.000000 0 1 munmap
------ ----------- ----------- --------- --------- ----------------
100.00 3.323239 2001912 1 total
Using the TSC Clock Source
[centos@ip-192-168-1-77 testbench]$ strace -c ./test
Elasped time is 2.00 seconds.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
32.97 0.000121 7 17 mmap
20.98 0.000077 8 10 mprotect
11.72 0.000043 9 5 open
10.08 0.000037 7 5 close
7.36 0.000027 5 6 fstat
6.81 0.000025 6 4 read
2.72 0.000010 10 1 munmap
2.18 0.000008 8 1 1 access
1.91 0.000007 7 1 execve
1.63 0.000006 6 1 brk
1.63 0.000006 6 1 arch_prctl
0.00 0.000000 0 1 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.000367 53 1 total
Change with:
Tip: Use TSC as clocksource
P-state and C-state control
 c4.8xlarge, d2.8xlarge, m4.10xlarge,
m4.16xlarge, p2.16xlarge, x1.16xlarge,
x1.32xlarge
 By entering deeper idle states, non-idle
cores can achieve up to 300MHz higher
clock frequencies
 But… deeper idle states require more
time to exit, may not be appropriate for
latency-sensitive workloads
 Limit c-state by adding
“intel_idle.max_cstate=1” to grub
Tip: P-state control for AVX2
 If an application makes heavy use of AVX2 on all cores, the processor
may attempt to draw more power than it should
 Processor will transparently reduce frequency
 Frequent changes of CPU frequency can slow an application
sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
See also: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
Review: T2 Instances
 Lowest cost EC2 instance at $0.0065 per hour
 Burstable performance
 Fixed allocation enforced with CPU credits
Model vCPU Baseline CPU Credits
/ Hour
Memory
(GiB)
Storage
t2.nano 1 5% 3 .5 EBS Only
t2.micro 1 10% 6 1 EBS Only
t2.small 1 20% 12 2 EBS Only
t2.medium 2 40%** 24 4 EBS Only
t2.large 2 60%** 36 8 EBS Only
General purpose, web serving, developer environments, small databases
How Credits Work
 A CPU credit provides the performance of a
full CPU core for one minute
 An instance earns CPU credits at a steady rate
 An instance consumes credits when active
 Credits expire (leak) after 24 hours
Baseline rate
Credit
balance
Burst
rate
Tip: Monitor CPU credit balance
Review: X1 Instances
 Largest memory instance with 2 TB of DRAM
 Quad socket, Intel E7 processors with 128 vCPUs
Model vCPU Memory (GiB) Local Storage Network
x1.16xlarge 64 976 1x 1920GB SSD 10Gbps
x1.32xlarge 128 1952 2x 1920GB SSD 20Gbps
In-memory databases, big data processing, HPC workloads
NUMA
 Non-uniform memory access
 Each processor in a multi-CPU system has
local memory that is accessible through a
fast interconnect
 Each processor can also access memory
from other CPUs, but local memory access
is a lot faster than remote memory
 Performance is related to the number of
CPU sockets and how they are connected -
Intel QuickPath Interconnect (QPI)
QPI
122GB 122GB
16 vCPU’s 16 vCPU’s
r3.8xlarge
QPI
QPI
QPIQPI
QPI
488GB
488GB
488GB
488GB
32 vCPU’s 32 vCPU’s
32 vCPU’s 32 vCPU’s
x1.32xlarge
Tip: Kernel Support for NUMA Balancing
 An application will perform best when the threads of its processes are
accessing memory on the same NUMA node.
 NUMA balancing moves tasks closer to the memory they are accessing.
 This is all done automatically by the Linux kernel when automatic NUMA
balancing is active: version 3.8+ of the Linux kernel.
 Windows support for NUMA first appeared in the Enterprise and Data
Center SKUs of Windows Server 2003.
 Set “numa=off” or use numactl to reduce NUMA paging if your
application uses more memory than will fit on a single socket or has
threads that move between sockets
Operating Systems Impact Performance
 Memory intensive web application
 Created many threads
 Rapidly allocated/deallocated memory
 Comparing performance of RHEL6 vs RHEL7
 Notice high amount of “system” time in top
 Found a benchmark tool (ebizzy) with a similar performance profile
 Traced it’s performance with “perf”
On RHEL6
[ec2-user@ip-172-31-12-150-RHEL6 ebizzy-0.3]$ sudo perf stat ./ebizzy -S 10
12,409 records/s
real 10.00 s
user 7.37 s
sys 341.22 s
Performance counter stats for './ebizzy -S 10':
361458.371052 task-clock (msec) # 35.880 CPUs utilized
10,343 context-switches # 0.029 K/sec
2,582 cpu-migrations # 0.007 K/sec
1,418,204 page-faults # 0.004 M/sec
10.074085097 seconds time elapsed
RHEL6 Flame Graph Output
www.brendangregg.com/flamegraphs.html
On RHEL7
[ec2-user@ip-172-31-7-22-RHEL7 ~]$ sudo perf stat ./ebizzy-0.3/ebizzy -S 10
425,143 records/s
real 10.00 s
user 397.28 s
sys 0.18 s
Performance counter stats for './ebizzy-0.3/ebizzy -S 10':
397515.862535 task-clock (msec) # 39.681 CPUs utilized
25,256 context-switches # 0.064 K/sec
2,201 cpu-migrations # 0.006 K/sec
14,109 page-faults # 0.035 K/sec
10.017856000 seconds time elapsed
Up from 12,400 records/s!
Down from 1,418,204!
RHEL7 Flame Graph Output
Hugepages
 Disable Transparent Hugepages
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
 Use Explicit Huge Pages
$ sudo mkdir /dev/hugetlbfs
$ sudo mount -t hugetlbfs none /dev/hugetlbfs
$ sudo sysctl -w vm.nr_hugepages=10000
$ HUGETLB_MORECORE=yes LD_PRELOAD=libhugetlbfs.so numactl --cpunodebind=0 
--membind=0 /path/to/application
See also: https://lwn.net/Articles/375096/
Hardware
Split Driver Model
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Storage
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
1
23
4
5
Granting in pre-3.8.0 Kernels
 Requires “grant mapping” prior to 3.8.0
 Grant mappings are expensive operations due to TLB flushes
SSD
Inter domain I/O:
(1) Grant memory
(2) Write to ring buffer
(3) Signal event
(4) Read ring buffer
(5) Map grants
(6) Read or write grants
(7) Unmap grants
read(fd, buffer,…)
I/O domain Instance
Granting in 3.8.0+ Kernels, Persistent and Indirect
 Grant mappings are set up in a pool one time
 Data is copied in and out of the grant pool
SSD
read(fd, buffer…)
I/O domain Instance
Grant pool
Copy to
and from
grant pool
Validating Persistent Grants
[ec2-user@ip-172-31-4-129 ~]$ dmesg | egrep -i 'blkfront'
Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated
disks.
blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdd: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvde: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdf: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdg: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdh: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdi: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
2009 – Longer ago than you think
 Avatar was the top movie in the theaters
 Facebook overtook MySpace in active users
 President Obama was sworn into office
 The 2.6.32 Linux kernel was released
Tip: Use 3.10+ kernel
 Amazon Linux 13.09 or later
 Ubuntu 14.04 or later
 RHEL/Centos 7 or later
 Etc.
Device Pass Through: Enhanced Networking
 SR-IOV eliminates need for driver domain
 Physical network device exposes virtual function to instance
 Requires a specialized driver, which means:
 Your instance OS needs to know about it
 EC2 needs to be told your instance can use it
Hardware
After Enhanced Networking
Driver Domain Guest Domain Guest Domain
VMM
NIC
Driver
Physical
CPU
Physical
Memory
SR-IOV Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
1
2
3
NIC
Driver
Elastic Network Adapter
 Next Generation of Enhanced
Networking
 Hardware Checksums
 Multi-Queue Support
 Receive Side Steering
 20Gbps in a Placement Group
 New Open Source Amazon Network
Driver
Network Performance
 20 Gigabit & 10 Gigabit
 Measured one-way, double that for bi-directional (full duplex)
 High, Moderate, Low – A function of the instance size and EBS
optimization
 Not all created equal – Test with iperf if it’s important!
 Use placement groups when you need high and consistent instance
to instance bandwidth
 All traffic limited to 5 Gb/s when exiting EC2
EBS Performance
 Instance size affects throughput
 Match your volume size and
type to your instance
 Use EBS optimization if EBS
performance is important
 Choose HVM AMIs
 Timekeeping: use TSC
 C state and P state controls
 Monitor T2 CPU credits
 Use a modern Linux OS
 NUMA balancing
 Persistent grants for I/O performance
 Enhanced networking
 Profile your application
Summary: Getting the Most Out of EC2 Instances
 Bare metal performance goal, and in many scenarios already there
 History of eliminating hypervisor intermediation and driver domains
 Hardware assisted virtualization
 Scheduling and granting efficiencies
 Device pass through
Virtualization Themes
Next Steps
 Visit the Amazon EC2 documentation
 Launch an instance and try your app!
Thank You!
Remember to complete
your evaluations!

More Related Content

What's hot

Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
Rodrigo Missiaggia
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
Pradeep Kumar
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
Brendan Gregg
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
Redis Labs
 
Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)
涛 吴
 
RedHat OpenStack Platform Overview
RedHat OpenStack Platform OverviewRedHat OpenStack Platform Overview
RedHat OpenStack Platform Overview
indevlab
 
Cgroups, namespaces and beyond: what are containers made from?
Cgroups, namespaces and beyond: what are containers made from?Cgroups, namespaces and beyond: what are containers made from?
Cgroups, namespaces and beyond: what are containers made from?
Docker, Inc.
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ScyllaDB
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
Amazon Web Services
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
Brendan Gregg
 
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
Flink Forward
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
Wes McKinney
 
High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...
Pradeep Redddy Raamana
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
JamieAlquiza
 
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnCI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
Vietnam Open Infrastructure User Group
 
RDMA programming design and case studies – for better performance distributed...
RDMA programming design and case studies – for better performance distributed...RDMA programming design and case studies – for better performance distributed...
RDMA programming design and case studies – for better performance distributed...
NTT Software Innovation Center
 
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per DayRedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
Redis Labs
 
Change data capture
Change data captureChange data capture
Change data capture
Ron Barabash
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
vincentvdk
 

What's hot (20)

Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)
 
RedHat OpenStack Platform Overview
RedHat OpenStack Platform OverviewRedHat OpenStack Platform Overview
RedHat OpenStack Platform Overview
 
Cgroups, namespaces and beyond: what are containers made from?
Cgroups, namespaces and beyond: what are containers made from?Cgroups, namespaces and beyond: what are containers made from?
Cgroups, namespaces and beyond: what are containers made from?
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
Virtual Flink Forward 2020: Lessons learned on Apache Flink application avail...
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
 
High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnCI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
 
RDMA programming design and case studies – for better performance distributed...
RDMA programming design and case studies – for better performance distributed...RDMA programming design and case studies – for better performance distributed...
RDMA programming design and case studies – for better performance distributed...
 
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per DayRedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
 
Change data capture
Change data captureChange data capture
Change data capture
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
 

Viewers also liked

Getting Started Best Practices
Getting Started Best PracticesGetting Started Best Practices
Getting Started Best Practices
Amazon Web Services
 
Security Best Practices
Security Best PracticesSecurity Best Practices
Security Best Practices
Amazon Web Services
 
Cost Optimisation
Cost OptimisationCost Optimisation
Cost Optimisation
Amazon Web Services
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
Amazon Web Services
 
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSWorkshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Amazon Web Services
 
AWSome Day | Tech Track
AWSome Day | Tech TrackAWSome Day | Tech Track
AWSome Day | Tech Track
Amazon Web Services
 
AWSome Day Intro
AWSome Day IntroAWSome Day Intro
AWSome Day Intro
Amazon Web Services
 
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
Amazon Web Services
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
Amazon Web Services
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
Amazon Web Services
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Amazon Web Services
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Tom Laszewski
 
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
Amazon Web Services
 
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編Amazon Web Services Japan
 
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
Amazon Web Services
 
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
Amazon Web Services
 
Operating Your Production API
Operating Your Production APIOperating Your Production API
Operating Your Production API
Amazon Web Services
 
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
Amazon Web Services
 
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
Amazon Web Services
 
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
Amazon Web Services
 

Viewers also liked (20)

Getting Started Best Practices
Getting Started Best PracticesGetting Started Best Practices
Getting Started Best Practices
 
Security Best Practices
Security Best PracticesSecurity Best Practices
Security Best Practices
 
Cost Optimisation
Cost OptimisationCost Optimisation
Cost Optimisation
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSWorkshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECS
 
AWSome Day | Tech Track
AWSome Day | Tech TrackAWSome Day | Tech Track
AWSome Day | Tech Track
 
AWSome Day Intro
AWSome Day IntroAWSome Day Intro
AWSome Day Intro
 
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
Introduction to AWS and Cloud Computing - Module 1 Part 1 - AWSome Day 2017
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
(DVO312) Sony: Building At-Scale Services with AWS Elastic Beanstalk
 
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編
[AWSマイスターシリーズ] Amazon Elastic Compute Cloud HPC編
 
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
 
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)
 
Operating Your Production API
Operating Your Production APIOperating Your Production API
Operating Your Production API
 
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
 
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
AWS re:Invent 2016: AWS Mobile State of the Union - Serverless, New User Expe...
 
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
Enrich Your DevOps Environment: Tools for Accelerating and Integrating Your A...
 

Similar to Deep Dive on Amazon EC2

AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Amazon Web Services
 
Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)
Julien SIMON
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 Instances
Amazon Web Services
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Amazon Web Services
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Amazon Web Services
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Amazon Web Services
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
Amazon Web Services
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Amazon Web Services
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
Searce Inc
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
Bhuvaneshwaran R
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
Amazon Web Services Korea
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Amazon Web Services
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
Coburn Watson
 
Introduction on Amazon EC2
Introduction on Amazon EC2Introduction on Amazon EC2
Introduction on Amazon EC2
Amazon Web Services
 

Similar to Deep Dive on Amazon EC2 (20)

AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 Instances
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Introduction on Amazon EC2
Introduction on Amazon EC2Introduction on Amazon EC2
Introduction on Amazon EC2
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Orkestra
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
Access Innovations, Inc.
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 

Recently uploaded (16)

Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 

Deep Dive on Amazon EC2

  • 1. Deep Dive on Amazon EC2 Instances Featuring Performance Optimization Best Practices By Androski Spicer, Solutions Architect © 2016 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services, Inc.
  • 2. What to Expect from the Session  Understanding of the factors that goes into choosing an EC2 instance  Defining system performance and how it is characterized for different workloads  How Amazon EC2 instances deliver performance while providing flexibility and agility  How to make the most of your EC2 instance experience through the lens of several instance types
  • 3. API EC2 EC2 Amazon Elastic Compute Cloud is Big Instances Networking Purchase options
  • 4. Host Server Hypervisor Guest 1 Guest 2 Guest n Amazon EC2 Instances
  • 5. In the past  First launched in August 2006  M1 instance  “One size fits all” M1
  • 6. Amazon EC2 Instances History 2006 2008 2010 2012 2014 2016 m1.small m1.large m1.xlarge c1.medium c1.xlarge m2.xlarge m2.4xlarge m2.2xlarge cc1.4xlarge t1.micro cg1.4xlarge cc2.8xlarge m1.medium hi1.4xlarge m3.xlarge m3.2xlarge hs1.8xlarge cr1.8xlarge c3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarge g2.2xlarge i2.xlarge i2.2xlarge i2.4xlarge i2.4xlarge m3.medium m3.large r3.large r3.xlarge r3.2xlarge r3.4xlarge r3.8xlarge t2.micro t2.small t2.med c4.large c4.xlarge c4.2xlarge c4.4xlarge c4.8xlarge d2.xlarge d2.2xlarge d2.4xlarge d2.8xlarge g2.8xlarge t2.large m4.large m4.xlarge m4.2xlarge m4.4xlarge m4.10xlarge x1.32xlarge t2.nano m4.16xlarge p2.xlarge p2.8xlarge p2.16xlarge
  • 8. EC2 Instance Families General purpose Compute optimized C3 Storage and I/O optimized I2 P2 GPU optimized Memory optimized R3C4 M4 D2 X1 G2
  • 9. What’s a Virtual CPU? (vCPU)  A vCPU is typically a hyper-threaded physical core*  On Linux, “A” threads enumerated before “B” threads  On Windows, threads are interleaved  Divide vCPU count by 2 to get core count  Cores by EC2 & RDS DB Instance type: https://aws.amazon.com/ec2/virtualcores/ * The “t” family is special
  • 10.
  • 11. Disable Hyper-Threading If You Need To  Useful for FPU heavy applications  Use ‘lscpu’ to validate layout  Hot offline the “B” threads for i in `seq 64 127`; do echo 0 > /sys/devices/system/cpu/cpu${i}/online done  Set grub to only initialize the first half of all threads maxcpus=63 [ec2-user@ip-172-31-7-218 ~]$ lscpu CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 4 NUMA node(s): 4 Model name: Intel(R) Xeon(R) CPU Hypervisor vendor: Xen Virtualization type: full NUMA node0 CPU(s): 0-15,64-79 NUMA node1 CPU(s): 16-31,80-95 NUMA node2 CPU(s): 32-47,96-111 NUMA node3 CPU(s): 48-63,112-127
  • 12.
  • 13. Instance sizing c4.8xlarge 2 - c4.4xlarge ≈ 4 - c4.2xlarge ≈ 8 - c4.xlarge ≈
  • 14. Resource Allocation  All resources assigned to you are dedicated to your instance with no over commitment*  All vCPUs are dedicated to you  Memory allocated is assigned only to your instance  Network resources are partitioned to avoid “noisy neighbors”  Curious about the number of instances per host? Use “Dedicated Hosts” as a guide. *Again, the “T” family is special
  • 15. “Launching new instances and running tests in parallel is easy…[when choosing an instance] there is no substitute for measuring the performance of your full application.” - EC2 documentation
  • 16. Timekeeping Explained  Timekeeping in an instance is deceptively hard  gettimeofday(), clock_gettime(), QueryPerformanceCounter()  The TSC  CPU counter, accessible from userspace  Requires calibration, vDSO  Invariant on Sandy Bridge+ processors  Xen pvclock; does not support vDSO  On current generation instances, use TSC as clocksource
  • 17. Benchmarking - Time Intensive Application #include <sys/time.h> #include <time.h> #include <stdio.h> #include <unistd.h> int main() { time_t start,end; time (&start); for ( int x = 0; x < 100000000; x++ ) { float f; float g; float h; f = 123456789.0f; g = 123456789.0f; h = f * g; struct timeval tv; gettimeofday(&tv, NULL); } time (&end); double dif = difftime (end,start); printf ("Elasped time is %.2lf seconds.n", dif ); return 0; }
  • 18. Using the Xen Clock Source [centos@ip-192-168-1-77 testbench]$ strace -c ./test Elasped time is 12.00 seconds. % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.99 3.322956 2 2001862 gettimeofday 0.00 0.000096 6 16 mmap 0.00 0.000050 5 10 mprotect 0.00 0.000038 8 5 open 0.00 0.000026 5 5 fstat 0.00 0.000025 5 5 close 0.00 0.000023 6 4 read 0.00 0.000008 8 1 1 access 0.00 0.000006 6 1 brk 0.00 0.000006 6 1 execve 0.00 0.000005 5 1 arch_prctl 0.00 0.000000 0 1 munmap ------ ----------- ----------- --------- --------- ---------------- 100.00 3.323239 2001912 1 total
  • 19. Using the TSC Clock Source [centos@ip-192-168-1-77 testbench]$ strace -c ./test Elasped time is 2.00 seconds. % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 32.97 0.000121 7 17 mmap 20.98 0.000077 8 10 mprotect 11.72 0.000043 9 5 open 10.08 0.000037 7 5 close 7.36 0.000027 5 6 fstat 6.81 0.000025 6 4 read 2.72 0.000010 10 1 munmap 2.18 0.000008 8 1 1 access 1.91 0.000007 7 1 execve 1.63 0.000006 6 1 brk 1.63 0.000006 6 1 arch_prctl 0.00 0.000000 0 1 write ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000367 53 1 total
  • 20. Change with: Tip: Use TSC as clocksource
  • 21. P-state and C-state control  c4.8xlarge, d2.8xlarge, m4.10xlarge, m4.16xlarge, p2.16xlarge, x1.16xlarge, x1.32xlarge  By entering deeper idle states, non-idle cores can achieve up to 300MHz higher clock frequencies  But… deeper idle states require more time to exit, may not be appropriate for latency-sensitive workloads  Limit c-state by adding “intel_idle.max_cstate=1” to grub
  • 22. Tip: P-state control for AVX2  If an application makes heavy use of AVX2 on all cores, the processor may attempt to draw more power than it should  Processor will transparently reduce frequency  Frequent changes of CPU frequency can slow an application sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" See also: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
  • 23. Review: T2 Instances  Lowest cost EC2 instance at $0.0065 per hour  Burstable performance  Fixed allocation enforced with CPU credits Model vCPU Baseline CPU Credits / Hour Memory (GiB) Storage t2.nano 1 5% 3 .5 EBS Only t2.micro 1 10% 6 1 EBS Only t2.small 1 20% 12 2 EBS Only t2.medium 2 40%** 24 4 EBS Only t2.large 2 60%** 36 8 EBS Only General purpose, web serving, developer environments, small databases
  • 24. How Credits Work  A CPU credit provides the performance of a full CPU core for one minute  An instance earns CPU credits at a steady rate  An instance consumes credits when active  Credits expire (leak) after 24 hours Baseline rate Credit balance Burst rate
  • 25. Tip: Monitor CPU credit balance
  • 26. Review: X1 Instances  Largest memory instance with 2 TB of DRAM  Quad socket, Intel E7 processors with 128 vCPUs Model vCPU Memory (GiB) Local Storage Network x1.16xlarge 64 976 1x 1920GB SSD 10Gbps x1.32xlarge 128 1952 2x 1920GB SSD 20Gbps In-memory databases, big data processing, HPC workloads
  • 27. NUMA  Non-uniform memory access  Each processor in a multi-CPU system has local memory that is accessible through a fast interconnect  Each processor can also access memory from other CPUs, but local memory access is a lot faster than remote memory  Performance is related to the number of CPU sockets and how they are connected - Intel QuickPath Interconnect (QPI)
  • 28. QPI 122GB 122GB 16 vCPU’s 16 vCPU’s r3.8xlarge
  • 29. QPI QPI QPIQPI QPI 488GB 488GB 488GB 488GB 32 vCPU’s 32 vCPU’s 32 vCPU’s 32 vCPU’s x1.32xlarge
  • 30. Tip: Kernel Support for NUMA Balancing  An application will perform best when the threads of its processes are accessing memory on the same NUMA node.  NUMA balancing moves tasks closer to the memory they are accessing.  This is all done automatically by the Linux kernel when automatic NUMA balancing is active: version 3.8+ of the Linux kernel.  Windows support for NUMA first appeared in the Enterprise and Data Center SKUs of Windows Server 2003.  Set “numa=off” or use numactl to reduce NUMA paging if your application uses more memory than will fit on a single socket or has threads that move between sockets
  • 31. Operating Systems Impact Performance  Memory intensive web application  Created many threads  Rapidly allocated/deallocated memory  Comparing performance of RHEL6 vs RHEL7  Notice high amount of “system” time in top  Found a benchmark tool (ebizzy) with a similar performance profile  Traced it’s performance with “perf”
  • 32. On RHEL6 [ec2-user@ip-172-31-12-150-RHEL6 ebizzy-0.3]$ sudo perf stat ./ebizzy -S 10 12,409 records/s real 10.00 s user 7.37 s sys 341.22 s Performance counter stats for './ebizzy -S 10': 361458.371052 task-clock (msec) # 35.880 CPUs utilized 10,343 context-switches # 0.029 K/sec 2,582 cpu-migrations # 0.007 K/sec 1,418,204 page-faults # 0.004 M/sec 10.074085097 seconds time elapsed
  • 33. RHEL6 Flame Graph Output www.brendangregg.com/flamegraphs.html
  • 34. On RHEL7 [ec2-user@ip-172-31-7-22-RHEL7 ~]$ sudo perf stat ./ebizzy-0.3/ebizzy -S 10 425,143 records/s real 10.00 s user 397.28 s sys 0.18 s Performance counter stats for './ebizzy-0.3/ebizzy -S 10': 397515.862535 task-clock (msec) # 39.681 CPUs utilized 25,256 context-switches # 0.064 K/sec 2,201 cpu-migrations # 0.006 K/sec 14,109 page-faults # 0.035 K/sec 10.017856000 seconds time elapsed Up from 12,400 records/s! Down from 1,418,204!
  • 36. Hugepages  Disable Transparent Hugepages # echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled # echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag  Use Explicit Huge Pages $ sudo mkdir /dev/hugetlbfs $ sudo mount -t hugetlbfs none /dev/hugetlbfs $ sudo sysctl -w vm.nr_hugepages=10000 $ HUGETLB_MORECORE=yes LD_PRELOAD=libhugetlbfs.so numactl --cpunodebind=0 --membind=0 /path/to/application See also: https://lwn.net/Articles/375096/
  • 37. Hardware Split Driver Model Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Storage Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application 1 23 4 5
  • 38. Granting in pre-3.8.0 Kernels  Requires “grant mapping” prior to 3.8.0  Grant mappings are expensive operations due to TLB flushes SSD Inter domain I/O: (1) Grant memory (2) Write to ring buffer (3) Signal event (4) Read ring buffer (5) Map grants (6) Read or write grants (7) Unmap grants read(fd, buffer,…) I/O domain Instance
  • 39. Granting in 3.8.0+ Kernels, Persistent and Indirect  Grant mappings are set up in a pool one time  Data is copied in and out of the grant pool SSD read(fd, buffer…) I/O domain Instance Grant pool Copy to and from grant pool
  • 40. Validating Persistent Grants [ec2-user@ip-172-31-4-129 ~]$ dmesg | egrep -i 'blkfront' Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks. blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdd: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvde: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdf: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdg: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdh: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdi: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
  • 41. 2009 – Longer ago than you think  Avatar was the top movie in the theaters  Facebook overtook MySpace in active users  President Obama was sworn into office  The 2.6.32 Linux kernel was released Tip: Use 3.10+ kernel  Amazon Linux 13.09 or later  Ubuntu 14.04 or later  RHEL/Centos 7 or later  Etc.
  • 42. Device Pass Through: Enhanced Networking  SR-IOV eliminates need for driver domain  Physical network device exposes virtual function to instance  Requires a specialized driver, which means:  Your instance OS needs to know about it  EC2 needs to be told your instance can use it
  • 43. Hardware After Enhanced Networking Driver Domain Guest Domain Guest Domain VMM NIC Driver Physical CPU Physical Memory SR-IOV Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application 1 2 3 NIC Driver
  • 44. Elastic Network Adapter  Next Generation of Enhanced Networking  Hardware Checksums  Multi-Queue Support  Receive Side Steering  20Gbps in a Placement Group  New Open Source Amazon Network Driver
  • 45. Network Performance  20 Gigabit & 10 Gigabit  Measured one-way, double that for bi-directional (full duplex)  High, Moderate, Low – A function of the instance size and EBS optimization  Not all created equal – Test with iperf if it’s important!  Use placement groups when you need high and consistent instance to instance bandwidth  All traffic limited to 5 Gb/s when exiting EC2
  • 46. EBS Performance  Instance size affects throughput  Match your volume size and type to your instance  Use EBS optimization if EBS performance is important
  • 47.  Choose HVM AMIs  Timekeeping: use TSC  C state and P state controls  Monitor T2 CPU credits  Use a modern Linux OS  NUMA balancing  Persistent grants for I/O performance  Enhanced networking  Profile your application Summary: Getting the Most Out of EC2 Instances
  • 48.  Bare metal performance goal, and in many scenarios already there  History of eliminating hypervisor intermediation and driver domains  Hardware assisted virtualization  Scheduling and granting efficiencies  Device pass through Virtualization Themes
  • 49. Next Steps  Visit the Amazon EC2 documentation  Launch an instance and try your app!

Editor's Notes

  1. Let’s start at the basics What is an EC2 instance? They are virtual machines Guests On a Hypervisor On physical hardware
  2. Presentation built to be a deep dive Going into the depths of how EC2 works Highlight actionable things Get the most performance out of your instances Talk a bit about how to choose an EC2 instance When it comes to performance Making sure you’re picking the right one is as important as the tuning tips that I’m also going to talk about
  3. EC2 is a big subject Talk about the Purchase Options APIs & SDK’s Networking Talk today about the instances themselves How they operate Features Options when you go to launch Other Topics List Recommended sessions at the end
  4. Let’s start at the basics What is an EC2 instance? They are virtual machines Guests On a Hypervisor On physical hardware
  5. Launched in 2006 “an instance” Didn’t have a name Did get any choices Like the Model T – any color as long as black Eventually gave it a name M1 instance Customers wanted more choice
  6. We’ve been iterating and growing ever since. Not only adding instances, but changing how EC2 works Launched the cc2 in 2011 Placement groups Bandwidth and latency Hardware assisted virtualization Exposes more of the underlying hardware Lets you get even more performance EC2 is always growing and changing based on customer feedback Always check our documentation for the latest as you’re building out How we do things today may be different in the future
  7. Go over how we talk about instances and name them Get on the same page First letter is the family Stands for what it’s suited for or what resources it has C for compute R for Ram I for IOPS Number is the generation Like a version number Last is the instance size T-Shirt size
  8. You’ve got a lot of choices and flexibility when you go to launch It can seem overwhelming Trying to pick the right instance Looking at just the families First find what your application is constrained by If you need memory, start with R3 CPU, go with C4 If balanced, look at general purpose M or T Perspective of your constraint, it’s easy to pick the right family Test to find the right size within that family If you need a little help, check the documentation List of workloads for each family.
  9. When you’re looking at instances, you’ll see something call vCPUs On modern instances not in the T family An hyperthreaded core Hyperthreading is great to increase performance I kinda lets your CPU do two things at once Like waiting on IO Real core count Divide by two Visit link, used for licensing.
  10. To give a visual representation… Output of LSTOPO on m4.10xlarge Linux utility for enumerating hardware Can run on any instance or physical server Shows graphical output of hardware configuration Sockets Memory on each socket L1-3 Cache CPU thread to core mapping Case of m4.10xlarge 40 threads 20 cores
  11. Some applications don’t benefit from hyperthreading Context switching may decrease performance Typically compute heavy apps Financial calculations & engineering simulations These apps usually disable hyperthreading If you’re not sure or don’t typically disable hyperthreading, don’t worry If you do, try to run with it disabled it on EC2 and see if it improves performance Easy on Linux, harder on Windows Linux The first set of threads on each cores is listed first, and the second or B threads are listed after that Disable the last half, which will be all the B threads Two ways Online Great for no reboot But it may cause instability Disable processors where threads may be running Won’t be persisted after a reboot In grub, Set max cpus to match physical cpu count minus 1 Safer – disabled when booting But makes it harder when you switch size Windows is harder Interleaved Have to use CPU affinity
  12. Same m4.10xlarge with hyperthreading turned off Only one CPU thread per core Comapred to the two that you saw earlier
  13. Let’s dig into how instance sizes work We build instances Easy to scale vertically and horizontally Look at the c4 family as an example C4.8xlarge on the left Largest instance size available That single c4.8xlarge Roughly equal to 2 c4.4xlarges That c4.4xlarge has roughly half the Number of vCPUS Amount of ram Available network bandwidth Keeps following down the line 2x c4.4xlarge = 4x c4.2xlarge And so on…
  14. Reason is because of how we partition instances Largest size is typically a full server On the smaller one’s you’re running a fraction of it depending on the size Virtualization historically has a bad reputation Usually used to manage over utilization of resources More virtual machines than physical resources We use virtualization for a lot of other reasons Security & Isolation Dedicate specific resources to specific customers vCPUS as an example With exception of T When you’re assigned a vCPU only customer using it Not sharing with anyone else on the box Same applies to Memory & Network We build with the goal of providing a consistent experience No matter what else is happening
  15. Last thing I want to say about choosing your instance Cheesy to quote documentation Good sentiment Easy to get an app up and running Don’t run synthetic benchmarks Install your app and send some realistic load Examples: Mobile App HPC application BI database Use a real workload to understand how your app will behave
  16. Digging deeper into the OS… On all systems, time keeping is important Used for things like Processing interrupts Getting the time and date Measuring performance Most AMI’s on AWS use Xen clock by default Compatible with all instance types TSC was introduced in Sandy Bridge Handled by bare metal You’re talking to your processor Not the hypervisor And because of this, calls to it are going to be much quicker
  17. To demonstrate this – simple application It does two things Performs a large number of get time of day calls a bit of math Don’t laugh at my code… I’m a sysadmin, not a developer Quick and dirty to test it out
  18. These are results on Xen clock source Profiled with Strace Really great tool to use with any app, yours included Shows the number of system calls make & the time they took Gettimeofday take the most amount of time with a lot of calls Overall, the test took about 12 seconds to run with Xen clock source
  19. On the same system Switched clocksource to TSC Reran the test Results look a lot different Gettimeofday doesn’t show up Run time reduced to two seconds This is extreme for a simple app I’ve seen apps improve by as much as 40%
  20. It’s an easy change to make on Linux Do it while the system is running First command shows available clock sources Second shows the current clock source Third would change it to TSC On windows, it’s handled automatically If you’re running a recently released EC2 instance Can improve a lot of apps JVM debugging Performance tracing SAP applications
  21. Recently change to the platform added P & C state control to the platform with C4, now available on many more First, let’s talk about C states C states control the power savings features of a processor Using c4.8xlarge as an example Base clock speed of 2.9Ghz Can turbo up to 3.5Ghz on one or two cores Must let other cores idle down Great when you need a few cores to have high frequencies Letting them idle down increase the time it takes for them to respond when you want to actually use them So if you have an application where latency is important You can limit how deep they’ll sleep Setting cstate parameter in grub
  22. You can use P state to set the desired running frequency of the cores Some customers and some workloads consistency is more important than performance Some Game servers good example Operate in loops Loop needs to complete in the same time, every time You can set the P state to prevent the processor to prevent it from scaling up and down Operates at the same frequency all the time
  23. Next I want to talk about T2 and why they’re special T2 instance are great general purpose instances Lowest costs instance available on AWS at ~1/2 a cent per hour for t2.nano Great for workloads where CPU demand varies over time Websites Developer environments Small database You start with a baseline level of performance That you can see in the chart above The magic of T2 is that you earn credits when the instance is idle Allows you to burst above the baseline We launched T2 because we saw that most workloads aren’t using 100% of CPU all of the time T2 family is a great way to Still get the performance you need when you need it Don’t pay for it when you don’t
  24. Let’s talk about how credits work You can think of credits in a T2 like a bucket When you boot the instance Start with enough credits for OS & Application When your app is up and running, you’ll use credits when you use CPU A single credit will let you run 100% of one core for one minute When the work dies down and instance becomes idle Earning new credits that will start to file up the bucket Credits also expire after 24 hours if unused
  25. To monitor those instances Cloudwatch Metrics Two available The one in Orange is the credit usage Spikes when usage is high Shows you how many credits you’re using per minute The Blue is the Balance Keep this above zero if you want more performance than baseline Monitoring your credit balance lets you ensure you’re getting consistent performance on a T2 What you’ll want to hook on if you’re using autoscaling
  26. Recently launched the X1 Biggest instance 2TB of RAM 128 Virtual CPUs Great for apps that need a huge memory footprint Good for: In memory databases big data processing some HPC
  27. When you have that much memory Managing is important On any system with multiple sockets Memory attached to local socket will be faster than remote Concept is called NUMA On Intel, there’s a QPI between sockets It’s the bus that transfer memory from one to another
  28. Look at the r3.8xlarge as an example Two sockets 122GB of ram on each socket Between are 2 QPI links Application on the left reading from the right Will go over the QPI Fast, but not as fast as what’s attached directly
  29. When you go to X1, things are more complex X1 is a 4 socket system Numa is more important Compared to an r3.8xlarge More memory per socket Only one QPI between sockets Memory transfers from one zone to another are going to take longer on X1
  30. So what can we do? If you’ve ever watched top on a linux system shows threads moving from one core to another Process scheduling to make sure work is balanced Around 3.8 kernel, started to use NUMA affinity Will try to keep processes in same NUMA zone Will also try to move memory around to be close to the process The downside is that this can actually slow down performance on some apps Especially true if you have a large memory pool spanning sockets The scheduler will be moving things around when it doesn’t need to be To disable, set NUMA=off in grub will disable memory transfers between zones disable NUMA awareness for process scheduling Alternative is to use numactl to lock processes to a specific zone Only be reading and writing memory that’s local to them
  31. Another thing to keep in mind Operating system and libraries can effect application performance Use not is running a modern linux kernel important Run as recent of a distro as you can Recent customer visit Custom Application using a large amount of memory EC2 performance not as good as on premise Their app was very complex and it was hard to get quick results when making changes Found a benchmark tool (ebizzy) with similar behavior to test
  32. Results of ebizzy on RHEL6 Used perf to profile and see what’s happening at a system level Generated 12,000 requests/second Lots of time in system space 1.5 million page faults
  33. Generated flame graphs to see what’s happening Created by Brendan Gregg, check out his site for more information A really good way to understand Paths the code is taking a system Time spent in specific calls You can see ebizzy on the bottom Making lots of madvise calls End up with a xen hypercall
  34. Compiled the same app on RHEL7 and tested on same instance type Saw significantly better performance RPS went from 12,000 – 425,000 Page faults went from 1.5 million to only 14,000
  35. What happened? This is where flamegraphs really shine Same exact flame graph Same Code Sam run type Only difference is the OS version What the flamegraph showed us Glibc Changed the path memory calls on RHEL7 Instead of long madvase with trip to hypervisor Single intel optimized call for memory management Recompile when moving to a different OS, it can make a big difference
  36. Last memory related tip is to Disable transparent huge pages Huge pages are a really big subject with a lot of different options See article It goes into detail about all the different options Transparent is enabled by default on most recent distributions Disabling transparent and using explicit Can help significantly for apps that are accessing a lot of memory
  37. Next, let’s talk about IO We have a few families that are optimized for IO I2 – IOPS – SSD Based D2 – Dense storage - Magetic Need a modern kernel to get best storage performance Reason is split driver model Application on left doing some disk IO Talks to the front end driver Then back end Then real driver Then hardware Data transfer happens through shared pages Need permissions to be granted and released
  38. Granting had lots of overhead in early kernels Every time it needs to write to disk Talks to VMM Get permission to write to device Fill a buffer with the data Pass to backend Wait for data to be written Remove the grant Really expensive process, lots of buffer flushing Gets worse the more CPUs you have
  39. Persistent grants created to solve this. Permission to write is reused for all transactions between front and back Grants don’t need to be unmapped Translation buffer never flushed Much better performance for IO operations
  40. Validating grants is easy Run dmesg and grep for blockfront This is i2.8xlarge All volumes have persistant grants enabled.
  41. If I haven’t said it enough Using a modern kernel is really important Many customers still use Centos6 Just by switching OS’s to 3.10 Seen as much as a 60% improvement 2.6 Kernel in Centos6 released in 2009 Long time ago in the cloud computing world Please use a modern Kernel & OS.
  42. Same lines as split driver model Release enhanced networking with C3 Uses Single Root IO Virtualization – SRIOV Physical device exposed to instance itself Has a few requirements Needs a special driver installed in the OS EC2 needs to be told to expose it that way
  43. Network path is much simpler Packets don’t’ have to go through the VMM Higher packets per second Decreased jitter – talking to bare metal It’s free on all supported instance Enabled by default in many AMI’s Highly recommend it if you’re touching the network
  44. And we’re not done improving the network Still making constant improvements Latest is with a new Network Adapter Launched with the X1 Called Elastic Network Adapter – ENA Built a new Amazon developed Open source Driver Will grow with us as we’re adding new features to the network Built to handle throughputs up to 20 Gigabits/second This + Hardware checksums & RSS make it the fastest network available on AWS today.
  45. Touch briefly on network Touch on a few points about network performance Attend the Deep dive to learn more Easy to forget that network can be a bottleneck on smaller instance types Customer doing S3 performance testing Not getting good performance Found out all network traffic was going through a T2 NAT Largest instances should get closer to 5Gbps when leaving EC2 and talking to things like S3 When we list 10 & 20 Gigabit bandwidth Instance bandwidth is bi-directional On p2, X1, and m4 20Gb in and Out at the same time But you need placement groups and multiple TCP streams
  46. Just like network throughput, EBS throughput is function of size of the instance Larger instance, more EBS traffic EBS optimization by default on newest instances Don’t have to worry about network and EBS competing Look at the EBS documentation Table of every EBS optimized instance Throughput and max IOPS Great place to go to look for specific performance out of EBS
  47. In conclusion Lots of things Getting the most out of it At bare minimum Benchmark your app Use a modern OS Monitor Cloudwatch Use enhanced networking
  48. Goal is to make virtualization as transparent as possible Eliminate any inefficiencies it may cause Goal of bare metal like performance Already there in a lot of ways
  49. So if you have any questions, the EC2 documentation is a great resource and covers even more than I could today. Otherwise, launch an instance and start testing your app. Thank you!
  50. So if you have any questions, the EC2 documentation is a great resource and covers even more than I could today. Otherwise, launch an instance and start testing your app. Thank you!