▸ Director of Infrastructure Services at IQ Innovations, LLC.
▸ Have been working in IT for 12+ years in various areas ranging
from desktop support to system administration to management.
▸ AWS Certiﬁed Solutions Architect
▸ Have been working heavily in AWS for about 2.5 years.
▸ Email: email@example.com
▸ Twitter: @bpadair
APPLICATION MANAGEMENT IN AWS
▸ Public cloud in general, and AWS in particular are
changing the way that we think about infrastructure and
the way we manage the applications that run on that
▸ Less permanence, more ephemeral and temporary.
▸ More purpose built and dedicated resources.
▸ Less “make it ﬁt”
WHAT DO WE MEAN?
▸ What do we mean when we talk about performance?
▸ Getting as much power as possible?
▸ Getting just enough?
▸ What about growth?
▸ Use Trusted Advisor to ﬁnd (somewhat) obvious
▸ Things like over-utilized instances, excessive security
group rules, and cache-hit ratio can be found here.
▸ Plan for performance to scale, not grow.
▸ Monitor, monitor, monitor.
▸ Need special consideration.
▸ RDS, Dynamo, EC2 instance.
▸ If using EC2, use provisioned IOPS, and RAID-0 volumes.
▸ Do not put databases on EFS instances.
▸ Replication - yes/no - where?
CASE-STUDY: IQ INNOVATIONS
▸ Two data centers and a public cloud provider.
▸ All Centos running on ESXi.
▸ MySQL database.
▸ Apache, Tomcat, Grails stack on app servers.
▸ 1 clients conﬁguration: 8 servers dedicated to MySQL, 14 app servers, 1 NFS server, 2 utility
▸ Performance was terrible.
▸ Average app response time: ~600ms
▸ Average end-user response time: ~4s
▸ Constantly running out of memory and restarting
▸ Nowhere to grow
CASE STUDY: IQ INNOVATIONS
▸ Moved to AWS. Eliminated the collocation space and other cloud provider.
▸ Still running MySQL and Centos.
▸ Databases moved to RDS. Application servers moved to EC2.
▸ Same client conﬁguration: 6 RDS instances for databases, 4 app servers, 1 utility server,
EFS to replace SAN.
▸ Performance improved dramatically:
▸ App response time: ~80-100ms
▸ End-user response time: ~1-2s
▸ No more memory issues.
▸ Cost savings of about 50%.
HAVEN’T WE BEEN DOING THIS FOREVER?
▸ Yes, and a lot of existing knowledge still applies.
▸ You still need smart policies.
▸ Your application still needs to protect against common attack vectors.
▸ Some things to change with a move to AWS, however.
▸ You are no longer responsible for physical security.
▸ You are no longer responsible for hypervisor security or patching.
▸ Depending on the service you may not even be responsible for OS
security and patching.
▸ Trusted advisor. This is a recurring theme.
▸ Bastion hosts
▸ Security groups
▸ COMMON SENSE!
▸ Console access for everyone.
▸ Overly permissive policies.
▸ Lack of two factor authentication.
▸ Overly/Publicly exposed access keys.
▸ Access key rotation.
EASIER AND HARDER SIMULTANEOUSLY
▸ A lot of the work for reliability is done for you.
▸ It is a mistake to put too much trust in this.
▸ The tools are there, but you have to choose to use them.
▸ Architecture matters.
CRITICAL THINGS TO UNDERSTAND
▸ Availability zones
▸ Difference between AZs and Regions and how they should
be used together.
▸ Replication of different services.
▸ Availability SLAs.
▸ S3 storage classes/levels
CASE STUDY: CONFIDENTIAL COMPANY
▸ Only in one data center due to cost.
▸ Had clients nationwide, but all resources were
▸ Had to have 4 or more hours of downtime for
▸ Many SPoF including storage and network. Redundancy
was attempted but not done well.
CASE STUDY: CONFIDENTIAL COMPANY
▸ AWS Setup:
▸ Multiple VPCs spread across multiple regions to provide redundancy
and be close to customers.
▸ VPC peering to reduce single points of failure.
▸ MAZ RDS instances for databases.
▸ EFS for network based storage.
▸ Replication of databases across regions.
▸ IaC templates for VPCs to allow for rapid reproduction in other regions.
WHAT IS SCALABILITY
▸ Scalability is about more than simply adding more
resources in response to increased demand.
▸ Scalability needs to include both scaling up and scaling
▸ Goal is to maximize user experience while minimizing cost.
▸ Provision with small spikes in mind, but not growth.
▸ Scale to growth.
▸ Schedule scale downs and scale ups.
▸ Auto-scaling is your friend.
▸ Monitor, monitor, monitor. Don’t alert, alert, alert.
▸ Reserving too quickly.
▸ Planning for vertical scaling as opposed to horizontal.
▸ Provisioning for growth instead of planning for it.
▸ Manual intervention.
▸ Under analysis of utilization.