Learn how to optimise your AWS bill by using Amazon EC2 Spot instances, AWS Savings plans and blended on-demand/spot pools for your AWS Autoscaling groups. Also includes some easy wins to help you get started.
So how are we going to do this?
Overview of what drives us to innovate at AWS
Automate cost and capacity management – Savings Plan, Compute Optimizer, EC2 Auto Scaling and Spot Instances
Workload examples – CI/CD, Containerized Web Apps, and Big Data, analytics and AI/ML.
Wrap up with next steps
Every servers has 4 hey computing resources– CPU, Memory, Storage, Network capabilities
Some workloads are more CPU intensive, and more memory intensive,
So we created different SKUs or familes – that’s the first letter to the right
As we added new technology to our instances, we realized we wanted to expose these innovations – so we introduced generations – what CPU capabilities and cjupsets, network capabilities
Last one is size – pretty simple tshirt – still have the same ratio, chipset, and but each size has twice the CPU, memory and storage of the previous size – enabling to scale up your workloads
What does all of this mean?
More choices enables better performance for specific workloads
Faster processors from Intel, processor choice with Graviton (ARM) and AMD, instances for accelerated computing with our partner Nvidia –
Network offerings up to 100GBps performance
Elastic Graphics or Elastic Inference and of course Elastic Block Store for greater performance and storage flexibility.
We will have nearly 300 instances by the end of the year to support virtually every workload and business need.
1/ Previously, you had to reference multiple data sources and test multiple instance types before selecting the best instance type for your workload. You had to repeat this selection process as workloads evolved and new EC2 instance types and features were released.
2/Now you have a single source of truth for the latest instance types, attributes, regional and zonal offerings, and pricing.
3/ You can get started by defining your hardware requirements and reviewing the set of instance types which meet these requirements. You can further compare the hardware attributes, pricing, and availability of each instance type if needed. Then you can select and launch an instance, aliased by creating an SSM parameter, or saved in a launch template to be launched later or referenced in existing automation.
4/ This new experience makes it quicker and easier for us to find and compare different instance types, project costs, and select an instance type that you are confident will give you the performance within budget
Non production can make up to 90% of the capacity of some workloads and commonly over 50%
It doesn’t need to scale dynamically in response to demand.
10x5 is a common development pattern
Anything running below 75% of the time can be considered for better cost optimization than RIs
https://aws.amazon.com/premiumsupport/knowledge-center/stop-start-instance-scheduler/
So why should you use Savings Plans?
First, they’re super to easy to use. Customers no longer have to make commitments to specific instance configurations and can easily save money just by committing to a $ spend. Secondly they provide significant savings, up to 72% off OD, just RIs. Finally they provide a ton of flexibility. With a Savings Plan, all you have to do is make a simple commitment to a spend/hour and you will save money on your usage automatically, even as that usage changes from one region to another, or from instance type to another or even if you move from EC2 to Fargate. All without having to perform exchanges or modifications.
AWS offers two types of Savings Plans - EC2 Instance Savings Plans and Compute Savings Plans
Compute Savings Plans provide the most flexibility and help reduce usage costs by up to 66%, just like Convertible RIs. These plans automatically apply to EC2 instance usage regardless of instance family, size, AZ, region, OS or tenancy, as well as Fargate usage. For example, with Compute Savings Plans, you can switch from C4 to M5 instances, shift a workload from EU (Ireland) to EU (London), or move a workload from EC2 to Fargate at any time and automatically continue to receive discounts.
EC2 Instance Savings Plans provide the lowest prices, in exchange for a commitment to usage of individual instance families in a region (e.g. commit to a consistent level of M5 usage in N. Virginia). This automatically provides you with savings of up to 72% off the On-Demand price of the selected instance family in that region regardless of AZ, size, OS or tenancy. EC2 Instance Savings Plans allows you to change your usage between instances within a family in that region. For example, you can move from c5.xlarge running Windows to c5.2xlarge running Linux, and automatically benefit from the Savings Plans prices.
Savings Plans is the easiest way to save on compute. Customers can sign up for Savings Plans in a few simple steps using the AWS Cost Explorer. Now lets take a look at these steps in detail.
Mention the T-family here on how they are burstable.
1/ AWS Compute Optimizer uses machine learning models trained on millions of workloads to help customers optimize their compute resources for cost and performance across all of workloads they run. You can take advantage of the recommendations in Compute Optimizer to reduce costs by up to 25%.
2/ AWS Compute Optimizer delivers instance type and auto scaling groups recommendations, making it even easier for customers to choose the right compute resources for specific workloads.
3/ AWS Compute Optimizer analyzes the configuration, resource utilization, and performance data of a workload to identify dozens of defining characteristics, such as whether the workload is CPU-intensive and whether it exhibits a daily pattern. Compute Optimizer then uses machine learning to process these characteristics to predict how the workload would perform on various hardware platforms, delivering resource recommendations.
4/ AWS Compute Optimizer delivers up to 3 recommended options for each AWS resource analyzed to right size and improve workload performance. Compute Optimizer predicts the expected CPU and memory utilization of your workload on various EC2 instance types. This helps you understand how your workload would perform on the recommended options before implementing the recommendations.
1/ Mettle uses machine learning models trained on millions of workloads to help customers optimize their compute resources for cost and performance across all of workloads they run. You can take advantage of the recommendations in Mettle to reduce costs by up to 25%.
2/ Mettle delivers instance type and auto scaling groups recommendations, making it even easier for customers to choose the right compute resources for specific workloads.
3/ Mettle analyzes the configuration, resource utilization, and performance data of a workload to identify dozens of defining characteristics, such as whether the workload is CPU-intensive and whether it exhibits a daily pattern. Mettle then uses machine learning to process these characteristics to predict how the workload would perform on various hardware platforms, delivering resource recommendations.
4/ Mettle delivers up to 3 recommended options for each AWS resource analyzed to right size and improve workload performance. Mettle predicts the expected CPU and memory utilization of your workload on various EC2 instance types. This helps you understand how your workload would perform on the recommended options before implementing the recommendations.
How does this work? Predictive Scaling’s machine learning algorithms leverage data from billions of traffic patterns in1/ Mettle uses machine learning models trained on millions of workloads to help customers optimize their compute resources for cost and performance across all of workloads they run. You can take advantage of the recommendations in Mettle to reduce costs by up to 25%.
2/ Mettle delivers instance type and auto scaling groups recommendations, making it even easier for customers to choose the right compute resources for specific workloads.
3/ Mettle analyzes the configuration, resource utilization, and performance data of a workload to identify dozens of defining characteristics, such as whether the workload is CPU-intensive and whether it exhibits a daily pattern. Mettle then uses machine learning to process these characteristics to predict how the workload would perform on various hardware platforms, delivering resource recommendations.
4/ Mettle delivers up to 3 recommended options for each AWS resource analyzed to right size and improve workload performance. Mettle predicts the expected CPU and memory utilization of your workload on various EC2 instance types. This helps you understand how your workload would perform on the recommended options before implementing the recommendations Amazon.com to predict future changes.
The pre-trained model then processes last 2 weeks of load metrics to forecasts the load metric for the next two days
The model also performs regression analysis between load metric and scaling metric, schedules scaling actions for the next two days, hourly, and then repeats this process every day
How to purchase EC2
How to optimize compute for savings and scale
Four different ways to purchase compute
On-Demand: Pay-as-you-go, no commitments, best for fluctuating workloads
Reserved Instance: Long term commitments that offer big savings over On-Demand prices. Best for always on workloads
Introducing Savings Plan: Just like Reserved Instances, but monetary commitment based and compute can be used across Fargate and EC2
Spot Instances: Same as pay-as-you-go pricing as On-Demand, but at up to 90% off. EC2 can reclaim with a 2 minute warning. Best for stateless or fault tolerant workloads
All four purchasing options use the same underlying EC2 instances and AWS infrastructure across 22 Regions
[Poll] How many of you use Spot Instances?
Excited to announce
New Spot integrations
Updates to EC2 Auto Scaling that make it easier than ever to incorporate Spot
Customer initiated Start/Stop for EC2 Spot
So, when should you use Spot, On-Demand or RIs?
Picking just one option is the wrong solution.
Use all three to optimize cost and capacity
Leverage the scale of AWS at a fraction of the cost
Simplified pricing model, no more bidding.
Spot is only interrupted when the EC2 needs to reclaim Spot for On-Demand capacity. No need to worry about your bidding strategy. Spot prices gradually adjust based on long-term supply and demand trends.
Spot is a reward for good architecture
Launching Spot workshop for a demo: Demo: https://ec2spotworkshops.com/launching_ec2_spot_instances.html
Not only save big, but get results faster
Use Spot across a number of AWS services and third parties. Will share more about these integrations later in the presentation
Two main kind of workloads:
Time sensitive: Web services, analytics, grid computing, containers
Time insensitive: ML training, Genomics analysis, development, testing, one-time queries
Instance flexibility: (Time sensitive workloads) Mix instance types with similar capabilityes: num of vCPUs / Memory
Time flexible: (time insensitive workloads) Workloads that require specific instance types, but can be flexible on completion times (e.g. batch Jobs with no SLA, ML training Jobs…)
Region flexible: large size / very instance specific kind of workloads e.g. real time rendering with a specific g3 instance, can benefit of increased region flexibility
Pay for what you need, but have the option to scale in and out when needed
Specify different percents of Spot and On-Demand using EC2 Auto Scaling.
RI and Savings Plan instance discounts automatically applied
* New - Capacity Optimized is Spot pool capacity aware, limiting chance of interruption
Example - Specify launching C5large across us-east-1, us-east-2 and us-west-1. ASG will launch Spot in deepest capacity pools
Also specify scale based on ”Lowest Cost” or “Prioritized List”
This time, we have the exact same ASG represented, but using the capacity-optimized SpotAllocationStrategy. In this case we don’t have SpotInstancePools as that parameter is specific to lowest-Price.
And if we look at the instances, ASG will launch instances on the deepest pools on each AZ, which may not always be the cheapest, but are from the deepest pools at instance launch time and reduce the likelihood of interruptions
So, when should you use Spot, On-Demand or RIs?
Picking just one option is the wrong solution.
Use all three to optimize cost and capacity
Before: build custom logic, leverage multiple APIs
No clean way to leverage Spot Instances, On-Demand and RIs in a single Auto Scaling group.
Complex complex code to discover capacity, be price aware across different instance types and Availability Zones, and scale capacity in different pools
Create three different auto scaling groups for c4.xlarge On-Demand, m5.large Spot, and another m4.large Spot ASG
Then: One ASG to scale across c4.xlarge On-Demand instances, m5.large Spot Instances, and m4.large Spot Instances.
Scaling in and out with EC2 Auto Scaling ensured base capacity fulfilled with On-Demand instances and additional capacity with Spot instances or a specified percentage mix of On-Demand or Spot instances
If AZ1 becomes unavailable, Auto Scaling launches instances in AZ2 or AZ3 to compensate all within a single AZ
Optimizing capacity management and cost optimization became easier
Introducing instance type weights
Configure weight to scale in and out based on previous gen instances or vCPUs across multiple AZs
Distribute Capacity evenly between availability zones for On-Demand and Spot separately
Prioritised is the only option, it will use the first instance in list, try to fill, then only moves to 2nd type, etc
How many of the specified overrides to use as pools
Let’s take a look at few real-life scenarios.
See concrete examples to get started with cost and capacity optimization.
With Managed Spot Training, SageMaker manages Spot instances on your behalf, no need to build additional tooling.
Can be used to train machine learning models, using the built-in algorithms with SageMaker, your own custom algorithms, and those available in AWS Marketplace.
Built-in algorithms and frameworks automatically save model checkpoints periodically. Training jobs to pause and resume reliably as and when Spot capacity becomes available.
Available in all regions and SageMaker instances
[Poll]How many of you run your CI/CD pipeline on AWS
[Poll] What build tools are you using today? Jenkins? Bamboo?
Continuous Integration with Jenkins is a perfect use case for cost optimization.
All the worker nodes in the cluster can leverage Spot and provide savings of up to 90%.
Jenkins plug-in will launch Spot instances as worker nodes for the CI server and automatically scale capacity with the load
Simplified reference architecture.
Jenkins Master and agents are running in the VPC. The Jenkins Master is behind an Application Load Balancer
EC2 Jenkins plugin launches Spot instances as Agents for Jenkins CI server
You can specify the scaling limits in your cloud settings of your plug-in.
Jenkins will try to scale EC2 Fleet up or down depending on the state of your nodes
Now moving on to Websites and apps on Containers.
[Poll] How many of you use containers today? How many of you use ECS? EKS or Kubernetes natively on AWS?
Containers are often stateless and fault-tolerant – a no-brainer for using Spot and Auto Scaling Groups
ECS and EKS: Two highly scalable, high-performance container orchestration services,
Run microservices, like a mapping API, or a real time bidding service on containers, on top of EC2 Instances – and have them managed by Fleet or by Auto Scaling.
This is a super easy way to optimize your containers for both price and performance
Architecture of a web app running on containers behind an elastic load balancer
The ELB automatically routes incoming web traffic across a dynamically changing number of instances.
Optimize Auto Scaling Group depending on application demand
Use Spot to address fluctuations - with your base of RIs and a bit of On-Demand
Deploy and manage applications, not infrastructure
With Spot save up to 70% off
Control how you scale based on tasks, vCPUs and memory
VM-level boundary enabling workload isolation and improved security as each task or pod runs on its own kernel.
Lower cost, innovate faster with Spot Instances
Maximize capacity with capacity optimized EC2 Auto Scaling and Savings Plan to lock in deep discounts for steady state workloads
Use Compute Optimizer for workload optimization
Schedule time an Immersion Day for hands-on from an AWS expert
If you’re ready to continue learning, check out our library of free digital courses, including introductory primers on a range of services
You can also take classroom training to get hands on practice and learn directly from an instructor.
Visit the learning library for the full list of courses