What is AWS Auto Scaling? Put simply, Auto Scaling is Amazon’s hosted service for automatically launching and terminating EC2 instances.
It’s really useful because it allows you to save money by only using AWS resources when you need them. For example, you can use it to help you accommodate peak loads without setting up huge amounts static infrastructure that runs all the time.
So what makes up Amazon’s Auto Scaling service? It’s actually comprised of different components that work in concert to allow you to automatically adjust your environment to meet your needs. The four basic components are: Launch Configurations, Auto Scaling Groups, Scaling Policies, and Cloud Watch Alarms. There are a few more pieces I’m glossing over, but these are the fundamental building blocks.
The first component, “Launch Configurations” are templates that define the parameters passed to your EC2 instances at launch time. For example, your launch configurations will define what size instance to run, which security group they will run in, etc. You can even specify things like “spot price” if you want to take advantage of Amazon’s spot instance pricing.
The next component is the “Autoscaling Group” An Auto Scaling Group is a collection of EC2 instances that run a launch configuration.
The third basic component is the “Scaling Policy.” Scaling policies are declarative templates that define an action to take on your autoscaling group. For example, “start two instances in the ASG named “webservers”
The last basic component is the “Cloud Watch Alarm”. These are triggers that execute scaling polices based on metrics measured in Amazon’s CloudWatch monitoring service.
So how do these components work together? To illustrate this, I’ll walk through a simple example. I’ll set up a very basic Auto Scaling service in five steps. In this example, my goal is to set up an environment that will launch and terminate servers based on CPU Utilization across my cluster.
To run through this example, I’ll assume the following things have already been done. First we have an AWS account :) In it, I’ll assume I’m already running a web application on an EBS backed instance. And finally I’ll assume we’re going to use the new unified AWS command line tool.
If you haven’t checked out the new AWS command line tool, it’s really nice. It even ships with an excellent set of autocomplete scripts. So entering commands really isn’t that painful. Everything is a Tab key press away. But I digress…
So for the first step, I’ll create an AMI of my instance. Here the instance id of my existing server is i-12345678. Hopefully it’s a web service that has a nice health check for a load balancer to detect when it’s down, and hopefully the web service on it can stop and start gracefully. Once I run this command, it will output an AMI ID we’ll use in later steps.
The next step is to create a load balancer that will route requests to my web server. This command will create an empty load balancer that forwards HTTP requests on port 80 to, similarly, port 80 on the instances that are attached to it.
Next, we’ll create a launch configuration. This Launch Configuration will start servers using the AMI we defined in step one. I’m also describing the standard AWS parameters such as “instance type,” “key-pair,” and “security-groups” that you need when starting any EC2 instance.
Once I have my Launch Configuration setup, I can create my Auto Scaling Group. Here I’m defining important parameters such as the minimum number of instances to run in the group, as well as the maximum number of instances allowed. I’m also setting up this group to attach instances to the ELB I created in step 2. Some other parameters of note: default-cooldown will set a default number of seconds that an autoscaling group will prevent actions from occurring against it. Specifically this means that if a policy executed against this group, it will wait 120 seconds before allowing another policy to run against it. The “—health-check-type” parameter indicates how Amazon will determine if a node in this group has failed. This can be one of two values, ELB or EC2. Here I’m setting it to “EC2” which is a basic ping check Amazon internally runs. If a node fails this check, the Auto Scaling group will kill the instance and replace it with another one.
Next, we’ll create 2 polices. This one increases the number of instances in your ASG by one. Note that we can override the cooldown we specified on the Auto Scaling Group. This way different scaling policies can prevent actions from occurring against the group for different amounts of time. Once we run this command it will output a huge ARN, which is an internal identifier that describes this policy. We’ll use this ARN later to hook up to our cloud watch alarms.
Similarly, this policy says “change the capacity of the ASG by negative one.” Or, more simply, terminate one instance. Again, this will spit out an ARN we will use later.
The last thing we’ll do is create 2 cloud watch alarms. This one will trigger the scale up policy when the Average CPU utilization across the entire ASG goes above 60% for 5 minutes. Note the —alarm-action parameter. This is the giant ARN that was spit out when we created the scale up policy.
Similarly, this is the alarm that will trigger the scale-down policy. It basically says, run the scale down policy when the average cpu utilization across all the instances in the ASG is less than or equal to 20%
And bam. That’s all there is to it. Now we have an autoscaling group that will dynamically launch and terminate instances based on CPU load. While the other ASG tools don’t have a GUI representation, you can actually log into the AWS console and look at graphs corresponding to your Cloud Watch alarms. This is an example from our production cluster at Mass Relevance. Here you can see red lines that correspond to where the triggers are set. When an alarm triggers, it’s entire graph turns red indicating that it is firing. In this state, it is executing any autoscaling policies associated with it.
Here’s another example of our production cluster’s cloud watch console. This was a special event that occurred on October 29th. Our auto scaling configuration handled this perfectly, meaning we didn’t have to actually do anything to handle this increase in load. Autoscale added and removed the instances as needed so we could sleep at night without too much worry.
So now I’ll talk about some advanced uses. At MR we don’t rely on pre-built AMI’s. Instead, all of our instances are built using chef. Luckily AWS gives you the tools to do this, too. In our environment, we use a combination of IAM roles, S3, and user-data scripts to hook up our instances to our Chef server. Specifically, we have an IAM role that allows instances to access a private S3 bucket that host scripts and packages that setup our chef client. User Data scripts are small shell scripts that EC2 instances will execute on boot time. So in our case, we pass a User Data script to our Launch Configuration. This script will leverage the IAM role to download our chef client from S3, set it up, and kick it off.
This is an example of the User Data Script we use to do this. All in all it’s been a very effective setup that has allowed us to scale for all sorts of events. For example, at the grammy’s we were pushing a peak loads of 80k requests per second. At the time we were running close to 200 web servers, worker servers, and caching servers to handle the load. All of these were launched and managed with autoscale, albeit with lots of pre-warming. After the event was over, we scaled back down to around something like 10 servers… which is our typical baseline.
So that about wraps it up. Are there any questions?
Auto Scaling on AWS
Auto Scaling on AWS
Software Engineer @ MassRelevance
What is AWS Auto Scaling?
Auto Scaling is Amazon’s hosted service for
automatically launching and terminating EC2
Why Use Auto Scaling
Save money by only using instances when you
Scale to accommodate expected and
Replace unhealthy servers with healthy ones.
What Makes Up Auto Scaling
Auto Scaling Groups
Cloud Watch Alarms
Templates that describe the parameters passed at
launch-time to your EC2 instances.
Examples: AMI, instance type (c1.xlarge, m1.small,
…), security groups, spot price
Auto Scaling Groups
A set of EC2 instances that run a launch
A template describing actions to run against an
Auto Scaling Group.
E.g. Start two instances in the ASG named
Cloud Watch Alarms
Triggers that can run Scaling Policies based on
Cloud Watch metrics (AWS’s built in monitoring
E.g. Run Scaling Policy “Launch Webservers”
when CPU exceeds a certain threshold for 5
A Simple Web Application in Five
Create an AMI
Setup an Elastic Load Balancer (ELB)
Create a Launch Configuration
Create Scaling Policies
Create Cloud Watch Alarms
A running web application on an EBS backed
AWS Command Line tool: