Getting a Rails app, a Java app, Ruby clients, Amazon ECS, Kinesis & Athena all to play nicely together. We'll take a look at how easy it is to get a "personal Heroku" running on AWS with Terraform. We'll look at the example of https://www.ratelim.it
Welcome to ezCater. Since we’re giving you pizza I get to pitch you on workign here!
Do you like: Tacos, Ruby, Graphs that go up and to the right & 100% Nice People?
Then you’ll love working at ezCater! Check us out https://www.ezcater.com/company/careers/ Seriously though, great things are happening here and we are growing > 2x year over year. It’s fun.
Check out our new engineering blog https://engineering.ezcater.com/ab-testing-at-ezcater-part-2-tracking-experiments-the-exposure-event
So who am I? Jeff Dwyer http://twitter.com/jdwyah Http://blog.jdwyah.com I’ve worked at PatientsLikeMe, HubSpot and now ezCater
But like lots of you I have side projects too! Like http://forcerank.it The best way to prioritize your trello backlog as a team!
Like http://whatsize.is The best way to know that 24 Months in Carters is 2T June & January, but 90mm in Hannah Anderson and 12-18 months in H&M
Like http://ratelim.it The best way to get a distributed rate limiter in the cloud.
So like most of you, I rely heavily on Heroku for all these side project. All fun and games, because it’s free, and the only cost is that all your app ideas take a full minute or so to boot on the first request.
When I worked at HubSpot I had access to a really powerful distributed rate limiter that could handle millions and millions of individual limits, and would persist the limits / token buckets forever. This was AMAZING because it meant I could use these limits to save tons of money on our UsageTracking bill, but also I could use it for Idempotency enforcement by having “infinite” limits. I used this to great effect to save money in lots of fun places. It’s an amazing hammer.
So I left HubSpot and of course didn’t have access to my favorite tool. I needed to build a new one for myself. So what is rate limiting? Here’s the basic API I wanted to support.
Most people think of Redis or Memcached for rate limits and those are very good for many limits, but the “eternal” rate limits I want to support are not well suited to Redis. Dynamo is a better solution for long term storage.
But Redis is still great and sometimes rate limits are going to get pounded, so we should front Dynamo with Redis
Fronting Dynamo with Redis means you still need to write to the backing store. The solid way to do this is a Queue and Writer. (You’ll notice we’re up to 2 services now)
Of course I want http://ratelim.it to be a little SaaS. I’m not crazy so I’m not going to write a CRUD app in Java, so we have a Ruby on Rails app too.
One thing that is a trick with Dynamo is that it doesn’t autoscale. But you’re going to have bursty traffic, so you’ll need to have something autoscale for you.
They best option is a long running python project that polls your dynamo tables and scales them when they get close. (So now we’ve got 4 services)
So how in the heck am I going to run all these things? My heroku bill is going to start to be problematic and these services are things that I really want to be “cheap to run & scale” since we’re planning to charge 5 millionths of a penny. There’s not room for much overhead.
Amazon ECS says it will do all this….
And their video is cute…
So here’s the basic architecture of the AWS services I want.
I deem this “do-able” even with my not-so-great understanding of AWS.
Oops, forgot Application Load Balancers.
Oops and IAM Roles, RolePolicies, AutoScalingGroups. SecurityGroups, VPCs, LaunchConfigurations, aws_acm_certificate, ECR repositories…
Here’s ¼ of the end result of how many resources I’ve got.
Let’s pause here and say that before this endeavor I had just about no idea what all this AWS infrastructure really was.
I just wanted to declare what resources I wanted and then let something figure it out. Isn’t there anything that does that? Luckily there is! https://www.terraform.io
So, what does Terraform look like? Well it’s pretty amazingly just-what-you-hoped-it-would-look-like. Here are the ECS Services and Tasks
You can see that each task defines how much memory and cpu it needs
You can see that we’re able to avoid checking secrets into terraform by using template files.
For a Rails app, you probably have at least 2 tasks. The task for the server which will be run in a service. And then a separate task that runs db:migrate. That you can just run ad-hoc.
Here’s building a docker image
And pushing docker image to ECR
Here’s terraform plan, which tells you what it is going to do.
Then terraform apply which actually tells ECS to swap which task the service is running.
Within ECS you can see that it starts the new task, hits the health check. Once that’s good it registers it in the ALB. And starts draining connections to the old task. Then removes the old task and reaches steady state.
ECS is really easy to set up to pipe logs to cloudwatch
Running a https://www.datadoghq.com agent in ECS is totally doable and means you get great dashboards about what is happening in your cluster right out of the box.
So why consider ECS to run your Docker images?
http://blog.jdwyah.com/2016/02/ruby-on-rails-on-docker-on-amazon-ecs-w.html and https://github.com/jdwyah/rails-docker-ecs-datadog-traceview-terraform have links to some basic terraform files that will point you in the right direction.
The last hurdle for RateLim.it was figuring out how to actually log & process everyone’s API usage efficiently. Remember we have to serve the request in less than .000005 dollars for there’s not a lot of room.
Amazon API Gateway was just going to be too expensive at these margins.
Instead Kinesis Firehose pointed to S3 and queried with https://aws.amazon.com/athena/ has proved to be a really delightful option. If you have a lot of data in S3 that you need to query with irregular ad-hoc queries you’ve got to check out Athena.
So how do I actually deploy?
1) Build a Docker Image and push it to your docker repository
1) docker tag whatsize_ecr:36ba45
2) Create a new ECS “Task” that points to 36ba45
3) Tell the ECS Service to start running 36ba45 instead.
4) Sit back and watch the Magic!
• Frankly it’s easier than the alternatives
• Runs anything that you can Docker
• Blue/Green Deployments out of the box
• Decent UI
• Deep understanding of autoscaling
• Very cost efficient. Bin Packing.
– ECS is awesome!! You may really like it.
– Need RateLimits or featureflags? Try https://www.ratelim.it
• Ruby Library https://github.com/jdwyah/ratelimit-ruby
• Java library https://github.com/jdwyah/ratelimit-java
– Subscribe to http://engineering.ezcater.com