AWS Sydney Meetup April 2016 - Paul Wakeford


Published on

These are the slides from the presentation on AWS billing and cost control that I made at the AWS Sydney Meetup on 6th April 2016. The deck style is nearly completely wordless so please see the speaker notes for information (click the 'Notes' link below to open the speaker notes).

I'm happy to field any questions via the contact methods in the deck.

Published in: Internet
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • About me. I have about 20 years of IT experience, architect at Fairfax since 2007. Using AWS since 2012 (Singapore days). I work for Fairfax Media, in the online side of things, so primarily on these properties...
  • But not Domain, they do their own thing. I’m not here to sell you anything. Any products or companies I mention I use as a customer only, I have no financial interest in any of them.
  • About 30 AWS accounts
    800 to 1100 instances
    Still migrating services
    Seven digit yearly AWS bill

    To show our rate of growth, this is a graph of our daily EC2 instance hours, from May 2013 to December 2015. Growth is about 7x.
  • And here is our bill growth. We really realised the extent of our billing issues in early 2015 and started to take steps to limit increases then. You can see we peaked in September and have been dropping back since then, while still seeing growth in usage.
  • Billing. It’s a bit boring though, right? And as techies you may be thinking ‘why bother? Why should I care?’

    Because someone has to pay. You may be a private user or a small business - the money is coming right out of your pocket - money that can be reinvested in the company - a bigger office, employ someone to do the the bad jobs, used to go on holiday or just to buy cool toys.

    If you work for Global MegaCorp then you could be a hero - having the reputation as someone who brings in projects under budget is not a bad thing.
    Or you can buy cool toys/tools with the saved money - maybe better monitoring or alerting tools, New Relic etc.

    Because we want to play with cool things. AWS is a cool thing, and ‘management’ has been told many times that ‘cloud is cheaper’. Do you want to go back to on-premise or VMWare? Lose all your cool AWS toys?

    Part of architecting solutions is being cost effective - if you do any of the advanced AWS certs - Pro SA or DevOps Pro you will find there are plenty of ‘choose the most cost effective option’ questions.
  • Also this guy is rich enough.
  • You need to be able to slice and dice your bill so you can identify high cost areas. Tagging is vital for this.

    Tags are labels - a key/value pair you can apply to most AWS resources - EC2 instances, EBS volumes, S2 buckets etc. Use tags everywhere. You can have up to ten tags. Tags are case sensitive. We use Project to identify resources for cross charging. You may want to tag front end web servers, app servers, Varnish servers, mobile API servers, etc. Come up with your own tags based on your use cases - the important thing is to have a standard and stick to it. You may need a Tag Policeman. Yes not everything can be tagged.
  • Tag support in the console is much better now - you can edit tags, find resources that are not tagged correctly, and create groups of tagged resources. And of course the CLI fully supports tagging. There are third party tools to manage tagging too - I’ve contributed to Graffiti Monkey, a tag inheritance tool for EBS volumes and snapshots, which is on Github. GorillaStack also have tools to manage tagging.

    Tags are visible in the billing console and also carried through to your detailed billing file.
  • You should enable all Billing Reports in your billing console, and choose which tags to show in the reports. You won’t want them all. Your billing file is then built up during the month (not quite real-time, there is a delay) as a CSV file in your nominated bucket.

    Note that tags are case sensitive hence the extra ‘project’ tags shown.
  • Baseline metrics - as done for perf, do for costs. You need to know what is normal.. Then measure exceptions and rates of change.
  • Cost Explorer in AWS billing console. This example is for a CMS project. See tags on right.
    Visible effect of purchasing reserved instances in September onwards, then some expiring in Feb 2016.
    Also note ELB costs now assigned due to better tagging.
    S3 reduction due to better asset management.
  • An in house tool, sadly not yet open sourced. Shows running costs by tag, name, project etc. Can drill into the instance detail and see where an instance was left on for testing over a weekend, etc.
  • Alerting. Set up billing alerts. You should do this whatever size you are. You don’t want to be another story on reddit where some API keys were lost and used to mine bitcoins.
    These billing alerts are for the whole account only, however you can also set budgets with alerts and they can be set per tag.
  • These are actually Cloudwatch alerts which tie into SNS.
  • There are third party tools around to do cost analysis - CloudCheckr, Cloudability, etc. We use CloudCheckr and I have it set to send me daily emails for each account so I can quickly scan the subject lines and look for rates of change. Low tech but it works.
  • Ensure costs are known at every level, from beginning to end, from project owner to engineer.
    Ensure project owners / budget holders know what their costs are up front and ongoing.
    Ideally have a pair or people as cost monitors.
  • When I do a design I include a bill of materials which the project owner has to sign off to say they accept the costs. The project is then assigned a code which is visible in the AWS CSV reports we enabled earlier. That allows us to cross charge the project and compare it to projections.
    Make cost control a KPI for team leaders and engineers. How you do it is up to you but having something in perf reviews, bonus reviews or salary reviews helps keep costs in focus.
    You can also give business owners access to AWS Billing Reports or to third party tools or internal dashboard.
    This is something we are still poor at doing.
  • AWS tools - Autoscale scheduling
    Open source - such as CloudCycler/FlyWheel - HTTP://J.MP/AWSCOST
    Third party tools - GorillaStack, ParkMyCloud etc

    Cost section of TA requires a support plan
    Cost reporting has improved a lot - if you haven’t used it recently take a look.
    Open source tools - Fairfax have written a couple, CloudCycler and Flywheel, [describe].
    Check the link for usage.
  • Graph of instance hours showing the impact of CloudCycler showing half way through.
  • An example from CloudCheckr.
    Go through your account and billing files carefully - find unused ELBs, unattached EIPs, old DynamoDB tables (maybe from EMR use) etc etc
  • Design for cost. Use cheaper alternatives. Use containers to drive up utilisation & for on socket provisioning. (pic with options).
  • A Spot/EC2 example. You can register two AS groups behind an ELB. In this case we used t2.small instances for base load and use the CPU Credit Cloudwatch metric as the basis for adding capacity to the Spot instance group.
  • Do you even need to run instances?
    We had a stats tracking solution - when a user clicked on a link, that click was sent to a HA nginx setup and then stored in a MongoDB cluster. This could all be replaced with a serverless option at a cost one fifth the size of the EC2 solution.
    Another example - developer asked for an EC2 instance to host a solution in production. So that would mean 2 instances. Dug further - publishing microservice - use Docker? Dug further - NodeJS app - use Lambda?
  • Other options..
    Sure, S3 is cheap but not so much if you are storing PB of logs. If you have already ingested them into your log analysis engine, why not use a lifecycle rule to migrate the object to Glacier?
    Use memcached in MySQL
    Use GP2 instead of PIOPS
  • Run static sites from S3 and Cloudfront. For my personal blog I could have used Wordpress but I don’t need that functionality, security exposure or cost - so I use a static site generator called Hugo and host the site in S3.
    Using CFN has a positive cost impact as you have the confidence to destroy environments.
  • Use consolidated billing with multiple accounts - one bill, and volume usage discounts, e.g. S3. RIs are active across consolidated accounts.
  • For assets that don’t need low latency consider other (cheaper) regions (e.g. Cloudfront from US only for zip files).
  • RIs for base load. Your DBR can be analysed in Redshift and Tableau to produce visualisations like this. If you have enterprise support your TAM should be doing this.
    Also use TA or another tool (CC, Cloudability etc).
  • Or DIY - templates written by Evan Crawford of AWS.
  • RI balancer tool - takes an inventory your RIs and instances (across consolidated accounts) and tries to make them match based on size and AZ.
  • Unlike physical servers it is very easy for AWS resources to be accidentally hidden at any time.
    This is not a one-off task - it’s an ongoing process and one where you definitely get out what you put in.
    Get everyone involved.
  • The obligatory ‘we are hiring’ part.
  • AWS Sydney Meetup April 2016 - Paul Wakeford

    1. 1. Slim your AWS bill-y ..with these 7 weird tips ..puntastic
    2. 2. Paul Wakeford @paulwakeford
    3. 3. Tag everything1
    4. 4. Measure the baseline, monitor, alert on exceptions 2
    5. 5. Establish a responsibility model3
    6. 6. Reduce usage4
    7. 7.
    8. 8. CloudCycler impact
    9. 9. Design for cost5
    10. 10. Use accounting tricks6
    11. 11. Review and restart7
    12. 12. Questions? ? @paulwakeford
    13. 13. CREDITS Special thanks to all the people who made and released these awesome resources for free: ◦ Presentation template by SlidesCarnival ◦ Photographs by Unsplash