The document discusses creating a culture of cloud cost management. It recommends that everyone in a company have visibility into cloud costs through shared data and metrics. It also recommends enacting policies to incentivize efficient usage, such as turning off underutilized instances, upgrading old instances, and using spot instances and reservations to save on cloud spend. Specific tactics include creating cost allocation tags, linked accounts, regular savings reports, and unit cost metrics to track cloud costs against business metrics. The document provides steps to get started, such as establishing daily cost visibility, defining a tagging taxonomy, analyzing inefficient resources, making small reserved instance purchases, and forming a cost optimization center of excellence.
Creating a Culture of Cost Management in the Cloud
1. Creating a Culture of Cost
Management in the Cloud
Fabio Pedrazzoli Grazioli
2. Expanding our Cost Awareness
● The Cloud is no longer one person’s responsibility
● Everyone across the company should have visibility of the costs
● Data should be shared across the company (Engineering, Finance, Ops, Capacity, Execs)
● For best results: enact policies and evangelize best practices
● Incentivize good behaviour (build self policing rather than watch dogs)
3. Rolling out a cost mgmt program
● Cost Visibility (emails, alerts, dashboards)
● Cost Allocations (tags, linked accounts)
● Efficient usage (only what is needed)
● Saving on what we use (Reserved and Spot)
● Unit costs (get to the bottom line)
4. Visibility
● Incurring vs Watching -> even out the gap between those using and those “counting”
● Get each stakeholder the spending fundamentals (i.e. daily emails, dashboards)
● Let teams see each other’s spending habits, creating a mild social pressure
Create broadly available dashboards and email broadcasts with per team company reports
5. Allocation 1/2
Tags
● Great flexibility, good if there is compliance
● Define taxonomy
● Ideally tag into automation rather than chasing the instances after deployment
(strongly depends on compliance)
● Consider a “tag or terminate” policy i.e. after 24 hours or at least send an email
warning to teams/devs
6. Allocation 2/2
Linked Accounts
Each team gets an account
● Ultimate solution for the cleanest chargebacks
● Many in the Fortune 500’s have hundreds of linked accounts (per team and
even per env sometimes) all under the Payer account
7. Efficiency
Don’t run the cloud as a datacenter
● 168 hours a week
● 108 are nights and weekends (65%) At least for dev/test resources:
● Turn off underutilized instances (and schedule weekly reports)
● Schedule dev/test downtime
● Find old instance families that can be upgraded (i.e. m1 to m{3,4}) (beware changes in storage for this one)
● Get live reports/alerts when there is a threshold hit
Consider setting up some kind of score system to measure waste for instance with parameters like: avg hourly node cost
(£), avg node uptime (%), CPU usage (%), avg node running life (hrs) etc.
8. Savings
Spot
● Spot instances are great savers but except some rare cases, may require high amounts of engineering work.
Reserved
● Reservations do not require any engineering
It’s entirely about “coupons” without touching the infrastructure in any way, shape or form.
● As long as we are using RI at least 50% of the time, the price has historically never got less convenient
than on demand, including the many discounts that have been applied during the years.
● Do not wait too much to get started with reservations (risk of paralysis by analysis)
● Look at hourly data because that is how AWS applies coupons
● Appoint a person for the RI process (see reservation slides)
9. Unit Costs
Unit cost is cost per “X”
Some common biz metrics for unit costs are
● Subscribers
● Pageview
● Customer
● Api calls
● Conversions
Unit Cost = Total Cost / Biz Metric
£ 1,000.00 Spend / 1,000 Customers = £ 1 Unit Cost
One of the advantages is anyone across the whole company can set goals based on Unit Costs. For instance “let’s
spend no more than £5 per customer”
10. Getting Started 1/2
Visibility
Give stakeholders a daily view into spending
Allocation
Put together a taxonomy with finance team and start splitting linked accounts and tagging
Efficiency
Look at underutilized instances (low CPU, Bandwidth, Storage) and M1 candidates to new generations (remember considering
storage)
Hourly Savings
Make small and uncontroversial RI buys and make it periodic
11. Getting started 2/2
Unit Costs
Determine top line biz metrics, divide costs by them
COE
Put together a Centre Of Excellence that meets regularly or at least has deadlines
12. References
This research is based mainly on the work of JR Storment (Cloudability)
AWS re:Invent 2014 | (ENT207) Creating a Culture of Cost Management in Your
Organization
https://www.youtube.com/watch?v=SaOLzxYiZlE
Contacts: http://fabioit.wordpress.com