Optimizing Total Cost of Ownership for the AWS Cloud
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Optimizing Total Cost of Ownership for the AWS Cloud

  • 1,446 views
Uploaded on

Cost is often the conversation starter when customers think about moving to the cloud. AWS helps lower costs for customers through its “pay only for what you use” pricing model, frequent price......

Cost is often the conversation starter when customers think about moving to the cloud. AWS helps lower costs for customers through its “pay only for what you use” pricing model, frequent price drops, and pricing model choice to support variable & stable workloads. In this session, you will learn about the financial considerations of owning and operating a traditional data center or managed hosting provider versus utilizing AWS. We will detail our TCO methodology and showcase cost comparisons for some common customer use-cases. We’ll also cover a few AWS cost optimization areas, including Spot and Reserved Instances, EC2 Auto Scaling, and consolidated billing.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,446
On Slideshare
1,439
From Embeds
7
Number of Embeds
1

Actions

Shares
Downloads
122
Comments
0
Likes
3

Embeds 7

https://twitter.com 7

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Optimizing Total Cost of Ownership for AWS Rohit Rahi, Amazon Web Services Valentino Volonghi, AdRoll
  • 2. Agenda • Total Cost of Ownership • AdRoll • Cost Optimization on AWS
  • 3. Lower costs with AWS 1 “Average of 400 servers replaced per customer” Replace up-front capital expense with low variable cost 2 42 Price Reductions Economies of scale allow AWS to continually lower costs 3 Pricing model choice to support variable & stable workloads 4 Save more money as you grow bigger Tiered Pricing Volume Discounts Custom Pricing On-Demand Reserved Spot Dedicated Source: IDC Whitepaper, sponsored by Amazon, “The Business Value of Amazon Web Services Accelerates Over Time.” December 2013
  • 4. Lower costs than on-premises On-Premises Traditional Data Center On-Premises Virtualized Data Center CAPEX OPEX OPEX AWS CAPEX OPEX* Cost savings from running internal IT more efficiently AWS Scale • Multiple new data centers built each year • Volume purchasing, highly automated, supply chain optimization Utilization fundamentally higher in AWS cloud • Aggregating non-correlated workloads, scale, spot market Amazon specific hardware designs • OEM acquisition of custom servers & net gear • Direct purchasing of disk, memory, & CPU • AWS controlled hypervisor & net protocol layers Diagram is not to scale *For AWS, OPEX costs includes Reserved Instances one-time low, upfront payment, if Reserved Instances are used. Cost savings from moving to a public cloud provider
  • 5. AWS Pricing Philosophy More AWS Usage More Infrastructure Economies of Scale Lower Infrastructure Costs Reduced Prices More Customers Ecosystem Global Footprint New Features New Services Infrastructure Innovation We pass the savings along to our customers in the form of low prices and continuous reductions 42
  • 6. Analysts have shown AWS reduces costs In early 2012, AWS commissioned IDC to interview 11 organizations that deployed applications on AWS. Since this study was conducted in early 2012, AWS has introduced price reductions nearly 20 times across Amazon EC2 and Amazon S3. IDC estimated what the impact of AWS's fee restructuring would be on the organizations that participated in the 2012 study and determined that the overall fees would drop by 21% lowering the five year TCO from $909,000 to $846,000. Source: IDC Business Value of AWS Accelerates over time IT PRODUCTIVITY INCREASE: 52% 5 YEAR TCO SAVINGS: 72%
  • 7. AWS TCO benefits increase over time… $3.50 in benefits $1 Investment in AWS $1 Investment in AWS $8.40 in benefits At 36 Months of using AWS… At 60 Months of using AWS… ~3X ~8X Source: IDC Business Value of AWS Accelerates over time According to IDC, this relationship between length of time using AWS and return is due to customers leveraging the more optimized environment to generate more applications along a learning curve.
  • 8. Comparing TCO is not easy (But We’re Going to Try) ≠
  • 9. Typical cost drivers for on-premises deployments, including overhead costs Network Costs Storage Costs Server Costs Hardware – Server, Rack Chassis PDUs, ToR Switches (+Maintenance) Software - OS, Virtualization Licenses (+Maintenance) Overhead Cost Space Power Cooling Hardware – Storage Disks, SAN/FC Switches Overhead Cost Storage Admin costs Network Hardware – LAN Switches, Load Balancer Bandwidth costs Network Admin costs Overhead Cost IT Labor Costs Server Admin Virtualization Admin 1 2 3 4 Space Power Cooling Space Power Cooling illustrative Diagram doesn’t include every cost item. E.g. software costs can include database, management, middle tier software costs. Facilities cost can include costs associated with upgrades, maintenance, building security, taxes etc. IT labor costs can include security admin and application admin costs.
  • 10. AWS services pricing includes overhead costs Hardware Vendor Offering ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ Server Network Hardware Software OS + VMs DC/Co-lo Floor Space Powering Cooling Software Defined Networking Data Center Personnel Storage Redundancy Resource Mgmt. /SW Automation × × × ××× ×
  • 11. TCO Example : 100 VMs On-Premises vs. AWS # of VMs Avg. vCPU Avg. vRAM Optimize by? Usage 25 1 2 RAM 5% 35 4 14 RAM 80% 30 8 32 RAM 60% 5 8 68 RAM 40% 5 16 128 Disk I/O 8% # of Instances Instance vCPU RAM Instance Type 25 m1.small 1 1.7 On Demand 35 m3.xlarge 4 15 3 Yr. Heavy RI 30 m3.2xlarge 8 30 3 Yr. Med. RI 5 m2.4xlarge 8 68.4 3 Yr. Light RI 5 i2.4xlarge 16 122 3 Yr. Light RI Avg. vCPU Avg. vRAM Optimize by? 1 2 CPU 4 14 CPU 8 32 CPU 8 68 CPU 16 128 Disk I/O Instance vCPU RAM m1.small 1 1.7 c3.xlarge 4 7 c3.2xlarge 8 15 c3.2xlarge 8 15 i2.4xlarge 16 122
  • 12. 1, 1.7, $0.060 1, 3.75, $0.113 2, 3.75, $0.145 2, 7.5, $0.225 2, 17.1, $0.410 4, 7, $0.300 4, 15, $0.450 4, 34.2, $0.820 8, 15, $0.600 8, 30, $0.900 8, 68.4, $1.640 4, 30.5, $0.853 8, 61, 1.705 16, 30, $1.200 32, 60, $2.400 32, 244, $3.500 16, 122, $3.410 16, 117, $4.600 32, 244, $6.820 0 50 100 150 200 250 300 0 5 10 15 20 25 30 TCO Example : Choosing EC2 instances On Demand Prices shown (N.Virginia region), only latest generation instances (M3,C3) shown where applicable, GPU and Micro instances not shown above Memory-Optimized Instances Compute-Optimized Instances General Purpose Instances Storage-Optimized Instances vCPU RAM
  • 13. TCO Example : 100 VMs On-Premises vs. AWS
  • 14. • From over 40 data centers down to 6 • Planning to migrate 3000 apps by Jan 2015 • Saving $100M over 3 Years VS 1. Evaluate infrastructure costs & architecture 2. Make business case 3. Enable decision to move to the cloud Customer Spotlight: Dow Jones Intl.
  • 15. In Your TCO Analysis Power/Cooling (compute, storage, shared network) Data Center Administration (procurement, design, build, operate, network, security personnel) Rent/Real Estate (co-lo charges, building deprecation, taxes) Software Licensing/Maintenance (OS, Virtualization, DCIM, Automation..) RAW vs. USABLE storage capacity (Usable = ~50% Raw) Storage Redundancy (RAID penalty, OS penalty) Storage Backup costs (Tape, backup software) Bandwidth, Network Gear & Redundancy (Routers, VPN, WAN..) DON’T FORGET THINK BENEFITS Additional investment into new initiatives Reduced Procurement Time, Resource sitting on shelf Cost of lost customers Lower down time, increased productivity
  • 16. 50B req/day without breaking the bank
  • 17. Pixel “fires”
  • 18. Pixel “fires” Serve ad?
  • 19. Pixel “fires” Serve ad? Ad served
  • 20. Data must be available all over the world!
  • 21. 7/2011 - ~50GB/day 4/2013 - ~5TB/day 10/2013 - ~20TB/day 03/2014 - ~40TB/day
  • 22. What do we do then? • Store 1.5+ PB of compressed data in S3 • Be available worldwide, and replicate all our data in every major area • ~50B requests/day with 100ms latency caps • As close as possible to 100% uptime • 35 engineers
  • 23. AdRoll S3 Usage by the numbers • 1.5+ PB of compressed data stored • 5-10TB of compressed new data each day • ~5B monthly requests in February • 500+ TB of transfer • No engineers spend time on storage
  • 24. AdRoll S3 buckets would be one of the most visited properties on the web, we don’t even think about it.
  • 25. BackBlaze Storage Pod 4.0 • 180TB cost $9,305 to build, only 1 PSU per box. • X 3 for local redundancy • X 3 for geographic redundancy • X 2 because we need to have room for growth • +50% because we need to have quickly available spare parts in each area • TOTAL: $251,235 <- this is the real cost of 180TB
  • 26. And it’s not even all of it… • Now let’s develop the software that manages about 5B monthly requests and transparently fails over… • Let’s add bandwidth cost, electricity and data center costs… • Our AWS total cost is equivalent to about 6 engineers.
  • 27. We’re not a storage company!
  • 28. We happen to have a similar amount of data that a storage company has. But we don’t have engineering dedicated to storage.
  • 29. Our RTB infrastructure • ~500 c3.4xlarge in 4 regions • Close to 100% uptime since launch • <100ms max latency, 0.15% timeouts YTD at 50B requests/day • 500K+ requests/second globally on DynamoDB • Fully replace the global infrastructure in less than 1 hour
  • 30. 2 engineers
  • 31. We have more people installing, configuring and managing our office network and windows laptops
  • 32. Alternative? • Setup 4 data centers locations • 2 on-call staff in each location • Provision ~1000 machines for growth and deployment flexibility. • Implement Auto Scaling and a common API • Provision 10-20% cold capacity in each location • What if a big customer signs up for AdRoll? • What if we improve our algorithms and we can do with half the machines? What if we want to upgrade our hardware?
  • 33. We’ve been trading money… For Time
  • 34. Time is a far more scarce resource and you can’t save it.
  • 35. What does it mean in numbers? • Minimum team size is about 20: – 8 on-call – 1 product manager for Auto Scaling and API – 5 engineers to develop and maintain Auto Scaling – 5 engineers to maintain Cassandra installation instead of DDB – 1 engineering manager • 20 people at an average $130K/year… $2.6M/year
  • 36. We spent less than that to run our entire infrastructure in EC2 last year. All of these very skilled engineers would not be adding value to our product.
  • 37. If we didn’t use EC2 a competitor would… And then we’d be in trouble
  • 38. Cost Optimization on AWS
  • 39. 1. Choose the right instance types Start Choose an instance that best meets your basic requirements Start with memory & then choose closest virtual cores Look for peak IOPS storage requirements Tune Change instance size up or down based upon monitoring Use CloudWatch & Trusted Advisor to assess Roll-Out Run multiple instances in multiple Availability Zones
  • 40. 2. Use Auto Scaling Describes what Auto Scaling will create when adding Instances Only one active launch configuration at a time Launch Configuration as-create-launch-config --image-id ami-54cf5c3d --instance-type m1.small --key mykey --group webservers --launch-config 101-launch-config Auto Scaling managed grouping of EC2 instances Automatically scale the number of instances by policy – Min, Max, Desired Auto Scaling Group as-create-auto-scaling-group 101-as-group --availability-zones us-east-1a us-east-1b --launch-configuration 101-launch-config --load-balancers myELB --max-size 5 --min-size 1 Parameters for performing an Auto Scaling action Scale Up/Down and by how much Auto Scaling Policy as-put-scaling-policy 101ScaleUpPolicy --auto-scaling-group 101-as-group --adjustment=1 --type ChangeInCapacity --cooldown 300
  • 41. Utilization and Auto Scaling: Granularity more small instances vs. less large instances 29 m1.large @ $0.240/hr. = $6.96 59 m1.small @ $0.06/hr. = $3.54
  • 42. 3. Turn off un-used instances • Dev./test instances • Simple instance start/stop • Tear down/build up altogether • Instances are disposable
  • 43. 0 2 4 6 8 10 12 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 On Demand Light Utilization RI Medium Utilization RI Heavy utilization RI /Spot Instances 4. Use Reserved Instances Reserved Instances enable you to maximize savings by paying a low, one-time fee for a capacity reservation of 1-3 years in exchange for a significant discount on the hourly rate. Amazon EC2 up to 65% savings Amazon RDS up to 76% savings Amazon DynamoDB up to 76% savings Amazon Redshift up to 73% savings Amazon ElastiCache up to 70% savings AWS services offering reservations
  • 44. Customer Spotlight: Pinterest AWS Set-up • 8 billion objects and 400 terabytes of data (May 2012), 10x growth from August 2011; EC2 instances have grown by 3x in the same time period • 150 EC2 instances ( web tier), 90 instances (in-memory caching), 70 master databases • Reserved instances used for standard traffic; On-demand and spot instances used to handle daily elastic load. Most services targeted to run at about 50% on-demand and 50% spot Source: Pinterest AWS Case study, Pinterest Architecture update, 410TB of data, 10X Growth, 12 employees Source: Return on Agility, Werner Vogels “Imagine we were running our data center, and we had to go through a process of capacity planning and ordering and racking hardware. It wouldn't have been possible to scale fast enough,"– Ryan Park, Pinterest Operations Engineer Costs have gone from $54 per hour to $20 per hour Only 2 weeks of engineering effort was required to achieve this cost savings
  • 45. 5. Use Spot Instances • Pricing • Up to 92% discount • Elastic • Capacity not otherwise available • Tradeoff • Potential for interruption Picking the right Bid Price - Tolerance for interruptions, % likelihood of termination
  • 46. 1.21 PFLOPS 264 years of compute in < 18 hours 16,788 instances in 8 regions Customer Spotlight: Cycle Computing On-Premises Spot $68 Million $33K
  • 47. 6. Leverage Storage Classes AWS Cloud Amazon Glacier Gateway Appliance/ AWS Storage Gateway Amazon S3 Block File On-premises Data Center Archive Backup Disaster Recovery Amazon EBS Amazon S3 Reduced Redundancy • 99.99% durability vs. 99.999999999% • Up to 20% savings • Great for everything that is easy to reproduce Amazon Glacier • Same durability as S3 • 3 to 5 hours restore time • Up to 89% savings • Great for archiving, long-term backups and old data
  • 48. 7. Offload your architecture The more you can offload, the less infrastructure you need to maintain, scale, and pay for • Offload popular traffic to Amazon CloudFront and S3 • Introduce Caching ResponseTime ServerLoad ResponseTime Server Load ResponseTime Server Load No CDN CDN for Static Content CDN for Static & Dynamic Content
  • 49. 8. Use Application Services Elastic Load Balancing Amazon Relational Database Service (RDS) Amazon Simple Queue Service (SQS) Amazon Simple Email Service (SES) Amazon Elastic MapReduce Amazon ElastiCache Amazon Simple Notification Service (SNS)
  • 50. Web Servers Availability Zone $0.025 per Elastic Load Balancer- hour (or partial hour) $0.008 per GB of data processed by an Elastic Load Balancer 100 GB Data processed, 1 ELB $18 (.025*24*30) + $.008*100 $18.80 Web Servers Availability Zone EC2 instance + software LB Elastic Load Balancer DNS DNS VS Leverage Application Services $0.060 per hour, m1.small Separate for Software Load Balancer $.060*24*30 = $43.2 (m1.small) + Software LB Cost On Demand Prices shown (N.Virginia region)
  • 51. 9. Use Consolidated Billing • Receive a single bill for all charges incurred across all linked accounts • Share RI discounts • Combine tiering benefits • View & manage linked accounts • Add additional accounts
  • 52. 10. Leverage AWS tools AWS Trusted Advisor AWS EC2 Usage Reports
  • 53. Recap Choose the right Instance type Use Auto Scaling Turn off un-used Instances Use Reserved Instances Use Spot Instances Leverage Storage Classes Offload your architecture Use Application Services Use Consolidated Billing Leverage AWS Tools – Trusted Advisor, EC2 Usage Reports Others…
  • 54. Summary TCO • Make reasonable assumptions and leverage industry benchmarks • Know the on-premises hidden costs Cost Optimization • Create cost-aware architectures and leverage best practices • Re-evaluate and revisit your architecture often • Leverage Application Services, CloudWatch • Stay up to date – RI modifications, Trusted Advisor
  • 55. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Optimizing Total Cost of Ownership for AWS Rohit Rahi, Amazon Web Services Valentino Volonghi, AdRoll Thank you!