Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud Native Cost Optimization

7,222 views

Published on

Slightly updated version of AWS Re:Invent 2014 talk.

Published in: Internet
  • Hey guys! Who wants to chat with me? More photos with me here 👉 http://www.bit.ly/katekoxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • First time I've given this talk. Still contains too many cut and pasted slides from the talk I did with Jinesh Varia of AWS at Cloud Connect 2013. I also want more data points from more companies to validate the waffly esitmates. Mostly trying to establish an outline model at this point. There should be a spreadsheet version of it eventually.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Cloud Native Cost Optimization

  1. 1. Cloud Native Cost Optimization Adrian Cockcroft @adrianco Technology Fellow - Battery Ventures November 2014
  2. 2. @adrianco What does everyone want?
  3. 3. @adrianco Less Time Less Cost
  4. 4. @adrianco Faster! See talks by @adrianco Speed and Scale - QCon New York Fast Delivery - GOTO Copenhagen
  5. 5. @adrianco Cheaper! This brand new talk: How to use Cloud Native architecture to reduce cost without slowing down
  6. 6. @adrianco How is Cost Measured?
  7. 7. 1 Bottom Up 2 Product 3 Top Down
  8. 8. @adrianco Bottom Up Costs Add up the cost to buy and operate every component 1
  9. 9. @adrianco Product Cost Cost of delivering and maintaining each product 2
  10. 10. @adrianco Top Down Costs Divide total budget by the number of components 3
  11. 11. @adrianco Top Down vs. Bottom Up Will never match! ! Hidden Subsidies vs. Hidden Costs1 3
  12. 12. @adrianco Agile Team Product Profit Value minus costs Time to value ROI, NPV, MMF Profit center 2 $
  13. 13. @adrianco What about cloud costs?
  14. 14. @adrianco Cloud Native Cost Optimization Optimize for speed first Turn it off! Capacity on demand Consolidate and Reserve Plan for price cuts FOSS tooling $$ $
  15. 15. @adrianco Do The Impossible Immediately Ne#lix'Examples' •  European'Launch'using'AWS'Ireland' –  No'employees'in'Ireland,'no'provisioning'delay,'everything' worked' –  No'need'to'do'detailed'capacity'planning' –  OverAprovisioned'on'day'1,'shrunk'to'fit'aDer'a'few'days' –  Capacity'grows'as'needed'for'addiGonal'country'launches' •  Brazilian'Proxy'Experiment' –  No'employees'in'Brazil,'no'“meeGngs'with'IT”' –  Deployed'instances'into'two'zones'in'AWS'Brazil' –  Experimented'with'network'proxy'opGmizaGon' –  Decided'that'gain'wasn’t'enough,'shut'everything'down'
  16. 16. @adrianco The Capacity Planning Problem
  17. 17. @adrianco Best Case Waste Product(Launch(Agility(2(Rightsized( Pre2Launch( Build2out( Tes9ng( Launch( Grow th( Grow th( Demand( Cloud( Datacenter( $( Cloud capacity used is maybe half average DC capacity
  18. 18. @adrianco Failure to Launch Product(Launch(-(Under-es1mated( Pre-Launch Build-out Testing Launch G row th G row th Mad scramble to add more DC capacity during launch phase outages
  19. 19. @adrianco Over the Top Losses Product(Launch(Agility(–(Over6es8mated( Pre-Launch Build-out Testing Launch G row th G row th $ Capacity wasted on failed launch magnifies the losses
  20. 20. @adrianco Turning off Capacity Off-peak production Test environments Dev out of hours Dormant Data ScienceWhen%you%turn%off%your%cloud%resources,% you%actually%stop%paying%for%them%
  21. 21. @adrianco Turn off Test Environments Snapshot or freeze Fast restart needed Persistent storage 40 of 168 hrs/wk
  22. 22. @adrianco Seasonal Savings 1 5 9 13 17 21 25 29 33 37 41 45 49 WebServers Week Optimize during a year 50% Savings Weekly&CPU&Load&
  23. 23. @adrianco Autoscale the Costs Away 50%+%Cost%Saving% Scale%up/down% by%70%+% Move%to%Load=Based%Scaling%
  24. 24. @adrianco Daily Duty Cycle Business'Throughput'Instances' Reactive Autoscaling saves around 50% Predictive Autoscaling saves around 70% See Scryer on Netflix Tech Blog
  25. 25. @adrianco Underutilized and Unused AWS$Support$–$Trusted$Advisor$–$ Your$personal$cloud$assistant$
  26. 26. @adrianco Clean Up the Crud Other&simple&op-miza-on&-ps& •  Don’t&forget&to…& – Disassociate&unused&EIPs& – Delete&unassociated&Amazon& EBS&volumes& – Delete&older&Amazon&EBS& snapshots& – Leverage&Amazon&S3&Object& Expira-on& & Janitor&Monkey&cleans&up&unused&resources&
  27. 27. @adrianco Total Cost of Oranges When%Comparing%TCO…! Make!sure!that! you!are!including! all!the!cost!factors! into!considera4on! Place% Power% Pipes% People% Pa6erns% How much does Openstack or ESX datacenter automation software and support cost per instance?
  28. 28. @adrianco When Do You Pay? @adrianco bill Now Next Month Ages Ago Lease Building Install AC etc Rack & Stack Private Cloud SW Run My Stuff Datacenter Up Front Costs
  29. 29. @adrianco Reservation Reductions On Demand Light Use Medium Use Heavy Use $ No Upfront $172 upfront $286 upfront $337 upfront $0.070/hr $0.050/hr $0.022/hr $0.015/hr $1840/36mo $1486/36mo $864/36mo $731/36mo Savings 22% 53% 60% Prices on Nov 11th, for m3.medium (1 vCPU, 3.75G RAM, SSD) purely to show typical savings
  30. 30. @adrianco Blended BenefitsMix$and$Match$Reserved$Types$and$On4Demand$ Instances( Days(of(Month( 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Heavy&Utilization&Reserved Instances Light&RI Light&RILight&RILight&RI On8Demand
  31. 31. @adrianco Consolidated Cost CuttingConsolidated+Billing:+Single+payer+for+a+group+of+ accounts+ •  One$Bill+for+mul7ple+accounts+ •  Easy$Tracking$of+account+ charges+(e.g.,+download+CSV+of+ cost+data)+ •  Volume$Discounts+can+be+ reached+faster+with+combined+ usage+ •  Reserved$Instances$are+shared+ across+accounts+(including+RDS+ Reserved+DBs)+
  32. 32. @adrianco Production PriorityOver%Reserve(the(Produc0on(Environment( Produc0on(Env.( Account( 100#Reserved# QA/Staging(Env.( Account( 0#Reserved## Perf(Tes0ng(Env.( Account( 0#Reserved## Development(Env.( Account( 0#Reserved# Storage(Account( 0#Reserved# Total#Capacity#
  33. 33. @adrianco Actual Reservation UsageConsolidated+Billing+Borrows+Unused+Reserva4ons+ Produc4on+Env.+ Account+ 68#Used# QA/Staging+Env.+ Account+ 10#Borrowed# Perf+Tes4ng+Env.+ Account+ 6#Borrowed## Development+Env.+ Account+ 12#Borrowed# Storage+Account+ 4#Borrowed# Total#Capacity#
  34. 34. @adrianco Consolidated Reservations Burst capacity guarantee Higher availability with lower cost Other accounts soak up any extra Monthly billing roll-up Capitalize reservation charges! But: Fixed location and instance type
  35. 35. @adrianco Re-sell Reservations Reserved'Instance'Marketplace' Buy$a$smaller$term$instance$ Buy$instance$with$different$OS$or$type$ Buy$a$Reserved$instance$in$different$region$ Sell$your$unused$Reserved$Instance$ Sell$unwanted$or$over;bought$capacity$ Further$reduce$costs$by$op>mizing$
  36. 36. @adrianco Use EC2 Spot Instances Cloud native dynamic autoscaled spot instances ! Real world total savings up to 50%
  37. 37. @adrianco Right Sizing Instances Fit the instance size to the workload
  38. 38. @adrianco Technology Refresh Older m1 and m2 families • Slower CPUs • Higher response times • Smaller caches (6MB) • Oldest m1.xlarge • 15G/8.5ECU/35c 23ECU/$ • Old m2.xlarge • 17G/6.5ECU/25c 26ECU/$ New m3 family • Faster CPUs • Lower response times • Larger caches (20MB) • Java perf ratio > ECU • New m3.xlarge • 15G/13ECU/28c 46ECU/$ • 77% better ECU/$ • Deploy fewer instances
  39. 39. @adrianco Data Science for Free Follow%the%Customer%(Run%web%servers)%during%the%day% Follow%the%Money%(Run%Hadoop%clusters)%at%night% 0 2 4 6 8 10 12 14 16 Mon Tue Wed Thur Fri Sat Sun No#of#Instances#Running# Week Auto7Scaling7Servers Hadoop7Servers No.%of%Reserved% Instances%
  40. 40. @adrianco Real Reservation ReductionsSoaking(up(unused(reserva0ons( Unused(reserved(instances(is(published(as(a(metric( ( Ne9lix(Data(Science(ETL(Workload( •  Daily(business(metrics(rollAup( •  Starts(aBer(midnight( •  EMR(clusters(started(using(hundreds(of(instances( Ne9lix(Movie(Encoding(Workload( •  Long(queue(of(high(and(low(priority(encoding(jobs( •  Can(soak(up(1000’s(of(addi0onal(unused(instances(
  41. 41. @adrianco Six Ways to Cut Costs #1#Business#Agility#by#Rapid#Experimenta8on#=#Profit# #2#Business>driven#Auto#Scaling#Architectures#=#Savings## #3#Mix#and#Match#Reserved#Instances#with#On>Demand#=#Savings# #4#Consolidated#Billing#and#Shared#Reserva8ons#=#Savings# #5#Always>on#Instance#Type#Op8miza8on#=#Recurring#Savings# Building#Cost>Aware#Cloud#Architectures# #6#Follow#the#Customer#(Run#web#servers)#during#the#day# Follow#the#Money#(Run#Hadoop#clusters)#at#night#
  42. 42. @adrianco Expect Prices to Drop Three Years Halving Every 18mo = maybe 40% over-all savings 0 25 50 75 100 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 Data shown is purely illustrative
  43. 43. @adrianco Combinations
  44. 44. @adrianco Lift and Shift Compounding 0 25 50 75 100 Base Price Rightsized Seasonal Daily Scaling Reserved Tech Refresh Price Cuts 25 3030 707070 100 Traditional application using AWS heavy use reservations Base price is for capacity bought up-front
  45. 45. @adrianco Conservative Compounding 0 25 50 75 100 Base Price Rightsized Seasonal Daily Scaling Reserved Tech Refresh Price Cuts 15 20 25 35 50 70 100 Cloud native application partially optimized light use reservations
  46. 46. @adrianco Agressive Compounding 0 25 50 75 100 Base Price Rightsized Seasonal Daily Scaling Reserved Tech Refresh Price Cuts 46812 25 50 100 Cloud native application fully optimized autoscaling mixed reservation use costs 4% of base price over three years!
  47. 47. @adrianco Cloud Native Patterns ● Business logic isolation in stateless micro-services ● Immutable code with instant rollback ● Auto-scaled capacity and deployment updates ● Distributed across availability zones and regions ● De-normalized single function NoSQL data stores ● See over 40 NetflixOSS projects at netflix.github.com ● Get “Technical Indigestion” trying to keep up with techblog.netflix.com
  48. 48. @adrianco Cost Monitoring and Optimization
  49. 49. @adrianco Final Thoughts Turn off idle instances Clean up unused stuff Optimize for pricing model Assume prices will go down Go cloud native to save
  50. 50. @adrianco Further Reading See www.battery.com for a list of portfolio investments ● Battery Ventures Blog http://www.battery.com/powered ● Adrian’s Blog http://perfcap.blogspot.com and Twitter @adrianco ● Slideshare http://slideshare.com/adriancockcroft ! ● Monitorama Opening Keynote Portland OR - May 7 th , 2014 - Video available ● GOTO Chicago Opening Keynote May 20 th , 2014 - Video available ● Qcon New York – Speed and Scale - June 11 th , 2014 - Video available ● Structure - Cloud Trends - San Francisco - June 19th, 2014 - Video available ● GOTO Copenhagen/Aarhus – Denmark – Sept 25 th , 2014 ● DevOps Enterprise Summit - San Francisco - Oct 21-23rd, 2014 - Videos available ● GOTO Berlin - Germany - Nov 6th, 2014 ● AWS Re:Invent - Las Vegas - Cloud Native Cost Optimization - November 14th, 2014 ● Dockercon Europe - Amsterdam - Microservices - December 4th, 2014

×