Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Summit Sydney 2014 | Why Scale Matters and How the Cloud Really is Different

A behind the scenes look at key aspects of the AWS infrastructure deployments. Some of the true differences between a cloud infrastructure design and conventional enterprise infrastructure deployment and why the cloud fundamentally changes application deployment speed, economics, and provides more and better tools for delivering high reliability applications. Few companies can afford to have a datacenter in every region in which they serve customers or have employees. Even fewer can afford to have multiple datacenter in each region where they have a presence. Even fewer can afford to invest in custom optimized network, server, storage, monitoring, cooling, and power distribution systems and software. We'll look more closely at these systems, how they work, how they are scaled, and the advantages they bring to customers.

  • Login to see the comments

AWS Summit Sydney 2014 | Why Scale Matters and How the Cloud Really is Different

  1. 1. © 2014, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of, Inc. Why Scale Matters And How The Cloud Really Is Different Rodney Haywood Amazon Web Services
  2. 2. Agenda Redefining Scale at AWS AWS Designed Hardware & Infrastructure Multi-AZ Design Point & Why it Works
  3. 3. Perspective on Scaling On average, AWS adds enough new server capacity every day to support Amazon’s global infrastructure when it was a $7B business (2004).
  4. 4. AWS Global Infrastructure 9 regions 25 availability zones 51 edge locations
  5. 5. Scaling Applications
  6. 6. Amazon S3 Growth Q4 2006 Q4 2007 Q4 2008 Q4 2009 Q4 2010 Q4 2011 Q4 2012 Q4 2013 Peak Requests: 2,000,000+ per second Total Number of S3 Objects 2.9 Billion 14 Billion 40 Billion 102 Billion 762 Billion 262 Billion >1.7 Trillion >3 Trillion Peak requests: 1.5M/sec
  7. 7. DynamoDB: Eventual consistency When 11 becomes 10!
  8. 8. DynamoDB: Requests Served/Month
  9. 9. DynamoDB: Consistent Performance at Scale
  10. 10. “AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers.”
  11. 11. Agenda Redefining Scale at AWS AWS Designed Hardware & Infrastructure Multi-AZ Design Point & Why it Works
  12. 12. Pace of Innovation Infrastructure pace of innovation increasing –  Driven by cloud service providers and high-scale internet applications –  Cost of datacenter and H/W infrastructure dominates –  Infrastructure more than just a cost center High focus on innovation –  Driving down cost –  Increasing aggregate reliability –  Reducing resource consumption footprint
  13. 13. AWS Custom Server Designs OEM Server Ecosystem –  Optimized for 10s to 100s of thousands of customers –  Broadly applicable servers can run a variety of workloads Cloud Server Ecosystem –  Optimized for single customer –  Highly specialized servers optimized for specific workload –  Large scale deployments allow hardware specialization –  Move hot s/w kernels to hardware implementations –  Datacenters, servers, networking, storage to designed to integrated spec.
  14. 14. AWS Custom Storage Designs Commercial high-density storage: •  Quanta M4600H 4U Disk Enclosure •  Impressive best in class general purpose design •  We use custom design with still higher density OEM storage & servers must target vast workload diversity High scale supports AWS-specific optimizations –  More space, power, & cost efficient
  15. 15. Networking Equipment •  Relative cost of networking increasing quickly •  Profit margins high •  Ecosystem vertically integrated 8% 3 year server & 10 year infrastructure amortization Monthly Costs
  16. 16. Get the Network Out of the Way Current Networks Over-SubscribedMainframe Model Goes Commodity •  Forces workload placement restrictions •  Goal: Make all points in datacenter equidistant •  Amazon custom routers & protocol stacks
  17. 17. Power Infrastructure Negotiated power purchasing agreements AWS custom high-voltage sub-stations in some regions –  Lower power cost –  Build faster
  18. 18. Carbon Neutral Power Choice Most companies rarely build new datacenters so there are few new power procurement options The entire multi-datacenter US-WEST (Oregon) is 100% carbon neutral One of the largest AWS regions world-wide –  And, by far, the fastest growing
  19. 19. Procurement & Supply Chain Optimization Global demand allows purchasing power at volume Direct component purchasing –  Precise inventory control –  Better pricing –  Optimized designs Supply ChainProcurement Demand-driven supply chain Shorter cycle time drives higher utilization –  Predicting next week easier than 4 to 6 months out Less overbuy & less capacity risk yielding lower costs
  20. 20. Utilization & Economics On premise 30% utilization VERY good &10% to 20% more common Solution: Pool number of heterogeneous services Don’t block the business Don’t over-buy Transfers capital expense to variable expense Apply capital for business investments rather than infrastructure Cost encourages prioritization of work by application developers High scale needed to make a spot market for low priority work Pay as You Go Pay as You Grow Server Utilization Problem Chargeback Models Drive Good Behavior
  21. 21. Amazon Cycle of Innovation 15+ years of operational excellence LowerReduce Prices Innovate Listen to Customers Lower Costs Improve Processes Re-invest in Features 42 AWS price reductions since 2006
  22. 22. AWS Pace of Innovation New  Service  Announcements  &  Updates     9 24 48 61 82 159 280 88 2007 2008 2009 2010 2011 2012 2013 2014
  23. 23. Agenda Redefining Scale at AWS AWS Designed Hardware & Infrastructure Multi-AZ Design Point & Why it Works
  24. 24. Conventional Design: Cross-Region Replication 5th app availability “9” only via multi-datacenter replication Conventional approach: –  Two datacenters in distant locations –  Replicate all data to both datacenters The industry-wide dominant multi-DC availability approach –  Looks rock solid but performs remarkably poorly in practice Acid Test: Are you willing to pull the plug on the primary server? 99.999%
  25. 25. What is wrong with inter-regional replication? Asynchronous replication between datacenters –  Committing to an SSD order 1 to 2 msec –  LA to New York 74 msec round trip On failure, a difficult & high skill decision: –  Fail-over & lose transactions, or –  Don’t fail-over & lose availability I’ve been on these calls in the past –  No win situation –  Very hard to get right
  26. 26. What Else is Wrong with X-Country Replication? Fragile: Active/Passive Doesn’t Work –  Failover to a system that hasn’t been taking operational load –  Passive secondary not recently tested –  Secondary config or S/W version different, incorrect load balancer config, incorrect network ACLs, latent hardware problem, router problem, resource shortage under load –  Can’t test without negative customer impact –  If you don’t test it, it won’t work 2-Way Redundancy Expensive: –  More than ½ capacity reserved to handle failure –  3 datacenters much less expensive but impractical w/o high scale
  27. 27. AWS Multi-Availability Zone Model Choose Region to be close to user, close to data, or meeting jurisdictional requirements Synchronous replication to 2 (or better 3) Availability Zones –  Easy when less than 2 to 3 msec away –  Can failover w/o customer impact ELB over EC2 instances in different AZs Stateless EC2 apps easy For persistent state use –  DynamoDB –  Simple Storage Service –  Mutli-AZ RDS
  28. 28. New Research: Customers Improve Availability by Migrating Apps to AWS 32% reduction in total application downtime 2013 AWS Customer Survey Research Note: Benchmarking availability and reliability in the cloud: Amazon Web Services Nucleus Research, November 2013, Document N168
  29. 29. Is Hosting On-premises Less Expensive? Utilization fundamentally higher in cloud –  Aggregating non-correlated workloads, scale, spot market Amazon specific H/W designs –  ODM acquisition of custom servers & net gear –  Direct purchasing of disk, memory, & CPU –  AWS controlled hypervisor & net protocol layers Deep R&D: Many new data centers built each year Immense scale –  Volume purchasing, highly automated, specialists in all areas Amazon margins are tiny compared to enterprise margins
  30. 30. Summary AWS Economics driven by scale & singular focus –  Economies of scale –  Increased availability through multiple-datacenter deployment –  Steadily declining price Mega-scale advantages available to all customers regardless of size –  Datacenter presence near all customers world-wide –  Multiple datacenters in each region for high availability –  Deeper R&D investment & operational focus in datacenter, server, storage, & networking than any IT organization in the world –  Buying power that rivals the biggest in the world Cloud Model Fundamentally different from the last 30 years –  Even if rebranded as “cloud enabled”, “private cloud”, “cloud-like”
  31. 31. END •  Harshana to replace