Copyright © 2010 Opscode, Inc - All Rights Reserved
‣ cb@opscode.com
‣ @skeptomai
‣ www.opscode.com
Christopher Brown VP, ...
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge
Computing
Network
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge
Computing
Network
•Opscode
Google, Amazon, Microsoft
built their own tools
Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute
P
almost everyone else is
here...
... inexperienced or ...
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & Control
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & Control
Nanite!
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?Defining the cloud
is like this...
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Dynamism
Dynamism
...not about excess capacity...
Dynamism
Dynamism
• Disintermediation
• Developers can freely experiment
Dynamism
• Disintermediation
• Developers can freely experiment
• Isolation
• Applications safely co-exist
Dynamism
• Disintermediation
• Developers can freely experiment
• Isolation
• Applications safely co-exist
• Utilization
•...
Dynamism
• Disintermediation
• Developers can freely experiment
This is what you are paying for
• Isolation
• Applications...
Scale
Scale You are not this BIG
Scale You are not this BIG
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 te...
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 te...
EC2 Design Principles
• Minimize management footprint
• Run inVMs just like customers.
• Forced to analyze what must run i...
Copyright © 2010 Opscode, Inc - All Rights Reserved 13
• Simple API, single unit of work
• think of early Unix tools (MH)
...
Copyright © 2010 Opscode, Inc - All Rights Reserved 14
APIs, Mashups
Copyright © 2010 Opscode, Inc - All Rights Reserved 15
http://www.flickr.com/photos/jfseesthings/4293062294/sizes/l/
Simpli...
Cost
Cost
• CapEx versus OpEx
Cost
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
Cost
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
• Do you have money,
time, or experience?
Cost
What are you willing to pay for?
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
• Do you have money,
time, or exper...
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Nobody ever imagined a band of
Orcs would steal a database table
Charles Stross - Halting State
MTTF & MTTR
Understanding how, when and
why things fail is great ... but
http://www.flickr.com/photos/dierken/948171048/siz...
MTTF & MTTR
Understanding how, when and
why things fail is great ... but
If your Mean Time to Recover exceeds the
time val...
Testing
• Test with production-like dataset and
performance
• Don’t do “Design by Laptop”
• A/B Testing
• API versioning
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/...
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/...
vs
Theo Morpheus
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Yo...
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Yo...
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Yo...
Availability
• For a distributed system to be continuously
available, every request received by a non-failing
node in the ...
Think Globally,
Act Locally
• Global but inconsistent aggregate view
• Local action where data is authoritative
• Autonomy...
Distributed Systems Design
• Avoid execution caching
• “Don’t lie, don’t retry”
• Embrace failure
• Don’t block the client...
Copyright © 2010 Opscode, Inc - All Rights Reserved 26
• It’s OK to apologize
• It’s better to completely fail for some us...
Apologize
...to Pat Helland
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance...
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance...
Make Forward Progress
• MVCC, vector clocks, & reconciliation
• Don’t resurrect objects
• always go forward, never go back...
Request Signing
• Stateless - no session tracking to lose or to
purge later
• X509 - only public information on front-
end...
Measure Monitor
Respond
• Save *everything* *forever*
• Histograms / Pareto Chart
• tp99.9, tp99, and tp90
• ignore tp50,“...
Control Chart
• Day over Day
• Same Day,Year overYear
• Confidence Intervals
“Shewhart stressed that bringing a production ...
Characteristic Curves
Periodicity
SLA,Variance,Troubleshooting
Data Taxonomy
• Precious
• Cachable
• Expensive
• Cheap
Consistency
• Authoritative vs. Consultative
• is_authorized? vs list group
Performance
• Call length
• Cyclomatic Complexity
• Request ID flow
• Vertical vs Horizontal Scale
• tension between unit p...
Failure Domains
• EC2 “droplets”
• EC2 DNS
• Coordinator zones
Copyright © 2010 Opscode, Inc - All Rights Reserved 39
Still with me?
Successes
•Sharable “AMI”s
•Metadata (Simple and open again)
•Open API ( think Eucalyptus)
•No API throttling
•Primitives
...
Failures
• SOAP makes little girls cry
• Amazon Web Services, circa 2006 was > 75%
REST or Query
• SOAP well supported by ...
Where are we going?
Design for Scale / Surge 2010
Upcoming SlideShare
Loading in …5
×

Design for Scale / Surge 2010

3,169 views
3,103 views

Published on

Christopher Brown's surgecon2010 talk on resilient, scalable systems based on his work on Amazon's EC2 and the Opscode Platform.

Published in: Technology, Business
1 Comment
7 Likes
Statistics
Notes
No Downloads
Views
Total views
3,169
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
41
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide

Design for Scale / Surge 2010

  1. 1. Copyright © 2010 Opscode, Inc - All Rights Reserved ‣ cb@opscode.com ‣ @skeptomai ‣ www.opscode.com Christopher Brown VP, Engineering 1 Design for Scale
  2. 2. Copyright © 2010 Opscode, Inc - All Rights Reserved 2 Who am I?
  3. 3. Copyright © 2010 Opscode, Inc - All Rights Reserved 2 Who am I? •Amazon EC2
  4. 4. Copyright © 2010 Opscode, Inc - All Rights Reserved 2 Who am I? •Amazon EC2 •Microsoft Edge Computing Network
  5. 5. Copyright © 2010 Opscode, Inc - All Rights Reserved 2 Who am I? •Amazon EC2 •Microsoft Edge Computing Network •Opscode
  6. 6. Google, Amazon, Microsoft built their own tools
  7. 7. Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute P almost everyone else is here... ... inexperienced or poorly equipped for the world in which we now operate. 4
  8. 8. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
  9. 9. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping
  10. 10. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping
  11. 11. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping Configuration
  12. 12. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping Configuration
  13. 13. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping Configuration Command & Control
  14. 14. The Method http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/ Bootstrapping Configuration Command & Control Nanite!
  15. 15. Copyright © 2010 Opscode, Inc - All Rights Reserved 6 Got it?
  16. 16. Copyright © 2010 Opscode, Inc - All Rights Reserved 6 Got it?Defining the cloud is like this...
  17. 17. Copyright © 2010 Opscode, Inc - All Rights Reserved 7 Origin Myth of EC2
  18. 18. Copyright © 2010 Opscode, Inc - All Rights Reserved 7 Origin Myth of EC2
  19. 19. Copyright © 2010 Opscode, Inc - All Rights Reserved 7 Origin Myth of EC2
  20. 20. Copyright © 2010 Opscode, Inc - All Rights Reserved 7 Origin Myth of EC2
  21. 21. Copyright © 2010 Opscode, Inc - All Rights Reserved 7 Origin Myth of EC2
  22. 22. Dynamism
  23. 23. Dynamism ...not about excess capacity...
  24. 24. Dynamism
  25. 25. Dynamism • Disintermediation • Developers can freely experiment
  26. 26. Dynamism • Disintermediation • Developers can freely experiment • Isolation • Applications safely co-exist
  27. 27. Dynamism • Disintermediation • Developers can freely experiment • Isolation • Applications safely co-exist • Utilization • Best use of expensive resources
  28. 28. Dynamism • Disintermediation • Developers can freely experiment This is what you are paying for • Isolation • Applications safely co-exist • Utilization • Best use of expensive resources
  29. 29. Scale
  30. 30. Scale You are not this BIG
  31. 31. Scale You are not this BIG
  32. 32. You are not that BIG • LAMP can scale on generic architecture • 2008 - Facebook has over 800 memcached servers, with 28 terabytes of RAM • 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM • Don’t design for A Million Users • Ship early, Ship ugly, Ship often!
  33. 33. You are not that BIG • LAMP can scale on generic architecture • 2008 - Facebook has over 800 memcached servers, with 28 terabytes of RAM • 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM • Don’t design for A Million Users • Ship early, Ship ugly, Ship often!
  34. 34. EC2 Design Principles • Minimize management footprint • Run inVMs just like customers. • Forced to analyze what must run in privileged space • “Harden everything” means separate network traffic inside the datacenter – customers and management run there • True multi-tenancy - Customers run side- by-side • Design by Fight Club • "You are not a beautiful and unique snowflake“ • “On a large enough time line, the survival rate for everyone will drop to zero.”  http://www.flickr.com/photos/europedistrict/4058066840/
  35. 35. Copyright © 2010 Opscode, Inc - All Rights Reserved 13 • Simple API, single unit of work • think of early Unix tools (MH) • Can compose with other APIs • Does not define policy / coupling • Customers will surprise you Primitives
  36. 36. Copyright © 2010 Opscode, Inc - All Rights Reserved 14 APIs, Mashups
  37. 37. Copyright © 2010 Opscode, Inc - All Rights Reserved 15 http://www.flickr.com/photos/jfseesthings/4293062294/sizes/l/ Simplify • Move complexity “up the stack” • Easier to debug • “Simple and Open” wins • OAuth, OpenID • ATOM, REST • Example: EC2 Metadata - HTTP
  38. 38. Cost
  39. 39. Cost • CapEx versus OpEx
  40. 40. Cost • CapEx versus OpEx • The Cloud is not “Cheaper”
  41. 41. Cost • CapEx versus OpEx • The Cloud is not “Cheaper” • Do you have money, time, or experience?
  42. 42. Cost What are you willing to pay for? • CapEx versus OpEx • The Cloud is not “Cheaper” • Do you have money, time, or experience?
  43. 43. Copyright © 2010 Opscode, Inc - All Rights Reserved 17 Power
  44. 44. Copyright © 2010 Opscode, Inc - All Rights Reserved 17 Power
  45. 45. Copyright © 2010 Opscode, Inc - All Rights Reserved 17 Power
  46. 46. Nobody ever imagined a band of Orcs would steal a database table Charles Stross - Halting State
  47. 47. MTTF & MTTR Understanding how, when and why things fail is great ... but http://www.flickr.com/photos/dierken/948171048/sizes/z/
  48. 48. MTTF & MTTR Understanding how, when and why things fail is great ... but If your Mean Time to Recover exceeds the time value of your data, your business is DEAD http://www.flickr.com/photos/dierken/948171048/sizes/z/
  49. 49. Testing • Test with production-like dataset and performance • Don’t do “Design by Laptop” • A/B Testing • API versioning
  50. 50. Pull the Plug •Create test environment •Pull the plug •Document •Pull the plug again! http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
  51. 51. Pull the Plug •Create test environment •Pull the plug •Document •Pull the plug again! http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
  52. 52. vs Theo Morpheus
  53. 53. • Vertical vs Horizontal Scale • Availability • Reliability • 99% vs 99.x% per unit? vs Theo Morpheus
  54. 54. Free your mind... • Vertical vs Horizontal Scale • Availability • Reliability • 99% vs 99.x% per unit? vs Theo Morpheus
  55. 55. Free your mind... • Vertical vs Horizontal Scale • Availability • Reliability • 99% vs 99.x% per unit? vs Theo Morpheus You are not Theo
  56. 56. Free your mind... • Vertical vs Horizontal Scale • Availability • Reliability • 99% vs 99.x% per unit? vs Theo Morpheus You are not Theo You’re probably not Morpheus either
  57. 57. Free your mind... • Vertical vs Horizontal Scale • Availability • Reliability • 99% vs 99.x% per unit? vs Theo Morpheus You are not Theo You’re probably not Morpheus either
  58. 58. Availability • For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response. • “Read globally,Write locally" with inconsistent cache • Service Level Agreements, even (especially?) internally
  59. 59. Think Globally, Act Locally • Global but inconsistent aggregate view • Local action where data is authoritative • Autonomy • “Rightsizing” your failure domain http://www.flickr.com/photos/28634332@N05/3872137437/sizes/m/in/photostream/
  60. 60. Distributed Systems Design • Avoid execution caching • “Don’t lie, don’t retry” • Embrace failure • Don’t block the client • Avoid internal policy • Ensure the system makes forward progress
  61. 61. Copyright © 2010 Opscode, Inc - All Rights Reserved 26 • It’s OK to apologize • It’s better to completely fail for some users than penalize all of them • The Web is all about “Hit Refresh” Embrace Failure
  62. 62. Apologize ...to Pat Helland
  63. 63. • Distributed Throttling • Staged / Pipeline with back pressure • Measure scalability at each stage • Degraded performance • Make progress for admitted requests • At odds with “stateless” / session-less Admission Control http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
  64. 64. • Distributed Throttling • Staged / Pipeline with back pressure • Measure scalability at each stage • Degraded performance • Make progress for admitted requests • At odds with “stateless” / session-less Admission Control http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
  65. 65. Make Forward Progress • MVCC, vector clocks, & reconciliation • Don’t resurrect objects • always go forward, never go back • "name" is a property of an object, not its unique key • Break the link, garbage collect later • Model “degraded service” performance
  66. 66. Request Signing • Stateless - no session tracking to lose or to purge later • X509 - only public information on front- end boxes. More secure against exploit • Shared secret - faster, smaller signature but requires secret info close to request front- end
  67. 67. Measure Monitor Respond • Save *everything* *forever* • Histograms / Pareto Chart • tp99.9, tp99, and tp90 • ignore tp50,“average” • http://en.wikipedia.org/wiki/Control_chart • http://www.newrelic.com/ • http://www.splunk.com/ • skewness, kurtosis
  68. 68. Control Chart • Day over Day • Same Day,Year overYear • Confidence Intervals “Shewhart stressed that bringing a production process into a state of statistical control, where there is only common-cause variation, and keeping it in control, is necessary to predict future output and to manage a process economically.” • http://en.wikipedia.org/wiki/Control_chart
  69. 69. Characteristic Curves
  70. 70. Periodicity SLA,Variance,Troubleshooting
  71. 71. Data Taxonomy • Precious • Cachable • Expensive • Cheap
  72. 72. Consistency • Authoritative vs. Consultative • is_authorized? vs list group
  73. 73. Performance • Call length • Cyclomatic Complexity • Request ID flow • Vertical vs Horizontal Scale • tension between unit performance and scalability
  74. 74. Failure Domains • EC2 “droplets” • EC2 DNS • Coordinator zones
  75. 75. Copyright © 2010 Opscode, Inc - All Rights Reserved 39 Still with me?
  76. 76. Successes •Sharable “AMI”s •Metadata (Simple and open again) •Open API ( think Eucalyptus) •No API throttling •Primitives •Pay-as you go •Free traffic between S3 and EC2 •Data and Compute together
  77. 77. Failures • SOAP makes little girls cry • Amazon Web Services, circa 2006 was > 75% REST or Query • SOAP well supported by commercial vendors, with their libraries • Still *Way* too hard to use. • Commodity business. Driving the bottom out of cost causes quality to suffer. • API vs UI?, User Experience in general • IaaS (Infrastructure as a Service) is insufficient by itself a hangman's noose. EC2, and the other offerings,
  78. 78. Where are we going?

×