Copyright © 2010 Opscode, Inc - All Rights Reserved
‣ cb@opscode.com
‣ @skeptomai
‣ www.opscode.com
Christopher Brown VP, Engineering
1
Design for Scale
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge
Computing
Network
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge
Computing
Network
•Opscode
Google, Amazon, Microsoft
built their own tools
Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute
P
almost everyone else is
here...
... inexperienced or poorly
equipped for the world in
which we now operate.
4
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & Control
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & Control
Nanite!
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?Defining the cloud
is like this...
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Dynamism
Dynamism
...not about excess capacity...
Dynamism
Dynamism
• Disintermediation
• Developers can freely experiment
Dynamism
• Disintermediation
• Developers can freely experiment
• Isolation
• Applications safely co-exist
Dynamism
• Disintermediation
• Developers can freely experiment
• Isolation
• Applications safely co-exist
• Utilization
• Best use of expensive resources
Dynamism
• Disintermediation
• Developers can freely experiment
This is what you are paying for
• Isolation
• Applications safely co-exist
• Utilization
• Best use of expensive resources
Scale
Scale You are not this BIG
Scale You are not this BIG
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 terabytes
of RAM
• 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM
• Don’t design for A Million Users
• Ship early, Ship ugly, Ship often!
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 terabytes
of RAM
• 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM
• Don’t design for A Million Users
• Ship early, Ship ugly, Ship often!
EC2 Design Principles
• Minimize management footprint
• Run inVMs just like customers.
• Forced to analyze what must run in
privileged space
• “Harden everything” means separate
network traffic inside the datacenter –
customers and management run there
• True multi-tenancy - Customers run side-
by-side
• Design by Fight Club
• "You are not a beautiful and unique
snowflake“
• “On a large enough time line, the survival
rate for everyone will drop to zero.” 
http://www.flickr.com/photos/europedistrict/4058066840/
Copyright © 2010 Opscode, Inc - All Rights Reserved 13
• Simple API, single unit of work
• think of early Unix tools (MH)
• Can compose with other APIs
• Does not define policy / coupling
• Customers will surprise you
Primitives
Copyright © 2010 Opscode, Inc - All Rights Reserved 14
APIs, Mashups
Copyright © 2010 Opscode, Inc - All Rights Reserved 15
http://www.flickr.com/photos/jfseesthings/4293062294/sizes/l/
Simplify
• Move complexity “up the stack”
• Easier to debug
• “Simple and Open” wins
• OAuth, OpenID
• ATOM, REST
• Example: EC2 Metadata -
HTTP
Cost
Cost
• CapEx versus OpEx
Cost
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
Cost
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
• Do you have money,
time, or experience?
Cost
What are you willing to pay for?
• CapEx versus OpEx
• The Cloud is not
“Cheaper”
• Do you have money,
time, or experience?
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Nobody ever imagined a band of
Orcs would steal a database table
Charles Stross - Halting State
MTTF & MTTR
Understanding how, when and
why things fail is great ... but
http://www.flickr.com/photos/dierken/948171048/sizes/z/
MTTF & MTTR
Understanding how, when and
why things fail is great ... but
If your Mean Time to Recover exceeds the
time value of your data, your business is
DEAD
http://www.flickr.com/photos/dierken/948171048/sizes/z/
Testing
• Test with production-like dataset and
performance
• Don’t do “Design by Laptop”
• A/B Testing
• API versioning
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
vs
Theo Morpheus
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo You’re probably not Morpheus either
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo You’re probably not Morpheus either
Availability
• For a distributed system to be continuously
available, every request received by a non-failing
node in the system must result in a response.
• “Read globally,Write locally" with inconsistent
cache
• Service Level Agreements, even (especially?)
internally
Think Globally,
Act Locally
• Global but inconsistent aggregate view
• Local action where data is authoritative
• Autonomy
• “Rightsizing” your failure domain
http://www.flickr.com/photos/28634332@N05/3872137437/sizes/m/in/photostream/
Distributed Systems Design
• Avoid execution caching
• “Don’t lie, don’t retry”
• Embrace failure
• Don’t block the client
• Avoid internal policy
• Ensure the system makes forward
progress
Copyright © 2010 Opscode, Inc - All Rights Reserved 26
• It’s OK to apologize
• It’s better to completely fail for some users
than penalize all of them
• The Web is all about “Hit Refresh”
Embrace
Failure
Apologize
...to Pat Helland
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance
• Make progress for admitted requests
• At odds with “stateless” / session-less
Admission
Control
http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance
• Make progress for admitted requests
• At odds with “stateless” / session-less
Admission
Control
http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
Make Forward Progress
• MVCC, vector clocks, & reconciliation
• Don’t resurrect objects
• always go forward, never go back
• "name" is a property of an object, not its
unique key
• Break the link, garbage collect later
• Model “degraded service” performance
Request Signing
• Stateless - no session tracking to lose or to
purge later
• X509 - only public information on front-
end boxes. More secure against exploit
• Shared secret - faster, smaller signature but
requires secret info close to request front-
end
Measure Monitor
Respond
• Save *everything* *forever*
• Histograms / Pareto Chart
• tp99.9, tp99, and tp90
• ignore tp50,“average”
• http://en.wikipedia.org/wiki/Control_chart
• http://www.newrelic.com/
• http://www.splunk.com/
• skewness, kurtosis
Control Chart
• Day over Day
• Same Day,Year overYear
• Confidence Intervals
“Shewhart stressed that bringing a production process into a state of statistical control, where there is
only common-cause variation, and keeping it in control, is necessary to predict future output and to
manage a process economically.”
• http://en.wikipedia.org/wiki/Control_chart
Characteristic Curves
Periodicity
SLA,Variance,Troubleshooting
Data Taxonomy
• Precious
• Cachable
• Expensive
• Cheap
Consistency
• Authoritative vs. Consultative
• is_authorized? vs list group
Performance
• Call length
• Cyclomatic Complexity
• Request ID flow
• Vertical vs Horizontal Scale
• tension between unit performance and
scalability
Failure Domains
• EC2 “droplets”
• EC2 DNS
• Coordinator zones
Copyright © 2010 Opscode, Inc - All Rights Reserved 39
Still with me?
Successes
•Sharable “AMI”s
•Metadata (Simple and open again)
•Open API ( think Eucalyptus)
•No API throttling
•Primitives
•Pay-as you go
•Free traffic between S3 and EC2
•Data and Compute together
Failures
• SOAP makes little girls cry
• Amazon Web Services, circa 2006 was > 75%
REST or Query
• SOAP well supported by commercial vendors,
with their libraries
• Still *Way* too hard to use.
• Commodity business. Driving the bottom out of
cost causes quality to suffer.
• API vs UI?, User Experience in general
• IaaS (Infrastructure as a Service) is insufficient by
itself
a hangman's noose. EC2, and the other offerings,
Where are we going?

Design for Scale / Surge 2010