Acug datafiniti pellon_sept2013

Building Resource Efficient
Distributed Systems At Scale
Michael Pellon (@p3ll0n)
Operations Engineer

In the ideal world . . .
. . . we want to be here
cost
work

But in the “real” world . . .
. . . we usually find ourselves here
cost
work

Big “jumps” are possible in a relatively short timeframe!
requestspersecond
~ 2009 - 2012
joules
~ 2013 - ???
RPS/dollar: 4.1x
RPS/joule: 4.3x
RPS/rack: 10.4x

Avoid “density without value”!

“Respect the problem.”
- Theo Schlossnagle, OmniTI

Tradeoffs cannot be solved by marketing.

How to play with the “big boys” when
you are not as “big” as them ...

Lesson #1
Understand deeply the relationship
between latency, bandwidth and capacity
across all levels of your infrastructure.

< disk seeks = higher performance

> caching = higher performance

We end up with an ever increasing amount
of our cheap DRAM is used to hide the
terrible latency of our cheap storage.

This growing split between the bandwidth and latency of
our storage systems only becomes apparent at large scale.

CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Latency 1.17 1.07 1.12 1.09
Annual Bandwidth and Latency Improvements (Patterson, 2004)
* Extracted from leading commodity components over the last 25 years and what is reported is the multiplicative performance increase per year.
➔ CPU fastest to change and DRAM is the slowest.

CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Latency 1.17 1.07 1.12 1.09
➔ Latency is driven by physical limits whereas bandwidth can be
addressed through parallelism.

CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Latency 1.17 1.07 1.12 1.09
➔ Latency is driven by physical limits whereas bandwidth can be
addressed through parallelism.
➔ Bountiful bandwidth with lagging latency!

CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Capacity -- 1.52 -- 1.48
Annual Bandwidth and Capacity Improvements (Patterson, 2004)
➔ Widening gap between bandwidth and capacity.

➔ Time to read a complete disk with random IO is increasing 22x /
decade or 36% / year.
CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Capacity -- 1.52 -- 1.48

➔ Time to read a complete disk with random IO is increasing 22x /
decade or 36% / year.
➔ Now our applications cannot afford to have a cache miss!
CPU DRAM LAN Disk
Bandwidth 1.50 1.27 1.39 1.28
Capacity -- 1.52 -- 1.48

Solutions
Caching, prediction and replication.

Tape is dead.
Disk is tape.
Flash is disk.
RAM locality is king.
- Jim Gray, Microsoft (2006)

Requires very careful attention to durability.

Expend bandwidth to reduce apparent latency.

Expend capacity to reduce apparent latency.

Avoid the problem entirely by using more servers with
cheaper, lower powered processors that more closely
match the capabilities of the memory subsystem.

➔ Leverages the massive volume economics of the smart device (e.g., cell phones
and tablets) market.

➔ Most workloads are not pushing CPU limits but are IO (disk, network or
memory) bound so spending more on a faster CPU will not deliver results.

➔ Price/performance in the device market is far better than current
generation server CPUs because there is far less competition in server
processors prices tend to be higher and price/performance relatively low.

➔ Server CPU = ~$300 - ~$1000

➔ Server CPU = ~$300 - ~$1000
➔ ARM CPU = ~$15 / Intel Atom S1200 = ~$65

➔ Server CPU = ~$300 - ~$1000
➔ ~25% the processing rate @ ~10% the cost!

➔ Server CPU = ~$300 - ~$1000
➔ ~25% the processing rate @ ~10% the cost!
➔ Volume of the device ecosystem fuels innovation so the performance gap
shrinks each generation!

➔ These machines also help with one of the biggest and most certainly the
fastest growing cost of any data center -- power!

➔ Your typical 8-core server uses ~200W idle and above 600W TDP (full tilt
boogie)!

boogie)!
➔ Bringing 30A @ 208V to each rack that is a 6.2 kW rack (and I know of folks
provisioning 12 - 14 kW racks just to fill it up 50%!)

boogie)!
➔ Bringing 30A @ 208V to each rack that is a 6.2 kW rack (and I know of folks
provisioning 12 - 14 kW racks just to fill it up 50%!)
➔ If you can save a lot on op-ex by spending a little more on cap-ex it’s a great
bargain! (ask your CFO!)

➔ People costs dominate the enterprise player’s data centers but it is very easy
and cheap to not let them dominate your costs.

➔ People costs dominate the enterprise player’s data centers but it is very easy
and cheap to not let them dominate your costs.
➔ The barrier to entry into automation tools (Puppet, Chef, etc) has never been
lower and their penetration into existing systems (networking devices, etc)
has never been higher.

Lesson #2
Understand that distributed systems are
fundamentally about dealing with
distance and having more than one thing.

Currently writing distributed applications is usually not
indistinguishable from writing non-distributed applications.

Despite the non-zero probability of failure within a
nearly every aspect of modern computers;
developers of non-distributed applications do not
routinely maintain a concept of failing hardware.

instruction
s
behaviors
programming
language
hardware
limitations

The difference between an entire data center and a single
computer should only be quantitative not qualitative.

Since software development is an entirely
quantitative pursuit we should be able to conceal the
entire complexity of the Internet within software.

A clear trajectory in the same direction …
➔ Erlang OTP (Ericsson) and GoCircuit (Tumblr).

➔ General-purpose distributed file systems (and protocols)
spanning multiple globally distributed data centers.

➔ Datacenter-scale job schedulers also abound (Google’s
Borg/Omega, Apache Mesos, Airbnb’s Chronos, etc.)

Borg/Omega, Apache Mesos, Airbnb Chronos, etc.)
➔ nanomsg scalability protocols (M. Sustrik).

Borg/Omega, Mesos, Airbnb, etc.)
➔ nanomsg scalability protocols (M. Sustrik).
➔ Not only possible but the clear “silent” choice of the
majority!

So how to play “big” when you’re “small”?
➔ You need to understand your technical substrate both
broadly and deeply so you know where to focus all your
resources most effectively.

➔ That understanding will allow to you operate at
economies of scale that free up your most important
resource -- people.

➔ That understanding will allow to you operate at
economies of scale that free up your most important
resource -- people.
➔ But remember the focus of our resources is not
necessarily where your resources should be focused nor is
anyone elses.

➔ Look for areas where a qualitative difference could easily
become merely a quantitative difference.

➔ Look for areas where a qualitative difference could easily
become merely a quantitative difference.
➔ Quantitative problems are easy to solve through
technology, however, qualitative problems are very
intractable through technology alone.

Acug datafiniti pellon_sept2013

Recommended

Recommended

More Related Content

Similar to Acug datafiniti pellon_sept2013

Similar to Acug datafiniti pellon_sept2013 (20)

Recently uploaded

Recently uploaded (20)

Acug datafiniti pellon_sept2013