SlideShare a Scribd company logo
1 of 65
ONE MAN OPS
      Reliability & Scale in AWS while letting you sleep through the night
                                                         Jos Boumans - @jiboumans
http://www.fwallpaper.net/picture_pics-Sleepy-cat.html
Tuesday 26 March 13
RIPE NCC
                      Engineering manager for RIPE Database
                                                              http://www.ripe.net/db
Tuesday 26 March 13
CANONICAL
                    Engineering manager for Ubuntu Server 10.04 & 10.10

http://lukeroberts.deviantart.com/art/Destroy-Ubuntu-93235775          http://www.ubuntu.com/business/server/overview
Tuesday 26 March 13
KRUX
                      VP of Operations & Infrastructure

                                                          http://www.krux.com/
Tuesday 26 March 13
GOOD GUYS OF DATA PRIVACY
Tuesday 26 March 13
SOME OF OUR CUSTOMERS
Tuesday 26 March 13
LOTS OF TRAFFIC
http://www.americapictures.net/buenos-aires-traffic-city-night-argentina.html
Tuesday 26 March 13
0                              2,500                 5,000        7,500   10,000



               AVERAGE REQUESTS* / SEC
                                                              *Twitter: New tweets
                                                              Wikipedia: Articles read
https://twitter.com/tps_watcher
                                                              Krux: New data points
http://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm
Tuesday 26 March 13
0                          150,000,000                          300,000,000              450,000,000   600,000,000




                  MONTHLY UNIQUE USERS
http://techcrunch.com/2012/12/18/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months/
http://technorati.com/technology/article/wikipedias-nonprofit-parent-raises-20-million/
Tuesday 26 March 13
WE CHOSE 'THE CLOUD'
http://previewnetworks.com/blog/
Tuesday 26 March 13
THERE ARE DOWNSIDES
http://modernsavage.hubpages.com/hub/10-springfield-shopper-headlines
Tuesday 26 March 13
FOCUS ON AWS
                                     http://aws.amazon.com/
Tuesday 26 March 13
APRIL 21, 2011
                                                                                                                    http://aws.amazon.com/message/65648/
http://businessnerds.wordpress.com/2011/05/28/so-far-so-good…-the-review/   http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html
Tuesday 26 March 13
... SOME OUTAGES ...
                 ... SKIPPED FOR BREVITY ...
Tuesday 26 March 13
JUNE 14, 2012
http://www.laczik.org/BMW/repair/E38_wiring_harness/E38_wiring_harness.html   http://blog.pagerduty.com/2012/06/outage-post-mortem-june-14/
Tuesday 26 March 13
JUNE 29, 2012
http://www.fanpop.com/spots/thunderstorm/images/25416163/title/thunderstorms-wallpaper   http://aws.amazon.com/message/67457/
Tuesday 26 March 13
AWS OUTAGE = YOUR OUTAGE
http://it.mario.wikia.com/wiki/Lakitu
Tuesday 26 March 13
THE RULES HAVE CHANGED
                                                        You're not in Kansas anymore

http://entreatmenot.blogspot.com/2011/04/shattered-dreams.html
Tuesday 26 March 13
NETWORK WILL PARTITION
                                                              And it will happen often

http://thevinylvillain.blogspot.com/2010_04_01_archive.html
Tuesday 26 March 13
DISK IO WILL FLUCTUATE
                                                     On a good day, it's mediocre

http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm
Tuesday 26 March 13
IP ADDRESSES WILL CHANGE
                       IP lease is 8 hours
                      DNS TTL is 60 seconds
www.fantom-xp.com
Tuesday 26 March 13
INSTANCES WILL DIE
                                  And it will always be your Database Master

http://room57.deviantart.com/art/Hangman-188353196
Tuesday 26 March 13
HUMANS MAKE MISTAKES
                      Including your humans

Tuesday 26 March 13
EMBRACE FAILURE
                                Hardware will fail. Humans will make errors.
                                   Nature will produce thunderstorms.
http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm
Tuesday 26 March 13
OR, COLLOQUIALLY

Tuesday 26 March 13
ADJUST YOUR STRATEGY
                                                      Don't bring a knife to a gun fight

http://www.flickr.com/photos/statlerhotel/6628770499/sizes/l/in/photostream/
Tuesday 26 March 13
DATA STORES
                                                     Some work better than others

http://gustavhoiland.com/2010/03/10/stacked-boxes/
Tuesday 26 March 13
RDBMS
         CouchDB
                                                                  BigTable Based
       Dynamo Based
                                                                Master / Slave based




                              CAP THEOREM
                      Your choice: sacrifice availability or consistency.
                                      Orange is a lie.
Tuesday 26 March 13
MYSQL / ORACLE VS RDS
                      See: Network partitioning & instances dying

Tuesday 26 March 13
AMAZON REDSHIFT
                                      Great for analytics/reports, bad for OLTP
                                           Unburden your RDS instances
http://www.flitemedia.com/music.php                                               http://aws.amazon.com/redshift
Tuesday 26 March 13
BIGTABLE BASED STORES
                                 HBase, Accumulo, Hypertable
                      Still suffer when network partitioning happens
                                                                       http://www.cloudera.com/cdh4/

Tuesday 26 March 13
DYNAMO BASED STORES
                                                         Cassandra, Riak, DynamoDB

http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html   http://aws.amazon.com/dynamodb/faqs/
Tuesday 26 March 13
GO HOSTED?
                                 CouchDB, MongoDB, Riak, Cassandra, HBase
                                          Your Latency May Vary
http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html
Tuesday 26 March 13
CLIENT SIDE STORAGE
                                          Keep a copy of your users data locally

http://www.wired.com/gadgetlab/2012/03/badass-gadget-ammo-lunch-box/       http://www.w3.org/2001/tag/2010/09/ClientSideStorage.html
Tuesday 26 March 13
FILE STORES
                                                                EBS vs Instance Store ...
                                                                     ... vs RamFS
http://homedezine.blogspot.com/2011/04/day-my-cat-removed-carpet-photo-studio.html
Tuesday 26 March 13
SIMPLE STORAGE SERVICE
                                                        S3: Arguably AWS' best feature

http://www.iwallpaper.us/gold-star-fo-christmas-wallpaper-140/
Tuesday 26 March 13
TRAFFIC SHAPING
                                                Control every part of the request

http://www.visualphotos.com/image/2x4154765/man_standing_with_traffic_cones_in_shape_of_u-turn
Tuesday 26 March 13
STAY LOCAL IF YOU CAN
                 Going off box exposes you to risks you need to mitigate

http://southshorewoman.com/issue/june-2010/article/local-character
Tuesday 26 March 13
CACHE WHAT YOU CAN
                                  HTTP Responses, DB Queries, User content
                                         Browsers have caches too!
http://theoatmeal.com/blog/charity_money
Tuesday 26 March 13
USE ELASTIC LOAD BALANCERS
                                                They will save you more than once

http://wallpapers5.com/wallpaper/Balance-Green-Tree-Frog/
Tuesday 26 March 13
USE GLOBAL LOAD BALANCING
                      Fail over to the closest data center on region failure

Tuesday 26 March 13
SHOUT OUT: DYN
                      DNS for Bit.ly, Quora, Twitter, Wikia, etc

Tuesday 26 March 13
USE A CDN
                                        Critical items should always be available

http://kadanthuponanimidangal.blogspot.com/2010/12/blog-post_6992.html
Tuesday 26 March 13
MEASURE EVERYTHING
                Find outliers, deviants & trends before they cause trouble

http://www.themoviedb.org/movie/629-the-usual-suspects
Tuesday 26 March 13
GRAPHITE, STATSD & COLLECTD
                       Use Statsd & Collectd for application/system metrics
                           Use graphite to store, aggregate & visualize
                                                                                                                    http://hostedgraphite.com/
http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html   http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/
Tuesday 26 March 13
GRAPH EVENTS
         Deployments, outages, CDN reconfigurations, failed builds, etc
          Anything that's important to the health of your eco system
http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/
Tuesday 26 March 13
COMPARE WEEK TO WEEK
                          Overlay week to week graphs using timeShift()
                         Quickly identifies trends and deviations from trends
http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-10
Tuesday 26 March 13
FORECASTING
                                 Use Holt-Winters confidence bands
                        Verify that your metrics are within normal tolerance
https://github.com/ripienaar/graphite-graph-dsl/wiki/Creating-Holt-Winters-Forecasts
Tuesday 26 March 13
FIND INDIVIDUAL OUTLIERS
                                                      Absolute numbers mean very little
                                                       Use mean & standard deviation
http://en.wikipedia.org/wiki/File:Black_sheep-1.jpg
Tuesday 26 March 13
ALERT ON TRENDS
                                Once you go over a threshold, it's too late
                              Alert on unwanted trends and preemptively fix
http://sub-second.blogspot.com/2012/06/reporting-response-times-percentile.html   http://aphyr.github.com/riemann/
Tuesday 26 March 13
MEASURE WITHOUT RETROFIT
                                          LogFormat "http.beacon:%D|ms" stats
                                         CustomLog "|nc -u localhost 8125" stats
                                                                               http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/
http://absinthemindedhero.blogspot.com/2012/03/victory-nonetheless.html   http://jiboumans.wordpress.com/2013/02/27/realtime-stats-from-varnish/
Tuesday 26 March 13
SHOUT OUT: NEW RELIC
             Java, but also Python, Ruby, .NET, PHP & NodeJS support
             In depth profiling of your app for performance & errors.
Tuesday 26 March 13
CONFIGURATION MANAGEMENT
                                                             Unique snowflakes are bad

http://www.torange.us/Plants/Conifers/spruce-needles-in-hoarfrost-424.html
Tuesday 26 March 13
PUPPET VS CHEF
                            Yes.

                                               http://puppetlabs.com/
                                       http://www.opscode.com/chef
Tuesday 26 March 13
INFRASTRUCTURE AS CODE
                                            Use different environments
                                            Measure and report on it
http://americansingercanary.com/green.htm
Tuesday 26 March 13
SHOUT OUT: UBUNTU
                                      Ubuntu + cloud-init + boto = awesome*
                                                                         *I am biased

http://www.123rf.com/photo_4871141_food-pyramid-isolated-on-white.html                  https://github.com/krux/ops-tools

Tuesday 26 March 13
AWS OPSWORKS
                                  Hosted Chef, No extra charge, Ubuntu 12.04 or Amazon Linux
                                                 Still rough around the edges.

http://thebrandbuilder.files.wordpress.com/2011/08/gordon-01.jpg                               http://aws.amazon.com/opsworks/

Tuesday 26 March 13
DEV = PRODUCTION
                          "I dunno, it worked on my laptop"
                                 Instead, use vagrant
http://vagrantup.com/                                         http://vagrantup.com/
Tuesday 26 March 13
ROLL YOUR OWN AMIS
                                                Instantly boot up new deployments
                                                     Reduce Time to Respond
http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html   http://puppetlabs.com/blog/rapid-scaling-with-auto-generated-amis-using-puppet/
Tuesday 26 March 13
CONFIDENT DEPLOYS
                                                   That human error could be yours

http://www.etsy.com/listing/37178125/stormtrooper-regrets-those-were-the
Tuesday 26 March 13
CONTINUOUS INTEGRATION
                         Ours: Github + Jenkins + FPM + apt::s3
                      From commit to deployable in one command                         http://github.com/
                                                                                    http://jenkins-ci.org/
                                                                      https://github.com/thekad/apt-s3
                                                             https://github.com/jordansissel/fpm/wiki/
Tuesday 26 March 13
ONE CLICK DEPLOYMENTS
                                        Deployments should not be exciting.
                                      Don't create a checklist; automate & track
                                                                                             https://checkmarkable.com
http://www.thegreenhead.com/2012/07/one-click-butter-cutter.php               https://github.com/jib/aws-analysis-tools/
Tuesday 26 March 13
DARK LAUNCHES
               Exercise the code without impacting the user experience
                                                                          http://www.kissmetrics.com/
http://www.layoutsparks.com/pictures/moon-23                   https://github.com/yahoo/boomerang/
Tuesday 26 March 13
SHADOW TRAFFIC
                                                    Test new code against live traffic

http://doppelthingers.tumblr.com/post/12839979386/traffic-light-shadow-hangman-and-possibly-his   https://gist.github.com/3125323
Tuesday 26 March 13
SLEEP TIGHT
                                           Slides at: www.Slideshare.net/jiboumans
                                                 We're hiring: www.krux.com
http://raafay-awan.blogspot.com/2011/08/cats-cutest-of-creatures.html
Tuesday 26 March 13

More Related Content

Similar to Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night

Modules and the Puppet Forge
Modules and the Puppet ForgeModules and the Puppet Forge
Modules and the Puppet ForgePuppet
 
Automatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetAutomatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetPuppet
 
Cloud building talk
Cloud building talkCloud building talk
Cloud building talkbodepd
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereIvan Zoratti
 
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]Jason Rhodes
 
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]Jason Rhodes
 
sampling on the corpus of giants
sampling on the corpus of giantssampling on the corpus of giants
sampling on the corpus of giantsmknoszlig
 
Consideration for Building a Private Cloud
Consideration for Building a Private CloudConsideration for Building a Private Cloud
Consideration for Building a Private CloudOpenStack Foundation
 
Drupal Course 2013 - Form API
Drupal Course 2013 - Form APIDrupal Course 2013 - Form API
Drupal Course 2013 - Form APIAttila Cs. Nagy
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at NetflixRob Spieldenner
 
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...Ernie Hsiung
 
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Andy Davies
 

Similar to Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night (17)

Modules and the Puppet Forge
Modules and the Puppet ForgeModules and the Puppet Forge
Modules and the Puppet Forge
 
WTF Amazon AWS
WTF Amazon AWSWTF Amazon AWS
WTF Amazon AWS
 
Automatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetAutomatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with Puppet
 
Cloud building talk
Cloud building talkCloud building talk
Cloud building talk
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens Here
 
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
 
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
 
Wphackergalaxy
WphackergalaxyWphackergalaxy
Wphackergalaxy
 
sampling on the corpus of giants
sampling on the corpus of giantssampling on the corpus of giants
sampling on the corpus of giants
 
Consideration for Building a Private Cloud
Consideration for Building a Private CloudConsideration for Building a Private Cloud
Consideration for Building a Private Cloud
 
Drupal Course 2013 - Form API
Drupal Course 2013 - Form APIDrupal Course 2013 - Form API
Drupal Course 2013 - Form API
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at Netflix
 
Faster mobile sites
Faster mobile sitesFaster mobile sites
Faster mobile sites
 
Blind XSS
Blind XSSBlind XSS
Blind XSS
 
Wordcamps 2013
Wordcamps 2013Wordcamps 2013
Wordcamps 2013
 
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
 
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
 

Recently uploaded

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Recently uploaded (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night

  • 1. ONE MAN OPS Reliability & Scale in AWS while letting you sleep through the night Jos Boumans - @jiboumans http://www.fwallpaper.net/picture_pics-Sleepy-cat.html Tuesday 26 March 13
  • 2. RIPE NCC Engineering manager for RIPE Database http://www.ripe.net/db Tuesday 26 March 13
  • 3. CANONICAL Engineering manager for Ubuntu Server 10.04 & 10.10 http://lukeroberts.deviantart.com/art/Destroy-Ubuntu-93235775 http://www.ubuntu.com/business/server/overview Tuesday 26 March 13
  • 4. KRUX VP of Operations & Infrastructure http://www.krux.com/ Tuesday 26 March 13
  • 5. GOOD GUYS OF DATA PRIVACY Tuesday 26 March 13
  • 6. SOME OF OUR CUSTOMERS Tuesday 26 March 13
  • 8. 0 2,500 5,000 7,500 10,000 AVERAGE REQUESTS* / SEC *Twitter: New tweets Wikipedia: Articles read https://twitter.com/tps_watcher Krux: New data points http://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm Tuesday 26 March 13
  • 9. 0 150,000,000 300,000,000 450,000,000 600,000,000 MONTHLY UNIQUE USERS http://techcrunch.com/2012/12/18/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months/ http://technorati.com/technology/article/wikipedias-nonprofit-parent-raises-20-million/ Tuesday 26 March 13
  • 10. WE CHOSE 'THE CLOUD' http://previewnetworks.com/blog/ Tuesday 26 March 13
  • 12. FOCUS ON AWS http://aws.amazon.com/ Tuesday 26 March 13
  • 13. APRIL 21, 2011 http://aws.amazon.com/message/65648/ http://businessnerds.wordpress.com/2011/05/28/so-far-so-good…-the-review/ http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html Tuesday 26 March 13
  • 14. ... SOME OUTAGES ... ... SKIPPED FOR BREVITY ... Tuesday 26 March 13
  • 15. JUNE 14, 2012 http://www.laczik.org/BMW/repair/E38_wiring_harness/E38_wiring_harness.html http://blog.pagerduty.com/2012/06/outage-post-mortem-june-14/ Tuesday 26 March 13
  • 17. AWS OUTAGE = YOUR OUTAGE http://it.mario.wikia.com/wiki/Lakitu Tuesday 26 March 13
  • 18. THE RULES HAVE CHANGED You're not in Kansas anymore http://entreatmenot.blogspot.com/2011/04/shattered-dreams.html Tuesday 26 March 13
  • 19. NETWORK WILL PARTITION And it will happen often http://thevinylvillain.blogspot.com/2010_04_01_archive.html Tuesday 26 March 13
  • 20. DISK IO WILL FLUCTUATE On a good day, it's mediocre http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm Tuesday 26 March 13
  • 21. IP ADDRESSES WILL CHANGE IP lease is 8 hours DNS TTL is 60 seconds www.fantom-xp.com Tuesday 26 March 13
  • 22. INSTANCES WILL DIE And it will always be your Database Master http://room57.deviantart.com/art/Hangman-188353196 Tuesday 26 March 13
  • 23. HUMANS MAKE MISTAKES Including your humans Tuesday 26 March 13
  • 24. EMBRACE FAILURE Hardware will fail. Humans will make errors. Nature will produce thunderstorms. http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm Tuesday 26 March 13
  • 26. ADJUST YOUR STRATEGY Don't bring a knife to a gun fight http://www.flickr.com/photos/statlerhotel/6628770499/sizes/l/in/photostream/ Tuesday 26 March 13
  • 27. DATA STORES Some work better than others http://gustavhoiland.com/2010/03/10/stacked-boxes/ Tuesday 26 March 13
  • 28. RDBMS CouchDB BigTable Based Dynamo Based Master / Slave based CAP THEOREM Your choice: sacrifice availability or consistency. Orange is a lie. Tuesday 26 March 13
  • 29. MYSQL / ORACLE VS RDS See: Network partitioning & instances dying Tuesday 26 March 13
  • 30. AMAZON REDSHIFT Great for analytics/reports, bad for OLTP Unburden your RDS instances http://www.flitemedia.com/music.php http://aws.amazon.com/redshift Tuesday 26 March 13
  • 31. BIGTABLE BASED STORES HBase, Accumulo, Hypertable Still suffer when network partitioning happens http://www.cloudera.com/cdh4/ Tuesday 26 March 13
  • 32. DYNAMO BASED STORES Cassandra, Riak, DynamoDB http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html http://aws.amazon.com/dynamodb/faqs/ Tuesday 26 March 13
  • 33. GO HOSTED? CouchDB, MongoDB, Riak, Cassandra, HBase Your Latency May Vary http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html Tuesday 26 March 13
  • 34. CLIENT SIDE STORAGE Keep a copy of your users data locally http://www.wired.com/gadgetlab/2012/03/badass-gadget-ammo-lunch-box/ http://www.w3.org/2001/tag/2010/09/ClientSideStorage.html Tuesday 26 March 13
  • 35. FILE STORES EBS vs Instance Store ... ... vs RamFS http://homedezine.blogspot.com/2011/04/day-my-cat-removed-carpet-photo-studio.html Tuesday 26 March 13
  • 36. SIMPLE STORAGE SERVICE S3: Arguably AWS' best feature http://www.iwallpaper.us/gold-star-fo-christmas-wallpaper-140/ Tuesday 26 March 13
  • 37. TRAFFIC SHAPING Control every part of the request http://www.visualphotos.com/image/2x4154765/man_standing_with_traffic_cones_in_shape_of_u-turn Tuesday 26 March 13
  • 38. STAY LOCAL IF YOU CAN Going off box exposes you to risks you need to mitigate http://southshorewoman.com/issue/june-2010/article/local-character Tuesday 26 March 13
  • 39. CACHE WHAT YOU CAN HTTP Responses, DB Queries, User content Browsers have caches too! http://theoatmeal.com/blog/charity_money Tuesday 26 March 13
  • 40. USE ELASTIC LOAD BALANCERS They will save you more than once http://wallpapers5.com/wallpaper/Balance-Green-Tree-Frog/ Tuesday 26 March 13
  • 41. USE GLOBAL LOAD BALANCING Fail over to the closest data center on region failure Tuesday 26 March 13
  • 42. SHOUT OUT: DYN DNS for Bit.ly, Quora, Twitter, Wikia, etc Tuesday 26 March 13
  • 43. USE A CDN Critical items should always be available http://kadanthuponanimidangal.blogspot.com/2010/12/blog-post_6992.html Tuesday 26 March 13
  • 44. MEASURE EVERYTHING Find outliers, deviants & trends before they cause trouble http://www.themoviedb.org/movie/629-the-usual-suspects Tuesday 26 March 13
  • 45. GRAPHITE, STATSD & COLLECTD Use Statsd & Collectd for application/system metrics Use graphite to store, aggregate & visualize http://hostedgraphite.com/ http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/ Tuesday 26 March 13
  • 46. GRAPH EVENTS Deployments, outages, CDN reconfigurations, failed builds, etc Anything that's important to the health of your eco system http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/ Tuesday 26 March 13
  • 47. COMPARE WEEK TO WEEK Overlay week to week graphs using timeShift() Quickly identifies trends and deviations from trends http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-10 Tuesday 26 March 13
  • 48. FORECASTING Use Holt-Winters confidence bands Verify that your metrics are within normal tolerance https://github.com/ripienaar/graphite-graph-dsl/wiki/Creating-Holt-Winters-Forecasts Tuesday 26 March 13
  • 49. FIND INDIVIDUAL OUTLIERS Absolute numbers mean very little Use mean & standard deviation http://en.wikipedia.org/wiki/File:Black_sheep-1.jpg Tuesday 26 March 13
  • 50. ALERT ON TRENDS Once you go over a threshold, it's too late Alert on unwanted trends and preemptively fix http://sub-second.blogspot.com/2012/06/reporting-response-times-percentile.html http://aphyr.github.com/riemann/ Tuesday 26 March 13
  • 51. MEASURE WITHOUT RETROFIT LogFormat "http.beacon:%D|ms" stats CustomLog "|nc -u localhost 8125" stats http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/ http://absinthemindedhero.blogspot.com/2012/03/victory-nonetheless.html http://jiboumans.wordpress.com/2013/02/27/realtime-stats-from-varnish/ Tuesday 26 March 13
  • 52. SHOUT OUT: NEW RELIC Java, but also Python, Ruby, .NET, PHP & NodeJS support In depth profiling of your app for performance & errors. Tuesday 26 March 13
  • 53. CONFIGURATION MANAGEMENT Unique snowflakes are bad http://www.torange.us/Plants/Conifers/spruce-needles-in-hoarfrost-424.html Tuesday 26 March 13
  • 54. PUPPET VS CHEF Yes. http://puppetlabs.com/ http://www.opscode.com/chef Tuesday 26 March 13
  • 55. INFRASTRUCTURE AS CODE Use different environments Measure and report on it http://americansingercanary.com/green.htm Tuesday 26 March 13
  • 56. SHOUT OUT: UBUNTU Ubuntu + cloud-init + boto = awesome* *I am biased http://www.123rf.com/photo_4871141_food-pyramid-isolated-on-white.html https://github.com/krux/ops-tools Tuesday 26 March 13
  • 57. AWS OPSWORKS Hosted Chef, No extra charge, Ubuntu 12.04 or Amazon Linux Still rough around the edges. http://thebrandbuilder.files.wordpress.com/2011/08/gordon-01.jpg http://aws.amazon.com/opsworks/ Tuesday 26 March 13
  • 58. DEV = PRODUCTION "I dunno, it worked on my laptop" Instead, use vagrant http://vagrantup.com/ http://vagrantup.com/ Tuesday 26 March 13
  • 59. ROLL YOUR OWN AMIS Instantly boot up new deployments Reduce Time to Respond http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html http://puppetlabs.com/blog/rapid-scaling-with-auto-generated-amis-using-puppet/ Tuesday 26 March 13
  • 60. CONFIDENT DEPLOYS That human error could be yours http://www.etsy.com/listing/37178125/stormtrooper-regrets-those-were-the Tuesday 26 March 13
  • 61. CONTINUOUS INTEGRATION Ours: Github + Jenkins + FPM + apt::s3 From commit to deployable in one command http://github.com/ http://jenkins-ci.org/ https://github.com/thekad/apt-s3 https://github.com/jordansissel/fpm/wiki/ Tuesday 26 March 13
  • 62. ONE CLICK DEPLOYMENTS Deployments should not be exciting. Don't create a checklist; automate & track https://checkmarkable.com http://www.thegreenhead.com/2012/07/one-click-butter-cutter.php https://github.com/jib/aws-analysis-tools/ Tuesday 26 March 13
  • 63. DARK LAUNCHES Exercise the code without impacting the user experience http://www.kissmetrics.com/ http://www.layoutsparks.com/pictures/moon-23 https://github.com/yahoo/boomerang/ Tuesday 26 March 13
  • 64. SHADOW TRAFFIC Test new code against live traffic http://doppelthingers.tumblr.com/post/12839979386/traffic-light-shadow-hangman-and-possibly-his https://gist.github.com/3125323 Tuesday 26 March 13
  • 65. SLEEP TIGHT Slides at: www.Slideshare.net/jiboumans We're hiring: www.krux.com http://raafay-awan.blogspot.com/2011/08/cats-cutest-of-creatures.html Tuesday 26 March 13