SlideShare a Scribd company logo
1 of 17
Cloud Native and Epidemic
Failures
April 2014
Adrian Cockcroft
@adrianco @BatteryVentures
http://www.linkedin.com/in/adriancockcroft
Cloud Native?
Epidemic Failures
Automated Diversity
Cloud Native
Construct a highly agile and highly
available service from ephemeral and
often broken components
Inspiration
Numquam ponenda est pluralitas
sine necessitate
Plurality must never be posited
without necessity
Occam’s Razor
Monoculture
Replicate “the best” as patterns
Reduce interaction complexity
Epidemic single point of failure
Pattern Failures
Infrastructure Pattern Failures
Software Stack Pattern Failures
Application Pattern Failures
Infrastructure Pattern Failures
• Device failures – bad batch of disks, PSUs, etc.
• CPU failures – cache corruption, math errors
• Datacenter failures – power, network, disaster
• Routing failures – DNS, Internet/ISP path
Software Stack Pattern Failures
• Time bombs – Counter wrap, memory leak
• Date bombs - Leap year, leap second, epoch
• Expiration – Certs timing out
• Trust revocation – Certificate Authority fails
• Security exploit – e.g. heartbleed
• Language bugs – compile time
• Runtime bugs – JVM, Linux, Hypervisor
• Network bugs – routers, firewalls, protocols
Application Pattern Failures
• Time bombs – Counter wrap, memory leak
• Date bombs - Leap year, leap second, epoch
• Content bombs – Data dependent failure
• Configuration – wrong/bad syntax
• Versioning – incompatible mixes
• Cascading failures – error handling bugs etc.
• Cascading overload – excessive logging etc.
What to do?
Automated diversity management
Diversified automation
Efficient vs. Antifragile
Specific Ideas
• Automate running a mixture
– Diversity as default for any service stack
– No developer overhead, stay agile, low cost
• Support oldest and newest versions together
– Automate running 50/50 mix CentOS/Ubuntu
– Mix versions of JDK, Tomcat, etc.
• Vendor diversity
– Multiple DNS vendors, cloud regions, costs more
– Multiple cloud vendors? Much higher cost.
Generate Permutations
> epi <-
data.frame(java=gl(2,1,8,c("java6","java7")), linux=gl(2,2,8,c("centos","ubuntu"))
, codeversion=gl(2,4,8,c("v34","v35")))
> epi
java linux codeversion
1 java6 centos v34
2 java7 centos v34
3 java6 ubuntu v34
4 java7 ubuntu v34
5 java6 centos v35
6 java7 centos v35
7 java6 ubuntu v35
8 java7 ubuntu v35
Deployment
• Builds
– Manual to test, automate if it works
– Modify build to generate permutation AMIs
– Modify Asgard to auto-deploy permutations
• Data collection
– Tag each instance with its permutation
– Gather metrics by permutation per instance
– Do R-based Design of Experiments analysis
Analysis
• As a function of permutations
– Error rate
– Response time
– CPU Utilization
• Interactions
– E.g. interaction between linux and java
– Contrasts identify components with issues
– Small changes with high statistical significance
GCS Total API Outage for ~1hr
Takeaway
Watch out for monocultures
A|B Testing – it’s not just for personalization
http://perfcap.blogspot.com
http://slideshare.net/adrianco – Netflix
http://slideshare.net/adriancockcroft - Battery
http://www.linkedin.com/in/adriancockcroft
@adrianco @BatteryVentures

More Related Content

What's hot

Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkAdrian Cockcroft
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A ChallengeAdrian Cockcroft
 
From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!Jules Pierre-Louis
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyAdrian Cockcroft
 
Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Adrian Cockcroft
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesAdrian Cockcroft
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceAdrian Cockcroft
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSBilal Aybar
 
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...Docker, Inc.
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCAdrian Cockcroft
 
Rugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzRugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzJames Wickett
 
Staying Secure When Moving to the Cloud - Dave Millier
Staying Secure When Moving to the Cloud - Dave MillierStaying Secure When Moving to the Cloud - Dave Millier
Staying Secure When Moving to the Cloud - Dave MillierTriNimbus
 
The hardest part of microservices: your data
The hardest part of microservices: your dataThe hardest part of microservices: your data
The hardest part of microservices: your dataChristian Posta
 
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...Thoughtworks
 
Successfully Implementing DEV-SEC-OPS in the Cloud
Successfully Implementing DEV-SEC-OPS in the CloudSuccessfully Implementing DEV-SEC-OPS in the Cloud
Successfully Implementing DEV-SEC-OPS in the CloudAmazon Web Services
 
Microservices architecture overview v3
Microservices architecture overview v3Microservices architecture overview v3
Microservices architecture overview v3Dmitry Skaredov
 
Building a Bank out of Microservices (NDC Sydney, August 2016)
Building a Bank out of Microservices (NDC Sydney, August 2016)Building a Bank out of Microservices (NDC Sydney, August 2016)
Building a Bank out of Microservices (NDC Sydney, August 2016)Graham Lea
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Rackspace Academy
 

What's hot (20)

Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
 
Microxchg Microservices
Microxchg MicroservicesMicroxchg Microservices
Microxchg Microservices
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A Challenge
 
From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the Ugly
 
Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for Microservices
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWS
 
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...
Dockerizing CS50: From Cluster to Cloud to Appliance to Container by David Ma...
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCC
 
Rugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzRugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudz
 
Staying Secure When Moving to the Cloud - Dave Millier
Staying Secure When Moving to the Cloud - Dave MillierStaying Secure When Moving to the Cloud - Dave Millier
Staying Secure When Moving to the Cloud - Dave Millier
 
The hardest part of microservices: your data
The hardest part of microservices: your dataThe hardest part of microservices: your data
The hardest part of microservices: your data
 
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...
Real-world Microservices: Lessons from the Front Line - Zhamak Delghani, Thou...
 
Successfully Implementing DEV-SEC-OPS in the Cloud
Successfully Implementing DEV-SEC-OPS in the CloudSuccessfully Implementing DEV-SEC-OPS in the Cloud
Successfully Implementing DEV-SEC-OPS in the Cloud
 
Microservices architecture overview v3
Microservices architecture overview v3Microservices architecture overview v3
Microservices architecture overview v3
 
Building a Bank out of Microservices (NDC Sydney, August 2016)
Building a Bank out of Microservices (NDC Sydney, August 2016)Building a Bank out of Microservices (NDC Sydney, August 2016)
Building a Bank out of Microservices (NDC Sydney, August 2016)
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
 
Mini-Training: Netflix Simian Army
Mini-Training: Netflix Simian ArmyMini-Training: Netflix Simian Army
Mini-Training: Netflix Simian Army
 

Viewers also liked

Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016Adrian Cockcroft
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureAdrian Cockcroft
 
Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and ArchitectureAdrian Cockcroft
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeAdrian Cockcroft
 
Most people cannot say - even to themselves - what their "Business Model" is
Most people cannot say - even to themselves - what their "Business Model" is Most people cannot say - even to themselves - what their "Business Model" is
Most people cannot say - even to themselves - what their "Business Model" is S K "Bal" Palekar
 
Zenoss Monitroing – zendmd Scripting Guide
Zenoss Monitroing – zendmd Scripting GuideZenoss Monitroing – zendmd Scripting Guide
Zenoss Monitroing – zendmd Scripting GuideVCP Muthukrishna
 
More Than Human - Personal Update Magazine - August 2011
More Than Human - Personal Update Magazine - August 2011More Than Human - Personal Update Magazine - August 2011
More Than Human - Personal Update Magazine - August 2011SpiritualOne
 
Eye Catching Photos
Eye Catching PhotosEye Catching Photos
Eye Catching PhotosYee Seng Gan
 
Technology In Schools What Is Changing
Technology  In  Schools  What  Is  ChangingTechnology  In  Schools  What  Is  Changing
Technology In Schools What Is ChangingYarmouth Schools
 
Nair jure123456
Nair jure123456Nair jure123456
Nair jure123456nairjure
 
Optical flares installation
Optical flares installationOptical flares installation
Optical flares installationgalamo11
 
Dynamic ticket pricing. Squeezing more juice from half time oranges
Dynamic ticket pricing. Squeezing more juice from half time oranges  Dynamic ticket pricing. Squeezing more juice from half time oranges
Dynamic ticket pricing. Squeezing more juice from half time oranges Value Partners
 
Web syahid 1210651273
Web syahid 1210651273Web syahid 1210651273
Web syahid 1210651273Moh Syahid
 
Talent Gradient Construction Based on Competency
Talent Gradient Construction Based on CompetencyTalent Gradient Construction Based on Competency
Talent Gradient Construction Based on CompetencyVincent T. ZHAO
 
Quantum Entanglement - Cryptography and Communication
Quantum Entanglement - Cryptography and CommunicationQuantum Entanglement - Cryptography and Communication
Quantum Entanglement - Cryptography and CommunicationYi-Hsueh Tsai
 

Viewers also liked (20)

Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 Structure
 
Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and Architecture
 
Speeding Up Innovation
Speeding Up InnovationSpeeding Up Innovation
Speeding Up Innovation
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
 
Most people cannot say - even to themselves - what their "Business Model" is
Most people cannot say - even to themselves - what their "Business Model" is Most people cannot say - even to themselves - what their "Business Model" is
Most people cannot say - even to themselves - what their "Business Model" is
 
Systems analysis and design (abe)
Systems analysis and design (abe)Systems analysis and design (abe)
Systems analysis and design (abe)
 
Rom - Ruby Object Mapper
Rom - Ruby Object MapperRom - Ruby Object Mapper
Rom - Ruby Object Mapper
 
Zenoss Monitroing – zendmd Scripting Guide
Zenoss Monitroing – zendmd Scripting GuideZenoss Monitroing – zendmd Scripting Guide
Zenoss Monitroing – zendmd Scripting Guide
 
More Than Human - Personal Update Magazine - August 2011
More Than Human - Personal Update Magazine - August 2011More Than Human - Personal Update Magazine - August 2011
More Than Human - Personal Update Magazine - August 2011
 
Eye Catching Photos
Eye Catching PhotosEye Catching Photos
Eye Catching Photos
 
Technology In Schools What Is Changing
Technology  In  Schools  What  Is  ChangingTechnology  In  Schools  What  Is  Changing
Technology In Schools What Is Changing
 
Nair jure123456
Nair jure123456Nair jure123456
Nair jure123456
 
4g5
4g54g5
4g5
 
Optical flares installation
Optical flares installationOptical flares installation
Optical flares installation
 
Dynamic ticket pricing. Squeezing more juice from half time oranges
Dynamic ticket pricing. Squeezing more juice from half time oranges  Dynamic ticket pricing. Squeezing more juice from half time oranges
Dynamic ticket pricing. Squeezing more juice from half time oranges
 
Web syahid 1210651273
Web syahid 1210651273Web syahid 1210651273
Web syahid 1210651273
 
Talent Gradient Construction Based on Competency
Talent Gradient Construction Based on CompetencyTalent Gradient Construction Based on Competency
Talent Gradient Construction Based on Competency
 
Mig gig first draft
Mig gig first draftMig gig first draft
Mig gig first draft
 
Quantum Entanglement - Cryptography and Communication
Quantum Entanglement - Cryptography and CommunicationQuantum Entanglement - Cryptography and Communication
Quantum Entanglement - Cryptography and Communication
 

Similar to Epidemic Failures

Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabitsYves Goeleven
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel RidingChristian Posta
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUsDavid Klee
 
D. Andreadis, Red Hat: Concepts and technical overview of Quarkus
D. Andreadis, Red Hat: Concepts and technical overview of QuarkusD. Andreadis, Red Hat: Concepts and technical overview of Quarkus
D. Andreadis, Red Hat: Concepts and technical overview of QuarkusUni Systems S.M.S.A.
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Markus Eisele
 
Fuse integration-services
Fuse integration-servicesFuse integration-services
Fuse integration-servicesChristian Posta
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internalsTokyo Azure Meetup
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerDavinder Kohli
 
EVE Microservices Platform
EVE Microservices PlatformEVE Microservices Platform
EVE Microservices PlatformAlaa Qutaish
 
SQL Server Clustering for Dummies
SQL Server Clustering for DummiesSQL Server Clustering for Dummies
SQL Server Clustering for DummiesMark Broadbent
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentationlalitjangra9
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsMaarten Smeets
 
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...Avere Systems
 

Similar to Epidemic Failures (20)

Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabits
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel Riding
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
 
D. Andreadis, Red Hat: Concepts and technical overview of Quarkus
D. Andreadis, Red Hat: Concepts and technical overview of QuarkusD. Andreadis, Red Hat: Concepts and technical overview of Quarkus
D. Andreadis, Red Hat: Concepts and technical overview of Quarkus
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
 
Fuse integration-services
Fuse integration-servicesFuse integration-services
Fuse integration-services
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, Docker
 
33rd degree
33rd degree33rd degree
33rd degree
 
EVE Microservices Platform
EVE Microservices PlatformEVE Microservices Platform
EVE Microservices Platform
 
SQL Server Clustering for Dummies
SQL Server Clustering for DummiesSQL Server Clustering for Dummies
SQL Server Clustering for Dummies
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentation
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMs
 
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
 
Migrating to Public Cloud
Migrating to Public CloudMigrating to Public Cloud
Migrating to Public Cloud
 

More from Adrian Cockcroft

Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesAdrian Cockcroft
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONAdrian Cockcroft
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceAdrian Cockcroft
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoAdrian Cockcroft
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferAdrian Cockcroft
 
Dockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesDockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesAdrian Cockcroft
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost OptimizationAdrian Cockcroft
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Adrian Cockcroft
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Adrian Cockcroft
 

More from Adrian Cockcroft (10)

Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential Goroutines
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft Conference
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at Cisco
 
In Search of Segmentation
In Search of SegmentationIn Search of Segmentation
In Search of Segmentation
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper Safer
 
Dockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesDockercon State of the Art in Microservices
Dockercon State of the Art in Microservices
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost Optimization
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Epidemic Failures

  • 1. Cloud Native and Epidemic Failures April 2014 Adrian Cockcroft @adrianco @BatteryVentures http://www.linkedin.com/in/adriancockcroft
  • 3. Cloud Native Construct a highly agile and highly available service from ephemeral and often broken components
  • 5. Numquam ponenda est pluralitas sine necessitate Plurality must never be posited without necessity Occam’s Razor
  • 6. Monoculture Replicate “the best” as patterns Reduce interaction complexity Epidemic single point of failure
  • 7. Pattern Failures Infrastructure Pattern Failures Software Stack Pattern Failures Application Pattern Failures
  • 8. Infrastructure Pattern Failures • Device failures – bad batch of disks, PSUs, etc. • CPU failures – cache corruption, math errors • Datacenter failures – power, network, disaster • Routing failures – DNS, Internet/ISP path
  • 9. Software Stack Pattern Failures • Time bombs – Counter wrap, memory leak • Date bombs - Leap year, leap second, epoch • Expiration – Certs timing out • Trust revocation – Certificate Authority fails • Security exploit – e.g. heartbleed • Language bugs – compile time • Runtime bugs – JVM, Linux, Hypervisor • Network bugs – routers, firewalls, protocols
  • 10. Application Pattern Failures • Time bombs – Counter wrap, memory leak • Date bombs - Leap year, leap second, epoch • Content bombs – Data dependent failure • Configuration – wrong/bad syntax • Versioning – incompatible mixes • Cascading failures – error handling bugs etc. • Cascading overload – excessive logging etc.
  • 11. What to do? Automated diversity management Diversified automation Efficient vs. Antifragile
  • 12. Specific Ideas • Automate running a mixture – Diversity as default for any service stack – No developer overhead, stay agile, low cost • Support oldest and newest versions together – Automate running 50/50 mix CentOS/Ubuntu – Mix versions of JDK, Tomcat, etc. • Vendor diversity – Multiple DNS vendors, cloud regions, costs more – Multiple cloud vendors? Much higher cost.
  • 13. Generate Permutations > epi <- data.frame(java=gl(2,1,8,c("java6","java7")), linux=gl(2,2,8,c("centos","ubuntu")) , codeversion=gl(2,4,8,c("v34","v35"))) > epi java linux codeversion 1 java6 centos v34 2 java7 centos v34 3 java6 ubuntu v34 4 java7 ubuntu v34 5 java6 centos v35 6 java7 centos v35 7 java6 ubuntu v35 8 java7 ubuntu v35
  • 14. Deployment • Builds – Manual to test, automate if it works – Modify build to generate permutation AMIs – Modify Asgard to auto-deploy permutations • Data collection – Tag each instance with its permutation – Gather metrics by permutation per instance – Do R-based Design of Experiments analysis
  • 15. Analysis • As a function of permutations – Error rate – Response time – CPU Utilization • Interactions – E.g. interaction between linux and java – Contrasts identify components with issues – Small changes with high statistical significance
  • 16. GCS Total API Outage for ~1hr
  • 17. Takeaway Watch out for monocultures A|B Testing – it’s not just for personalization http://perfcap.blogspot.com http://slideshare.net/adrianco – Netflix http://slideshare.net/adriancockcroft - Battery http://www.linkedin.com/in/adriancockcroft @adrianco @BatteryVentures