SlideShare a Scribd company logo
Chaos Testing for Docker Containers
Who am I?
‣Alexei Ledenev (@alexeiled)
‣Chief of Research @codefresh.io
‣Open Source Projects
‣github.com/alexei-led/pumba
‣github.com/codefresh-io/microci
‣#docker #k8s #aws #gcloud
Complex Systems
"Sooner or later, any complex system will fail, and software systems are no exception.
Failure can occur anytime and almost anywhere. So you should never get too comfortable."
Last Year Outages
• IBM Cloud, January 26

• GitLab, January 31

• AWS, February 28

• Microsoft Azure, March 16

• ...

• Visit http://outage.report/
What can we do
to achieve better Quality?
More testing? Better monitoring?
Functional Testing
Performance Testing
Integration Testing
Penetration Testing
Acceptance Testing Log Analytics
Monitoring Alerts
Failure Predictions
Building distributed software today is easier than ever
CAP Theorem
“Of three properties of
shared-data systems
(Consistency, Availability
and tolerance to network
Partitions) only two can be
achieved at any given
moment in time.”
Eric Brewer
Chaos Engineering
• Embrace the failure!
• Defines an empirical approach to resilience testing of distributed software systems 

• Chaos Experiment

- define a "normal/steady" state of the system (e.g. by monitoring a set of system and business
metrics)

- pseudo-randomly inject faults (e.g. by terminating VMs, killing containers or changing network
behavior)

- try to discover system weaknesses by deviation from expected or steady-state behavior 

The harder it is to disrupt the steady state, the more confidence we have in the behavior of the system.
 
http://principlesofchaos.org/
https://github.com/Netflix/SimianArmy
Google :// Chaos Monkey for DockerWarthog
What is Pumba(a)?
1. Pumbaa is a well-known supporting character
(warthog) from Disney’s animated film The Lion King

2.  In Swahili, pumbaa means “to be foolish, silly, weak-
minded, careless, negligent”
3. It's also an open source Chaos Testing tool for Docker
containers 

1. https://github.com/gaia-adm/pumba

2. Linux, Windows, MacOS, Docker
What Pumba can do?
• Pumba disturbs Docker runtime environment, injecting different failures 

• The "victim" container can be specified, providing name/s or regex

• Radom selection is also supported (with `--random` flag)

• It's possible to define a repeatable time interval and duration parameters
to better control the Chaos

• Pumba can disturb either single Docker host, Swarm cluster, and
Kubernetes cluster
Pumba Docker Chaos Commands
1. stop running Docker container

2. kill (send termination or other signal) to the main process within a
Docker container

3. remove "victim" containers, with their links and volumes

4. pause all processes within a "victim" Docker container for a
specified time
demo time ...
Examples
# stop random container once in a 10 minutes
$ pumba --random --interval 10m kill --signal SIGSTOP
# every 15 minutes kill `mysql` container and
# every hour remove containers starting with "cf"
$ pumba --interval 15m kill --signal SIGTERM mysql &
$ pumba --interval 1h rm re2:^cf &
# every 5 min randomly kill "worker1" or "worker2" containers
# and every 3 minutes pause "queue" container for 15s
$ pumba --random --interval 5m kill --signal SIGKILL worker1 worker2 &
$ pumba --interval 3m pause --duration 15s queue &
Pumba Network Chaos Commands
1. Pumba can emulate network failures at container level (filter by IP too)

2. delay egress traffic for the specified containers

3. add packet-loss based on different probability loss models (2-3-4 state
Markov, Gilbert, Simple Gilbert and Bernoulli)

4. rate limit egress traffic for the specified containers
# add 3 seconds delay for all outgoing packets
# on (default) network device of Docker container for 5 minutes
$ pumba netem --duration 5m delay --time 3000 mydb
# add a delay of 3000ms ± 30ms,
# with the next random element depending 20% on the last one,
# for all outgoing packets on device of all Docker container,
# with name start with for 10 minutes
$ pumba netem --duration 5m --interface eth1 delay 
--time 3000 --jitter 30 --correlation 20 re2:^hp
# add a delay of 3000ms ± 40ms, where variation in delay
# is described by normal distribution,
# for all outgoing packets on main network device of randomly
# chosen Docker container
# from the specified list, for 5 minutes
$ pumba --random netem --duration 5m delay --time 3000 
--jitter 40 --distribution normal 
container1 container2 container3
Pumba Netem under the hood
• The Linux kernel offers a native framework for routing, bridging, firewalling, address
translation and much else.

• Before a packet leaves the output interface, it passes through Linux Traffic Control (tc). This
component is a powerful tool for scheduling, shaping, classifying and prioritizing traffic.

• The basic component of Linux Traffic Control is the queuing discipline (qdisc).  The
simplest implementation of a qdisc is first in first out (FIFO). There are others too.

• The network emulation (netem) project adds queuing disciplines that emulate wide area
network properties such as latency, jitter, loss, duplication, corruption and reordering.
demo time ...
pumba netem loss: https://asciinema.org/a/82430
pumba netem delay: https://asciinema.org/a/82428
Chaos Engineering for Docker

More Related Content

What's hot

Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIU
Rohit Jnagal
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos Engineering
Gremlin
 
YOW2018 - Events and Commands: Developing Asynchronous Microservices
YOW2018 - Events and Commands: Developing Asynchronous MicroservicesYOW2018 - Events and Commands: Developing Asynchronous Microservices
YOW2018 - Events and Commands: Developing Asynchronous Microservices
Chris Richardson
 
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
Amazon Web Services
 
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
Lucas Jellema
 
Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practices
Ashutosh Agarwal
 
(DVO401) Deep Dive into Blue/Green Deployments on AWS
(DVO401) Deep Dive into Blue/Green Deployments on AWS(DVO401) Deep Dive into Blue/Green Deployments on AWS
(DVO401) Deep Dive into Blue/Green Deployments on AWS
Amazon Web Services
 
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
Katherine Golovinova
 
Chaos Engineering
Chaos EngineeringChaos Engineering
Chaos Engineering
Amazon Web Services
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
Ran Levy
 
Chaos engineering
Chaos engineering Chaos engineering
Chaos engineering
Alberto Acerbis
 
Introduction to Chaos Engineering
Introduction to Chaos EngineeringIntroduction to Chaos Engineering
Introduction to Chaos Engineering
Raymond Adrian (Rad) Butalid
 
Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?
Katherine Golovinova
 
Kubernetes Probes (Liveness, Readyness, Startup) Introduction
Kubernetes Probes (Liveness, Readyness, Startup) IntroductionKubernetes Probes (Liveness, Readyness, Startup) Introduction
Kubernetes Probes (Liveness, Readyness, Startup) Introduction
AkhmadZakiAlsafi
 
CI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the TimeCI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the Time
Amazon Web Services
 
Chaos Engineering with Gremlin Platform
Chaos Engineering with Gremlin PlatformChaos Engineering with Gremlin Platform
Chaos Engineering with Gremlin Platform
Anshul Patel
 
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Amazon Web Services
 
AWS CDK introduction
AWS CDK introductionAWS CDK introduction
AWS CDK introduction
leo lapworth
 
Autoscaling with Kubernetes
Autoscaling with KubernetesAutoscaling with Kubernetes
Autoscaling with Kubernetes
Johannes Würbach
 
chaos-engineering-Knolx
chaos-engineering-Knolxchaos-engineering-Knolx
chaos-engineering-Knolx
Knoldus Inc.
 

What's hot (20)

Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIU
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos Engineering
 
YOW2018 - Events and Commands: Developing Asynchronous Microservices
YOW2018 - Events and Commands: Developing Asynchronous MicroservicesYOW2018 - Events and Commands: Developing Asynchronous Microservices
YOW2018 - Events and Commands: Developing Asynchronous Microservices
 
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - A...
 
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
 
Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practices
 
(DVO401) Deep Dive into Blue/Green Deployments on AWS
(DVO401) Deep Dive into Blue/Green Deployments on AWS(DVO401) Deep Dive into Blue/Green Deployments on AWS
(DVO401) Deep Dive into Blue/Green Deployments on AWS
 
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
"Certified Kubernetes Administrator Exam – how it was" by Andrii Fedenishin
 
Chaos Engineering
Chaos EngineeringChaos Engineering
Chaos Engineering
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
 
Chaos engineering
Chaos engineering Chaos engineering
Chaos engineering
 
Introduction to Chaos Engineering
Introduction to Chaos EngineeringIntroduction to Chaos Engineering
Introduction to Chaos Engineering
 
Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?
 
Kubernetes Probes (Liveness, Readyness, Startup) Introduction
Kubernetes Probes (Liveness, Readyness, Startup) IntroductionKubernetes Probes (Liveness, Readyness, Startup) Introduction
Kubernetes Probes (Liveness, Readyness, Startup) Introduction
 
CI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the TimeCI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the Time
 
Chaos Engineering with Gremlin Platform
Chaos Engineering with Gremlin PlatformChaos Engineering with Gremlin Platform
Chaos Engineering with Gremlin Platform
 
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
 
AWS CDK introduction
AWS CDK introductionAWS CDK introduction
AWS CDK introduction
 
Autoscaling with Kubernetes
Autoscaling with KubernetesAutoscaling with Kubernetes
Autoscaling with Kubernetes
 
chaos-engineering-Knolx
chaos-engineering-Knolxchaos-engineering-Knolx
chaos-engineering-Knolx
 

Similar to Chaos Engineering for Docker

DevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoMDevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoM
Guillaume Arnaud
 
Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
Docker, Inc.
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
Brendan Gregg
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Zabbix
 
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea LuzzardiWhat's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
Mike Goelzer
 
What's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
What's New in Docker 1.12 by Mike Goelzer and Andrea LuzzardiWhat's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
What's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
Docker, Inc.
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
 
Docker 1.11 Presentation
Docker 1.11 PresentationDocker 1.11 Presentation
Docker 1.11 Presentation
Sreenivas Makam
 
Demystfying container-networking
Demystfying container-networkingDemystfying container-networking
Demystfying container-networking
Balasundaram Natarajan
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Phil Estes
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production Cloud
Salman Baset
 
Kubernetes - Starting with 1.2
Kubernetes  - Starting with 1.2Kubernetes  - Starting with 1.2
Kubernetes - Starting with 1.2
William Stewart
 
Parallelizing CI using Docker Swarm-Mode
Parallelizing CI using Docker Swarm-ModeParallelizing CI using Docker Swarm-Mode
Parallelizing CI using Docker Swarm-Mode
Akihiro Suda
 
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) HypervisorAsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
Dave Voutila
 
FPC for the Masses (SANSFire Edition)
FPC for the Masses (SANSFire Edition)FPC for the Masses (SANSFire Edition)
FPC for the Masses (SANSFire Edition)
Xavier Mertens
 
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz LachJDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
PROIDEA
 
Docker advance1
Docker advance1Docker advance1
Docker advance1
Gourav Varma
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Ricardo Amaro
 

Similar to Chaos Engineering for Docker (20)

DevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoMDevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoM
 
Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea LuzzardiWhat's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
What's New in Docker 1.12 (June 20, 2016) by Mike Goelzer & Andrea Luzzardi
 
What's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
What's New in Docker 1.12 by Mike Goelzer and Andrea LuzzardiWhat's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
What's New in Docker 1.12 by Mike Goelzer and Andrea Luzzardi
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
Docker 1.11 Presentation
Docker 1.11 PresentationDocker 1.11 Presentation
Docker 1.11 Presentation
 
Demystfying container-networking
Demystfying container-networkingDemystfying container-networking
Demystfying container-networking
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker Security
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production Cloud
 
Kubernetes - Starting with 1.2
Kubernetes  - Starting with 1.2Kubernetes  - Starting with 1.2
Kubernetes - Starting with 1.2
 
Parallelizing CI using Docker Swarm-Mode
Parallelizing CI using Docker Swarm-ModeParallelizing CI using Docker Swarm-Mode
Parallelizing CI using Docker Swarm-Mode
 
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) HypervisorAsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
AsiaBSDCon2023 - Hardening Emulated Devices in OpenBSD’s vmd(8) Hypervisor
 
FPC for the Masses (SANSFire Edition)
FPC for the Masses (SANSFire Edition)FPC for the Masses (SANSFire Edition)
FPC for the Masses (SANSFire Edition)
 
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz LachJDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
 
Docker advance1
Docker advance1Docker advance1
Docker advance1
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
 

Recently uploaded

Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 

Recently uploaded (20)

Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 

Chaos Engineering for Docker

  • 1. Chaos Testing for Docker Containers
  • 2. Who am I? ‣Alexei Ledenev (@alexeiled) ‣Chief of Research @codefresh.io ‣Open Source Projects ‣github.com/alexei-led/pumba ‣github.com/codefresh-io/microci ‣#docker #k8s #aws #gcloud
  • 3. Complex Systems "Sooner or later, any complex system will fail, and software systems are no exception. Failure can occur anytime and almost anywhere. So you should never get too comfortable."
  • 4. Last Year Outages • IBM Cloud, January 26 • GitLab, January 31 • AWS, February 28 • Microsoft Azure, March 16 • ... • Visit http://outage.report/
  • 5.
  • 6. What can we do to achieve better Quality? More testing? Better monitoring? Functional Testing Performance Testing Integration Testing Penetration Testing Acceptance Testing Log Analytics Monitoring Alerts Failure Predictions
  • 7. Building distributed software today is easier than ever
  • 8. CAP Theorem “Of three properties of shared-data systems (Consistency, Availability and tolerance to network Partitions) only two can be achieved at any given moment in time.” Eric Brewer
  • 9. Chaos Engineering • Embrace the failure! • Defines an empirical approach to resilience testing of distributed software systems • Chaos Experiment - define a "normal/steady" state of the system (e.g. by monitoring a set of system and business metrics) - pseudo-randomly inject faults (e.g. by terminating VMs, killing containers or changing network behavior) - try to discover system weaknesses by deviation from expected or steady-state behavior The harder it is to disrupt the steady state, the more confidence we have in the behavior of the system.   http://principlesofchaos.org/
  • 11. Google :// Chaos Monkey for DockerWarthog
  • 12. What is Pumba(a)? 1. Pumbaa is a well-known supporting character (warthog) from Disney’s animated film The Lion King 2.  In Swahili, pumbaa means “to be foolish, silly, weak- minded, careless, negligent” 3. It's also an open source Chaos Testing tool for Docker containers 1. https://github.com/gaia-adm/pumba 2. Linux, Windows, MacOS, Docker
  • 13. What Pumba can do? • Pumba disturbs Docker runtime environment, injecting different failures • The "victim" container can be specified, providing name/s or regex • Radom selection is also supported (with `--random` flag) • It's possible to define a repeatable time interval and duration parameters to better control the Chaos • Pumba can disturb either single Docker host, Swarm cluster, and Kubernetes cluster
  • 14. Pumba Docker Chaos Commands 1. stop running Docker container 2. kill (send termination or other signal) to the main process within a Docker container 3. remove "victim" containers, with their links and volumes 4. pause all processes within a "victim" Docker container for a specified time
  • 16. Examples # stop random container once in a 10 minutes $ pumba --random --interval 10m kill --signal SIGSTOP # every 15 minutes kill `mysql` container and # every hour remove containers starting with "cf" $ pumba --interval 15m kill --signal SIGTERM mysql & $ pumba --interval 1h rm re2:^cf & # every 5 min randomly kill "worker1" or "worker2" containers # and every 3 minutes pause "queue" container for 15s $ pumba --random --interval 5m kill --signal SIGKILL worker1 worker2 & $ pumba --interval 3m pause --duration 15s queue &
  • 17. Pumba Network Chaos Commands 1. Pumba can emulate network failures at container level (filter by IP too) 2. delay egress traffic for the specified containers 3. add packet-loss based on different probability loss models (2-3-4 state Markov, Gilbert, Simple Gilbert and Bernoulli) 4. rate limit egress traffic for the specified containers
  • 18. # add 3 seconds delay for all outgoing packets # on (default) network device of Docker container for 5 minutes $ pumba netem --duration 5m delay --time 3000 mydb # add a delay of 3000ms ± 30ms, # with the next random element depending 20% on the last one, # for all outgoing packets on device of all Docker container, # with name start with for 10 minutes $ pumba netem --duration 5m --interface eth1 delay --time 3000 --jitter 30 --correlation 20 re2:^hp # add a delay of 3000ms ± 40ms, where variation in delay # is described by normal distribution, # for all outgoing packets on main network device of randomly # chosen Docker container # from the specified list, for 5 minutes $ pumba --random netem --duration 5m delay --time 3000 --jitter 40 --distribution normal container1 container2 container3
  • 19. Pumba Netem under the hood • The Linux kernel offers a native framework for routing, bridging, firewalling, address translation and much else. • Before a packet leaves the output interface, it passes through Linux Traffic Control (tc). This component is a powerful tool for scheduling, shaping, classifying and prioritizing traffic. • The basic component of Linux Traffic Control is the queuing discipline (qdisc).  The simplest implementation of a qdisc is first in first out (FIFO). There are others too. • The network emulation (netem) project adds queuing disciplines that emulate wide area network properties such as latency, jitter, loss, duplication, corruption and reordering.
  • 20. demo time ... pumba netem loss: https://asciinema.org/a/82430 pumba netem delay: https://asciinema.org/a/82428