SlideShare a Scribd company logo
1 of 51
Download to read offline
Dock’em
Distributed Systems Testing with
NetEm and Docker
Linux.conf.au 2015
Raghavendra Prabhu
 raghavendra.d.prabhu@gmail.com
Percona  raghavendra.prabhu@percona.com
 randomsurfer  wnohang.net  rdprabhu  ronin13
Split Brain?
Our Cluster
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed
system. ”
“ A distributed system is one in which the failure of a
computer you didn’t even know existed can render your own
computer unusable. ” - Leslie Lamport
“ Never attribute to malice that which is adequately
explained by stupidity. ” - Hanlon’s Razor
“ Never attribute to Byzantine failure which can be
explained by an ill node(s) ” - Me
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed
system. ”
“ A distributed system is one in which the failure of a
computer you didn’t even know existed can render your own
computer unusable. ” - Leslie Lamport
“ Never attribute to malice that which is adequately
explained by stupidity. ” - Hanlon’s Razor
“ Never attribute to Byzantine failure which can be
explained by an ill node(s) ” - Me
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed
system. ”
“ A distributed system is one in which the failure of a
computer you didn’t even know existed can render your own
computer unusable. ” - Leslie Lamport
“ Never attribute to malice that which is adequately
explained by stupidity. ” - Hanlon’s Razor
“ Never attribute to Byzantine failure which can be
explained by an ill node(s) ” - Me
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed
system. ”
“ A distributed system is one in which the failure of a
computer you didn’t even know existed can render your own
computer unusable. ” - Leslie Lamport
“ Never attribute to malice that which is adequately
explained by stupidity. ” - Hanlon’s Razor
“ Never attribute to Byzantine failure which can be
explained by an ill node(s) ” - Me
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
20000 feet view
Introduction
Actors
▶ Database - WSREP/PXC
▶ Plugin - Galera
▶ Traffic control
♦ Traffic Control - tc
♦ NetEm
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
Introduction
Actors
▶ Database - WSREP/PXC
▶ Plugin - Galera
▶ Traffic control
♦ Traffic Control - tc
♦ NetEm
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
Introduction
Actors
▶ Database - WSREP/PXC
▶ Plugin - Galera
▶ Traffic control
♦ Traffic Control - tc
♦ NetEm
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
Introduction
Actors
▶ Containers - Docker
▶ Load
♦ Generators - Sysbench, RQG
▶ Network
♦ Dnsmasq
♦ nsenter
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
Introduction
Actors
▶ Containers - Docker
▶ Load
♦ Generators - Sysbench, RQG
▶ Network
♦ Dnsmasq
♦ nsenter
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
Introduction
Actors
▶ Jenkins
♦ Build flow and CI
▶ Storage
♦ Why
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 8 / 41
Distributed Systems Testing
A Kobayashi Maru!
Details
But why
▶ The ‘P’ in CAP
▶ WAN scalability
▶ Real Reason - fun!
▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
Details
But why
▶ The ‘P’ in CAP
▶ WAN scalability
▶ Real Reason - fun!
▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
Details
But why
▶ The ‘P’ in CAP
▶ WAN scalability
▶ Real Reason - fun!
▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
Details
But why
▶ The ‘P’ in CAP
▶ WAN scalability
▶ Real Reason - fun!
▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
Details
But why
▶ Failures in warehouses.
▶ Not quorum, but consensus.
▶ Real world networks and synchronous replication
- Delay
- Partition
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 11 / 41
Galera
Details
Galera
▶ Data-centric approach
▶ EVS
▶ Causality and Synchronous
▶ Latency
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 13 / 41
One can bring the whole
down
Details
Tests
▶ Chaos testing
▶ Flow control with sysbench
▶ Network Loss
▶ Future
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 18 / 41
There is no higher menace than
distributed systems testing
Details
Tests: Chaos testing
▶ Nodes killed at random around sysbench
▶ Less than half of nodes are chosen
▶ docker inspect && SIGKILL
▶ Configurable sleep && retry
♦ Snapshot/Incremental State Transfer
▶ docker restart && repeat
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 20 / 41
Details
Tests: Network Loss
▶ Loss nodes
▶ Detach/Keep qdisc
▶ Reconciliation
▶ Sanity checks
▶ Formation of PC || time to recover
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 21 / 41
Containers!
Details
Docker
▶ Why not virtualize
♦ Occam
♦ Namespaces
▶ Simplicity
♦ Network
♦ One application per node
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 23 / 41
Details
Docker
▶ Portability
- See same qualitative behavior that I do.
▶ Reproducibility
- Makes it determinstic
▶ Configurable and CI
- Byproducts
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 24 / 41
Details
Docker
▶ QEMU vis-à-vis Docker
▶ Scalability
♦ Performance
♦ Feature
▶ Abstraction of channels
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 25 / 41
Details
Container Networking
▶ Linking didn’t help
▶ Dnsmasq to rescue!
♦ Hosts file and volumes
♦ SIGHUP and refresh
▶ Potential issues
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 26 / 41
Details
Noise
▶ Initial setup
- Bridge
- Egress only
- IFB
▶ Present state
▶ NetEm
- tc qdisc buckets
- packet loss, delay, corruption, duplication, reordering
- nsenter
▶ Future
- Docker exec
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 27 / 41
Details
Other noises
▶ Aim
▶ Fsync
- libeatmydata
- Variance
▶ Correlation with network
▶ How with Docker
- LD_PRELOAD
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 28 / 41
The Fix
Details
Eviction
▶ STONITH
▶ Permanent eviction
▶ ’N’ strikes & out!
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 30 / 41
Details
Eviction
▶ Aim
▶ Quorum required
- Why? - Not shoot each other
- Non-PC nodes also.
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
Details
Eviction
▶ Aim
▶ Quorum required
- Why? - Not shoot each other
- Non-PC nodes also.
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
Details
Coredumps with Docker
▶ Breakdown of abstraction
▶ Lack of isolation
▶ What was done
- Volumes
- core_pattern & sysctl
- suid and ulimit
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 32 / 41
Details
WAN Segments
▶ How they work
▶ Random allocation
▶ Joiner starvation
▶ Simulates data center
▶ Donor selection
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 33 / 41
Epilogue
The code
▶ Github: https://github.com/percona/pxc-docker
▶ Jenkins:
http://jenkins.percona.com/job/PXC-5.6-netem/
▶ Contributions/testing welcome!
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 34 / 41
Epilogue
Code: todo
▶ Docker automated builds
▶ Orchestration
▶ Docker
♦ Injection
♦ Signal proxying
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 35 / 41
Future work
Epilogue
Future work
▶ Fault injection
♦ Memory
- Poisoned memory
♦ Disk
- libeatmydata
- Opposite
- ENOSPC
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 37 / 41
Epilogue
Fault injection
▶ CPU
- NUMA?
- Hotplug
▶ More network
- corruption, duplication, reordering, rate-limit
- Better distribution
- Other shaping
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 38 / 41
Epilogue
Further Reading
▶ Byzantine fault tolerance
- Reaching agreement in presence of faults
▶ The Network is Reliable
▶ NetEm
▶ Latency: The New Web Performance Bottleneck
▶ Galera Cluster Documentation
▶ Auto eviction code
▶ Don’t Settle for Eventual Consistency
▶ Extended Virtual Synchrony
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 39 / 41
Epilogue
About
▶ /me: Raghavendra Prabhu, Product Lead, Percona XtraDB
Cluster, Percona.
▶ Slides will be at slideshare.net/slidunder.
▶ About.me: raghavendra.prabhu
▶ Keybase.io: rdprabhu
▶ Presentation under CC BY-SA 4.0
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 40 / 41
Epilogue
Image Credits
▶ http://galeracluster.com/documentation-webpages/
▶ https://en.wikipedia.org/wiki/Network_theory
▶ https://www.facebook.com/sciencedump/photos/a.296290153732762.90161.
111815475513565/985102638184840/?type=1
▶ http://www.thebarrow.org/Neurological_Services/Epilepsy/204354
▶ https://flic.kr/p/9J6GNu
▶ https://secure.flickr.com/photos/brewbooks/7780990192
▶ https://www.flickr.com/photos/kwerfeldein/2649294869
▶ https://www.flickr.com/photos/gcwest/281385801
▶ https://www.flickr.com/photos/bob_in_thailand/9782777742/
▶ http://schauerte.me/data.html
▶ http://ok-panic.net/art/jeff/dennis.jpg
Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 41 / 41

More Related Content

Similar to Dock'em: Distributed Systems Testing with NetEm and Docker

Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Raghavendra Prabhu
 
Corpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testCorpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testRaghavendra Prabhu
 
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...Ontico
 
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Raghavendra Prabhu
 
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentCorpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentRaghavendra Prabhu
 
The MetaCPAN VM for Dummies Part One (Installation)
The MetaCPAN VM for Dummies Part One (Installation)The MetaCPAN VM for Dummies Part One (Installation)
The MetaCPAN VM for Dummies Part One (Installation)Olaf Alders
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKMarian Marinov
 
Other FacilityICCP MasterHistorianDatabaseSCADA.docx
Other FacilityICCP MasterHistorianDatabaseSCADA.docxOther FacilityICCP MasterHistorianDatabaseSCADA.docx
Other FacilityICCP MasterHistorianDatabaseSCADA.docxgerardkortney
 
EEA Faceted Navigation and Plone 6.pdf
EEA Faceted Navigation and Plone 6.pdfEEA Faceted Navigation and Plone 6.pdf
EEA Faceted Navigation and Plone 6.pdfAlin Voinea
 
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский Sigma Software
 
Percona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingPercona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingRaghavendra Prabhu
 
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonPhilip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Novaclayton_oneill
 
Beat the devil: towards a Drupal performance benchmark
Beat the devil: towards a Drupal performance benchmarkBeat the devil: towards a Drupal performance benchmark
Beat the devil: towards a Drupal performance benchmarkPedro González Serrano
 
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Raghavendra Prabhu
 
Developing production OpenFlow controller with Trema
Developing production OpenFlow controller with TremaDeveloping production OpenFlow controller with Trema
Developing production OpenFlow controller with TremaYasunobu Chiba
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRaghavendra Prabhu
 
Creating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE StudioCreating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE Studioelliando dias
 

Similar to Dock'em: Distributed Systems Testing with NetEm and Docker (20)

Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
 
Corpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testCorpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to test
 
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...
Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabh...
 
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
 
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentCorpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
 
The MetaCPAN VM for Dummies Part One (Installation)
The MetaCPAN VM for Dummies Part One (Installation)The MetaCPAN VM for Dummies Part One (Installation)
The MetaCPAN VM for Dummies Part One (Installation)
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Other FacilityICCP MasterHistorianDatabaseSCADA.docx
Other FacilityICCP MasterHistorianDatabaseSCADA.docxOther FacilityICCP MasterHistorianDatabaseSCADA.docx
Other FacilityICCP MasterHistorianDatabaseSCADA.docx
 
EEA Faceted Navigation and Plone 6.pdf
EEA Faceted Navigation and Plone 6.pdfEEA Faceted Navigation and Plone 6.pdf
EEA Faceted Navigation and Plone 6.pdf
 
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский
RSocket — new Reactive cross-network Protocol? Олег Докука и Игорь Лозинский
 
Percona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingPercona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testing
 
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy Person
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Nova
 
Beat the devil: towards a Drupal performance benchmark
Beat the devil: towards a Drupal performance benchmarkBeat the devil: towards a Drupal performance benchmark
Beat the devil: towards a Drupal performance benchmark
 
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
 
Developing production OpenFlow controller with Trema
Developing production OpenFlow controller with TremaDeveloping production OpenFlow controller with Trema
Developing production OpenFlow controller with Trema
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profit
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016
 
Creating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE StudioCreating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE Studio
 

More from Raghavendra Prabhu

Orchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTAOrchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTARaghavendra Prabhu
 
Orchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesOrchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesRaghavendra Prabhu
 
Cassandra Operator with Yelp PaaSTA
Cassandra Operator with Yelp PaaSTACassandra Operator with Yelp PaaSTA
Cassandra Operator with Yelp PaaSTARaghavendra Prabhu
 
Safe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitSafe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitRaghavendra Prabhu
 
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesOrchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesRaghavendra Prabhu
 
Pass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondPass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondRaghavendra Prabhu
 
Cassandra in Docker at Yelp: Opportunities and Challenges
Cassandra in Docker at Yelp: Opportunities and ChallengesCassandra in Docker at Yelp: Opportunities and Challenges
Cassandra in Docker at Yelp: Opportunities and ChallengesRaghavendra Prabhu
 
Taskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerTaskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerRaghavendra Prabhu
 
Taskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerTaskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerRaghavendra Prabhu
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesRaghavendra Prabhu
 
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut:  Orchestrating  Percona XtraDB Cluster with KubernetesClusternaut:  Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut: Orchestrating  Percona XtraDB Cluster with KubernetesRaghavendra Prabhu
 
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Raghavendra Prabhu
 
Working from home - fun, facts and scares!
Working from home -  fun, facts and scares!Working from home -  fun, facts and scares!
Working from home - fun, facts and scares!Raghavendra Prabhu
 
Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Raghavendra Prabhu
 
Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Raghavendra Prabhu
 
ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...Raghavendra Prabhu
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedRaghavendra Prabhu
 

More from Raghavendra Prabhu (20)

Orchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTAOrchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTA
 
Orchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesOrchestrating Cassandra with Kubernetes
Orchestrating Cassandra with Kubernetes
 
Cassandra Operator with Yelp PaaSTA
Cassandra Operator with Yelp PaaSTACassandra Operator with Yelp PaaSTA
Cassandra Operator with Yelp PaaSTA
 
Safe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitSafe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and Profit
 
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesOrchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
 
Pass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondPass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and Beyond
 
Cassandra in Docker at Yelp: Opportunities and Challenges
Cassandra in Docker at Yelp: Opportunities and ChallengesCassandra in Docker at Yelp: Opportunities and Challenges
Cassandra in Docker at Yelp: Opportunities and Challenges
 
Taskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerTaskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task Manager
 
Taskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerTaskerman - a distributed cluster task manager
Taskerman - a distributed cluster task manager
 
NUMA and Java Databases
NUMA and Java DatabasesNUMA and Java Databases
NUMA and Java Databases
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and Opportunities
 
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut:  Orchestrating  Percona XtraDB Cluster with KubernetesClusternaut:  Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
 
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
 
Working from home - fun, facts and scares!
Working from home -  fun, facts and scares!Working from home -  fun, facts and scares!
Working from home - fun, facts and scares!
 
Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Securing databases with systemd for containers and services
Securing databases with systemd for containers and services
 
Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body
 
ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysed
 
Xtrabackup and FTWRL
Xtrabackup and FTWRLXtrabackup and FTWRL
Xtrabackup and FTWRL
 
MySQL-and-virtualization
MySQL-and-virtualizationMySQL-and-virtualization
MySQL-and-virtualization
 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Dock'em: Distributed Systems Testing with NetEm and Docker

  • 1. Dock’em Distributed Systems Testing with NetEm and Docker Linux.conf.au 2015 Raghavendra Prabhu  raghavendra.d.prabhu@gmail.com Percona  raghavendra.prabhu@percona.com  randomsurfer  wnohang.net  rdprabhu  ronin13
  • 4. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  • 5. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  • 6. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  • 7. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  • 9. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  • 10. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  • 11. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  • 12. Introduction Actors ▶ Containers - Docker ▶ Load ♦ Generators - Sysbench, RQG ▶ Network ♦ Dnsmasq ♦ nsenter Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
  • 13. Introduction Actors ▶ Containers - Docker ▶ Load ♦ Generators - Sysbench, RQG ▶ Network ♦ Dnsmasq ♦ nsenter Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
  • 14. Introduction Actors ▶ Jenkins ♦ Build flow and CI ▶ Storage ♦ Why Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 8 / 41
  • 15. Distributed Systems Testing A Kobayashi Maru!
  • 16. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  • 17. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  • 18. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  • 19. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  • 20. Details But why ▶ Failures in warehouses. ▶ Not quorum, but consensus. ▶ Real world networks and synchronous replication - Delay - Partition Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 11 / 41
  • 22. Details Galera ▶ Data-centric approach ▶ EVS ▶ Causality and Synchronous ▶ Latency Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 13 / 41
  • 23.
  • 24.
  • 25.
  • 26. One can bring the whole down
  • 27. Details Tests ▶ Chaos testing ▶ Flow control with sysbench ▶ Network Loss ▶ Future Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 18 / 41
  • 28. There is no higher menace than distributed systems testing
  • 29. Details Tests: Chaos testing ▶ Nodes killed at random around sysbench ▶ Less than half of nodes are chosen ▶ docker inspect && SIGKILL ▶ Configurable sleep && retry ♦ Snapshot/Incremental State Transfer ▶ docker restart && repeat Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 20 / 41
  • 30. Details Tests: Network Loss ▶ Loss nodes ▶ Detach/Keep qdisc ▶ Reconciliation ▶ Sanity checks ▶ Formation of PC || time to recover Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 21 / 41
  • 32. Details Docker ▶ Why not virtualize ♦ Occam ♦ Namespaces ▶ Simplicity ♦ Network ♦ One application per node Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 23 / 41
  • 33. Details Docker ▶ Portability - See same qualitative behavior that I do. ▶ Reproducibility - Makes it determinstic ▶ Configurable and CI - Byproducts Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 24 / 41
  • 34. Details Docker ▶ QEMU vis-à-vis Docker ▶ Scalability ♦ Performance ♦ Feature ▶ Abstraction of channels Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 25 / 41
  • 35. Details Container Networking ▶ Linking didn’t help ▶ Dnsmasq to rescue! ♦ Hosts file and volumes ♦ SIGHUP and refresh ▶ Potential issues Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 26 / 41
  • 36. Details Noise ▶ Initial setup - Bridge - Egress only - IFB ▶ Present state ▶ NetEm - tc qdisc buckets - packet loss, delay, corruption, duplication, reordering - nsenter ▶ Future - Docker exec Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 27 / 41
  • 37. Details Other noises ▶ Aim ▶ Fsync - libeatmydata - Variance ▶ Correlation with network ▶ How with Docker - LD_PRELOAD Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 28 / 41
  • 39. Details Eviction ▶ STONITH ▶ Permanent eviction ▶ ’N’ strikes & out! Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 30 / 41
  • 40. Details Eviction ▶ Aim ▶ Quorum required - Why? - Not shoot each other - Non-PC nodes also. Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
  • 41. Details Eviction ▶ Aim ▶ Quorum required - Why? - Not shoot each other - Non-PC nodes also. Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
  • 42. Details Coredumps with Docker ▶ Breakdown of abstraction ▶ Lack of isolation ▶ What was done - Volumes - core_pattern & sysctl - suid and ulimit Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 32 / 41
  • 43. Details WAN Segments ▶ How they work ▶ Random allocation ▶ Joiner starvation ▶ Simulates data center ▶ Donor selection Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 33 / 41
  • 44. Epilogue The code ▶ Github: https://github.com/percona/pxc-docker ▶ Jenkins: http://jenkins.percona.com/job/PXC-5.6-netem/ ▶ Contributions/testing welcome! Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 34 / 41
  • 45. Epilogue Code: todo ▶ Docker automated builds ▶ Orchestration ▶ Docker ♦ Injection ♦ Signal proxying Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 35 / 41
  • 47. Epilogue Future work ▶ Fault injection ♦ Memory - Poisoned memory ♦ Disk - libeatmydata - Opposite - ENOSPC Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 37 / 41
  • 48. Epilogue Fault injection ▶ CPU - NUMA? - Hotplug ▶ More network - corruption, duplication, reordering, rate-limit - Better distribution - Other shaping Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 38 / 41
  • 49. Epilogue Further Reading ▶ Byzantine fault tolerance - Reaching agreement in presence of faults ▶ The Network is Reliable ▶ NetEm ▶ Latency: The New Web Performance Bottleneck ▶ Galera Cluster Documentation ▶ Auto eviction code ▶ Don’t Settle for Eventual Consistency ▶ Extended Virtual Synchrony Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 39 / 41
  • 50. Epilogue About ▶ /me: Raghavendra Prabhu, Product Lead, Percona XtraDB Cluster, Percona. ▶ Slides will be at slideshare.net/slidunder. ▶ About.me: raghavendra.prabhu ▶ Keybase.io: rdprabhu ▶ Presentation under CC BY-SA 4.0 Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 40 / 41
  • 51. Epilogue Image Credits ▶ http://galeracluster.com/documentation-webpages/ ▶ https://en.wikipedia.org/wiki/Network_theory ▶ https://www.facebook.com/sciencedump/photos/a.296290153732762.90161. 111815475513565/985102638184840/?type=1 ▶ http://www.thebarrow.org/Neurological_Services/Epilepsy/204354 ▶ https://flic.kr/p/9J6GNu ▶ https://secure.flickr.com/photos/brewbooks/7780990192 ▶ https://www.flickr.com/photos/kwerfeldein/2649294869 ▶ https://www.flickr.com/photos/gcwest/281385801 ▶ https://www.flickr.com/photos/bob_in_thailand/9782777742/ ▶ http://schauerte.me/data.html ▶ http://ok-panic.net/art/jeff/dennis.jpg Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 41 / 41