Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dock'em: Distributed Systems Testing with NetEm and Docker

1,108 views

Published on

This talk is about distributed systems testing of Galera with NetEm and Docker!

Video of the talk: https://www.youtube.com/watch?v=YBuuvhSO38s&list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2&index=1

Playlist: https://www.youtube.com/playlist?list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2

Published in: Engineering
  • Be the first to comment

Dock'em: Distributed Systems Testing with NetEm and Docker

  1. 1. Dock’em Distributed Systems Testing with NetEm and Docker Linux.conf.au 2015 Raghavendra Prabhu  raghavendra.d.prabhu@gmail.com Percona  raghavendra.prabhu@percona.com  randomsurfer  wnohang.net  rdprabhu  ronin13
  2. 2. Split Brain?
  3. 3. Our Cluster
  4. 4. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  5. 5. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  6. 6. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  7. 7. Introduction Seed quotes.. “ ’Network is reliable’ - a fallacy of the distributed system. ” “ A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. ” - Leslie Lamport “ Never attribute to malice that which is adequately explained by stupidity. ” - Hanlon’s Razor “ Never attribute to Byzantine failure which can be explained by an ill node(s) ” - Me Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 4 / 41
  8. 8. 20000 feet view
  9. 9. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  10. 10. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  11. 11. Introduction Actors ▶ Database - WSREP/PXC ▶ Plugin - Galera ▶ Traffic control ♦ Traffic Control - tc ♦ NetEm Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 6 / 41
  12. 12. Introduction Actors ▶ Containers - Docker ▶ Load ♦ Generators - Sysbench, RQG ▶ Network ♦ Dnsmasq ♦ nsenter Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
  13. 13. Introduction Actors ▶ Containers - Docker ▶ Load ♦ Generators - Sysbench, RQG ▶ Network ♦ Dnsmasq ♦ nsenter Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 7 / 41
  14. 14. Introduction Actors ▶ Jenkins ♦ Build flow and CI ▶ Storage ♦ Why Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 8 / 41
  15. 15. Distributed Systems Testing A Kobayashi Maru!
  16. 16. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  17. 17. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  18. 18. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  19. 19. Details But why ▶ The ‘P’ in CAP ▶ WAN scalability ▶ Real Reason - fun! ▶ Tolerance to latency variance Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 10 / 41
  20. 20. Details But why ▶ Failures in warehouses. ▶ Not quorum, but consensus. ▶ Real world networks and synchronous replication - Delay - Partition Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 11 / 41
  21. 21. Galera
  22. 22. Details Galera ▶ Data-centric approach ▶ EVS ▶ Causality and Synchronous ▶ Latency Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 13 / 41
  23. 23. One can bring the whole down
  24. 24. Details Tests ▶ Chaos testing ▶ Flow control with sysbench ▶ Network Loss ▶ Future Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 18 / 41
  25. 25. There is no higher menace than distributed systems testing
  26. 26. Details Tests: Chaos testing ▶ Nodes killed at random around sysbench ▶ Less than half of nodes are chosen ▶ docker inspect && SIGKILL ▶ Configurable sleep && retry ♦ Snapshot/Incremental State Transfer ▶ docker restart && repeat Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 20 / 41
  27. 27. Details Tests: Network Loss ▶ Loss nodes ▶ Detach/Keep qdisc ▶ Reconciliation ▶ Sanity checks ▶ Formation of PC || time to recover Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 21 / 41
  28. 28. Containers!
  29. 29. Details Docker ▶ Why not virtualize ♦ Occam ♦ Namespaces ▶ Simplicity ♦ Network ♦ One application per node Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 23 / 41
  30. 30. Details Docker ▶ Portability - See same qualitative behavior that I do. ▶ Reproducibility - Makes it determinstic ▶ Configurable and CI - Byproducts Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 24 / 41
  31. 31. Details Docker ▶ QEMU vis-à-vis Docker ▶ Scalability ♦ Performance ♦ Feature ▶ Abstraction of channels Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 25 / 41
  32. 32. Details Container Networking ▶ Linking didn’t help ▶ Dnsmasq to rescue! ♦ Hosts file and volumes ♦ SIGHUP and refresh ▶ Potential issues Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 26 / 41
  33. 33. Details Noise ▶ Initial setup - Bridge - Egress only - IFB ▶ Present state ▶ NetEm - tc qdisc buckets - packet loss, delay, corruption, duplication, reordering - nsenter ▶ Future - Docker exec Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 27 / 41
  34. 34. Details Other noises ▶ Aim ▶ Fsync - libeatmydata - Variance ▶ Correlation with network ▶ How with Docker - LD_PRELOAD Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 28 / 41
  35. 35. The Fix
  36. 36. Details Eviction ▶ STONITH ▶ Permanent eviction ▶ ’N’ strikes & out! Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 30 / 41
  37. 37. Details Eviction ▶ Aim ▶ Quorum required - Why? - Not shoot each other - Non-PC nodes also. Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
  38. 38. Details Eviction ▶ Aim ▶ Quorum required - Why? - Not shoot each other - Non-PC nodes also. Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 31 / 41
  39. 39. Details Coredumps with Docker ▶ Breakdown of abstraction ▶ Lack of isolation ▶ What was done - Volumes - core_pattern & sysctl - suid and ulimit Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 32 / 41
  40. 40. Details WAN Segments ▶ How they work ▶ Random allocation ▶ Joiner starvation ▶ Simulates data center ▶ Donor selection Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 33 / 41
  41. 41. Epilogue The code ▶ Github: https://github.com/percona/pxc-docker ▶ Jenkins: http://jenkins.percona.com/job/PXC-5.6-netem/ ▶ Contributions/testing welcome! Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 34 / 41
  42. 42. Epilogue Code: todo ▶ Docker automated builds ▶ Orchestration ▶ Docker ♦ Injection ♦ Signal proxying Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 35 / 41
  43. 43. Future work
  44. 44. Epilogue Future work ▶ Fault injection ♦ Memory - Poisoned memory ♦ Disk - libeatmydata - Opposite - ENOSPC Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 37 / 41
  45. 45. Epilogue Fault injection ▶ CPU - NUMA? - Hotplug ▶ More network - corruption, duplication, reordering, rate-limit - Better distribution - Other shaping Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 38 / 41
  46. 46. Epilogue Further Reading ▶ Byzantine fault tolerance - Reaching agreement in presence of faults ▶ The Network is Reliable ▶ NetEm ▶ Latency: The New Web Performance Bottleneck ▶ Galera Cluster Documentation ▶ Auto eviction code ▶ Don’t Settle for Eventual Consistency ▶ Extended Virtual Synchrony Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 39 / 41
  47. 47. Epilogue About ▶ /me: Raghavendra Prabhu, Product Lead, Percona XtraDB Cluster, Percona. ▶ Slides will be at slideshare.net/slidunder. ▶ About.me: raghavendra.prabhu ▶ Keybase.io: rdprabhu ▶ Presentation under CC BY-SA 4.0 Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 40 / 41
  48. 48. Epilogue Image Credits ▶ http://galeracluster.com/documentation-webpages/ ▶ https://en.wikipedia.org/wiki/Network_theory ▶ https://www.facebook.com/sciencedump/photos/a.296290153732762.90161. 111815475513565/985102638184840/?type=1 ▶ http://www.thebarrow.org/Neurological_Services/Epilepsy/204354 ▶ https://flic.kr/p/9J6GNu ▶ https://secure.flickr.com/photos/brewbooks/7780990192 ▶ https://www.flickr.com/photos/kwerfeldein/2649294869 ▶ https://www.flickr.com/photos/gcwest/281385801 ▶ https://www.flickr.com/photos/bob_in_thailand/9782777742/ ▶ http://schauerte.me/data.html ▶ http://ok-panic.net/art/jeff/dennis.jpg Raghavendra Prabhu (Percona) Dock’em 13 January, 2015 41 / 41

×