Testing microservices is challenging. Dividing a system into components naturally creates inter-service dependencies, and each service has its own performance and fault-tolerance characteristics that need to be validated during development, the QA process, and continually in production. Join Daniel Bryant to learn about the theory, techniques and practices needed to overcome this challenge.
– Introduction to the challenges of testing distributed microservice systems
– Learn how to isolate tests within a complex microservice ecosystem
– Introduction to consumer-driven contract testing
– Explore how API simulation can be used for testing work undertaken during DevOps, legacy system and high-volume load testing
– Implementing fault-injection testing to validate nonfunctional requirements in development and QA
– An introduction and discussion of the need for continually validating microservice systems running in production, both through observability and chaos engineering
3. Was Rube Goldberg the first microservice architect?
09/10/2018 @danielbryantuk
4. tl;dr
• A lot of microservice testing attempts I see are at the fringes of a
spectrum from “YOLO” to seeking absolute correctness
• I believe the key tradeoffs should be around pre-prod vs post-prod tests
• Contract testing, API simulation and chaos experimentation can be useful
techniques for microservice testing
09/10/2018 @danielbryantuk
5. @danielbryantuk
• Independent Technical Consultant, Product Architect at Datawire
• Architecture, DevOps, Java, microservices, cloud, containers
• Continuous Delivery (CI/CD) advocate
• Leading change through technology and teams
09/10/2018 @danielbryantuk
bit.ly/2jWDSF7
oreil.ly/2E63nCR
8. The test pyramid (is just a model)
• This model was created before the
rise in popularity of microservices…
• …but after David Parnas’ modularity
• Applies at system and service level
• Probably needs updating…
09/10/2018 @danielbryantuk
martinfowler.com/bliki/TestPyramid.html
9. New testing strategies for microservices
09/10/2018 @danielbryantuk
https://medium.com/@copyconstruct/testing-microservices-the-sane-way-9bb31d158c16 http://distributed-systems-observability-ebook.humio.com/
12. I’m not suggesting that you avoid unit tests
09/10/2018 @danielbryantuk
https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf https://www.youtube.com/watch?v=ZMbqbXxRthE
13. I’m not suggesting that you avoid unit tests
• 77% of production failures can be
reproduced by a unit test
• Testing error handling code could
have prevented 58% of catastrophic
failures
• 35% of catastrophic failures
• Empty error handler, or contains FIXME
• Error handler aborts system
09/10/2018 @danielbryantuk
https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf
14. Integration/component tests
If your component/integration tests look too complicated,
they probably are
Coupling and cohesion apply to everything!
09/10/2018 @danielbryantuk
18. General strategies
• Test outside-in
• Acceptance tests for system and services
• “LUFD” the context and TDD the API
• Virtualise dependencies
• Test contracts of unstable APIs
• Invest in monitoring, synthetic txns and
chaos engineerings (in this order)
09/10/2018 @danielbryantuk
https://itnext.io/microservice-testing-coupling-and-cohesion-all-the-way-down-a9f100cda523
19. Let’s look at some techniques in more depth
09/10/2018 @danielbryantuk
21. So, where do contracts fit into this…
09/10/2018 @danielbryantuk
martinfowler.com/bliki/TestPyramid.html
Contract Tests Focused on system
Focused on service/function
22. API contracts
• APIs are service contracts
• Many are producer-driven
• It’s possible to design outside-in:
• Consumer-Driven Contracts
• martinfowler.com/articles/consumerDrivenContracts.html
09/10/2018 @danielbryantuk
24. CDC workflow
1. Consumer writes a contract that defines an interaction with the API.
1. For HTTP RPC this is simply request with acceptable params and response
2. Often the contract can be autogenerated from a test
2. Consumer issues a pull request to producer containing the contract
3. Producer runs the SUT (via pipeline) and tests if the contract is valid
1. If yes, then simply accept the pull request
2. If no, then modify the SUT to meet the contract (this often involves inter-
team communication), and then accept the pull request
4. Producer deploys (via pipeline), and consumer deploys (via pipeline)
1. Take care in regards to backwards compatibility
09/10/2018 @danielbryantuk
1.
2. 3. 4.
4.
26. CDC for messaging
• What about messaging?
• Message schema are an API
• Pact supports AMQP contracts
• www.infoq.com/presentations/contracts-streaming-microservices
09/10/2018 @danielbryantuk
27. CDC for messaging
09/10/2018 @danielbryantuk
www.infoq.com/presentations/contracts-streaming-microservices
docs.confluent.io/current/schema-registry/docs/maven-plugin.html
28. Contract testing musings
• Great in low trust or poor communication organisations
• Act as a cue for a conversation
• Can be used to implement TDD for the API
• Resource intensive to create and maintain
09/10/2018 @danielbryantuk
37. API simulation musings
• Great when a dependency is “expensive” to access or tricky to mock
• Useful when failure modes of dependency are hard to recreate
• Simulations can be fragile and/or complicated
09/10/2018 @danielbryantuk
48. Chaos engineering prerequisites
Tammy Butow’s three prerequisites:
1. High severity incident management
2. Monitoring
3. Measure the impact of downtime
09/10/2018 @danielbryantuk
https://www.infoq.com/news/2018/03/resilient-systems-chaos-engineer
49. Chaos engineering prerequisites
09/10/2018 @danielbryantuk
https://www.infoq.com/news/2018/03/resilient-systems-chaos-engineer
Tammy Butow’s three prerequisites:
1. High severity incident management
2. Monitoring
3. Measure the impact of downtime
50. Chaos engineering musings
• Great for codifying/asserting system quality attributes
• Can prompt team to think about monitoring and DR/BC
• Can cause a lot of damage if approached casually
09/10/2018 @danielbryantuk
52. Conclusion
• Try and avoid microservice testing strategies that are solely YOLO or
attempting to seek absolute correctness
• Balance pre-prod (generally technology facing and supporting the team)
vs post-prod tests (generally business facing and critiquing the product)
• Contract testing, API simulation and chaos experimentation can be useful
techniques for microservice testing
09/10/2018 @danielbryantuk
We present the result of a comprehensive study investigating 198 randomly selected, user-reported failures that occurred on Cassandra, HBase, Hadoop Distributed File System (HDFS), Hadoop MapReduce, and Redis, with the goal of understanding how one or multiple faults eventually evolve into a user-visible failure. We found that from a testing point of view, almost all failures require only 3 or fewer nodes to reproduce, which is good news considering that these services typically run on a very large number of nodes. H
We present the result of a comprehensive study investigating 198 randomly selected, user-reported failures that occurred on Cassandra, HBase, Hadoop Distributed File System (HDFS), Hadoop MapReduce, and Redis, with the goal of understanding how one or multiple faults eventually evolve into a user-visible failure. We found that from a testing point of view, almost all failures require only 3 or fewer nodes to reproduce, which is good news considering that these services typically run on a very large number of nodes. H