Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Resilient Distributed Systems by Using Caching Command and Rollback-Replay

1,046 views

Published on

At Grace Hopper Conference 2016, Tanuja Phadke discusses the problem with resiliency in distributed systems.

Published in: Technology
  • Be the first to comment

Building Resilient Distributed Systems by Using Caching Command and Rollback-Replay

  1. 1. PAGE 1 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC16 2016Building Resilient Distributed Systems by Using Caching Command and Rollback-Replay Tanuja Phadke tanuja_phadke@intuit.com
  2. 2. PAGE 2 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY The problem with resiliency in distributed systems Single node system Node Database1 Web container caching Database2 • All components reside in the same machine. • It’s not too hard to ensure atomicity. • Either all occur or nothing occurs
  3. 3. PAGE 3 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY The problem with resiliency in distributed systems node node1 node2 node2node1 node1 node2 • Components are spread out. • Maintaining atomicity and resiliency is a challenge. • So we strive for eventual consistency. • The change will eventually be propagated to all the copies of data.
  4. 4. PAGE 4 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Intuit case study: Login Service Intuit makes financial software. Many of these products use the Login service for login and fetching users’ bank accounts and transactions securely.
  5. 5. PAGE 5 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Requirements for the Login Service • Fast response times • Resilience • Fault-tolerance • Consistency
  6. 6. PAGE 6 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 4-step solution we used to solve the problem 1. Decouple design a. Implement single responsibility principle (SRP) b. Use the command pattern 2. Use circuit breaker framework 3. Use reactor to recover 4. Use caching (record)
  7. 7. PAGE 7 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 1. Decouple design • Individual components can be developed independently. • Plug and play components into bigger solution.
  8. 8. PAGE 8 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 1a. Implement single responsibility principle • Module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility. • Separation of concerns • Each module/method does only one task.
  9. 9. PAGE 9 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 1b. Use command pattern Invoker Client creates <<interface>> Command execute() recover() Concrete Command A Concrete Command B implements uses creates A behavioral design pattern in which an object is used to encapsulate all information needed to perform an action or trigger an event at a later time.
  10. 10. PAGE 10 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Benefits of the command pattern • Each command knows how to execute itself. • Each command knows how to react to failures. • Rollback • Retry • Something else
  11. 11. PAGE 11 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Traditional model with services Orchestration Handler Service A Service B create update delete get create update delete get GET PUT
  12. 12. PAGE 12 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Introduce commands Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Command create, update ... Command create, update ...
  13. 13. PAGE 13 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 2. Use circuit breaker • Circuit breaker is used to detect failures, and encapsulates logic to reacting to failure (during maintenance, temporary external system failure or unexpected system difficulties). • The circuit breaker pattern is a stability patterns applied in a RESTful architecture. • Several open sources are available (Hystrix is developed by Netflix and is popular open source).
  14. 14. PAGE 14 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Example of circuit breaker
  15. 15. PAGE 15 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 3. Use reactor • Gets invoked in case of failure. • We can specify the behavior. • Rollback • Retry • Trigger a back-up • Fallback
  16. 16. PAGE 16 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Use circuit breaker Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Circuit breaker Fallback Short circuit Log error Error response CommandCommand
  17. 17. PAGE 17 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 4. Use caching • Use cache to save the commands so that they can be used for recovery. • Some popular open source solutions: • Hazelcast • Memcache • Redis
  18. 18. PAGE 18 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 4. Use caching Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Fallback Short circuit Log error Error response Cache Client Cache Cache Listener [Reactor]
  19. 19. PAGE 19 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Full resilient picture Orchestration Handler Service A Service B create update delete get create update delete get GET PUT
  20. 20. PAGE 20 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Service A fails Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Fallback Short circuit Log error Error response
  21. 21. PAGE 21 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Service A is successful and Service B fails Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Fallback Short circuit Log error Error response Cache Client Cache Cache Listener [Reactor] Reactor(recover) Caching dirty
  22. 22. PAGE 22 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Service A and Service B both succeed Orchestration Handler Service A Service B create update delete get create update delete get GET PUT Fallback Short circuit Log error Error response Cache Client Cache Cache Listener [Reactor] Caching(record) Not Dirty
  23. 23. PAGE 23 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY 4-step solution we used to solve the problem 1. Decouple design a. Implement single responsibility principle (SRP) b. Use the command pattern 2. Use caching (record) 3. Use circuit breaker framework 4. Use reactor to recover
  24. 24. PAGE 24 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Our story: How did we benefit? Over 100 user update requests were failing. • They got slow responses. • Resulted in high CPU utilization and cascading failures. After we implemented this solution, we failed fast and could adhere to the SLAs.
  25. 25. PAGE 25 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY For more info ... Retry pattern https://msdn.microsoft.com/en-us/library/dn589788.aspx Command Handling http://www.axonframework.org/docs/2.0/command- handling.html
  26. 26. PAGE 26 | GRACE HOPPER CELEBRATION 2016 | #GHC16 PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY Thank you Feedback? Download at http://bit.ly/ghc16app or search GHC 16 in the app store Rate and review the session on our mobile app

×