Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

No Microservice is an Island

328 views

Published on

You don't deploy a single microservice. The journey to microservice architecture involves more than how code is written or applications are packaged. It's about creating an interconnected ecosystem that keeps things running. Infrastructure and tools have only grown in importance as microservices have emerged as a dominant architecture pattern. Deploying, scaling, and monitoring are more important for microservices than they ever were before. Attendees will leave this session knowing the basic infrastructure and tooling needs for microservices to be successful.

Published in: Software
  • Be the first to comment

  • Be the first to like this

No Microservice is an Island

  1. 1. No Microservice Is An Island Michele Titolo Lead Software Engineer @ Capital One
  2. 2. Microservices Are A Distributed System
  3. 3. Something WILL PROBABLY GO WRONG
  4. 4. How Do You Figure Out What’s Wrong?
  5. 5. How Do You Deploy A Fix?
  6. 6. New Challenges
  7. 7. Microservices Are More Than Application Size
  8. 8. Failure Points++
  9. 9. Microservices Need An Ecosystem
  10. 10. Apps Must Contribute
  11. 11. Infrastructure Must Contribute
  12. 12. Otherwise You’ll Play Catch Up
  13. 13. Three Biggest Challenges
  14. 14. Deployment
  15. 15. Scaling
  16. 16. Debugging
  17. 17. Create A Foundation
  18. 18. Deploying Microservices
  19. 19. Characteristics
  20. 20. Small Releases
  21. 21. Frequent Releases
  22. 22. Consistent Releases
  23. 23. Limit Special Snowflakes
  24. 24. Top-Notch Tooling
  25. 25. Automate Every Aspect
  26. 26. 🙅
  27. 27.
  28. 28. Strategies
  29. 29. Staged Deployments
  30. 30. Canary
  31. 31. R O U T E R
  32. 32. Blue/Green Red/Black
  33. 33. R O U T E R
  34. 34. R O U T E R
  35. 35. R O U T E R
  36. 36. Automatic Rollbacks
  37. 37. How Do You Know A Deployment Is Successful?
  38. 38. What Features Or Tools Need To Exist?
  39. 39. App
  40. 40. - Tests Pass - Health Check - Dependency/Secret Injection
  41. 41. Robust Tests
  42. 42. Unit And Integration
  43. 43. Frequent Deployments ➡ Frequent Testing
  44. 44. Standardized Health Check
  45. 45. Consume Dependencies And Secrets
  46. 46. Dependencies i.e. URL to database
  47. 47. Secrets i.e. username/password to DB
  48. 48. - Tests Pass - Health Check - Dependency/Secret Injection
  49. 49. System
  50. 50. Aggregate Health Checks
  51. 51. Show System State
  52. 52. Successful Deployment?
  53. 53. Define “Successful”
  54. 54. Proactive
  55. 55. Report
  56. 56. Bots
  57. 57. Recap
  58. 58. Once a Deployment Is Finished, Everything’s Done?
  59. 59. Distributed Systems 
 Are Constantly Changing
  60. 60. Scaling Microservices
  61. 61. App Has More Or Less Load
  62. 62. Health Tracked
  63. 63. - RAM - CPU - Latency - Requests per second
  64. 64. Custom Metrics
  65. 65. How Do Microservices Scale?
  66. 66. Automation
  67. 67. Smart System
  68. 68. Trigger Scaling
  69. 69. Scale Up Scale Down
  70. 70. Scale Up Scale Down
  71. 71. Deploy New Instances
  72. 72. Monitor Health Checks
  73. 73. Send Traffic To New Instances
  74. 74. Scale Up Scale Down
  75. 75. Instances Aren’t Being Used
  76. 76. Stop Routing Traffic
  77. 77. Gracefully Shutdown
  78. 78. Disconnect
  79. 79. Report Progress
  80. 80. Ultimately Get Removed
  81. 81. Routing Traffic?
  82. 82. Infrastructure++
  83. 83. Load Balancer Service Discovery
  84. 84. Load Balancer Service Discovery
  85. 85. All Cloud Providers Have One Built-In
  86. 86. Attach To Scaling Group
  87. 87. New Instances Registered
  88. 88. Conversely, Instances Going Away Are Removed
  89. 89. Load Balancer Service Discovery
  90. 90. Routes To Healthy Instances
  91. 91. Apps Register
  92. 92. Route Via Convention
  93. 93. A B Service Discovery GET /b/foo GET /foo
  94. 94. A 🤢 Service Discovery GET /b/foo 🚫
  95. 95. Knows Health, Can Trigger Scaling Automation
  96. 96. Scaling Only Solves So Many Problems
  97. 97. Debugging Microservices
  98. 98. First You Need To Know There’s A Problem
  99. 99. How?
  100. 100. Alerts
  101. 101. What Qualifies As A Problem?
  102. 102. Varies Per App
  103. 103. RAM, CPU, Latency
  104. 104. Less Obvious, Too
  105. 105. Key Performance Indicator
  106. 106. Alerts Are For Known Issues
  107. 107. Dashboards
  108. 108. How Do You Debug An Issue?
  109. 109. First: Check Problematic App
  110. 110. Pretend Ssh Doesn’t Exist
  111. 111. Next Best Thing: Logs
  112. 112. Log Collection And Aggregation
  113. 113. Configurable Log Levels
  114. 114. Inspect Logs
  115. 115. Non-Trivial Issue ➡ Isolate
  116. 116. Avoid Cascading Failures
  117. 117. How Do You Isolate A Service?
  118. 118. Identify Isolate
  119. 119. Identify Isolate
  120. 120. Q: What Parts Of The System Depend On Another?
  121. 121. Request Tracing
  122. 122. Unique Request Ids
  123. 123. Best As Overlay
  124. 124. Opentracing
  125. 125. Add Header Manually
  126. 126. Identify Isolate
  127. 127. Circuit Breaking
  128. 128. Let App Recover
  129. 129. App Or System
  130. 130. Fail Gracefully
  131. 131. Deploy A Fix
  132. 132. Slowly Start Sending Traffic
  133. 133. Can Be Done Automatically
  134. 134. Built-In Load Balancers Service Discovery
  135. 135. Otherwise Update Manually
  136. 136. Everything Returns To Normal
  137. 137. What If The Problem Isn’t One Of Your Apps?
  138. 138. Do You Have An External Dependency?
  139. 139. No Access To Logs
  140. 140. No Access To Source
  141. 141. You Can’t Deploy A Fix
  142. 142. Fewer Tools
  143. 143. Damage Control
  144. 144. How Limit Impact?
  145. 145. Status Page
  146. 146. Monitor The Status Page
  147. 147. If Status Page Shows No Issue, Raise It
  148. 148. Keeping Your System Healthy
  149. 149. What’s Inside Your Control?
  150. 150. Can’t See Logs
  151. 151. Request Tracing
  152. 152. Circuit Breaking
  153. 153. Degrade Gracefully
  154. 154. Sometimes Infrastructure Fails
  155. 155. Ex: S3 Outage In 2017
  156. 156. Circuit Breaking
  157. 157. To Recap
  158. 158. Debugging Internal Apps
  159. 159. Log, Trace, Circuit Break
  160. 160. Debugging External Apps
  161. 161. Trace And Circuit Break
  162. 162. One Thing All Of These Have In Common…
  163. 163. Metrics
  164. 164. Monitoring
  165. 165. Gain Insights Into Distributed System
  166. 166. Not Free
  167. 167. Apps Updated
  168. 168. Health Checks
  169. 169. Infrastructure Updated
  170. 170. Monitor Health Checks
  171. 171. Gather Information
  172. 172. Create Confidence That System Is Healthy
  173. 173. Creating An Ecosystem For Microservices Is Doable
  174. 174. Requires More Metrics
  175. 175. Requires Smarter Infrastructure
  176. 176. Infrastructure won’t evolve on its own; people need to change their behavior and fundamentally think of what it takes to run an application a different way. Nova, Kris and Garrison, Justin (2017). Cloud Native Infrastructure. O’Reilley.
  177. 177. Build Software To Manage Software
  178. 178. Build Software To Support A Healthy Ecosystem
  179. 179. Thank You! @micheletitolo
  180. 180. • https://unsplash.com/photos/-lp8sTmF9HA • https://unsplash.com/photos/Rfflri94rs8 • https://unsplash.com/photos/ B87VRCWmpqU • https://unsplash.com/photos/aLDuLRjZknE • https://unsplash.com/photos/xal6tVZRCGc • https://unsplash.com/photos/ VRAxb_gNjQQ Photo Credits

×