Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Miniature Guide to Operational Features - EdinDevOps - SkeltonThatcher

775 views

Published on

Treating operational aspects of software as 'non-functional requirements' and 'an Ops problem' rather than a core part of the software product leads to poor live service and unexplained errors in Production.

However, many Product Managers understandably feel uneasy about taking on the (necessary) responsibility for prioritising operational features alongside user-visible and API features.

This session aims to bring Scrum Masters and Product Owners up to speed on operational features, empowering them to make effective prioritisation choices about all kinds of product features, whether user-visible or operational.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Miniature Guide to Operational Features - EdinDevOps - SkeltonThatcher

  1. 1. The Miniature Guide to Operational Features Edinburgh DevOps Meetup – 15th September 2015 Rob Thatcher & Matthew Skelton
  2. 2. “Operational Features” how to develop and test prioritisation techniques
  3. 3. availability is the best feature
  4. 4. Operational Features
  5. 5. “the properties of a system which make it work well in Production”
  6. 6. Not PIMP MY RIDE MORE Greasy Mechanic
  7. 7. Not PIMP MY RIDE MORE Greasy Mechanic
  8. 8. Terminology
  9. 9. what happened to NFRs? (non-functional requirements)
  10. 10. Non- Functional Functional
  11. 11. language impact
  12. 12. non-starter non compos mentis non-compete
  13. 13. nonsense !
  14. 14. holistic product view
  15. 15. How did we get to this?
  16. 16. admission: IT folk have been guilty of making operational features quite scary & mysterious
  17. 17. long lists of requirements crazy test plans
  18. 18. poor explanation of needs failure to engage stakeholders gold-plating
  19. 19. de-mystify operational features
  20. 20. better approach pragmatic and effective rapid, safe, valuable
  21. 21. “the properties of a system which make it work well in Production”
  22. 22. Why value Operational Features?
  23. 23. downtime: $$$ reputation ($$)
  24. 24. non-linear increase in complexity and problems
  25. 25. Internet of Things
  26. 26. we can no longer deal manually with the scale/volume of potential problems
  27. 27. agility and response to incidents
  28. 28. remote car hacking: security as an operational feature
  29. 29. HA + DR + Backup + Metrics + Diagnostics + …
  30. 30. think: "when it fails, how will we recover?“ it will fail
  31. 31. How do we develop and test Operational Features?
  32. 32. defined features testable and measurable
  33. 33. ahead lie the ‘ilities’...
  34. 34. 1. What 2. How to test
  35. 35. Operational Hooks
  36. 36. Deployment Pipeline
  37. 37. Configurability
  38. 38. re-read config (SIGHUP) text files in version control inject settings – no ‘black boxes’
  39. 39. toggle features via config “Postcode lookup unavailable”  better UX
  40. 40. Deployability
  41. 41. immutable artefacts concurrent releases (SxS) symlinks
  42. 42. rapid scriptable simple failure modes
  43. 43. Maintainability
  44. 44. holding page as MVP!
  45. 45. live system component diagrams
  46. 46. modularity ability to upgrade version numbering (SemVer?)
  47. 47. Testability
  48. 48. every component has a /health endpoint
  49. 49. stubbed/mocked/faked endpoints test things individually
  50. 50. Recoverability
  51. 51. asynchronous service start expect services to be erroring logs are not wiped (rotated: okay) avoid flooding logs
  52. 52. no nasty zombies after failures MTTR more important than MTBF* * for most kinds of F
  53. 53. Performance
  54. 54. run key 'hotspot' areas early use a deployment pipeline ‘critical path’
  55. 55. early pipeline tests act as a barometer for later performance problems
  56. 56. derive transit time metrics
  57. 57. Monitorability
  58. 58. stream of metrics transaction tracing
  59. 59. BasketItemAdded grep BasketItem
  60. 60. logging for insights
  61. 61. Resilience
  62. 62. Saboteur for network failure testing deployment pipeline
  63. 63. assume missing or failing Chaos Monkey don’t crash on HTTP 503
  64. 64. Scalability
  65. 65. concurrent workers queues and bottlenecks throttling is your friend
  66. 66. Security and ‘securability’
  67. 67. securability by practice SSL certs & HEARTBLEED
  68. 68. Gauntlt deployment pipeline
  69. 69. # nmap-simple.attack Feature: simple nmap attack to check for open ports Background: Given "nmap" is installed And the following profile: | name | value | | hostname | example.com | Scenario: Check standard web ports When I launch an "nmap" attack with: """ nmap -F <hostname> """ Then the output should match /80.tcps+open/ Then the output should not match: """ 25/tcps+open """
  70. 70. Availability
  71. 71. “available but unusable" synthetic transactions
  72. 72. special HTTP header: trigger additional metrics/reporting
  73. 73. How the organisation affects Operational Features
  74. 74. Budgets
  75. 75. bonuses: story points delivered tickets closed
  76. 76. Capex vs Opex tax breaks
  77. 77. avoiding the Capex/Opex evil
  78. 78. Developers seen as more valuable than Ops people 3x hiring bonus for Devs (!)
  79. 79. improved awareness in product teams
  80. 80. share ownership and decision making
  81. 81. features end-user operational end-user
  82. 82. single product backlog
  83. 83. Product Owner on call for incidents
  84. 84. tricky! high degree of maturity honesty about the product
  85. 85. Product Owner and Tech Lead are both on the hook for outages
  86. 86. AVOID Product Owner for ‘user features’ and Tech Lead for ‘operational features’
  87. 87. How to evaluate Operational Features vs User Features
  88. 88. treat Ops team folk as another user persona
  89. 89. alternatives to User Stories?
  90. 90. NOT: "as a logging subsystem, I want..."
  91. 91. Metrics
  92. 92. Live: downtime, A/B for operational aspects (speed) Pre-live: time spent re-deploying
  93. 93. Metrics for better conversations
  94. 94. metric-ify your delivery and test infrastructure  99.99% uptime, but 20 redeployments every time
  95. 95. Heuristics for operational features 30% of total product budget 30% of dev team time
  96. 96. holistic product view
  97. 97. MVP: ‘service unavailable’ page
  98. 98. test early for operational features using a deployment pipeline
  99. 99. single product backlog: (user) features + (operational) features
  100. 100. availability is the best feature 
  101. 101. Books! operabilitybook.comoperationalfeatures.com
  102. 102. thank you http://skeltonthatcher.com/ enquiries@skeltonthatcher.com @SkeltonThatcher +44 (0)20 8242 4103

×