Software operability and run book collaboration London Feb 2014

1,598 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,598
On SlideShare
0
From Embeds
0
Number of Embeds
195
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Software operability and run book collaboration London Feb 2014

  1. 1. #unidevops Software Operability, Run Book Collaboration, and DevOps Matthew Skelton 27th February 2014 DevOps Summit, London, UK www.devopssummit.com @matthewpskelton softwareoperability.com
  2. 2. • Software Operability • Run Book Collaboration • Making Operability Work • Questions #unidevops Agenda
  3. 3. • Software systems since 1998 • Continuous Delivery specialist, DevOps enthusiast, Operability nut • London Continuous Delivery meetup group - londoncd.org.uk • Experience DevOps workshops • PIPELINE Conference #unidevops Background
  4. 4. #unidevops Software Operability
  5. 5. • • • • Definitions Examples Why focus on operability? How DevOps can help #unidevops Software Operability
  6. 6. #unidevops Operability?
  7. 7. • Cognates: – Opera – Operate – Operational – Inter-operability #unidevops Etymology of Operability?
  8. 8. #unidevops
  9. 9. • Operability: the properties of a system which make it work well in Production #unidevops Software Operability
  10. 10. Since 1929, Mallorca, Spain #unidevops Operable Systems
  11. 11. • David Copeland (@davetron5000): “How your software runs in production is all that matters. The most amazing abstractions, cleanest code, or beautiful algorithms are meaningless if your code doesn’t run well on production.” • http://www.naildrivin5.com/blog/2013/06/16/production-is-all-that-matters.html #unidevops Software Operability
  12. 12. • • • • • • • • • Deploy Monitor Diagnose Debug Query Control Inspect Clear ... #unidevops Operational Criteria
  13. 13. #unidevops “Non-Functional”
  14. 14. • Hooks (internal APIs) for: – Logging – Monitoring – Diagnostics – Health checks – Data clear-down – Service / daemon / container control #unidevops Shaped by Operability
  15. 15. #unidevops Ops Folk are Users Too!
  16. 16. #unidevops
  17. 17. • Deploy more rapidly, frequently • High cost of Production outage • Systems now more complicated #unidevops Why focus on Operability?
  18. 18. #unidevops Outages are Embarrassing!
  19. 19. #unidevops Operational considerations
  20. 20. #unidevops Operational considerations
  21. 21. #unidevops Operational considerations
  22. 22. • DevOps is one way to address poor operability • Improved collaboration and communication between Dev teams and Ops teams • Example: Run Book Collaboration #unidevops How DevOps can help
  23. 23. #unidevops Run Book Collaboration
  24. 24. • Feedback loops and learning • What is a run book? • How can run book collaboration help operability? #unidevops Run Book Collaboration
  25. 25. Gene Kim: http://itrevolution.com/the-three-ways-principles-underpinning-devops/ #unidevops Feedback Loops
  26. 26. #unidevops Run Book
  27. 27. #unidevops Templates
  28. 28. #unidevops Example • • 1 Table of Contents 2 System Overview – – – – – – – – – 2.1 Service Overview 2.2 Contributing Applications, Daemons, and Windows Services 2.3 Hours of Operation 2.4 Execution Design 2.5 Infrastructure and Network Design 2.6 Resilience, Fault Tolerance and HighAvailability 2.7 Throttling and Partial Shutdown 2.8 Required Resources 2.9 Expected Traffic and Load • • • 4.1 Configuration Management – – – • 7.5 Troubleshooting – 8.1 Maintenance Procedures 7 Operational Tasks • • 5.2 Backup Procedures 5.3 Restore Procedures – – 6.1 Error Messages 6.2 Events 6 Monitoring and Alerting 8.1.1 Patching – – • • – • – 8.1.3.1 Log Rotation 8.2.1 Technical Testing 8.2.2 Post-Deployment 9 Failure and Recovery Procedures – – – • 8.1.1.1 Normal Cycle 8.1.1.2 Zero-Day Vulnerabilities 8.1.2 GMT/BST time changes 8.1.3 Cleardown Activities 8.2 Testing • • 5 System Backup and Restore 5.1.1 Special Files 7.4.1 System Rebuilds 8 Maintenance Tasks • 5.1 Backup Requirements 3 Security and Access Control 4 System Configuration • 7.1 Deployment 7.2 Batch Processing 7.3 Power Procedures 7.4 Routine Checks – 2.10 Environmental Differences 2.11 Tools – • – – – – • 6.3 Health Checks 6.4 Other Messages 2.9.1 Hot or Peak Periods 2.9.2 Warm Periods 2.9.3 Cool or Quiet Periods – – • • – – 9.1 Failover 9.2 Recovery 9.3 Troubleshooting Failover and Recovery 10 Contact Details
  29. 29. #unidevops Example • • 1 Table of Contents 2 System Overview – 2.1 Service Overview – 2.2 Contributing Applications, Daemons, and Windows Services – 2.3 Hours of Operation – 2.4 Execution Design – 2.5 Infrastructure and Network Design – 2.6 Resilience, Fault Tolerance and High-Availability – 2.7 Throttling and Partial Shutdown – 2.8 Required Resources – 2.9 Expected Traffic and Load • • • • • • • • 3 Security and Access Control 4 System Configuration 5 System Backup and Restore 6 Monitoring and Alerting 7 Operational Tasks 8 Maintenance Tasks 9 Failure and Recovery Procedures 10 Contact Details
  30. 30. #unidevops Example 2.1 Service Overview 2.2 Contributing Applications, Daemons, and Windows Services 2.3 Hours of Operation 2.4 Execution Design 2.5 Infrastructure and Network Design 2.6 Resilience, Fault Tolerance and High-Availability 2.7 Throttling and Partial Shutdown 2.8 Required Resources 2.9 Expected Traffic and Load
  31. 31. #unidevops It‟s Not Documentation
  32. 32. #unidevops Focus on Collaboration
  33. 33. • • • • • Better understanding Better cross-team working Reduction in operational problems Fewer outages Reduced long-term cost-ofownership #unidevops Outcomes
  34. 34. • • • • Focus on the collaboration Run book is a means, not an end Throw it away when complete (?) Aim to automate more over time • See http://runbookcollab.info/ #unidevops Run Book as Collaboration
  35. 35. #unidevops Making Operability Work
  36. 36. • • • • • NFRs vs Operational Features Budget changes Organisation changes Responsibility changes Avoid on-call anti-patterns #unidevops Making Operability Work
  37. 37. #unidevops “Non-Functional”
  38. 38. Features #unidevops Operational Features
  39. 39. • Single product backlog – End-user + Operational features – New features + bugs • Product Owner on call – Accountable for operational failures – Seriously! #unidevops Taking Operability Seriously
  40. 40. #unidevops
  41. 41. • “What is your budget code?” • Capex vs. Opex? • Remove budget barriers to regular, effective communication #unidevops Budget changes
  42. 42. #unidevops Niek Bartholomeus (@niekbartho) - http://niek.bartholomeus.be/ https://speakerdeck.com/niekbartho/self-organization-vs-global-optimization-a-comparison-betweentraditional-and-modern-organizations
  43. 43. • “I‟ll need to ask my manager first” • Lack of autonomy • Remove reporting barriers to regular, effective communication • More at http://bit.ly/DevOpsTopologies #unidevops Organisation changes
  44. 44. #unidevops “I just want to write code”
  45. 45. #unidevops Mysterious Coding Tricks
  46. 46. #unidevops On-call for Responsibility
  47. 47. • • • • • • Too much overtime pay Too little overtime pay Rota team too small No training in incident response No team ownership of product No team autonomy for changes #unidevops On-call Anti-Patterns
  48. 48. • Team members want to help make things better • Empowered to fix problems • Reduce the times they are woken up #unidevops On call - Goal
  49. 49. • • • • • Operational Features, not “NFRs” Sustainable collaboration Sensible, fair on-call rotas Over-compensate in time off Avoid burn-out #unidevops The operability of operability
  50. 50. #unidevops Recapitulation
  51. 51. Making software systems work well in Production #unidevops Software Operability
  52. 52. Shared focus on operability throughout the delivery cycle #unidevops Run Book Collaboration
  53. 53. Use DevOps team patterns for sustainable operability #unidevops Making Operability Operable
  54. 54. #unidevops What‟s Next?
  55. 55. • Patterns for Performance and Operability – Ford, Gileadi, Purba, Moerman • http://whoownsmyoperability.com/ – Recommended reading lists #unidevops Further Reading
  56. 56. • Release It! – Michael Nygard (@mnygard) • http://www.michaelnygard.com/ #unidevops Further Reading
  57. 57. • Software Operability – How to make software work well in Production – Due early late 2014 • Sign up at OperabilityBook.com • Discount code for DevOps Summit attendees #unidevops Operability Book
  58. 58. • A hands-on workshop for DevOps culture • Forthcoming dates: – London: 28th February 2014 • http://experiencedevops.org/ #unidevops Experience DevOps
  59. 59. • • • • • • Continuous Delivery „Unconference‟ format Tuesday 8th April 2014 London, UK http://pipelineconf.info/ @PipelineConf #unidevops PIPELINE
  60. 60. #unidevops Matthew Skelton @matthewpskelton Questions & Discussion softwareoperability.com operabilitybook.com bit.ly/DevOpsTopologies
  61. 61. http://www.blinkenlights.nl/images/ blinkenlights-big.jpeg http://www.danatronics.com/s db_apps.html http://riverbankoftruth.com/ wpcontent/uploads/2013/07/embarrassedchimp22.jpg http://www.thinkgeek.com/edm/ 20040709.html Acknowledgements http://indianaohindiana.com/wpcontent/uploads/2013/10/Tome.jpg http://www.guavaworks.com/companyblog/guava-doesnt-do-cookie-cutter.html http://www.carpages.co.uk/ford/ford-sandsculptures-05-09-11.asp http://www.thisismoney.co.uk/money/experts/ article-2324270/Take-smaller-pension-pots-taxfree-leave-final-salary-untouched.html http://paranoidnews.org/wpcontent/uploads/2010/10/Alien-Hunt-AlarmClock.jpg http://particulations.blogspot.co.uk/ 2010/08/headingley-hole.html http://marvel.wikia.com/ Stephen_Strange_(Earth-616) #unidevops http://pianofortekeys.files.wordpress.com/ 2013/04/ariadnne_wideweb__470x3300.jpg
  62. 62. #unidevops Further Slides
  63. 63. #unidevops The Phoenix Project
  64. 64. #unidevops Continuous Delivery

×