• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Software operability and run book collaboration - DevOps Summit, Bangalore
 

Software operability and run book collaboration - DevOps Summit, Bangalore

on

  • 700 views

Making software work well in production (through good software operability) is one of the goals of DevOps. Collaboration between Dev and Ops on the 'run book' or operation manual is one way to open up ...

Making software work well in production (through good software operability) is one of the goals of DevOps. Collaboration between Dev and Ops on the 'run book' or operation manual is one way to open up communication channels between Dev and Ops, leading to improved software operability.
This is the slide deck I used at DevOps Summit, Bangalore, on 18th December 2013.

Statistics

Views

Total Views
700
Views on SlideShare
671
Embed Views
29

Actions

Likes
1
Downloads
9
Comments
0

3 Embeds 29

http://blog.matthewskelton.net 23
https://twitter.com 5
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Software operability and run book collaboration - DevOps Summit, Bangalore Software operability and run book collaboration - DevOps Summit, Bangalore Presentation Transcript

    • #unidevops Software Operability, Run Book Collaboration, and DevOps Matthew Skelton 18th December 2013 DevOps Summit, Bangalore, India www.devops-summit.org @matthewpskelton softwareoperability.com
    • • Software Operability • Run Book Collaboration • Making Operability Work • Questions #unidevops Agenda
    • • Software systems since 1998 • Software build & deployment specialist & DevOps enthusiast • London Continuous Delivery meetup group - londoncd.org.uk • Experience DevOps workshops #unidevops Background
    • #unidevops Software Operability
    • • • • • Definitions Examples Why focus on operability? How DevOps can help #unidevops Software Operability
    • #unidevops Operability?
    • • Cognates: – Opera – Operate – Operational – Inter-operability #unidevops Etymology of Operability?
    • #unidevops
    • • Operability: the properties of a system which make it work well in Production #unidevops Software Operability
    • Since 1929, Mallorca, Spain #unidevops Operable Systems
    • • David Copeland (@davetron5000): “How your software runs in production is all that matters. The most amazing abstractions, cleanest code, or beautiful algorithms are meaningless if your code doesn’t run well on production.” • http://www.naildrivin5.com/blog/2013/06/16/production-is-all-that-matters.html #unidevops Software Operability
    • • • • • • • • • • Deploy Monitor Diagnose Debug Query Control Inspect Clear ... #unidevops Operational Criteria
    • #unidevops “Non-Functional”
    • • Hooks (internal APIs) for: – Logging – Monitoring – Diagnostics – Health checks – Data clear-down – Service / daemon / container control #unidevops Shaped by Operability
    • #unidevops Ops Folk are Users Too!
    • #unidevops
    • • Deploy more rapidly, frequently • High cost of Production outage • Systems now more complicated #unidevops Why focus on Operability?
    • #unidevops Outages are Embarrassing!
    • #unidevops Operational considerations
    • #unidevops Operational considerations
    • #unidevops Operational considerations
    • • DevOps is one way to address poor operability • Improved collaboration and communication between Dev teams and Ops teams • Example: Run Book Collaboration #unidevops How DevOps can help
    • #unidevops Run Book Collaboration
    • • Feedback loops and learning • What is a run book? • How can run book collaboration help operability? #unidevops Run Book Collaboration
    • Gene Kim: http://itrevolution.com/the-three-ways-principles-underpinning-devops/ #unidevops Feedback Loops
    • #unidevops Run Book
    • #unidevops Templates
    • #unidevops Example • • 1 Table of Contents 2 System Overview – – – – – – – – – 2.1 Service Overview 2.2 Contributing Applications, Daemons, and Windows Services 2.3 Hours of Operation 2.4 Execution Design 2.5 Infrastructure and Network Design 2.6 Resilience, Fault Tolerance and HighAvailability 2.7 Throttling and Partial Shutdown 2.8 Required Resources 2.9 Expected Traffic and Load • • • 4.1 Configuration Management – – – • 7.5 Troubleshooting – 8.1 Maintenance Procedures 7 Operational Tasks • • 5.2 Backup Procedures 5.3 Restore Procedures – – 6.1 Error Messages 6.2 Events 6 Monitoring and Alerting 8.1.1 Patching – – • • – • – 8.1.3.1 Log Rotation 8.2.1 Technical Testing 8.2.2 Post-Deployment 9 Failure and Recovery Procedures – – – • 8.1.1.1 Normal Cycle 8.1.1.2 Zero-Day Vulnerabilities 8.1.2 GMT/BST time changes 8.1.3 Cleardown Activities 8.2 Testing • • 5 System Backup and Restore 5.1.1 Special Files 7.4.1 System Rebuilds 8 Maintenance Tasks • 5.1 Backup Requirements 3 Security and Access Control 4 System Configuration • 7.1 Deployment 7.2 Batch Processing 7.3 Power Procedures 7.4 Routine Checks – 2.10 Environmental Differences 2.11 Tools – • – – – – • 6.3 Health Checks 6.4 Other Messages 2.9.1 Hot or Peak Periods 2.9.2 Warm Periods 2.9.3 Cool or Quiet Periods – – • • – – 9.1 Failover 9.2 Recovery 9.3 Troubleshooting Failover and Recovery 10 Contact Details
    • #unidevops Example • • 1 Table of Contents 2 System Overview – 2.1 Service Overview – 2.2 Contributing Applications, Daemons, and Windows Services – 2.3 Hours of Operation – 2.4 Execution Design – 2.5 Infrastructure and Network Design – 2.6 Resilience, Fault Tolerance and High-Availability – 2.7 Throttling and Partial Shutdown – 2.8 Required Resources – 2.9 Expected Traffic and Load • • • • • • • • 3 Security and Access Control 4 System Configuration 5 System Backup and Restore 6 Monitoring and Alerting 7 Operational Tasks 8 Maintenance Tasks 9 Failure and Recovery Procedures 10 Contact Details
    • #unidevops Example 2.1 Service Overview 2.2 Contributing Applications, Daemons, and Windows Services 2.3 Hours of Operation 2.4 Execution Design 2.5 Infrastructure and Network Design 2.6 Resilience, Fault Tolerance and High-Availability 2.7 Throttling and Partial Shutdown 2.8 Required Resources 2.9 Expected Traffic and Load
    • #unidevops It’s Not Documentation
    • #unidevops Focus on Collaboration
    • • • • • • Better understanding Better cross-team working Reduction in operational problems Fewer outages Reduced long-term cost-ofownership #unidevops Outcomes
    • • • • • Focus on the collaboration Run book is a means, not an end Throw it away when complete (?) Aim to automate more over time • See http://runbookcollab.info/ #unidevops Run Book as Collaboration
    • #unidevops Making Operability Work
    • • • • • • NFRs vs Operational Features Budget changes Organisation changes Responsibility changes Avoid on-call anti-patterns #unidevops Making Operability Work
    • #unidevops “Non-Functional”
    • Features #unidevops Operational Features
    • • Single product backlog – End-user + Operational features – New features + bugs • Product Owner on call – Accountable for operational failures – Seriously! #unidevops Taking Operability Seriously
    • #unidevops
    • • “What is your budget code?” • Capex vs. Opex? • Remove budget barriers to regular, effective communication #unidevops Budget changes
    • #unidevops Niek Bartholomeus (@niekbartho) - http://niek.bartholomeus.be/ https://speakerdeck.com/niekbartho/self-organization-vs-global-optimization-a-comparison-betweentraditional-and-modern-organizations
    • • “I’ll need to ask my manager first” • Lack of autonomy • Remove reporting barriers to regular, effective communication • More at http://bit.ly/DevOpsTopologies #unidevops Organisation changes
    • #unidevops “I just want to write code”
    • #unidevops Mysterious Coding Tricks
    • #unidevops On-call for Responsibility
    • • • • • • • Too much overtime pay Too little overtime pay Rota team too small No training in incident response No team ownership of product No team autonomy for changes #unidevops On-call Anti-Patterns
    • • Team members want to help make things better • Empowered to fix problems • Reduce the times they are woken up #unidevops On call - Goal
    • • • • • • Operational Features, not “NFRs” Sustainable collaboration Sensible, fair on-call rotas Over-compensate in time off Avoid burn-out #unidevops The operability of operability
    • #unidevops Recapitulation
    • Making software systems work well in Production #unidevops Software Operability
    • Shared focus on operability throughout the delivery cycle #unidevops Run Book Collaboration
    • Use DevOps team patterns for sustainable operability #unidevops Making Operability Operable
    • #unidevops What’s Next?
    • • Patterns for Performance and Operability – Ford, Gileadi, Purba, Moerman • http://whoownsmyoperability.com/ – Recommended reading lists #unidevops Further Reading
    • • Software Operability – How to make software work well in Production – Due early 2014 • Sign up at OperabilityBook.com • Discount code for DevOps Summit attendees #unidevops Operability Book
    • • A hands-on workshop for DevOps culture • Forthcoming dates: – Bangalore: 19th December 2013 – London: February 2014 (tbc) • http://experiencedevops.org/ #unidevops Experience DevOps
    • • • • • • Continuous Delivery Tuesday 8th April 2014 London, UK http://pipelineconf.info/ @PipelineConf #unidevops PIPELINE Conference
    • #unidevops Matthew Skelton @matthewpskelton Questions & Discussion softwareoperability.com operabilitybook.com bit.ly/DevOpsTopologies
    • http://www.blinkenlights.nl/images/ blinkenlights-big.jpeg http://www.danatronics.com/s db_apps.html http://riverbankoftruth.com/ wpcontent/uploads/2013/07/embarrassedchimp22.jpg http://www.thinkgeek.com/edm/ 20040709.html Acknowledgements http://indianaohindiana.com/wpcontent/uploads/2013/10/Tome.jpg http://www.guavaworks.com/companyblog/guava-doesnt-do-cookie-cutter.html http://www.carpages.co.uk/ford/ford-sandsculptures-05-09-11.asp http://www.thisismoney.co.uk/money/experts/ article-2324270/Take-smaller-pension-pots-taxfree-leave-final-salary-untouched.html http://paranoidnews.org/wpcontent/uploads/2010/10/Alien-Hunt-AlarmClock.jpg http://particulations.blogspot.co.uk/ 2010/08/headingley-hole.html http://marvel.wikia.com/ Stephen_Strange_(Earth-616) #unidevops http://pianofortekeys.files.wordpress.com/ 2013/04/ariadnne_wideweb__470x3300.jpg
    • #unidevops Further Slides
    • #unidevops The Phoenix Project
    • #unidevops Continuous Delivery