Dev ops and_infrastructure_immunology_v0.4Presentation Transcript
1DEVOPS ISSUSTAINABLE OPS&INFRASTRUCTUREIMMUNOLOGYV0.4 prepared and presented by Julie TsaiDec. 17, 2012
History2 Concepts borrows heavily (or stolen) from classic papers “Bootstrapping an Infrastructure” by Steve Traugott and Joel Huddleston, and Mark Burgess’s “Computer Immunology” and Promise Theory Personal experience – syncing scripts, predicting change, better communication
What does this fix?3 How do I keep X (files, permissions, services) from changing unpredictably? When did change happen? Is it related to the downtime incident we had? Or unpredictable deployments? Who/what group made that change? The system is growing (or has) arms and legs in unpredictable, astonishing directions making it difficult/impossible to reproduce. Or make minor changes: Deployments are the equivalent of leveling the whole house to change one light bulb. Critical parts of the infrastructure reside in peoples heads - bad for scaling the company, bad for individual development. Put the real estate to better use.
Centralized, Automated4 Standards Sounds intutive, but…. Obvious examples in SA world – LDAP, DNS, logservers, data consistency, NFS fileservers Same principle as programmers’ DRY
What does this look like?5 1) Version-Controlled Published Configurations 2) Master Fileserver Repository 3) Automated Propagation and Maintenance The heart of where much of today’s DevOps work exists: This is where tools like cfengine, puppet, and chef literally “level-up” the way your infrastructure is managed. See links on last slide for more information. 4) Monitoring the Infrastructure 5) Self-Healing
Version-Controlled Published6 Configurations Git, svn, perforce, cvs – SCM of choice Promise Theory – connected but independent agents cannot wrest guarantees from each other – they can only truly obligate themselves. But this can be leveraged to coordinate.
Monitoring & Self-Healing9 What’s the current state Post-change state Event-driven hooks from monitoring back to automation tool creates self-healing i.e. Nagios, Empirix, monitoring tool of choice End-to-end change visibility – intended changes, logged changes, monitoring events
What do we gain?10 A lot: Known configs/profiles assured to reflect live system state auditable easy-to-administer security configurations predictable change and rollback Large-scale updates that are seamless, uniform, and logged. Agile compliance! Uptime! More free time! To devote to higher-level activities
Good Reading11 Classic “Bootstrapping an Infrastructure,” LISA ’98 - http://www.infrastructures.org/papers/bootstrap/bootstr ap.html Self-Healing Networks - http://onlamp.com/pub/a/onlamp/2006/05/25/self- healing-networks.html?page=1 Relative origins of cfengine, puppet, chef - http://verticalsysadmin.com/blog/uncategorized/relativ e-origins-of-cfengine-chef-and-puppet Promises of DevOps - http://cfengine.com/markburgess/blog_devops.html Promise Theory - http://en.wikipedia.org/wiki/Promise_theory