Your SlideShare is downloading. ×
0
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
A tale of Disaster Recovery (Cfengine everyday, practices and tools)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A tale of Disaster Recovery (Cfengine everyday, practices and tools)

3,885

Published on

After a brief presentation of configuration management (CM) basics, we start with an ill-fated tale from the recent past about disaster recovery (also known as a case study, if you must): how our CM …

After a brief presentation of configuration management (CM) basics, we start with an ill-fated tale from the recent past about disaster recovery (also known as a case study, if you must): how our CM saved us, how it didn't, and what could have been done better. This could lead to a discussion about best practices.

We use Cfengine 3, and will introduce the software, overview the main differences with other open source CM tools before explaining why we like this choice. But Cfengine is not all: what enables us to manage our configuration completely are the practices and tools we've built around it.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,885
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
36
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. FOSDEM 2011 @Brussels, Belgium A tale of disaster recovery Cfengine everyday, practices and toolsNicolas Charles <nch@normation.com>Jonathan Clarke <jcl@normation.com>    
  • 2. About the speakers Nicolas Charles Jonathan Clarke Cfengine contributor OpenLDAP commiter Cfengine ”Community Champion” (C3) Scala Developer Sysadmin But we get on pretty well! (mostly...)    
  • 3. Agenda1) Configuration Management 1012) Our choice of tool3) A tale of disaster recovery4) Introducing Cfengine 35) Why we love Cfengine 3    
  • 4. A bit aboutConfiguration Management...    
  • 5. Configuration management What is it ?  Configuration Management is a field of management that focuses on establishing and maintaining consistency of a system (..) throughout its life  Software configuration management is the task of tracking and controlling changes in the softwareSources:http://en.wikipedia.org/wiki/Configuration_managementhttp://en.wikipedia.org/wiki/Software_configuration_management    
  • 6. Configuration management Why is it useful ?  Control changes  Reproduce over time and nodes  Audit and keep history data  Repair automaticaly    
  • 7. Configuration Management Tools What we chose, and why    
  • 8. Our choice Back in mid 2009 Needed a configuration management tool Criteria:  Open source  Multi-platform agent (including Windows)  Resilient  Non-disruptive    
  • 9. Our choice: candidates Cfengine 3 Puppet Chef    
  • 10. Our choice: candidates Cfengine 3 More on this choice later...    
  • 11. Disaster Recovery An ill-fated tale from the recent past (CASE STUDY)   
  • 12. Before the disaster... Our companys IT infrastructure Small company: small requirements  Web site, email  Git repository, Redmine... Small company: small budget  All on one hosted server    
  • 13. Asking for trouble? Just one hosted server! Critical services! No, a ”safe” configuration:  Redundant hardware, 3 disk RAID-5 array  All services automatically installed and setup using Configuration Management  Backups: daily (several off-site locations)  Several VMs to separate services    
  • 14. A critical failure 2 hard drives fail simultaneously → RAID-5 array is down → Almost all services fail immediately → ”The end of the world as we know it” → Need to rebuild everything NOW    
  • 15. Recovering Step 1: Panic! Step 2: Get a new server Step 3: Reinstall base OS + virtualization Step 4: Restore VM configuration... whoops Step 4: Re-create the VMs manually Step 5: Reinstall each OS in each VM...    
  • 16. Recovering Step 6: Installation Configuration Management Step 7: Sit back and watch all the services coming back online as if by magic! Step 8: Huh, wheres my data? Step 9: Manually restore backups Step 10: Make a list of missing data...    
  • 17. Lessons learned1) Hard disks fail reliably2) Restoring virtualization setups: ● Backing up the config files would have helped ● Need CM tools to describe the desired state! (Cfengine Nova does this)3) Configuration Management should tie in to our backup system4) Backups were lacking some files: always test!    
  • 18. Wishlist and discussion Integrating Configuration Management tools and backup systems is a crucial step for CM to be efficient for disaster recovery  What do others do? Provisioning VMs and their resources (disks, network) should be automated too  Cloud providers are one solution  What about ”plain” virtualization?    
  • 19. A bit about Cfengine 3... Sources: across the Internet   
  • 20. Cfengine: HistorySource:http://verticalsysadmin.com/blog/uncategorized/relative-origins-of-cfengine-chef-and-puppet    
  • 21. Cfengine 3: Intro Configuration management software Written in C Two versions :  Community (GPL v3)  Nova (closed source) : Community + extra features Backed by Cfengine AS – Norway based company founded in 2009    
  • 22. Cfengine 3: Features  According to Kuleven comparative study of configuration management systems:  Very mature  Cross platform (*BSD, AIX, HP-UX, Linux, Mac OS X, Solaris, Windows)  Strongly distributed  Based on state description and convergence  Very high scalabily ( > 10000 nodes )  Very small footprintSource: http://distrinet.cs.kuleuven.be/software/sysconfigtools/overview    
  • 23. Cfengine 3: Components Cf-agent  Runs on all managed hosts  Applies configuration – this is the heart  Can connect to cf-serverd to get policies / files Cf-serverd  Distributes policies and files  Must be run on policy server(s)  Usually run on all hosts to enable remote runs Cf-monitord  Collects statistics on all nodes    
  • 24. Cfengine 3: Promises Configuration rules are called promises  ”Promise” to be in the desired state  Cfengine agent handles the steps to get there: convergence Promise theory is based on research done in the University of Oslo    
  • 25. Cfengine 3: Usage examples Large companies (Facebook, AMD, …) Critical systems: Joint Australia Tsunami Warning Centre Personal computers Mobile devices: Nokia N900 Underwater devices: army submarines Small and medium companies...    
  • 26. Why we love Cfengine 3...Sources: our experience and opinions    
  • 27. Memory usage Daemon consumption on managed hosts    
  • 28. Multi-platform Define a configuration for all operating systems  Windows, Linux  Make it ”transparent” (forget about the complexity)  Existing standard library handling the differences between each OS and distribution    
  • 29. File editing Only change what you need to  You like your distributions defaults?  You have various different systems already setup and just need to change something? Search for lines and replace/delete/add them Only change one field in a file  /etc/passwd for example...    
  • 30. Complex tasks Powerful class system to trigger promises  Based on nodes itself  Based on time  Based on whatever you might imagine Complex workflow can be created    
  • 31. Thank you ! FOSDEM 2011Configuration Management roomAnd those brave enough to wake up early    

×