Opsinthecloud - david nalley
Upcoming SlideShare
Loading in...5

Like this? Share it with your network

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 148

http://cloud.dzone.com 148

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Ops in the Cloud Puppetconf David Nalley [email_address]
  • 2. # whoami
    • Horrible at slide aesthetics
    • 3. Recovering sysadmin
    • 4. Community Manager for CloudStack
    • 5. Contributor to a number of F/LOSS projects
      • Fedora
      • 6. Zenoss
      • 7. Number of others to lesser degrees
  • 8. What this isn't
    • Telling you how to be the next {Netflix, Facebook, Zynga, Google}
    • 9. A presentation about the cloud (hopefully you know what that is)
    • 10. A detailed checklist of what to do to get it right.
  • 11. The problem domain
    • Provisioning of machines and services isn't centralized, and perhaps not even human driven.
    • 12. Scaling means that machines come and go and come and go, and come and go.
    • 13. You may or may not have the same level of access that you are used to.
    • 14. Things happen MUCH faster
  • 15. Foundational Elements of Cloud Ops
    • Automated Provisioning
    • 16. Config Management
    • 17. Monitoring
    • 18. Orchestration
  • 19. Automated Provisioning
    • Cloud has typically meant golden masters. FAIL
    • 20. No access to the typical things you'd likely have access to even with virtualization.
    • 21. AMI, VHD, OVA, OVF, QCOW2, VMDK, oh my!
    • 22. Have disk image, how to start? Multiple clouds?
  • 23. Config Management
    • Machines come and go whimsically – no time to edit nodes.pp or manually handle certs
      • How do I ensure that it gets correct manifests applied.
      • 24. Is autosigning really secure?
    • Lots of potential for cruft to build up if done wrong.
  • 25. Monitoring
    • Scaling means you might not care about single instances.
      • If a single instance goes down, and no one notices, should you care?
      • 26. At what point do you alert?
    • What do you do with all of the data (that machine only stayed up for 30 minutes or 30 hours)
    • 27. Did the machine die, or did something magically decommission it? Or is it 'Cloud HA'?
  • 28. Orchestration
    • Config Management gets you to a state – but is a small slice.
    • 29. Scheduling batch jobs, app deployment.
    • 30. How do you know how many appservers there are at any given time that you should be orchestrating?