• Like
  • Save
Monitoring and metrics in the cloud
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Monitoring and metrics in the cloud



Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Metrics and Monitoring in the cloudDavid Lutz@dlutzy
  • 2. The objective of metrics is tomake pretty graphs…
  • 3. The objective of metrics is tomake pretty graphs...in order to understand the performanceand capacityof your systems and how they vary over time.
  • 4. The objective of monitoring is to…make the Operations-guy-on-call’s life hell.
  • 5. The objective of monitoring is tocheck that the system is working as expectedand take action if some component isnt.
  • 6. “Those who cannot remember the past arecondemned to repeat it” - George Santayana So here’s a case study…
  • 7. A long time ago in a data centre far,far away….
  • 8. Complete system includes humans to run it!Human Factors Engineering.http://en.wikipedia.org/wiki/Human_factors2 x Linux Engineers1 x Network Engineer1 x Do Anything Guy1 x Developer
  • 9. No Monitoring or Metrics. Black Box. Completely blind.
  • 10. MRTGNet Saint
  • 11. Large Development team External Consultants ITIL Process people5 x Linux Engineers1 x Network Engineer2 x Database AdministratorsandPart of an Infrastructure team that includedVirtualization specialistsStorage specialistsHardware specialists
  • 12. WTF happened? It grew… Virtualization / Cloud Cloud / Virtualization
  • 13. Approximately 400 serversStill using Nagios and Cacti15 minutes to add server manually.1 hour or more to add a new check.
  • 14. And Ganglia.And External SAAS tools:New Relic. Gomez. Omniture.
  • 15. #monitoringsucks
  • 16. Getting it right
  • 17. Getting it wrong
  • 18. What’s different about the cloud?• Servers come and go• Sometimes automatically with auto-scaling• Topologies and Architectures change rapidly• Driven from Configuration Management Systems
  • 19. The problems with Nagios• Clunky UI.• Monolithic design.• Hard to scale.• Hard to add nodes dynamically.
  • 20. #doingitright
  • 21. #doingitwrong
  • 22. Sensu… Is it the Nagios killer?
  • 23. sensu-serversensu-clientsensu-apisensu-dashboard
  • 24. • JSON everywhere• Can re-use Nagios checks• Messaging oriented architecture• Designed to be driven from Config Management tools• Supports dynamic topologies
  • 25. ?
  • 26. ?
  • 27. David Lutz 99designs @dlutzymeetup.com/Infrastructure-Coders/