Your SlideShare is downloading. ×
Have you been stalking your servers?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Have you been stalking your servers?

573

Published on

Presentation for DrupalCon Prague 2013 …

Presentation for DrupalCon Prague 2013
https://prague2013.drupal.org/session/have-you-been-stalking-your-servers

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
573
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Have you been stalking your servers?
  • 2. Have you been stalking your servers? Marji Cermak Sysadmin & DevOps Engineer at Morpht marji@morpht.com @cermakm
  • 3. The rule of 3 things picture: http://www.flickr.com/photos/helenaperezgarcia/5692392667/
  • 4. The rule of 3 things 1. What is monitoring and why do you want to monitor 2. Some monitoring tools available for you 3. It is easy to start with monitoring.
  • 5. Part 1 What is monitoring and why do you want to monitor
  • 6. photo: http://www.flickr.com/photos/tiagopadua/7903366470/
  • 7. Monitoring Monitoring is an intermittent (regular or irregular) series of observations in time, carried out to show the extent of compliance with a formulated standard or degree of deviation from an expected norm. J. M. Hellawell (1991), modified by A. Brown (2000), http://jncc.defra.gov.uk/page-2268 nature conservation area
  • 8. Why you need to monitor ● to know about the bad news before your customers (or your boss)
  • 9. Why you need to monitor ● to know about the bad news before your customers (or your boss) ● to scale up your server in advance
  • 10. Why you need to monitor ● to know about the bad news before your customers (or your boss) ● to scale up your server in advance ● to tune up your app
  • 11. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :)
  • 12. The fun of the nines Source: http://en.wikipedia.org/wiki/High_availability Nines: http://en.wikipedia.org/wiki/List_of_unusual_units_of_measurement#Nines
  • 13. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :) ● to minimise downtime (expensive)
  • 14. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :) ● to minimise downtime (expensive) ● to capture customer information
  • 15. Why you need to monitor (cont.) ● to have data / metrics to diagnose
  • 16. Diagnosing your collected data watch out for: ● trends
  • 17. Diagnosing your collected data watch out for: ● trends ● spikes
  • 18. Diagnosing your collected data watch out for: ● trends ● spikes ● irregularities
  • 19. Diagnosing your collected data watch out for: ● trends ● spikes ● irregularities ● thresholds
  • 20. Areas to monitor ● network photo: http://www.flickr.com/photos/misja_klimov/2120956405/
  • 21. Areas to monitor ● network ● server photo: http://www.flickr.com/photos/johnjack/3666997634/
  • 22. Areas to monitor ● network ● server ● services photo: http://www.flickr.com/photos/agustingodet/3691794089/
  • 23. Areas to monitor ● network ● server ● services photo: http://www.flickr.com/photos/agustingodet/3691792393/
  • 24. Areas to monitor ● network ● server ● services ● applications photo: http://www.flickr.com/photos/cheerfulstoic/942211994/
  • 25. Areas to monitor ● network ● server ● services ● applications ● users photo: http://www.flickr.com/photos/jimmysmith/99528596/
  • 26. Drupal Areas to monitor? ● network ● server ● services ● applications ● users
  • 27. Drupal Areas to monitor ● network ● server ● services ● applications ● users
  • 28. Drupal Areas to monitor ● network ● server ● services ● applications ● users
  • 29. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications ● users
  • 30. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications - your Drupal site(s) ● users
  • 31. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications - your Drupal site(s) ● users
  • 32. Part 2 Some monitoring tools available for you
  • 33. Meet Nagios, Munin and others ● Nagios ● Munin ● APC dashboard ● related Drupal modules
  • 34. Nagios /ˈnɑːɡiːoʊs/ ● system, network and infrastructure monitoring software application ● monitors and alerts
  • 35. Nagios /ˈnɑːɡiːoʊs/ Provides monitoring of: ● network services (SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH), ● host resources (processor load, disk usage, system logs), ● anything else like probes (temperature, alarms, etc). Many plugins available.
  • 36. Nagios /ˈnɑːɡiːoʊs/ Name and Pronunciation: ● NetSaint -> "Nagios Ain't Gonna Insist On Sainthood" ● Agios' a transliteration of the Greek word άγιος (saint)
  • 37. Nagios /ˈnɑːɡiːoʊs/ ● alerts by email/pager/IM... ● alerts to different contacts ● notification escalation ● service / host dependencies ● soft / hard states
  • 38. Nagios /ˈnɑːɡiːoʊs/
  • 39. Nagios Addons NRPE (Nagios Remote Plugin Executor) - executes plugins on remote Linux/Unix hosts image source: http://nagios.sourceforge.net/docs/3_0/addons.html
  • 40. Nagios Addons NSCA - sends passive checks from remote Linux/Unix hosts to Nagios image source: http://nagios.sourceforge.net/docs/3_0/addons.html
  • 41. Drupal and Nagios
  • 42. Munin ● network/system monitoring application ● outputs graphs through a web interface ● many plugins
  • 43. Munin ● master / node architecture ● connects to all nodes at regular intervals ● it uses the RRDtool (round robin database tool, handles time-series data)
  • 44. Munin Example
  • 45. Drupal and Munin
  • 46. Drupal and Munin
  • 47. ● they complement each other ● nagios normally alerts on one “service” ● munin can be used to correlate different things Nagios & Munin
  • 48. APC - what is it? The Alternative PHP Cache (APC) is a free and open opcode cache for PHP.
  • 49. APC - what is it? The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. Its goal is to provide a free, open, and robust framework for caching and optimising PHP intermediate code. Inside your webserver (not a webcache)
  • 50. Monitoring APC Memory Usage, Hit & Misses
  • 51. Monitoring APC Fragmentation
  • 52. Monitoring APC memory usage
  • 53. Monitoring APC files in cache
  • 54. Other monitoring tools ● Collectd ● Graphite ● Shinken ● Sensu ● NewRelic ● Pingdom
  • 55. Part 3 It is easy to start with monitoring.
  • 56. How to install these tools? Munin sudo apt-get install munin munin-node Nagios sudo apt-get install nagios3 APC dashboard php.apc script from php-apc package
  • 57. How to configure these? ● It is a bit fiddly ● There are many guides targeting beginners ● You don’t want to do it again and again
  • 58. puppet – a quick way to start system for automating system administration tasks
  • 59. puppet – a quick way to start ● a declarative language for expressing system configuration,
  • 60. puppet – a quick way to start ● a declarative language for expressing system configuration, ● a client and server for distributing it
  • 61. puppet – a quick way to start ● a declarative language for expressing system configuration, ● a client and server for distributing it ● and a library for realising the configuration.
  • 62. puppet – a quick way to start }
  • 63. puppet – a quick way to start 1. clone the stalk-your-box repo 2. run puppet apply on the code 3. monitor!
  • 64. A quick way to start $ git clone git://github.com/morpht/stalk-your-box.git /tmp/stalk-your-box Cloning into '/tmp/stalk-your-box'... remote: Counting objects: 23, done. remote: Compressing objects: 100% (19/19), done. remote: Total 23 (delta 1), reused 23 (delta 1) Receiving objects: 100% (23/23), 11.35 KiB, done. Resolving deltas: 100% (1/1), done.
  • 65. A quick way to start $ cd /tmp/stalk-your-box/ $ sudo puppet apply --modulepath=modules manifest.pp notice: /Stage[main]/Nagios::Server/Package[nagios3]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Nagios::Server/File[/etc/nagios3/htpasswd.users]/ensure: created notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: Adding password for user nagiosadmin notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: executed successfully notice: /Stage[main]/Munin::Node/Package[libcache-cache-perl]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Munin::Node/Package[munin-node]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Munin::Node/File[munin-node.conf]/content: content changed '{md5} e486786f866d7d7e025dea401c300e7b' to '{md5}dbf97a87a8da86ef68155815ecae3c1c' notice: /Stage[main]/Munin::Server/Service[apache2]: Triggered 'refresh' from 1 events notice: Finished catalog run in 44.26 seconds
  • 66. What this gives you
  • 67. What this gives you
  • 68. What this gives you
  • 69. Manifest.pp
  • 70. Manifest.pp
  • 71. Manifest.pp
  • 72. Summary It is easy to start with monitoring.
  • 73. The fun part - what’s wrong? What’s wrong here?
  • 74. The fun part - what’s wrong?
  • 75. Questions Here is the get started monitoring repo: https://github.com/morpht/stalk-your-box Marji Cermak Sysadmin & DevOps Engineer at Morpht marji@morpht.com @cermakm
  • 76. Resources Rule of Three: en.wikipedia.org/wiki/Rule_of_three_(writing) Nagios: http://www.nagios.org/ Munin: http://munin-monitoring.org/ Nagios module: https://drupal.org/project/nagios Munin module: https://drupal.org/project/munin Munin plugins (experimental): https://drupal.org/sandbox/murrayw/2084281 Sensu: http://sensuapp.org MySQLTuner: http://MySQLTuner.pl
  • 77. THANK YOU! WHAT DID YOU THINK? Locate this session at the DrupalCon Prague website: http://prague2013.drupal.org/schedule Click the “Take the survey” link

×