Open Source
Monitoring Tools
Save the server, Save the world


                                  Jeff Smith
                  jeffreyksmithjr@gmail.com
                            @theperipatetic1
Problem
Many heterogeneous servers
Can't let them fail
Servers can't talk or fix themselves...yet
WWBD?
What would Barbie do?
"Systems administration is hard.
Let's go shopping!"
Why Open Source?
● Free!
  ○ Free as in beer
  ○ Free as in speech
● Classic reasons*
  ○   Market Share
  ○   Reliability
  ○   Performance
  ○   Scalability
  ○   Security
  ○   Total Cost of Ownership


                           * from David A. Wheeler
Monitoring Master:
Ganglia




Good enough for Wikipedia
Alert and Oriented:
Nagios




(Complex) alerts
(Complex) administration
Somebird to watch over me:
Munin




Comprehensible
Plug-and-play
A daemon in the sack:
Collectd




Do one thing, and do it well.
A back end that doesn't get bigger:
RRD
● Used nearly everywhere
● Few competitors historically
● De facto standard
Just here for the scenery:
Graphing Tools
● RRDTool used to power (nearly) everything
● Gnuplot
● Innovation on the horizon
  ○ Graphite, Flot, etc.
Spoiled for choice:
Selection Criteria
● All about architecture
● What is YOUR use case?
● Beware the splitters
Don't stop me now:
Learning More
● RTFM
● RRTFM
● Books?
  ○   Ganglia: soon to be 1
  ○   Nagios: 5
  ○   Munin: 1...in German!
  ○   Collectd: 0
Leading Lights
● Tobi Oetiker
● Universities
  ○ Berkeley
● Large websites
  ○ Etsy
  ○ StumbleUpon
Whoopty do, what does it all mean?
Servers fail
Be a PreCop
Real-time bug detection
"Measure Anything, Measure Everything"
                                         -Etsy
Be a hero
Save the server, Save the world
Cyberstalk Me

                         Jeff Smith
         jeffreyksmithjr@gmail.com
                   @theperipatetic1

Save the server, Save the world