Advanced troubleshooting linux performance

Advanced
troubleshooting linux
performance
By Naor Weissmann

strace
strace is a powerful debugging utility for Linux
and some other Unix-like systems to monitor
the system calls used by a program and all the
signals it receives.

ltrace
It intercepts and records the dynamic library
calls which are called by the executed process
and the signals which are received by that
process.

When you need visualised trending.
Let`s be honest, the raw data from sar, is just
not good enough for analyzing. Especially then
you need to present your finding to others.
● isag
● ksar

isag
● Basic GUI
● Has some security concerns
● Not included on RH / CentOS

ksar
● new and powerful
● build in Java

Monitoring across the stack
It is much more common practice to separate
stack roles got between machines on all levels.
Such a practice is even more common since
the appearance and acceptance of
virtualization.
o monitor and troubleshoot your application you
need one place to monitor and relate
everything

Then you need complex correlations
Munin, the mother of all visualisations

Munin
● Client - server architecture
● Tons of ready to go plugins
● Easily deployed
● Custom plugins extendable
● Custom graphs aggregation
● Uses RRD as a database

Frameworks
● scriptural frameworks such as watchdog
● full frameworks such as sensu
● structured
● easier in deployment
● supported
http://www.sensuapp.org
https://github.com/sebastien/monitoring

Monitoring (aka watchdog)
● monitoring and data-collection daemon
● lightweight
● written in python
good for:
● to be notified when incidents happen
● automatic actions to be taken
● to collect statistics for further processing

example-service-monitoring.py
#!/usr/bin/env python
from monitoring import *
Monitor(
Service(
name = "google-search-latency",
monitor = (
HTTP(
GET="http://www.google.ca/search?q=monitoring",
freq=Time.s(1),
timeout=Time.ms(80),
fail=[
Print("Google search query took more than 50ms")
]
)
)
)
).run()

example-system-health.py
from monitoring import *
Monitor (
Service(
name = "system-health",
monitor = (
SystemInfo(freq=Time.s(1),
success = (
LogResult("myserver.system.mem=", extract=lambda r,_:r["memoryUsage"]),
LogResult("myserver.system.disk=", extract=lambda r,_:reduce(max,r["diskUsage"].
values())),
LogResult("myserver.system.cpu=", extract=lambda r,_:r["cpuUsage"]),
)
),
Delta(
Bandwidth("eth0", freq=Time.s(1)),
extract = lambda v:v["total"]["bytes"]/1000.0/1000.0,
success = [LogResult("myserver.system.eth0.sent=")]
),
SystemHealth(
cpu=0.90, disk=0.90, mem=0.90,
freq=Time.s(60),
fail=[Log(path="monitoring-system-failures.log")]
),
)
)
).run()

Sensu
● lightweight
● written in python / ruby
● Can re-use Nagios plugins
consider themselves to be “monitoring router”
basically it is a framework that:
connects “check” scripts run across many
nodes with “handler” scripts run on one or
more Sensu servers

Librato
Create beautiful dashboards from different data
sources.

logstash
logstash is a tool for managing your logs. It
helps you take logs and other event data from
your systems and move it into a central place.
logstash is open source and completely free.
http://logstash.net/

logstash sample: eventlog
input {
eventlog {
type => 'Win32-EventLog'
logfile => 'System'
}
}

logstash sample: date
For example, syslog events usually have
timestamps like this:
"Apr 17 09:32:01"
match => [ "logdate", "MMM dd YYY HH:mm:
ss",
"MMM d YYY HH:mm:ss", "ISO8601" ]

logstash sample: output to redis
output {
stdout { debug => true debug_format =>
"json"}
redis { host => "127.0.0.1" data_type => "list"
key => "logstash" }
}

graylog
Graylog2 is an open source log management
solution that stores your logs in ElasticSearch.
It consists of a server written in Java that
accepts your syslog messages via TCP, UDP
or AMQP and stores it in the database. The
second part is a web interface that allows you
to manage the log messages from your web
browser.

logstash + graylog = rocking conf
Any log amount
+
searchable
+
alertable
=
usable

Advanced troubleshooting linux performance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Advanced troubleshooting linux performance

Similar to Advanced troubleshooting linux performance (20)

More from Forthscale

More from Forthscale (6)

Recently uploaded

Recently uploaded (20)

Advanced troubleshooting linux performance