• Save
Advanced troubleshooting linux performance
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Advanced troubleshooting linux performance

on

  • 1,872 views

Small presentation about monitoring stack performance (for example on linux) to see it`s behavior and troubleshoot operational issues.

Small presentation about monitoring stack performance (for example on linux) to see it`s behavior and troubleshoot operational issues.

Statistics

Views

Total Views
1,872
Views on SlideShare
1,871
Embed Views
1

Actions

Likes
6
Downloads
1
Comments
0

1 Embed 1

https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Advanced troubleshooting linux performance Presentation Transcript

  • 1. Advanced troubleshooting linux performance By Naor Weissmann
  • 2. strace strace is a powerful debugging utility for Linux and some other Unix-like systems to monitor the system calls used by a program and all the signals it receives.
  • 3. ltrace It intercepts and records the dynamic library calls which are called by the executed process and the signals which are received by that process.
  • 4. When you need visualised trending. Let`s be honest, the raw data from sar, is just not good enough for analyzing. Especially then you need to present your finding to others. ● isag ● ksar
  • 5. isag ● Basic GUI ● Has some security concerns ● Not included on RH / CentOS
  • 6. ksar ● new and powerful ● build in Java
  • 7. Monitoring across the stack It is much more common practice to separate stack roles got between machines on all levels. Such a practice is even more common since the appearance and acceptance of virtualization. o monitor and troubleshoot your application you need one place to monitor and relate everything
  • 8. Then you need complex correlations Munin, the mother of all visualisations
  • 9. Munin ● Client - server architecture ● Tons of ready to go plugins ● Easily deployed ● Custom plugins extendable ● Custom graphs aggregation ● Uses RRD as a database
  • 10. Frameworks ● scriptural frameworks such as watchdog ● full frameworks such as sensu ● structured ● easier in deployment ● supported http://www.sensuapp.org https://github.com/sebastien/monitoring
  • 11. Monitoring (aka watchdog) ● monitoring and data-collection daemon ● lightweight ● written in python good for: ● to be notified when incidents happen ● automatic actions to be taken ● to collect statistics for further processing
  • 12. example-service-monitoring.py #!/usr/bin/env python from monitoring import * Monitor( Service( name = "google-search-latency", monitor = ( HTTP( GET="http://www.google.ca/search?q=monitoring", freq=Time.s(1), timeout=Time.ms(80), fail=[ Print("Google search query took more than 50ms") ] ) ) ) ).run()
  • 13. example-system-health.py from monitoring import * Monitor ( Service( name = "system-health", monitor = ( SystemInfo(freq=Time.s(1), success = ( LogResult("myserver.system.mem=", extract=lambda r,_:r["memoryUsage"]), LogResult("myserver.system.disk=", extract=lambda r,_:reduce(max,r["diskUsage"]. values())), LogResult("myserver.system.cpu=", extract=lambda r,_:r["cpuUsage"]), ) ), Delta( Bandwidth("eth0", freq=Time.s(1)), extract = lambda v:v["total"]["bytes"]/1000.0/1000.0, success = [LogResult("myserver.system.eth0.sent=")] ), SystemHealth( cpu=0.90, disk=0.90, mem=0.90, freq=Time.s(60), fail=[Log(path="monitoring-system-failures.log")] ), ) ) ).run()
  • 14. Sensu ● lightweight ● written in python / ruby ● Can re-use Nagios plugins consider themselves to be “monitoring router” basically it is a framework that: connects “check” scripts run across many nodes with “handler” scripts run on one or more Sensu servers
  • 15. Librato Create beautiful dashboards from different data sources.
  • 16. logstash logstash is a tool for managing your logs. It helps you take logs and other event data from your systems and move it into a central place. logstash is open source and completely free. http://logstash.net/
  • 17. logstash sample: eventlog input { eventlog { type => 'Win32-EventLog' logfile => 'System' } }
  • 18. logstash sample: date For example, syslog events usually have timestamps like this: "Apr 17 09:32:01" match => [ "logdate", "MMM dd YYY HH:mm: ss", "MMM d YYY HH:mm:ss", "ISO8601" ]
  • 19. logstash sample: output to redis output { stdout { debug => true debug_format => "json"} redis { host => "127.0.0.1" data_type => "list" key => "logstash" } }
  • 20. graylog Graylog2 is an open source log management solution that stores your logs in ElasticSearch. It consists of a server written in Java that accepts your syslog messages via TCP, UDP or AMQP and stores it in the database. The second part is a web interface that allows you to manage the log messages from your web browser.
  • 21. graylog
  • 22. logstash + graylog = rocking conf Any log amount + searchable + alertable = usable
  • 23. logstash + graylog