Your SlideShare is downloading. ×
0
Performance Management with Free and Bundled Tools <ul><ul><li>Adrian Cockcroft </li></ul></ul><ul><ul><li>Netflix Inc. </...
Agenda <ul><li>Overview of Capacity Planning Requirements and Data Sources </li></ul><ul><li>Performance Data Collection <...
What are we talking about? June 2, 2010 Adrian Cockcroft and Mario Jauvin Network monitoring with WireShark, MRTG, BigSist...
Capacity Planning Requirements and Data Sources June 2, 2010 Adrian Cockcroft and Mario Jauvin
Definitions <ul><li>Capacity </li></ul><ul><ul><li>Resource utilization and headroom </li></ul></ul><ul><li>Planning </li>...
Capacity Planning Requirements <ul><li>We care about CPU, Memory, Network and Disk resources, and Application response tim...
CPU Capacity Measurements <ul><li>CPU Capacity is defined by CPU type and clock rate, or a benchmark rating like SPECrateI...
CPU Measurement Issues <ul><li>Biased sample CPU measurements </li></ul><ul><li>Microstate measurements are accurate, but ...
Memory Capacity Measurements <ul><li>Physical Memory Capacity Utilization and Limits </li></ul><ul><ul><li>Kernel memory <...
Network Capacity Measurements <ul><li>Network Interface Throughput </li></ul><ul><ul><li>Byte and packet rates input and o...
Disk Capacity Measurements <ul><li>Detailed metrics vary by platform </li></ul><ul><li>Easy for the simple disk cases </li...
Capacity Planning Challenges <ul><li>Constantly changing infrastructure </li></ul><ul><li>Limited attention span from staf...
Observability <ul><li>Four different viewpoints </li></ul><ul><ul><li>Management </li></ul></ul><ul><ul><li>Engineering </...
Management Viewpoint <ul><li>Daily summary of status and problems </li></ul><ul><li>Business oriented metrics </li></ul><u...
Engineering Viewpoint <ul><li>Large volumes of detailed data at several different time scales </li></ul><ul><li>Input to t...
QA Test Viewpoint <ul><li>Workload specification tools </li></ul><ul><li>Load generation frameworks </li></ul><ul><li>Test...
Operations Viewpoint <ul><li>Immediate timeframe </li></ul><ul><li>Real time display, updated in seconds </li></ul><ul><li...
Measurement Data Interfaces <ul><li>Several generic raw access methods </li></ul><ul><ul><li>Read the kernel directly (not...
Process based data <ul><li>Used by ps, top, proctool and debuggers, pea.se, Solaris proc tools </li></ul><ul><li>Solaris a...
Kernel Trace - TNF, Dtrace, ktrace <ul><li>Solaris, Linux, Windows and other Unixes have similar features </li></ul><ul><u...
Dtrace – Dynamic Tracing <ul><li>One of the most exiting new features in Solaris 10, rave reviews </li></ul><ul><li>New bo...
Accounting Records <ul><li>Standard Unix System V Accounting - acct </li></ul><ul><ul><li>Tiny, incomplete (no process id!...
Extracct for Solaris <ul><li>I wrote extracct tool to get extended acct data out in a useful form </li></ul><ul><li>See  h...
Example Extracct Output <ul><li># ./extracct Usage: extracct [-vwr] [ file | -a dir ] </li></ul><ul><li>-v: verbose -w: wr...
Configuration information <ul><li>Configuration data comes from too many sources! </li></ul><ul><ul><li>Solaris device tre...
Free Network Monitoring Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
SNMP <ul><li>Simple network management protocol </li></ul><ul><li>UDP protocol based on port 161 </li></ul><ul><li>Client/...
SNMP – MIBs <ul><li>Management information base </li></ul><ul><li>Defines the structure and the semantic of the informatio...
SNMP – commands <ul><li>Called PDU (protocol data units) </li></ul><ul><li>GET </li></ul><ul><li>GETNEXT </li></ul><ul><li...
Versions <ul><li>Version 1, original version done in May 1991 </li></ul><ul><li>Version 2, around 1993. Failed because the...
SNMP tools <ul><li>Too numerous to name all but… </li></ul><ul><li>OpenNMS </li></ul><ul><li>Nagios </li></ul><ul><li>Cact...
SNMP tools <ul><li>Snmpwalk – will report all data in a specified MIB </li></ul><ul><li>getIf – will report data about int...
OpenNMS <ul><li>Well…. it’s not that portable </li></ul><ul><ul><li>95% java is not 100% java </li></ul></ul><ul><ul><li>R...
OpenNMS <ul><li>Main screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
OpenNMS <ul><li>Node screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
Nagios <ul><li>Easy to build/compile (on Solaris 10) </li></ul><ul><li>Easy to install </li></ul><ul><li>Quick response fr...
Nagios <ul><li>Main screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
Nagios <ul><li>Host detail screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop <ul><li>Similar to familiar UNIX top tool for processes but used for network </li></ul><ul><li>Provide huge selection...
ntop – Active Sessions June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop Hosts June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop Network Load June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop_Network_Thruput June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop Port Dist June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop_Protocol_Dist June 2, 2010 Adrian Cockcroft and Mario Jauvin
ntop Protocols June 2, 2010 Adrian Cockcroft and Mario Jauvin
Zenoss <ul><li>Open source monitoring and management of IT infrastructure </li></ul><ul><li>Zenoss core is free </li></ul>...
zenoss Architecture June 2, 2010 Adrian Cockcroft and Mario Jauvin
zenoss Dash Config June 2, 2010 Adrian Cockcroft and Mario Jauvin
zenoss Google June 2, 2010 Adrian Cockcroft and Mario Jauvin
zenoss Google Alerts June 2, 2010 Adrian Cockcroft and Mario Jauvin
Zenoss Graphs June 2, 2010 Adrian Cockcroft and Mario Jauvin
zenoss Topology June 2, 2010 Adrian Cockcroft and Mario Jauvin
MRTG <ul><li>Really simple to install and configure </li></ul><ul><li>Require manual config file creation </li></ul><ul><l...
MRTG <ul><li>Interface screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
MRTG <ul><li>Other CPU screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
RRD tool <ul><li>Software to store, retrieve and graph numerical time series data </li></ul><ul><li>Use a round robin algo...
RRD tool <ul><li>Compiles on most platforms </li></ul><ul><li>Used by many SNMP  based tools </li></ul><ul><ul><li>OpenNMS...
RRD tool <ul><li>14all CGI script that plots data similar to MRTG </li></ul><ul><li>Configurable to collect data at differ...
RRD tool <ul><li>Sample screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
RRD tool <ul><li>Screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
RRD tool <ul><li>Create a RRD database </li></ul><ul><li>rrdtool create test.rrd   </li></ul><ul><li>--start 920804400   <...
RRD tool <ul><li>Create a graph </li></ul><ul><li>rrdtool graph speed.png   </li></ul><ul><li>--start 920804400 --end 9208...
Free Performance Data Collection and Rules Toolkits June 2, 2010 Adrian Cockcroft and Mario Jauvin
SE toolkit Example Tools <ul><li>A free performance toolkit for rapidly creating custom data sources </li></ul><ul><li>Mak...
SE language features <ul><li>SE is a 64bit interpreted dialect of C </li></ul><ul><ul><li>Not a new language to learn from...
Creating Rules <ul><li>Based on  real experiences of all the things that go wrong </li></ul><ul><li>Capture an approximati...
Configuring Rules <ul><li>Thresholds should be configured </li></ul><ul><li>Very application dependent </li></ul><ul><li>C...
Rules as Objects <ul><li>Define only the input and output information </li></ul><ul><li>Hide implementation details </li><...
&quot;virtual adrian&quot; rules summary <ul><li>Disk Rule for all disks at once </li></ul><ul><ul><li>Looks for slow disk...
XE Toolkit - www.xetoolkit.com <ul><li>Complete re-write of SE Toolkit by Rich Pettit </li></ul><ul><ul><li>Extensible Jav...
Captive Metrics / XE Toolkit Architecture June 2, 2010 Adrian Cockcroft and Mario Jauvin
Free System Monitoring Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
Collated Performance Data - Orca <ul><li>Problems with time sync when collecting data from multiple tools </li></ul><ul><u...
Orca data collections <ul><li>Collected using “procollator” reading info from /proc on Linux </li></ul><ul><li>[Uptime]   ...
June 2, 2010 Adrian Cockcroft and Mario Jauvin All metrics are stored in “round robin database” format using RRDtool to ge...
Cacti – www.cacti.net <ul><li>Web based user interface based on RRDtool </li></ul><ul><li>More sophisticated GUI than Orca...
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
Ganglia – www.ganglia.info <ul><li>Web based RRDtool GUI somewhat similar to Cacti </li></ul><ul><li>Better management of ...
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
BigBrother and BigSister <ul><li>Network and system dashboard alert monitor </li></ul><ul><li>Widely used at internet site...
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
Free QA Test and Modelling Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
QA Test Requirements <ul><li>Generate test workload </li></ul><ul><ul><li>SLAMD, Grinder </li></ul></ul><ul><li>Collect pe...
Grinder 3 - Powerful New Features <ul><li>100% Pure Java - works on any hardware platform and any operating system that su...
SLAMD <ul><li>Load generation framework, written in Java </li></ul><ul><li>Originally built to test LDAP servers by Sun </...
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
June 2, 2010 Adrian Cockcroft and Mario Jauvin
PDQ Modelling Tool <ul><li>Dr Neil Gunther’s toolkit at  http://www.perfdynamics.com </li></ul><ul><li>Library used from C...
References and Conclusion June 2, 2010 Adrian Cockcroft and Mario Jauvin
Licences for Free Tools <ul><li>Open Source Initiative </li></ul><ul><ul><li>“ OSI Approved licences” </li></ul></ul><ul><...
Web Pages and Books <ul><li>Adrian’s Performance and other topics blog </li></ul><ul><ul><li>http://perfcap.blogspot.com  ...
Concluding Remarks <ul><li>Many large installations depend on free tools </li></ul><ul><li>A full suite of functionality i...
Questions? [email_address] [email_address] June 2, 2010 Adrian Cockcroft and Mario Jauvin
Upcoming SlideShare
Loading in...5
×

Capacity Planning and Performance Monitoring with Free Tools

4,647

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,647
On Slideshare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
189
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • &amp;quot;We reject kings, presidents, and voting; we believe in rough consensus and running code.&amp;quot; -- David Clark, IAB chair, 1992
  • I added the SNMP-Informant and it started appearing automagically.
  • 920804400 – Noon 7th of March, 1999 DS – data source SPEED as counter, collected every 300 seconds (defaults) 600 is heartbeat – maximum time to wait after which data is unknown U:U means unknown minimum and maximum RRA – round robin archive 0.5 – xfiles factor - % of unknown after which whole archive is unknown 1:24 average every 1 interval (no average) and keep 24 (2 hours worth) 6:10 everage every 6 values and keep 10
  • Start at noon, end at 13:00, average RRA called SPEED using 2 pixel thickness and color red #FF0000
  • Transcript of "Capacity Planning and Performance Monitoring with Free Tools"

    1. 1. Performance Management with Free and Bundled Tools <ul><ul><li>Adrian Cockcroft </li></ul></ul><ul><ul><li>Netflix Inc. </li></ul></ul><ul><ul><li>[email_address] </li></ul></ul><ul><ul><li>(Co-authored with Mario Jauvin </li></ul></ul><ul><ul><li>MFJ Associates </li></ul></ul><ul><ul><li>mario@mfjassociates.net) </li></ul></ul><ul><ul><li>2 June 2010 </li></ul></ul>
    2. 2. Agenda <ul><li>Overview of Capacity Planning Requirements and Data Sources </li></ul><ul><li>Performance Data Collection </li></ul><ul><li>Free Network Monitoring Tools </li></ul><ul><li>Free System Monitoring Tools </li></ul><ul><li>Free Load Generation and Modelling Tools </li></ul><ul><li>Licences and References </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    3. 3. What are we talking about? June 2, 2010 Adrian Cockcroft and Mario Jauvin Network monitoring with WireShark, MRTG, BigSister, Cacti, Nagios, OpenNMS, Zenoss, Openxtra, ntop Database Tier monitoring With SEtoolkit, Orca, XEtoolkit Application Tier monitoring with Orca, Cacti, BigSister, Ganglia, XEtoolkit FREE!! QA Load generation with Grinder or SLAMD, modelling with PDQ and R
    4. 4. Capacity Planning Requirements and Data Sources June 2, 2010 Adrian Cockcroft and Mario Jauvin
    5. 5. Definitions <ul><li>Capacity </li></ul><ul><ul><li>Resource utilization and headroom </li></ul></ul><ul><li>Planning </li></ul><ul><ul><li>Predicting future needs by analyzing historical data and modeling future scenarios </li></ul></ul><ul><li>Performance Monitoring </li></ul><ul><ul><li>Collecting and reporting on performance data </li></ul></ul><ul><li>Free Tools </li></ul><ul><ul><li>Bundled with the OS or available for no $$$ </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    6. 6. Capacity Planning Requirements <ul><li>We care about CPU, Memory, Network and Disk resources, and Application response times </li></ul><ul><li>We need to know how much of each resource we are using now, and will use in the future </li></ul><ul><li>We need to know how much headroom we have to handle higher loads </li></ul><ul><li>We want to understand how headroom varies, and how it relates to application response times and throughput </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    7. 7. CPU Capacity Measurements <ul><li>CPU Capacity is defined by CPU type and clock rate, or a benchmark rating like SPECrateInt2000 </li></ul><ul><li>CPU utilization is defined as busy time divided by elapsed time for each CPU </li></ul><ul><li>CPU load average measures the average number of jobs running and ready to run </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    8. 8. CPU Measurement Issues <ul><li>Biased sample CPU measurements </li></ul><ul><li>Microstate measurements are accurate, but are platform and tool specific </li></ul><ul><li>Hyperthreading non-linearities </li></ul><ul><li>Platform specific details, e.g. are interrupts included in system time? </li></ul><ul><ul><li>http://perfcap.blogspot.com/2005/10/how-busy-is-your-cpu-really.html </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    9. 9. Memory Capacity Measurements <ul><li>Physical Memory Capacity Utilization and Limits </li></ul><ul><ul><li>Kernel memory </li></ul></ul><ul><ul><li>Shared Memory segment </li></ul></ul><ul><ul><li>Executable code, stack and heap </li></ul></ul><ul><ul><li>File system cache usage </li></ul></ul><ul><ul><li>Unused free memory </li></ul></ul><ul><li>Virtual Memory Capacity - Swap Space </li></ul><ul><li>Memory Throughput </li></ul><ul><ul><li>Page in and page out rates </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    10. 10. Network Capacity Measurements <ul><li>Network Interface Throughput </li></ul><ul><ul><li>Byte and packet rates input and output </li></ul></ul><ul><li>TCP Protocol Specific Throughput </li></ul><ul><ul><li>TCP connection count and connection rates </li></ul></ul><ul><ul><li>TCP byte rates input and output </li></ul></ul><ul><li>NFS/SMB Protocol Specific Throughput </li></ul><ul><ul><li>Byte rates read and write </li></ul></ul><ul><ul><li>NFS/SMB service response times </li></ul></ul><ul><li>HTTP Protocol Specific Throughput </li></ul><ul><ul><li>HTTP operation rates </li></ul></ul><ul><ul><li>Get and put payload byte rates and size distribution </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    11. 11. Disk Capacity Measurements <ul><li>Detailed metrics vary by platform </li></ul><ul><li>Easy for the simple disk cases </li></ul><ul><li>Hard for cached RAID subsystems </li></ul><ul><li>Almost Impossible for shared disk subsystems and SANs </li></ul><ul><ul><li>Another system or volume can be sharing a backend spindle, when it gets busy your own volume can saturate, even though you did not change your own workload </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    12. 12. Capacity Planning Challenges <ul><li>Constantly changing infrastructure </li></ul><ul><li>Limited attention span from staff </li></ul><ul><li>Horizontally scaled commodity systems </li></ul><ul><li>Per node software licencing costs too much </li></ul><ul><li>Too many tools, too many agents per node </li></ul><ul><li>Too much data, not enough analysis </li></ul><ul><li>Non-linear and non-intuitive scalability </li></ul><ul><li>Lack of tools and metrics for virtualized resources </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    13. 13. Observability <ul><li>Four different viewpoints </li></ul><ul><ul><li>Management </li></ul></ul><ul><ul><li>Engineering </li></ul></ul><ul><ul><li>QA Testing </li></ul></ul><ul><ul><li>Operations </li></ul></ul><ul><li>Each needs very different information </li></ul><ul><li>Ideal would be different views of the same performance database </li></ul><ul><li>Reality is a mess of disjoint tools </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    14. 14. Management Viewpoint <ul><li>Daily summary of status and problems </li></ul><ul><li>Business oriented metrics </li></ul><ul><li>Future scenario planning </li></ul><ul><li>Marketing and management input </li></ul><ul><li>Concise report with dashboard style status indicators </li></ul><ul><li>Free tools: R, Spreadsheet and Web based displays, no good summarization tools </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    15. 15. Engineering Viewpoint <ul><li>Large volumes of detailed data at several different time scales </li></ul><ul><li>Input to tuning, reconfiguring and future product development </li></ul><ul><li>Low level problem diagnosis </li></ul><ul><li>Detailed reports with drill down and correlation analysis </li></ul><ul><li>Free tools: XE/SE Toolkit, Orca, Ganglia, Cacti, R </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    16. 16. QA Test Viewpoint <ul><li>Workload specification tools </li></ul><ul><li>Load generation frameworks </li></ul><ul><li>Testing for functionality and performance </li></ul><ul><li>Regression tools to compare releases </li></ul><ul><li>Modelling difference between test configuration and production configuration </li></ul><ul><li>Free Tools: The Grinder, SLAMD, R, PDQ </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    17. 17. Operations Viewpoint <ul><li>Immediate timeframe </li></ul><ul><li>Real time display, updated in seconds </li></ul><ul><li>Alert based monitoring </li></ul><ul><li>High level problem diagnosis </li></ul><ul><li>Simple high level graphs and views </li></ul><ul><li>Free tools: BigSister, Nagios, OpenNMS, MRTG, Cacti, Ganglia, WireShark, ntop </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    18. 18. Measurement Data Interfaces <ul><li>Several generic raw access methods </li></ul><ul><ul><li>Read the kernel directly (not a good idea) </li></ul></ul><ul><ul><li>Structured system data (Solaris kstat, Linux /proc) </li></ul></ul><ul><ul><li>Process data </li></ul></ul><ul><ul><li>Network data </li></ul></ul><ul><ul><li>Accounting data </li></ul></ul><ul><ul><li>Application data </li></ul></ul><ul><li>Command based data interfaces </li></ul><ul><ul><li>Scrape data from vmstat, iostat, netstat, sar, ps </li></ul></ul><ul><ul><li>Higher overhead, lower resolution, missing metrics </li></ul></ul><ul><li>Data available is platform specific either way </li></ul><ul><li>Much more detail on this topic in the Solaris/Linux Performance Measurement and Tuning Class </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    19. 19. Process based data <ul><li>Used by ps, top, proctool and debuggers, pea.se, Solaris proc tools </li></ul><ul><li>Solaris and Linux both have /proc/pid/metric hierarchy </li></ul><ul><li>Linux also includes system information in /proc rather than kstat </li></ul><ul><li>Advantages </li></ul><ul><ul><li>The recommended and supported process access API </li></ul></ul><ul><ul><li>Metric data structures reasonably stable over releases </li></ul></ul><ul><ul><li>Consistent data using locking </li></ul></ul><ul><ul><li>Solaris microstate data provides accurate process state timers </li></ul></ul><ul><li>Disadvantages </li></ul><ul><ul><li>High overhead for open/read/close for every process </li></ul></ul><ul><ul><li>Linux reports data as ascii text, Solaris as binary structures </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    20. 20. Kernel Trace - TNF, Dtrace, ktrace <ul><li>Solaris, Linux, Windows and other Unixes have similar features </li></ul><ul><ul><li>Solaris has TNF probes and prex command to control them </li></ul></ul><ul><ul><li>User level probe library for hires tracepoints allows instrumentation of multithreaded applications </li></ul></ul><ul><ul><li>Kernel level probes allow disk I/O and scheduler tracing </li></ul></ul><ul><li>Advantages </li></ul><ul><ul><li>Low overhead, microsecond resolution </li></ul></ul><ul><ul><li>I/O trace capability is extremely useful </li></ul></ul><ul><li>Disadvantages </li></ul><ul><ul><li>Too much data to process with simple tracing capabilities </li></ul></ul><ul><ul><li>Trace buffer can overflow or cause locking issues </li></ul></ul><ul><li>Solaris 10 Dtrace is a quite different beast! Much more flexible </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    21. 21. Dtrace – Dynamic Tracing <ul><li>One of the most exiting new features in Solaris 10, rave reviews </li></ul><ul><li>New book: &quot;Solaris Performance and Tools&quot; by Richard McDougall </li></ul><ul><li>Advantages </li></ul><ul><ul><li>No overhead when it is not in use </li></ul></ul><ul><ul><li>Low overhead probes can be put anywhere/everywhere </li></ul></ul><ul><ul><li>Trace data is correlated and filtered at source, get exactly the data you want, very sophisticated data providers included </li></ul></ul><ul><ul><li>Bundled, supported, designed to be safe for production systems </li></ul></ul><ul><li>Disadvantages </li></ul><ul><ul><li>Solaris specific, but being ported to BSD/Linux </li></ul></ul><ul><ul><li>No high level tools support yet </li></ul></ul><ul><ul><li>Yet another scripting language to learn – somewhat similar to “awk” </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    22. 22. Accounting Records <ul><li>Standard Unix System V Accounting - acct </li></ul><ul><ul><li>Tiny, incomplete (no process id!) low resolution, no overhead! </li></ul></ul><ul><li>Solaris Extended System and Network Accounting - exacct </li></ul><ul><ul><li>Flexible, Overly complex, Detailed data, Interval support </li></ul></ul><ul><ul><li>No overhead! 100% capture ratio for infrequent samples! </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    23. 23. Extracct for Solaris <ul><li>I wrote extracct tool to get extended acct data out in a useful form </li></ul><ul><li>See http://perfcap.blogspot.com/search?q=accounting for details and get code from http://www.orcaware.com/orca/pub/extracct </li></ul><ul><li>Pre-compiled code for Solaris SPARC and x86. Solaris 8 to 10. </li></ul><ul><ul><li>Useful data is logged in regular columns for easy import </li></ul></ul><ul><ul><li>Includes low overhead network accounting config file for TCP flows </li></ul></ul><ul><ul><li>Interval accounting option to force all processes to cut records </li></ul></ul><ul><ul><li>Automatic log filename generation and clean switching </li></ul></ul><ul><ul><li>Designed to run directly as a cron job, useful today </li></ul></ul><ul><li>More work needed to interface output to SE toolkit and Orca </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    24. 24. Example Extracct Output <ul><li># ./extracct Usage: extracct [-vwr] [ file | -a dir ] </li></ul><ul><li>-v: verbose -w: wracct all processes first -r: rotate logs -a dir: use acctadm.conf to get input logs, and write output files to dir </li></ul><ul><li>The usual way to run the command will be from cron as shown </li></ul><ul><li>0 * * * * /opt/exdump/extracct -war /var/tmp/exacct > /dev/null 2>&1 2 * * * * /bin/find /var/adm/exacct -ctime +7 -exec rm {} ; </li></ul><ul><li>This also shows how to clean up old log files, I only delete the binary files in this example, and I created /var/tmp/exacct to hold the text files. The process data in the text file looks like this: timestamp locltime duration procid ppid uid usr sys majf rwKB vcxK icxK sigK sycK arMB mrMB command 1114734370 17:26:10 0.0027 16527 16526 0 0.000 0.002 0 0.53 0.00 0.00 0.00 0.1 0.7 28.9 acctadm 1114734370 17:26:10 0.0045 16526 16525 0 0.000 0.001 0 0.00 0.00 0.00 0.00 0.1 1.1 28.9 sh 1114734370 17:26:10 0.0114 16525 8020 0 0.001 0.005 0 1.71 0.00 0.00 0.00 0.3 1.0 28.9 exdump 1109786959 10:09:19 -1.0000 1 0 0 4.311 3.066 96 47504.69 49.85 0.18 0.34 456.2 0.9 1.0 init 1109786959 10:09:19 -1.0000 2 0 0 0.000 0.000 0 0.00 0.00 0.00 0.00 0.0 0.0 0.0 pageout </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    25. 25. Configuration information <ul><li>Configuration data comes from too many sources! </li></ul><ul><ul><li>Solaris device tree displayed by prtconf and prtdiag </li></ul></ul><ul><ul><li>Linux configuration data can be found in /proc </li></ul></ul><ul><ul><li>Solaris 8 adds dynamic configuration notification device picld </li></ul></ul><ul><ul><li>SunVTS component test system has vtsprobe to get config </li></ul></ul><ul><ul><li>SCSI device info using iostat -E in Solaris </li></ul></ul><ul><ul><li>Logical volume data from vxprint and metastat </li></ul></ul><ul><ul><li>HW RAID info from device specific tools </li></ul></ul><ul><ul><li>Critical storage config info must be accessed over ethernet… </li></ul></ul><ul><li>It is very hard to combine all this data </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    26. 26. Free Network Monitoring Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
    27. 27. SNMP <ul><li>Simple network management protocol </li></ul><ul><li>UDP protocol based on port 161 </li></ul><ul><li>Client/server like </li></ul><ul><ul><li>Client is called management application entity </li></ul></ul><ul><ul><li>Server is called an agent entity </li></ul></ul><ul><li>Agent entity is designed to be implemented on network hardware, router, switches, etc </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    28. 28. SNMP – MIBs <ul><li>Management information base </li></ul><ul><li>Defines the structure and the semantic of the information that can be reported on </li></ul><ul><li>Most commonly used is MIB-II which defines a set of standard networking attributes </li></ul><ul><ul><li>Interface tables </li></ul></ul><ul><ul><li>System level information </li></ul></ul><ul><ul><li>Routing tables </li></ul></ul><ul><li>Specified using ASN.1 (abstract syntax notation 1) </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    29. 29. SNMP – commands <ul><li>Called PDU (protocol data units) </li></ul><ul><li>GET </li></ul><ul><li>GETNEXT </li></ul><ul><li>GETBULK </li></ul><ul><li>SET </li></ul><ul><li>Encoded using BER (basic encoding rules) </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    30. 30. Versions <ul><li>Version 1, original version done in May 1991 </li></ul><ul><li>Version 2, around 1993. Failed because the IETF credo of “rough consensus and running code” could not be met on securing SNMP </li></ul><ul><li>Turned into V2c for community string security (like V1) </li></ul><ul><li>Version 3, added security and complexity in 1998 </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    31. 31. SNMP tools <ul><li>Too numerous to name all but… </li></ul><ul><li>OpenNMS </li></ul><ul><li>Nagios </li></ul><ul><li>Cacti </li></ul><ul><li>MRTG </li></ul><ul><li>Net-snmp </li></ul><ul><ul><li>See www.snmplink.org </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    32. 32. SNMP tools <ul><li>Snmpwalk – will report all data in a specified MIB </li></ul><ul><li>getIf – will report data about interfaces and includes built-in MIB browser </li></ul><ul><li>Snmptable – will report tabular data from MIB tables </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    33. 33. OpenNMS <ul><li>Well…. it’s not that portable </li></ul><ul><ul><li>95% java is not 100% java </li></ul></ul><ul><ul><li>Requires about 20-30 different platform specific packages (PostgreSQL, Perl, RRD tool, Tomcat 4 etc…) </li></ul></ul><ul><ul><li>Difficult to install </li></ul></ul><ul><ul><li>Easy auto discovery </li></ul></ul><ul><ul><li>Web-based interface </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    34. 34. OpenNMS <ul><li>Main screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    35. 35. OpenNMS <ul><li>Node screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    36. 36. Nagios <ul><li>Easy to build/compile (on Solaris 10) </li></ul><ul><li>Easy to install </li></ul><ul><li>Quick response from CGI </li></ul><ul><li>Configuration is manual and a pain </li></ul><ul><ul><li>13 configuration files with all kinds of interrelated entries </li></ul></ul><ul><ul><li>Tedious and error prone </li></ul></ul><ul><li>Requires plugins to do anything </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    37. 37. Nagios <ul><li>Main screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    38. 38. Nagios <ul><li>Host detail screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    39. 39. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    40. 40. ntop <ul><li>Similar to familiar UNIX top tool for processes but used for network </li></ul><ul><li>Provide huge selection of real-time data </li></ul><ul><li>Can be found at http://www.openxtra.co.uk/ </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    41. 41. ntop – Active Sessions June 2, 2010 Adrian Cockcroft and Mario Jauvin
    42. 42. ntop Hosts June 2, 2010 Adrian Cockcroft and Mario Jauvin
    43. 43. ntop Network Load June 2, 2010 Adrian Cockcroft and Mario Jauvin
    44. 44. ntop_Network_Thruput June 2, 2010 Adrian Cockcroft and Mario Jauvin
    45. 45. ntop Port Dist June 2, 2010 Adrian Cockcroft and Mario Jauvin
    46. 46. ntop_Protocol_Dist June 2, 2010 Adrian Cockcroft and Mario Jauvin
    47. 47. ntop Protocols June 2, 2010 Adrian Cockcroft and Mario Jauvin
    48. 48. Zenoss <ul><li>Open source monitoring and management of IT infrastructure </li></ul><ul><li>Zenoss core is free </li></ul><ul><li>Other editions are for a fee </li></ul><ul><li>Get it from http://www.zenoss.com/download/ </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    49. 49. zenoss Architecture June 2, 2010 Adrian Cockcroft and Mario Jauvin
    50. 50. zenoss Dash Config June 2, 2010 Adrian Cockcroft and Mario Jauvin
    51. 51. zenoss Google June 2, 2010 Adrian Cockcroft and Mario Jauvin
    52. 52. zenoss Google Alerts June 2, 2010 Adrian Cockcroft and Mario Jauvin
    53. 53. Zenoss Graphs June 2, 2010 Adrian Cockcroft and Mario Jauvin
    54. 54. zenoss Topology June 2, 2010 Adrian Cockcroft and Mario Jauvin
    55. 55. MRTG <ul><li>Really simple to install and configure </li></ul><ul><li>Require manual config file creation </li></ul><ul><li>Only for MIB-II interface plotting out of the box </li></ul><ul><li>Graphing not flexible, axis, time etc </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    56. 56. MRTG <ul><li>Interface screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    57. 57. MRTG <ul><li>Other CPU screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    58. 58. RRD tool <ul><li>Software to store, retrieve and graph numerical time series data </li></ul><ul><li>Use a round robin algorithm </li></ul><ul><li>Data files are a fixed size </li></ul><ul><ul><li>Don’t grow </li></ul></ul><ul><ul><li>Don’t require maintenance </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    59. 59. RRD tool <ul><li>Compiles on most platforms </li></ul><ul><li>Used by many SNMP based tools </li></ul><ul><ul><li>OpenNMS </li></ul></ul><ul><ul><li>Cacti </li></ul></ul><ul><ul><li>BigSister </li></ul></ul><ul><ul><li>WeatherMap4RRD </li></ul></ul><ul><ul><li>MailGraph </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    60. 60. RRD tool <ul><li>14all CGI script that plots data similar to MRTG </li></ul><ul><li>Configurable to collect data at different interval (unlike MRTG) </li></ul><ul><li>Flexible and variable in what data can be collected </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    61. 61. RRD tool <ul><li>Sample screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    62. 62. RRD tool <ul><li>Screen shot </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    63. 63. RRD tool <ul><li>Create a RRD database </li></ul><ul><li>rrdtool create test.rrd </li></ul><ul><li>--start 920804400 </li></ul><ul><li>DS:speed:COUNTER:600:U:U </li></ul><ul><li>RRA:AVERAGE:0.5:1:24 </li></ul><ul><li>RRA:AVERAGE:0.5:6:10 </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    64. 64. RRD tool <ul><li>Create a graph </li></ul><ul><li>rrdtool graph speed.png </li></ul><ul><li>--start 920804400 --end 920808000 </li></ul><ul><li>DEF:myspeed=test.rrd:speed:AVERAGE </li></ul><ul><li>LINE2:myspeed#FF0000 </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    65. 65. Free Performance Data Collection and Rules Toolkits June 2, 2010 Adrian Cockcroft and Mario Jauvin
    66. 66. SE toolkit Example Tools <ul><li>A free performance toolkit for rapidly creating custom data sources </li></ul><ul><li>Makes all the very extensive Solaris metrics easily available </li></ul><ul><li>Very system specific and not enough metrics exist to port to Linux </li></ul><ul><li>Written by Rich Pettit with contributions from Adrian Cockcroft </li></ul><ul><li>Get SE3.4 from http://sourceforge.net/projects/setoolkit/ </li></ul><ul><li>Open source with support for SPARC & x86 Solaris 8, 9, 10 </li></ul><ul><li>Function Example SE Programs </li></ul><ul><li>Rule Monitors cpg.se monlog.se mon_cm.se live_test.se percollator.se </li></ul><ul><li>zoom.se virtual_adrian.se virtual_adrian_lite.se </li></ul><ul><li>Disk Monitors siostat.se xio.se xiostat.se iomonitor.se iost.se xit.se disks.se </li></ul><ul><li>CPU Monitors cpu_meter.se vmmonitor.se mpvmstat.se </li></ul><ul><li>Process Monitors msacct.se pea.se ps-ax.se ps-p.se pwatch.se pw.se </li></ul><ul><li>Network Monitors net.se tcp_monitor.se netmonitor.se netstatx.se nfsmonitor.se nx.se </li></ul><ul><li>Clones iostat.se uname.se vmstat.se nfsstat-m.se perfmeter.se xload.se </li></ul><ul><li>Data browsers aw.se infotool.se multi_meter.se </li></ul><ul><li>Contributed Code anasa dfstats kview systune watch orcollator.se </li></ul><ul><li>Test Programs syslog.se cpus.se pure_test.se collisions.se uptime.se dumpkstats.se net_example nproc.se kvmname.se </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    67. 67. SE language features <ul><li>SE is a 64bit interpreted dialect of C </li></ul><ul><ul><li>Not a new language to learn from scratch! </li></ul></ul><ul><ul><li>Standard C /usr/ccs/bin/cpp used at runtime to preprocess SE scripts </li></ul></ul><ul><ul><li>Main omissions - pointer types and goto </li></ul></ul><ul><ul><li>Main additions - classes and “string” type </li></ul></ul><ul><ul><li>powerful ways to handle dynamically allocated data </li></ul></ul><ul><ul><li>built-in fast balanced tree routines for storing key indexed data </li></ul></ul><ul><li>Dynamic linking to all existing C libraries </li></ul><ul><ul><li>Built-in classes access kernel data </li></ul></ul><ul><ul><li>Supplied class code hides details, provides the data you want </li></ul></ul><ul><li>Example scripts improve on basic utilities e.g. siostat.se, nx.se, pea.se </li></ul><ul><li>Example rule based monitors e.g. virtual_adrian.se, orcallator.se </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    68. 68. Creating Rules <ul><li>Based on real experiences of all the things that go wrong </li></ul><ul><li>Capture an approximation to intuition </li></ul><ul><li>Test and calibrate rules on as many systems as possible </li></ul><ul><li>Easy?? </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    69. 69. Configuring Rules <ul><li>Thresholds should be configured </li></ul><ul><li>Very application dependent </li></ul><ul><li>Capture the operating envelope </li></ul><ul><ul><li>Measure the underlying values </li></ul></ul><ul><ul><li>Measure peaks in normal operation </li></ul></ul><ul><ul><li>Note values during problems </li></ul></ul><ul><ul><li>Set thresholds to capture the difference </li></ul></ul><ul><li>This applies to any tool </li></ul><ul><ul><li>SE Toolkit, Cacti, Ganglia, Nagios, OpenNMS etc. </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    70. 70. Rules as Objects <ul><li>Define only the input and output information </li></ul><ul><li>Hide implementation details </li></ul><ul><li>Make high level rule objects trivial to use and reuse </li></ul><ul><li>SE Toolkit does it in three lines of code: </li></ul><ul><ul><li>#include <rules file> </li></ul></ul><ul><ul><li>Declare rule object as a typed variable </li></ul></ul><ul><ul><li>Read and use or print object status </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    71. 71. &quot;virtual adrian&quot; rules summary <ul><li>Disk Rule for all disks at once </li></ul><ul><ul><li>Looks for slow disks and unbalanced usage </li></ul></ul><ul><li>Network Rule for all networks at once </li></ul><ul><ul><li>Looks for slow nets and unbalanced usage </li></ul></ul><ul><li>Swap Rule - Looks for lack of available swap space </li></ul><ul><li>RAM Rule - Looks for short page residence times </li></ul><ul><li>CPU Power Rule </li></ul><ul><ul><li>Scales on MP systems </li></ul></ul><ul><ul><li>Looks for long run queue delays </li></ul></ul><ul><li>Mutex Rule - Looks for kernel lock contention and high sys CPU time </li></ul><ul><li>TCP Rule </li></ul><ul><ul><li>Looks for listen queue problems </li></ul></ul><ul><ul><li>Reports on connection attempt failures </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    72. 72. XE Toolkit - www.xetoolkit.com <ul><li>Complete re-write of SE Toolkit by Rich Pettit </li></ul><ul><ul><li>Extensible Java collector, customize with jar files </li></ul></ul><ul><ul><li>Release 1.2 available April 2008 </li></ul></ul><ul><ul><li>Multi-platform support Solaris, Linux/x86, Windows, BSD, OSX, HP-UX, AIX, Linux/s390, Linux/Power </li></ul></ul><ul><li>Licencing </li></ul><ul><ul><li>Free GPL version for standard use and shared derivations </li></ul></ul><ul><ul><li>Open source, hosted at http://sourceforge.net/projects/xe-toolkit/ </li></ul></ul><ul><ul><li>Commercial support available if needed </li></ul></ul><ul><ul><li>Commercial product license for custom in-house derivations </li></ul></ul><ul><li>Addresses all the issues people had with SE toolkit ! </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    73. 73. Captive Metrics / XE Toolkit Architecture June 2, 2010 Adrian Cockcroft and Mario Jauvin
    74. 74. Free System Monitoring Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
    75. 75. Collated Performance Data - Orca <ul><li>Problems with time sync when collecting data from multiple tools </li></ul><ul><ul><li>No timestamp at all for vmstat, netstat, df... </li></ul></ul><ul><ul><li>No timestamp by default for iostat and ps... </li></ul></ul><ul><ul><li>No way to collect realtime stats from an http logfile </li></ul></ul><ul><li>Use SE Toolkit to generate one timestamped row containing all the data </li></ul><ul><ul><li>First version of percollator.se written by Adrian Cockcroft in 1996 </li></ul></ul><ul><ul><li>Extended orcallator.se written by Blair Zajac a few years later </li></ul></ul><ul><ul><li>Graphs generated by orca batch job feeding rrdtool based web pages </li></ul></ul><ul><ul><li>Active community developing tool at http://www.orcaware.com </li></ul></ul><ul><ul><li>Extended to collect much more data, including process workloads </li></ul></ul><ul><ul><li>Basic data collection ported to Linux, HP-UX and Windows </li></ul></ul><ul><li>Orca is basically MRTG for System metrics rather than Network </li></ul><ul><li>See http://www.orcaware.com/orca/docs/Orca_Understanding_Performance_Data.ppt </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    76. 76. Orca data collections <ul><li>Collected using “procollator” reading info from /proc on Linux </li></ul><ul><li>[Uptime]      [Average # Processes in Run Queue (Load Average)]      [CPU Usage]      [New Process Spawn Rate]      [Number of System & Running Processes]      [Context Switches & Interrupts Rate]      [Interface Input Bits Per Second]      [Interface Output Bits Per Second]      [Interface Input Packets Per Second]      [Interface Output Packets Per Second]      [Interface Input Errors Per Second]      [Interface Output Errors Per Second]      [Interface Input Dropped Per Second]      [Interface Output Dropped Per Second]      [Interface Output Collisions]      [Interface Output Carrier Losses]      [TCP Current Connections]      [IP Statistics]      [TCP Statistics]      [ICMP Statistics]      [UDP Statistics]      [Disk System Wide Reads/Writes Per Second]      [Disk System Wide Transfer Rate]      [Disk Reads/Writes Per Second]      [Disk Transfer Rate]      [Disk Space Percent Usage]      [Physical Memory Usage]      [Swap Usage]      [Page Ins & Outs Rate]      [Swap Ins & Outs Rate]    </li></ul><ul><li>Orca on Solaris collects many more metrics than shown above </li></ul><ul><li>Strength of Orca is lots of detailed metrics with low overhead for collection </li></ul><ul><li>Easily customized to add more system metrics or application metrics </li></ul><ul><li>Orca can already track HTTP traffic and parse log files </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    77. 77. June 2, 2010 Adrian Cockcroft and Mario Jauvin All metrics are stored in “round robin database” format using RRDtool to generate displays over different time spans Web page is simple collection of plots with drill down by metric or by time Suitable for monitoring a relatively small number of systems in great detail, e.g. backend database servers
    78. 78. Cacti – www.cacti.net <ul><li>Web based user interface based on RRDtool </li></ul><ul><li>More sophisticated GUI than Orca or MRTG </li></ul><ul><li>Less sophisticated system metric collection, but more coverage of networking </li></ul><ul><li>Better management of groups of systems and devices than Orca, useful for tens to hundreds of nodes </li></ul><ul><li>Access control and personalization for users </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    79. 79. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    80. 80. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    81. 81. Ganglia – www.ganglia.info <ul><li>Web based RRDtool GUI somewhat similar to Cacti </li></ul><ul><li>Better management of clusters of systems and devices than Cacti, useful for hundreds to thousands of nodes in a hierarchy of clusters </li></ul><ul><li>Provides many summary statistic plots at cluster level and collects detailed configuration data </li></ul><ul><li>XML based data representation </li></ul><ul><li>Uses low overhead network protocol </li></ul><ul><li>In common use at hundreds of large HPC Grid sites, less visibly in use at some large commercial sites </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    82. 82. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    83. 83. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    84. 84. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    85. 85. BigBrother and BigSister <ul><li>Network and system dashboard alert monitor </li></ul><ul><li>Widely used at internet sites </li></ul><ul><li>Bigbrother is at http://www.bb4.com </li></ul><ul><li>Bigsister is at http://bigsister.graeff.com </li></ul><ul><li>Bigsister seems to have more features, alert logging, better portability and more efficient data collection. Compatible update to BB4. </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    86. 86. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    87. 87. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    88. 88. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    89. 89. Free QA Test and Modelling Tools June 2, 2010 Adrian Cockcroft and Mario Jauvin
    90. 90. QA Test Requirements <ul><li>Generate test workload </li></ul><ul><ul><li>SLAMD, Grinder </li></ul></ul><ul><li>Collect performance metrics </li></ul><ul><ul><li>Any of the tools already mentioned </li></ul></ul><ul><li>Report regression against baseline </li></ul><ul><li>Predict capacity needed for production system </li></ul><ul><ul><li>Use spreadsheets for simple linear prediction </li></ul></ul><ul><ul><li>Use modelling tools such as PDQ for queuing models </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    91. 91. Grinder 3 - Powerful New Features <ul><li>100% Pure Java - works on any hardware platform and any operating system that supports J2SE 1.3 and above. </li></ul><ul><li>Java and Jython based load testing framework </li></ul><ul><ul><li>Web Browsers: simulate web browsers using HTTP, and HTTPS. </li></ul></ul><ul><ul><li>Web Services: test interfaces using SOAP and XML-RPC. </li></ul></ul><ul><ul><li>Database: test databases using JDBC. </li></ul></ul><ul><ul><li>Middleware: RPC and MOM based systems using IIOP, RMI/IIOP, RMI/JRMP, and JMS. </li></ul></ul><ul><ul><li>Other Internet protocols: POP3, SMTP, FTP, and LDAP. </li></ul></ul><ul><li>See http://grinder.sourceforge.net/g3/features.html </li></ul><ul><li>J2EE Performance Testing with BEA WebLogic Server by Peter Zadrozny, Philip Aston and Ted Osborne, originally published by Expert Press and now by APress uses Grinder 2 throughout. </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    92. 92. SLAMD <ul><li>Load generation framework, written in Java </li></ul><ul><li>Originally built to test LDAP servers by Sun </li></ul><ul><li>Extended to be very generic and published as open source. Actively being developed. </li></ul><ul><li>Sophisticated functions and user interface </li></ul><ul><li>See http://www.slamd.com </li></ul><ul><li>Latest Release 2.0 has better usability focus </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    93. 93. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    94. 94. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    95. 95. June 2, 2010 Adrian Cockcroft and Mario Jauvin
    96. 96. PDQ Modelling Tool <ul><li>Dr Neil Gunther’s toolkit at http://www.perfdynamics.com </li></ul><ul><li>Library used from C or Perl provides MVA queueing models </li></ul><ul><li>Use to calibrate in QA and predict in production </li></ul><ul><li>PDQ modelling tool details: </li></ul><ul><ul><li>The Practical Performance Analyst Dr. Neil Gunther - McGraw-Hill, 1998 ISBN 0-07-912946-3 </li></ul></ul><ul><ul><li>Analyzing Computer System Performance with Perl:PDQ 2004, ISBN 3-54-020865-8 </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    97. 97. References and Conclusion June 2, 2010 Adrian Cockcroft and Mario Jauvin
    98. 98. Licences for Free Tools <ul><li>Open Source Initiative </li></ul><ul><ul><li>“ OSI Approved licences” </li></ul></ul><ul><ul><li>http://opensource.org/licenses/category </li></ul></ul><ul><li>Comparisons of Common Licences </li></ul><ul><ul><li>http://zooko.com/license_quick_ref.html </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    99. 99. Web Pages and Books <ul><li>Adrian’s Performance and other topics blog </li></ul><ul><ul><li>http://perfcap.blogspot.com </li></ul></ul><ul><li>MFJ Associates performance tools link page </li></ul><ul><ul><li>http://www.mfjassociates.net/perf_links.html </li></ul></ul><ul><li>More free tools compiled by John Sellens </li></ul><ul><ul><li>http://www.generalconcepts.com/resources/monitoring/ </li></ul></ul><ul><li>More tools compiled by Openxtra </li></ul><ul><ul><li>http://www.openxtra.co.uk/resource-center/open_source_network_monitor_tools.php </li></ul></ul><ul><li>SE toolkit info: Sun Performance and Tuning - Java and the Internet - Adrian Cockcroft and Richard Pettit - Sun Press/Prentice Hall, 2 nd Edition, 1998 ISBN 0-13-095249-4 </li></ul><ul><li>Solaris 8 and Linux: System Performance Tuning 2 nd Edition – Gian-Paolo Musumeci, O’Reilly 2002 ISBN: 0-596-00284-X </li></ul><ul><li>Solaris Internals http://www.solarisinternals.com </li></ul><ul><ul><li>Richard McDougall and James Mauro - new 2nd edition and new performance book by Richard McDougall and Brendan Gregg </li></ul></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    100. 100. Concluding Remarks <ul><li>Many large installations depend on free tools </li></ul><ul><li>A full suite of functionality is available </li></ul><ul><li>Several tools are needed to cover the bases </li></ul><ul><li>Tradeoff between function and ease of use </li></ul><ul><li>Support may be available, but typically Google is the best support tool </li></ul><ul><li>Functionality is increasing…. </li></ul>June 2, 2010 Adrian Cockcroft and Mario Jauvin
    101. 101. Questions? [email_address] [email_address] June 2, 2010 Adrian Cockcroft and Mario Jauvin
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×