SlideShare a Scribd company logo
1 of 44
Visualization and alerting with 
RRD, Nagios XI, and PNP 
Rob Seiwert 
rob@vcaglobal.com
Introduction 
• IT Director for Video Corporation of America for the last 18 
years. 
• Used Big Brother since Beta until 2011. 
• Used MRTG pre RRD and RRD from Beta. 
• Moved to Nagios Core in 2011. 
• Moved to Nagios XI in 2012.
Introduction & Agenda 
• A Real World Problem 
• Trending and Prediction using RRD 
• Creating Cool PNP Templates 
• The problem with Cool Visualizations 
• LSL and what does it all mean 
• A Proactive Solution
Real World Problem 
• Run away processes consuming SAN storage 
• Virtual Storage complicated and masked the issue 
• Wanted earlier notice of pending issues 
• Needed to work with my history
Looking for Something Better 
This doesn’t look so bad does it?
A little better
Trending Options in RRD 
• Holt-Winters Forecasting (RRA) 
• Trend 
• Predict 
• Least Squares Line
LSL Trending using RRDTool 
http://tiskanto.blogspot.com/2011/12/trend-predictions-with-rrd-tool-not-so.html 
http://blog.hermione.de/?p=67 
http://hints.jeb.be/2009/12/04/trend-prediction-with-rrdtool/
Getting it into Nagios / PNP 
Custom Templates are stored at two places in the file system. 
• share/templates.dist - for templates included in the PNP package 
• share/templates - for custom made templates which are not changed 
during updates 
If the graph for the service “http” on host “localhost” should be shown, PNP will 
look for the XML file perfdata/localhost/http.xml and read its contents. The XML 
files are created automatically by PNP and contain information about the particular 
host and service. The header contains information about the plugin and the 
performance data. The XML tag <TEMPLATE> identifies which PNP template will be 
used for this graph.
Finding the Template 
PNP will append .php to the template value found in the xml and look for a 
template with the name check_http.php in the following sequence: 
1. templates/check_http.php 
2. templates.dist/check_http.php 
3. templates/default.php 
4. templates.dist/default.php 
The template default.php takes an exceptional position as it is used every 
time no other applicable template is found.
XML Contents 
Useful information from the PNP XML files is available within your template. 
<NAGIOS> 
<DATASOURCE> 
<TEMPLATE>check_http</TEMPLATE> 
<DS>1</DS> 
<NAME>time</NAME> 
<UNIT>s</UNIT> 
<ACT>0.006721</ACT> 
<WARN>1.000000</WARN> 
<CRIT>2.000000</CRIT> 
<MIN>0.000000</MIN> 
<MAX></MAX> 
</DATASOURCE>
PNP Templates 
Building the RRD options 
$opt[1] = "--vertical-label "Disk Usage (%)" --height 200 -l 0 -u 100 --upper-limit 100 --title " $hostname / $servicedesc" "; 
$opt[1] .= " --start now-7d --end now+7d --right-axis 1:0 ";
Building Graph Definition 
Define the RRD data to test 
$def[1] .= "DEF:used1=$rrdfile:$DS[1]:AVERAGE "; 
# used2 = data used to calc 1 day trend 
$def[1] .= "DEF:used2=$rrdfile:$DS[1]:AVERAGE:start=now-1d "; 
# used2 = data used to calc 2 day trend 
$def[1] .= "DEF:used3=$rrdfile:$DS[1]:AVERAGE:start=now-2d "; 
# used3 = data used to calc 1 week trend 
$def[1] .= "DEF:used4=$rrdfile:$DS[1]:AVERAGE:start=now-7d ";
Whats a DEF? 
DEF 
DEF:<vname>=<rrdfile>:<ds-name>:<CF>[:step=<step>][:start=<time>][:end=<time>] 
This command fetches data from an RRD file 
CDEF 
CDEF:vname=RPN expression 
This command creates a new set of data points (in memory only, not in the RRD file) out of one or more other data series. 
VDEF 
VDEF:vname=RPN expression 
This command returns a value and/or a time according to the RPN statements used.
What’s a RPN Expression 
RPN = Reverse Polish Notation 
In reverse Polish notation the operators follow their operands; for instance, to add 3 and 4, one 
would write "3 4 +" rather than "3 + 4". If there are multiple operations, the operator is given 
immediately after its second operand; so the expression written "3 − 4 + 5" in conventional 
notation would be written "3 4 − 5 +" in RPN: 4 is first subtracted from 3, then 5 added to it. An 
advantage of RPN is that it obviates the need for parentheses that are required by infix. 
http://en.wikipedia.org/wiki/Reverse_Polish_notation
Working in Percent 
I decided to visualize this as a percentage. Could just as well been Bytes. 
The plugin author in this case had decided to put the max space in as a second data source probably due to the 
virtual and dynamic nature of the Storage Pool. 
$def[1] .= "DEF:cap=$rrdfile:$DS[2]:MAX "; 
$def[1] .= "CDEF:pused1=used1,100,*,cap,/ "; 
$def[1] .= "CDEF:pused2=used2,100,*,cap,/ "; 
$def[1] .= "CDEF:pused3=used3,100,*,cap,/ "; 
$def[1] .= "CDEF:pused4=used4,100,*,cap,/ "; 
This uses gets the max capacity from the second data source in the Perf data then creates calculated datasets as a 
percent for each range using reverse polish notation.
Getting the LSL Trend 
$def[1] .= "VDEF:D2=pused2,LSLSLOPE "; # this is the slope (m) 
$def[1] .= "VDEF:H2=pused2,LSLINT "; # this is the y intercept 
$def[1] .= "VDEF:C2=pused2,LSLCORREL "; # this is the Correlation Coeffient 
$def[1] .= "CDEF:trend2=pused2,POP,D2,COUNT,*,H2,+ "; # new cdef with trend 
$def[1] .= "CDEF:trend2limit=trend2,0,100,LIMIT "; # trim the trend to run from 0 to 100 
$def[1] .= "CDEF:danger2=trend2,90,100,LIMIT "; # find the danger zone 
$def[1] .= "VDEF:min2=danger2,FIRST "; # datapoint that crosses 90% 
$def[1] .= "VDEF:max2=danger2,LAST "; # returns the datapoint that crosses 100% 
The forth line creates the trend line. Each cdef must have a time reference so we push a dataset into the equation and pop it off. 
This just gives us a valid cdef. Next push the slope and and count (aka x) and multiply them. Next we add in the y intercept. 
Basically line 4 builds a data set using the equation: y = mx + b 
Repeat for all other data sets you want to trend. At this point we have all we need. Time to make the chart
Adding the Lines 
$def[1] .= "AREA:pused1#BDF400AA:"Disk Usage % " "; 
$def[1] .= "LINE1:pused1#000000 "; 
$def[1] .= "LINE1:trend4limit#B70094:"Trend last Week ":dashes "; 
$def[1] .= "LINE1:trend3limit#0077FF:"Trend last 2 Days ":dashes "; 
$def[1] .= "LINE1:trend2limit#FF9900:"Trend last 24 Hoursn":dashes ";
Adding the Danger Zone 
$def[1] .= "AREA:danger4#B7009477 "; 
$def[1] .= "AREA:danger3#0077FF77 "; 
$def[1] .= "AREA:danger2#FF990077 "; 
$def[1] .= "LINE2:danger4#B70094 "; 
$def[1] .= "LINE2:danger3#0077FF "; 
$def[1] .= "LINE2:danger2#FF9900 ";
Adding the Danger Zone Markers 
Cool alternate to warning and critical lines. 
This creates the two pink bars from 90% to 100% 
$def[1] .="LINE1:90 AREA:5#FF000022::STACK AREA:5#FF000044::STACK ";
Adding the Danger Dates 
$def[1] .= "COMMENT:"Reach 90% " "; 
$def[1] .= "GPRINT:min4:"%x":strftime "; 
$def[1] .= "GPRINT:min3:"%21x":strftime "; 
$def[1] .= "GPRINT:min2:"%21xn":strftime "; 
$def[1] .= "COMMENT:"Reach 100%" "; 
$def[1] .= "GPRINT:max4:"%x":strftime "; 
$def[1] .= "GPRINT:max3:"%21x":strftime "; 
$def[1] .= "GPRINT:max2:"%21xn":strftime ";
End Result
Second Chart 
This second chart is simpler but follows the same layout. The difference is that it takes the 
PNP time range and extends the chart by the same amount giving you a dynamic trend. In 
this example, it’s the standard 24 hour period and the template adds another 24 hours 
into the future.
Strftime and Unknown Dates 
http://www.wolfgangsvault.com/santana/poster-art/poster/BG209.html
The Problem with Pretty Charts 
Had my chart that predicted when my SAN would hit 100%. As 
disk usage increased and decreased you could see the trending 
lines jump. 
The problem with cool visualizations is that they have to be 
watched. 
It seemed to me that it should be possible to build an alert using 
the data available in RRD.
LSL – Least Squares Line 
LSLSLOPE, LSLINT, LSLCORREL 
Return the parameters for a Least Squares Line (y = mx +b) 
which approximate the provided dataset. 
LSLSLOPE is the slope (m) of the line related to the COUNT position of the data. 
LSLINT is the y-intercept (b), which happens also to be the first data point on the graph. 
LSLCORREL is the Correlation Coefficient 
(also know as Pearson's Product Moment Correlation Coefficient). 
It will range from 0 to +/-1 and represents the quality of fit for the approximation. 
$def[1] .= "VDEF:D2=pused2,LSLSLOPE "; # this is the slope (m) 
$def[1] .= "VDEF:H2=pused2,LSLINT "; # this is the y intercept 
$def[1] .= "VDEF:C2=pused2,LSLCORREL "; # this is the Correlation Coeffient
What is the Slope 
• Rise over Run 
• Rate of Change 
• describes both the direction and the steepness of the line. 
• denoted by the letter m in y=mx+b 
• For RRD is the rate of change for each step in the dataset
What is the Y-Intercept 
The point at which a curve or function crosses the y-axis 
Basically the Starting point for Y when X=0
What is the Correlation Coefficient 
• The linear correlation between the data set and the resultant 
slope. 
• represents the quality of fit for the approximation 
• Returns a value between +1 and −1 inclusive 
• 1 is total positive correlation 
• 0 is no correlation 
• −1 is total negative correlation.
What is the Correlation Coefficient
What is the Correlation Coefficient
Solving for X 
Knowing the maximum capacity for y (100%) and y=mx+b after 
calculating the LSLSlope(m) and the LSLINT(b) we can solve for x. 
x=(y-b)/m 
Question is what is X really?
What is X Really? 
X is the number of steps to reach the upper limit 
Knowing the step size is critical. Be aware setting step on the DEF 
or the options doesn’t guarantee that is the step used. 
X * stepsize = number of seconds to reach 100% from the 
beginning of the trend
Building the check 
result=$(rrdtool graph --width 4000 --step $step  
DEF:used1=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:1:AVERAGE:start=now-48h:step=$step  
DEF:used2=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:1:AVERAGE:start=now-48h:step=$step  
DEF:cap=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:2:MAX  
VDEF:SLOPE2=used2,LSLSLOPE  
VDEF:H2=used2,LSLINT  
VDEF:C2=used2,LSLCORREL  
CDEF:c=used2,COUNT,EXC,POP  
CDEF:t=used2,TIME,EXC,POP  
VDEF:tmin=t,MINIMUM  
VDEF:tmax=t,MAXIMUM  
VDEF:cmax=c,MAXIMUM  
CDEF:s=used2,POP,tmax,tmin,-,cmax,1,-,/  
PRINT:SLOPE2:'%lf'  
PRINT:H2:'%lf'  
PRINT:C2:'%4.2lf'  
PRINT:SLOPE2:'%4.2lf%s'  
PRINT:cap:MAX:'%lf'  
PRINT:s:MAX:'%16.0lf'  
PRINT:tmin:'%16.0lf'  
PRINT:tmax:'%16.0lf')
Building the Check 
if [ "X$result" != "X" ] ; then 
# split results to $1 $2 etc 
set $result 
else 
# no results must be error 
exit 
fi 
timerange="Data Analyized Starting 
$(date -d @$8) - Ending $(date -d @$9)" 
# $2 = Slope = m 
# $3 = y-intercept = b 
# $4 = Corralation Coefficient 
# $5 = Slope, SI Scaled 
# $6 = Max Capacity 
# $7 = Step 
# $8 = Start Time (sec since epoch) 
# $9 = End Time (secs since epoch)
Building the Check 
status=0 
buf="Reach 100% on $(date -d @$reach100in) Slope $5 per $7 seconds Correlation Coefficient $4" 
timetill=$(($reach100in-$now)) 
if [ $timetill -lt $(($day*30)) ] ; then 
status=1 
buf="WARNING: $buf" 
fi 
if [ $timetill -lt $(($day*2)) ] ; then 
status=2 
buf="CRITICAL: $buf" 
fi 
echo $buf 
echo $timerange 
echo "|'Rate of Change per Min'=$2" 
exit $status
Resulting Check
“Performance Data”
Other Lessons Learned 
• RRD Step Size is not always what you think 
• Data Resolution will affect your results 
• Follow the Perf Data API 
• Use Bytes without UOM and let RRD Scale 
• Default Template differs between PNP and XI 
• Use Check_pnp_rrd.pl
Follow the PerfData API 
Nagios PerfData API 
https://nagios-plugins.org/doc/guidelines.html#AEN200 
'label'=value[UOM];[warn];[crit];[min];[max] 
• Recommend setting all fields available 
• In my real world example the plugin author had decided to list volume and snapshot reserves in the 
warn and crit fields and put the max value into a second dataset. Follow the standard!
SI: Le Système International d'Unités 
The International System of Units 
the modern form of the metric system and is the world's most widely used system of 
measurement. It comprises a coherent system of units of measurement built around 
seven base units, 
http://en.wikipedia.org/wiki/Second
SI Scaling 
http://en.wikipedia.org/wiki/Metric_prefix 
Metric prefixes in everyday use 
Text Symbol Factor 
tera T 1000000000000 
giga G 1000000000 
mega M 1000000 
kilo k 1000 
hecto h 100 
deca da 10 
(none) (none) 1 
deci d 0.1 
centi c 0.01 
milli m 0.001 
micro μ 0.000001 
nano n 0.000000001 
pico p 0.000000000001
Questions? 
Any questions? 
Thanks!
The End 
Robert C. Seiwert 
rob@vcaglobal.com

More Related Content

What's hot

More than syntax
More than syntaxMore than syntax
More than syntaxWooga
 
Small pieces loosely joined
Small pieces loosely joinedSmall pieces loosely joined
Small pieces loosely joinedennui2342
 
Programa Sumar y Multiplicar
Programa Sumar y MultiplicarPrograma Sumar y Multiplicar
Programa Sumar y MultiplicarAnais Rodriguez
 
The why and how of moving to php 5.4/5.5
The why and how of moving to php 5.4/5.5The why and how of moving to php 5.4/5.5
The why and how of moving to php 5.4/5.5Wim Godden
 
CLI, the other SAPI phpnw11
CLI, the other SAPI phpnw11CLI, the other SAPI phpnw11
CLI, the other SAPI phpnw11Combell NV
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11Combell NV
 
Redis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your applicationRedis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your applicationrjsmelo
 
[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to TestZsolt Fabok
 
Practical File of C Language
Practical File of C LanguagePractical File of C Language
Practical File of C LanguageRAJWANT KAUR
 
Program to find the avg of two numbers
Program to find the avg of two numbersProgram to find the avg of two numbers
Program to find the avg of two numbersSwarup Boro
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
 
The Ring programming language version 1.6 book - Part 9 of 189
The Ring programming language version 1.6 book - Part 9 of 189The Ring programming language version 1.6 book - Part 9 of 189
The Ring programming language version 1.6 book - Part 9 of 189Mahmoud Samir Fayed
 
Php 7 errors messages
Php 7 errors messagesPhp 7 errors messages
Php 7 errors messagesDamien Seguy
 
Smolder @Silex
Smolder @SilexSmolder @Silex
Smolder @SilexJeen Lee
 
Node.js API 서버 성능 개선기
Node.js API 서버 성능 개선기Node.js API 서버 성능 개선기
Node.js API 서버 성능 개선기JeongHun Byeon
 
Streaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingStreaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingAll Things Open
 
Cli the other sapi pbc11
Cli the other sapi pbc11Cli the other sapi pbc11
Cli the other sapi pbc11Combell NV
 

What's hot (20)

More than syntax
More than syntaxMore than syntax
More than syntax
 
Small pieces loosely joined
Small pieces loosely joinedSmall pieces loosely joined
Small pieces loosely joined
 
Programa Sumar y Multiplicar
Programa Sumar y MultiplicarPrograma Sumar y Multiplicar
Programa Sumar y Multiplicar
 
The why and how of moving to php 5.4/5.5
The why and how of moving to php 5.4/5.5The why and how of moving to php 5.4/5.5
The why and how of moving to php 5.4/5.5
 
CLI, the other SAPI phpnw11
CLI, the other SAPI phpnw11CLI, the other SAPI phpnw11
CLI, the other SAPI phpnw11
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11
 
C Programming Exam problems & Solution by sazzad hossain
C Programming Exam problems & Solution by sazzad hossainC Programming Exam problems & Solution by sazzad hossain
C Programming Exam problems & Solution by sazzad hossain
 
Redis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your applicationRedis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your application
 
[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test
 
Practical File of C Language
Practical File of C LanguagePractical File of C Language
Practical File of C Language
 
Program to find the avg of two numbers
Program to find the avg of two numbersProgram to find the avg of two numbers
Program to find the avg of two numbers
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
The Ring programming language version 1.6 book - Part 9 of 189
The Ring programming language version 1.6 book - Part 9 of 189The Ring programming language version 1.6 book - Part 9 of 189
The Ring programming language version 1.6 book - Part 9 of 189
 
errors in php 7
errors in php 7errors in php 7
errors in php 7
 
Php 7 errors messages
Php 7 errors messagesPhp 7 errors messages
Php 7 errors messages
 
Smolder @Silex
Smolder @SilexSmolder @Silex
Smolder @Silex
 
C Programming
C ProgrammingC Programming
C Programming
 
Node.js API 서버 성능 개선기
Node.js API 서버 성능 개선기Node.js API 서버 성능 개선기
Node.js API 서버 성능 개선기
 
Streaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingStreaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via Streaming
 
Cli the other sapi pbc11
Cli the other sapi pbc11Cli the other sapi pbc11
Cli the other sapi pbc11
 

Similar to Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios

Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsPraveen Penumathsa
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit
 
Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsPraveen Penumathsa
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionFranck Pachot
 
Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisEelco Visser
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral ResearchPo-Ting Wu
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streamingAdam Doyle
 
Problem solving using computers - Unit 1 - Study material
Problem solving using computers - Unit 1 - Study materialProblem solving using computers - Unit 1 - Study material
Problem solving using computers - Unit 1 - Study materialTo Sum It Up
 
Performance measures
Performance measuresPerformance measures
Performance measuresDivya Tiwari
 
Regression Analysis on Flights data
Regression Analysis on Flights dataRegression Analysis on Flights data
Regression Analysis on Flights dataMansi Verma
 
Types of c operators ppt
Types of c operators pptTypes of c operators ppt
Types of c operators pptViraj Shah
 

Similar to Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios (20)

Rseminarp
RseminarpRseminarp
Rseminarp
 
Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented Programs
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van Hovell
 
Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented Programs
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 
C important questions
C important questionsC important questions
C important questions
 
Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow Analysis
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral Research
 
Oct.22nd.Presentation.Final
Oct.22nd.Presentation.FinalOct.22nd.Presentation.Final
Oct.22nd.Presentation.Final
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
C code examples
C code examplesC code examples
C code examples
 
Problem solving using computers - Unit 1 - Study material
Problem solving using computers - Unit 1 - Study materialProblem solving using computers - Unit 1 - Study material
Problem solving using computers - Unit 1 - Study material
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Operators1.pptx
Operators1.pptxOperators1.pptx
Operators1.pptx
 
Performance measures
Performance measuresPerformance measures
Performance measures
 
Regression Analysis on Flights data
Regression Analysis on Flights dataRegression Analysis on Flights data
Regression Analysis on Flights data
 
CppTutorial.ppt
CppTutorial.pptCppTutorial.ppt
CppTutorial.ppt
 
Types of c operators ppt
Types of c operators pptTypes of c operators ppt
Types of c operators ppt
 

More from Nagios

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best PracticesNagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewNagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The HoodNagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsNagios
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionNagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsNagios
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceNagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksNagios
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationNagios
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosNagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosNagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - FeaturesNagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios
 

More from Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios

  • 1. Visualization and alerting with RRD, Nagios XI, and PNP Rob Seiwert rob@vcaglobal.com
  • 2. Introduction • IT Director for Video Corporation of America for the last 18 years. • Used Big Brother since Beta until 2011. • Used MRTG pre RRD and RRD from Beta. • Moved to Nagios Core in 2011. • Moved to Nagios XI in 2012.
  • 3. Introduction & Agenda • A Real World Problem • Trending and Prediction using RRD • Creating Cool PNP Templates • The problem with Cool Visualizations • LSL and what does it all mean • A Proactive Solution
  • 4. Real World Problem • Run away processes consuming SAN storage • Virtual Storage complicated and masked the issue • Wanted earlier notice of pending issues • Needed to work with my history
  • 5. Looking for Something Better This doesn’t look so bad does it?
  • 7. Trending Options in RRD • Holt-Winters Forecasting (RRA) • Trend • Predict • Least Squares Line
  • 8. LSL Trending using RRDTool http://tiskanto.blogspot.com/2011/12/trend-predictions-with-rrd-tool-not-so.html http://blog.hermione.de/?p=67 http://hints.jeb.be/2009/12/04/trend-prediction-with-rrdtool/
  • 9. Getting it into Nagios / PNP Custom Templates are stored at two places in the file system. • share/templates.dist - for templates included in the PNP package • share/templates - for custom made templates which are not changed during updates If the graph for the service “http” on host “localhost” should be shown, PNP will look for the XML file perfdata/localhost/http.xml and read its contents. The XML files are created automatically by PNP and contain information about the particular host and service. The header contains information about the plugin and the performance data. The XML tag <TEMPLATE> identifies which PNP template will be used for this graph.
  • 10. Finding the Template PNP will append .php to the template value found in the xml and look for a template with the name check_http.php in the following sequence: 1. templates/check_http.php 2. templates.dist/check_http.php 3. templates/default.php 4. templates.dist/default.php The template default.php takes an exceptional position as it is used every time no other applicable template is found.
  • 11. XML Contents Useful information from the PNP XML files is available within your template. <NAGIOS> <DATASOURCE> <TEMPLATE>check_http</TEMPLATE> <DS>1</DS> <NAME>time</NAME> <UNIT>s</UNIT> <ACT>0.006721</ACT> <WARN>1.000000</WARN> <CRIT>2.000000</CRIT> <MIN>0.000000</MIN> <MAX></MAX> </DATASOURCE>
  • 12. PNP Templates Building the RRD options $opt[1] = "--vertical-label "Disk Usage (%)" --height 200 -l 0 -u 100 --upper-limit 100 --title " $hostname / $servicedesc" "; $opt[1] .= " --start now-7d --end now+7d --right-axis 1:0 ";
  • 13. Building Graph Definition Define the RRD data to test $def[1] .= "DEF:used1=$rrdfile:$DS[1]:AVERAGE "; # used2 = data used to calc 1 day trend $def[1] .= "DEF:used2=$rrdfile:$DS[1]:AVERAGE:start=now-1d "; # used2 = data used to calc 2 day trend $def[1] .= "DEF:used3=$rrdfile:$DS[1]:AVERAGE:start=now-2d "; # used3 = data used to calc 1 week trend $def[1] .= "DEF:used4=$rrdfile:$DS[1]:AVERAGE:start=now-7d ";
  • 14. Whats a DEF? DEF DEF:<vname>=<rrdfile>:<ds-name>:<CF>[:step=<step>][:start=<time>][:end=<time>] This command fetches data from an RRD file CDEF CDEF:vname=RPN expression This command creates a new set of data points (in memory only, not in the RRD file) out of one or more other data series. VDEF VDEF:vname=RPN expression This command returns a value and/or a time according to the RPN statements used.
  • 15. What’s a RPN Expression RPN = Reverse Polish Notation In reverse Polish notation the operators follow their operands; for instance, to add 3 and 4, one would write "3 4 +" rather than "3 + 4". If there are multiple operations, the operator is given immediately after its second operand; so the expression written "3 − 4 + 5" in conventional notation would be written "3 4 − 5 +" in RPN: 4 is first subtracted from 3, then 5 added to it. An advantage of RPN is that it obviates the need for parentheses that are required by infix. http://en.wikipedia.org/wiki/Reverse_Polish_notation
  • 16. Working in Percent I decided to visualize this as a percentage. Could just as well been Bytes. The plugin author in this case had decided to put the max space in as a second data source probably due to the virtual and dynamic nature of the Storage Pool. $def[1] .= "DEF:cap=$rrdfile:$DS[2]:MAX "; $def[1] .= "CDEF:pused1=used1,100,*,cap,/ "; $def[1] .= "CDEF:pused2=used2,100,*,cap,/ "; $def[1] .= "CDEF:pused3=used3,100,*,cap,/ "; $def[1] .= "CDEF:pused4=used4,100,*,cap,/ "; This uses gets the max capacity from the second data source in the Perf data then creates calculated datasets as a percent for each range using reverse polish notation.
  • 17. Getting the LSL Trend $def[1] .= "VDEF:D2=pused2,LSLSLOPE "; # this is the slope (m) $def[1] .= "VDEF:H2=pused2,LSLINT "; # this is the y intercept $def[1] .= "VDEF:C2=pused2,LSLCORREL "; # this is the Correlation Coeffient $def[1] .= "CDEF:trend2=pused2,POP,D2,COUNT,*,H2,+ "; # new cdef with trend $def[1] .= "CDEF:trend2limit=trend2,0,100,LIMIT "; # trim the trend to run from 0 to 100 $def[1] .= "CDEF:danger2=trend2,90,100,LIMIT "; # find the danger zone $def[1] .= "VDEF:min2=danger2,FIRST "; # datapoint that crosses 90% $def[1] .= "VDEF:max2=danger2,LAST "; # returns the datapoint that crosses 100% The forth line creates the trend line. Each cdef must have a time reference so we push a dataset into the equation and pop it off. This just gives us a valid cdef. Next push the slope and and count (aka x) and multiply them. Next we add in the y intercept. Basically line 4 builds a data set using the equation: y = mx + b Repeat for all other data sets you want to trend. At this point we have all we need. Time to make the chart
  • 18. Adding the Lines $def[1] .= "AREA:pused1#BDF400AA:"Disk Usage % " "; $def[1] .= "LINE1:pused1#000000 "; $def[1] .= "LINE1:trend4limit#B70094:"Trend last Week ":dashes "; $def[1] .= "LINE1:trend3limit#0077FF:"Trend last 2 Days ":dashes "; $def[1] .= "LINE1:trend2limit#FF9900:"Trend last 24 Hoursn":dashes ";
  • 19. Adding the Danger Zone $def[1] .= "AREA:danger4#B7009477 "; $def[1] .= "AREA:danger3#0077FF77 "; $def[1] .= "AREA:danger2#FF990077 "; $def[1] .= "LINE2:danger4#B70094 "; $def[1] .= "LINE2:danger3#0077FF "; $def[1] .= "LINE2:danger2#FF9900 ";
  • 20. Adding the Danger Zone Markers Cool alternate to warning and critical lines. This creates the two pink bars from 90% to 100% $def[1] .="LINE1:90 AREA:5#FF000022::STACK AREA:5#FF000044::STACK ";
  • 21. Adding the Danger Dates $def[1] .= "COMMENT:"Reach 90% " "; $def[1] .= "GPRINT:min4:"%x":strftime "; $def[1] .= "GPRINT:min3:"%21x":strftime "; $def[1] .= "GPRINT:min2:"%21xn":strftime "; $def[1] .= "COMMENT:"Reach 100%" "; $def[1] .= "GPRINT:max4:"%x":strftime "; $def[1] .= "GPRINT:max3:"%21x":strftime "; $def[1] .= "GPRINT:max2:"%21xn":strftime ";
  • 23. Second Chart This second chart is simpler but follows the same layout. The difference is that it takes the PNP time range and extends the chart by the same amount giving you a dynamic trend. In this example, it’s the standard 24 hour period and the template adds another 24 hours into the future.
  • 24. Strftime and Unknown Dates http://www.wolfgangsvault.com/santana/poster-art/poster/BG209.html
  • 25. The Problem with Pretty Charts Had my chart that predicted when my SAN would hit 100%. As disk usage increased and decreased you could see the trending lines jump. The problem with cool visualizations is that they have to be watched. It seemed to me that it should be possible to build an alert using the data available in RRD.
  • 26. LSL – Least Squares Line LSLSLOPE, LSLINT, LSLCORREL Return the parameters for a Least Squares Line (y = mx +b) which approximate the provided dataset. LSLSLOPE is the slope (m) of the line related to the COUNT position of the data. LSLINT is the y-intercept (b), which happens also to be the first data point on the graph. LSLCORREL is the Correlation Coefficient (also know as Pearson's Product Moment Correlation Coefficient). It will range from 0 to +/-1 and represents the quality of fit for the approximation. $def[1] .= "VDEF:D2=pused2,LSLSLOPE "; # this is the slope (m) $def[1] .= "VDEF:H2=pused2,LSLINT "; # this is the y intercept $def[1] .= "VDEF:C2=pused2,LSLCORREL "; # this is the Correlation Coeffient
  • 27. What is the Slope • Rise over Run • Rate of Change • describes both the direction and the steepness of the line. • denoted by the letter m in y=mx+b • For RRD is the rate of change for each step in the dataset
  • 28. What is the Y-Intercept The point at which a curve or function crosses the y-axis Basically the Starting point for Y when X=0
  • 29. What is the Correlation Coefficient • The linear correlation between the data set and the resultant slope. • represents the quality of fit for the approximation • Returns a value between +1 and −1 inclusive • 1 is total positive correlation • 0 is no correlation • −1 is total negative correlation.
  • 30. What is the Correlation Coefficient
  • 31. What is the Correlation Coefficient
  • 32. Solving for X Knowing the maximum capacity for y (100%) and y=mx+b after calculating the LSLSlope(m) and the LSLINT(b) we can solve for x. x=(y-b)/m Question is what is X really?
  • 33. What is X Really? X is the number of steps to reach the upper limit Knowing the step size is critical. Be aware setting step on the DEF or the options doesn’t guarantee that is the step used. X * stepsize = number of seconds to reach 100% from the beginning of the trend
  • 34. Building the check result=$(rrdtool graph --width 4000 --step $step DEF:used1=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:1:AVERAGE:start=now-48h:step=$step DEF:used2=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:1:AVERAGE:start=now-48h:step=$step DEF:cap=/usr/local/nagios/share/perfdata/iSCSIgroup/Disk_Usage.rrd:2:MAX VDEF:SLOPE2=used2,LSLSLOPE VDEF:H2=used2,LSLINT VDEF:C2=used2,LSLCORREL CDEF:c=used2,COUNT,EXC,POP CDEF:t=used2,TIME,EXC,POP VDEF:tmin=t,MINIMUM VDEF:tmax=t,MAXIMUM VDEF:cmax=c,MAXIMUM CDEF:s=used2,POP,tmax,tmin,-,cmax,1,-,/ PRINT:SLOPE2:'%lf' PRINT:H2:'%lf' PRINT:C2:'%4.2lf' PRINT:SLOPE2:'%4.2lf%s' PRINT:cap:MAX:'%lf' PRINT:s:MAX:'%16.0lf' PRINT:tmin:'%16.0lf' PRINT:tmax:'%16.0lf')
  • 35. Building the Check if [ "X$result" != "X" ] ; then # split results to $1 $2 etc set $result else # no results must be error exit fi timerange="Data Analyized Starting $(date -d @$8) - Ending $(date -d @$9)" # $2 = Slope = m # $3 = y-intercept = b # $4 = Corralation Coefficient # $5 = Slope, SI Scaled # $6 = Max Capacity # $7 = Step # $8 = Start Time (sec since epoch) # $9 = End Time (secs since epoch)
  • 36. Building the Check status=0 buf="Reach 100% on $(date -d @$reach100in) Slope $5 per $7 seconds Correlation Coefficient $4" timetill=$(($reach100in-$now)) if [ $timetill -lt $(($day*30)) ] ; then status=1 buf="WARNING: $buf" fi if [ $timetill -lt $(($day*2)) ] ; then status=2 buf="CRITICAL: $buf" fi echo $buf echo $timerange echo "|'Rate of Change per Min'=$2" exit $status
  • 39. Other Lessons Learned • RRD Step Size is not always what you think • Data Resolution will affect your results • Follow the Perf Data API • Use Bytes without UOM and let RRD Scale • Default Template differs between PNP and XI • Use Check_pnp_rrd.pl
  • 40. Follow the PerfData API Nagios PerfData API https://nagios-plugins.org/doc/guidelines.html#AEN200 'label'=value[UOM];[warn];[crit];[min];[max] • Recommend setting all fields available • In my real world example the plugin author had decided to list volume and snapshot reserves in the warn and crit fields and put the max value into a second dataset. Follow the standard!
  • 41. SI: Le Système International d'Unités The International System of Units the modern form of the metric system and is the world's most widely used system of measurement. It comprises a coherent system of units of measurement built around seven base units, http://en.wikipedia.org/wiki/Second
  • 42. SI Scaling http://en.wikipedia.org/wiki/Metric_prefix Metric prefixes in everyday use Text Symbol Factor tera T 1000000000000 giga G 1000000000 mega M 1000000 kilo k 1000 hecto h 100 deca da 10 (none) (none) 1 deci d 0.1 centi c 0.01 milli m 0.001 micro μ 0.000001 nano n 0.000000001 pico p 0.000000000001
  • 44. The End Robert C. Seiwert rob@vcaglobal.com

Editor's Notes

  1. http://docs.pnp4nagios.org/pnp-0.6/tpl
  2. http://en.wikipedia.org/wiki/Least_squares
  3. http://mathworld.wolfram.com/y-Intercept.html
  4. http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
  5. http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
  6. http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient