95th percentile calculation with RRDtool

I’ve managed to come up with some useful scripts using RRDtool to pull data out of round-robin archives (rra’s) and to compute 95th percentile.  RRDtool is used as the database backend for multiple monitoring tools such as Cacti, Torrus, Zenoss, and MRTG (as option).

In working this out I utilized the benefit of having my own Cacti deployment handy so that I could compare syntax against Cacti’s own RRDtool queries.  Cacti is an excellent tool because it essentially provides a nice frontend for setting up the necessary SNMP polls to devices that create the rra’s in RRDtool which contain all the data points from the SNMP polls.  It then uses RRDtool queries to extract and process various data and to graph it.  It does a nice job of this without adding a layer of obfuscation on top of RRDtool and is extremely perspicacious, allowing you to see exactly what commands its running via its graphical web interface.

Extracting Data Points

Using the rrdtool xport command you can pull data points from an rra into XML format like so:

rrdtool xport DEF:xx="11_22_33_44_traffic_in_225.rrd":traffic_in:AVERAGE DEF:yy="11_22_33_44_traffic_in_225.rrd":traffic_out:AVERAGE CDEF:aa=xx,8,* CDEF:bb=yy,8,* CDEF:sum=aa,bb,+ XPORT:aa:"in bits" XPORT:bb:"out bits" XPORT:sum:"sum"

The output of this will look like the following:

<?xml version="1.0" encoding="ISO-8859-1"?>

<xport>
   <meta>
       <start>1268688300</start>
       <step>300</step>
       <end>1268774700</end>
       <rows>289</rows>
       <columns>3</columns>
       <legend>
          <entry>in bits</entry>
          <entry>out bits</entry>
          <entry>sum</entry>
       </legend>
       </meta>
       <data>
          <row><t>1268688300</t><v>2.7999872862e+07</v><v>7.5003438338e+06</v><v>3.5500216695e+07</v></row>
          <row><t>1268688600</t><v>2.8997785260e+07</v><v>7.5029700077e+06</v><v>3.6500755268e+07</v></row>
          <row><t>1268688900</t><v>2.8835230581e+07</v><v>7.6799495891e+06</v><v>3.6515180170e+07</v></row>

What I want to do now is actually just extract the raw data set from this query and present it in a clean form.  This data can later be processed further, for example it can be output into csv format for processing somewhere else (might be useful if you are working on billing calculation methods for example).

To do this I only need two other tools, grep and the GNU version of awk, gawk:

rrdtool xport DEF:xx="11_22_33_44_traffic_in_225.rrd":traffic_in:AVERAGE DEF:yy="11_22_33_44_traffic_in_225.rrd":traffic_out:AVERAGE CDEF:aa=xx,8,* CDEF:bb=yy,8,* CDEF:sum=aa,bb,+ XPORT:aa:"in bits" XPORT:bb:"out bits" XPORT:sum:"sum" |grep -v NaN |grep -F -e '<row>' |gawk -F '<[^>]*>' '{printf "%s %10.3f %10.3f %10.3fn", strftime("%Y.%m.%d-%T",$3)" ", $5" ", $7" ", $9}'

The actual data points are contained in the lines embedded between the <row>  </row> tags.  I use grep to filter out entries with NaN values, which are data points which for one reason or another are empty (usually because the SNMP poll to the device did not return a result).  I then use grep to select only the lines with <row>.  This filters out all the XML header info at the top.

After grepping, the output will look like:

          <row><t>1268693700</t><v>3.3158293270e+07</v><v>7.5986690338e+06</v><v>4.0756962303e+07</v></row>
          <row><t>1268694000</t><v>3.5909368728e+07</v><v>7.7049765378e+06</v><v>4.3614345266e+07</v></row>
          <row><t>1268694300</t><v>3.4919549475e+07</v><v>8.4181615151e+06</v><v>4.3337710990e+07</v></row>

Next, I use gawk and specify the field delimiter as the brackets for XML tags using the regex  <[^>]*>
Finally, I print the fields in question (in this case fields 3, 5, 7, and 9, the other fields are various other non-data characters in the row.  If you are using a different rrdtool query with different CDEF’s or XPORT statements the fields may be different.)

Finally, I use some gawk magic to change the format of the first column of data, which is the epoch date (date given in number of seconds since 1 January 1970) to a more readable date.  Note that I chose a date format (using the strftime function)  which is still able to be sorted numerically (this would not be the case if for example the format was Jan 1, 1970 12:00).

The output will now look like:

2010.03.15-16:00:00  35909368.728 7704976.538 43614345.266
2010.03.15-16:05:00  34919549.475 8418161.515 43337710.990
2010.03.15-16:10:00  35540540.978 8270836.766 43811377.744

With the preceding output it is now trivial to use the cut command or some more awk to extract fields, or use the sort command to sort by certain columns.  For example:

sort -t ' ' -rn -k 3,3

will sort by column 3 in descending order.

95th Percentile Calculation

Here I am going to use rrdtool graph to output the 95th percentile calculation.  Without going into detail about how to define the necessary CDEF and VDEF statements (you can read in detail about these here), the following is the command I used:

rrdtool graph /dev/null -f ''  DEF:xx="11_22_33_44_traffic_in_225.rrd":traffic_in:AVERAGE DEF:yy="11_22_33_44_traffic_in_225.rrd":traffic_out:AVERAGE CDEF:aa=xx,8,* CDEF:bb=yy,8,* CDEF:totsum=aa,bb,+ VDEF:inper95=aa,95,PERCENT VDEF:outper95=bb,95,PERCENT VDEF:sumper95=totsum,95,PERCENT PRINT:inper95:"95th percentile bits in: "%lf PRINT:outper95:"95th percentile bits out: "%lf PRINT:sumper95:"95th percentile bits sum: "%lf

The output will look like:

95th percentile in: 46280432.074401
95th percentile out: 10138325.618306
95th percentile sum: 55849477.154300

Even though this uses the rrdtool graph command the output is to the console only. It is necessary to use the rrdtool graph command because you cannot specify a VDEF with rrdtool xport.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *