Monitoring Cisco IP SLAs: rrdtool

by gorthx

Last week I set up an SLA to monitor jitter & looked at the stats available on the command line. This week, I’ll graph those stats.

Like I mentioned, this is quick-and-dirty monitoring, just meant to give me an idea of what I’m working with. This isn’t necessarily something I’d deploy to production, especially since there are products available that already do this. (Which I hope to have time to review in the near future.)

These OIDs from CISCO-RTTMON-MIB look promising:

rttMonCtrlAdminTag.10 = STRING: "backwater-office-VoiceTest"
rttMonCtrlAdminRttType.10 = INTEGER: jitter(9)
...
rttMonLatestJitterOperMOS.10 = Gauge32: 434
rttMonLatestJitterOperAvgJitter.10 = Gauge32: 1
rttMonLatestJitterOperAvgSDJ.10 = Gauge32: 1
rttMonLatestJitterOperAvgDSJ.10 = Gauge32: 1
...
rttMonLatestJitterOperNumOfRTT.10 = Gauge32: 996
rttMonLatestJitterOperRTTSum.10 = Gauge32: 40079
rttMonLatestJitterOperRTTSum2.10 = Gauge32: 1613583
rttMonLatestJitterOperRTTMin.10 = Gauge32: 39
rttMonLatestJitterOperRTTMax.10 = Gauge32: 48

Notes:
– the SLA id is the iid
– there are other configuration-related OIDs available, but I am not going to bother with them at this time.
– there are also OIDs that will tell me how many packets were lost from probes, etc (similar to the output of ‘sh ip sla statistics’), but I’m not going to bother with those either.
– From reading the MIB, I gather that NumOfRTT is a gauge value because it’s a ‘bucket’ – the MIB holds [NumOfRTT] number of the most recent values. I’m not 100% clear on how the math works on this yet, but (RTTSum/NumOfRTT) gives me a reasonable value, so I’m going with it.
– yes, we have to do some math to get the AvgRTT here.

Next steps:
1. Create an appropriate RRD (see below)
2. Set up a quick perl script to do the polling, calculate the AvgRTT, and update the rrd.
3. Graph it!

After about a week, this is what we have:

Graph of RTT
Graph of jitter + MOS

Looks like Something Happened [tm] on Tuesday. Next installment: configuring syslog messages about “events” like this.


Here are the commands I used to create my RRD:

rrdtool create PROBE_10_backwater-office-VoiceTest.rrd \
--start -4d \
DS:MOS:GAUGE:900:0:U \
DS:AvgJitter:GAUGE:900:0:U \
DS:AvgSDJ:GAUGE:900:0:U \
DS:AvgDSJ:GAUGE:900:0:U \
DS:AvgRTT:GAUGE:900:0:U \
DS:RTTMin:GAUGE:900:0:U \
DS:RTTMax:GAUGE:900:0:U \
RRA:AVERAGE:0.5:1:2304 \
RRA:AVERAGE:0.5:6:1536 \
RRA:AVERAGE:0.5:24:2268 \
RRA:AVERAGE:0.5:288:1890 \
RRA:MAX:0.5:1:2304 \
RRA:MAX:0.5:6:1536 \
RRA:MAX:0.5:24:2268 \
RRA:MAX:0.5:288:1890 \
RRA:MIN:0.5:1:2304 \
RRA:MIN:0.5:6:1536 \
RRA:MIN:0.5:24:2268 \
RRA:MIN:0.5:288:1890

…and to create the graphs:

# jitter
        rrdtool graph PROBE_10_backwater-office-VoiceTest_jitter.png \
--imgformat PNG \
--start end-1w \
--end now \
--width 600 \
--height 200 \
--title "jitter data for PROBE - SLA 10 - backwater-office-VoiceTest" \
--vertical-label "value" \
--lower-limit 0 \
--alt-autoscale \
DEF:AvgJitter=PROBE_10_backwater-office-VoiceTest.rrd:AvgJitter:AVERAGE \
DEF:MaxJitter=PROBE_10_backwater-office-VoiceTest.rrd:AvgJitter:MIN \
DEF:MinJitter=PROBE_10_backwater-office-VoiceTest.rrd:AvgJitter:MAX \
DEF:AvgSDJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgSDJ:AVERAGE \
DEF:MinSDJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgSDJ:MIN \
DEF:MaxSDJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgSDJ:MAX \
DEF:AvgDSJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgDSJ:AVERAGE \
DEF:MinDSJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgDSJ:MIN \
DEF:MaxDSJ=PROBE_10_backwater-office-VoiceTest.rrd:AvgDSJ:MAX \
DEF:RawMOS=PROBE_10_backwater-office-VoiceTest.rrd:MOS:AVERAGE \
DEF:RawMinMOS=PROBE_10_backwater-office-VoiceTest.rrd:MOS:MIN \
DEF:RawMaxMOS=PROBE_10_backwater-office-VoiceTest.rrd:MOS:MAX \
CDEF:MOS=RawMOS,100,/ \
CDEF:MinMOS=RawMinMOS,100,/ \
CDEF:MaxMOS=RawMaxMOS,100,/ \
LINE2:AvgJitter#800000:"Jitter            :" \
GPRINT:MinJitter:MIN:"Min %1.2lf" \
GPRINT:AvgJitter:AVERAGE:"Avg %1.2lf" \
GPRINT:MaxJitter:MAX:"Max %1.2lf\l" \
LINE2:AvgSDJ#4E387E:"Jitter, src to dst:" \
GPRINT:MinSDJ:MIN:"Min %1.2lf" \
GPRINT:AvgSDJ:AVERAGE:"Avg %1.2lf" \
GPRINT:MaxSDJ:MAX:"Max %1.2lf\l" \
LINE2:AvgDSJ#677d37:"Jitter, dst to src:" \
GPRINT:MinDSJ:MIN:"Min %1.2lf" \
GPRINT:AvgDSJ:AVERAGE:"Avg %1.2lf" \
GPRINT:MaxDSJ:MAX:"Max %1.2lf\l" \
LINE1:MOS#307D7E:"MOS               :" \
GPRINT:MinMOS:MIN:"MOS Min %1.2lf" \
GPRINT:MOS:AVERAGE:"Avg %1.2lf" \
GPRINT:MaxMOS:MAX:"Max %1.2lf\l" \
COMMENT:"Graph created at `date "+%Y-%b-%d %H:%M"`"

# rtt
        rrdtool graph PROBE_10_backwater-office-VoiceTest_rtt.png \
--imgformat PNG \
--start end-1w \
--end now \
--width 600 \
--height 200 \
--title "rtt data for PROBE - SLA 10 - backwater-office-VoiceTest" \
--vertical-label "ms" \
--lower-limit 0 \
--alt-autoscale \
DEF:AvgRTT=PROBE_10_backwater-office-VoiceTest.rrd:AvgRTT:AVERAGE \
DEF:RTTMin=PROBE_10_backwater-office-VoiceTest.rrd:RTTMin:MIN \
DEF:RTTMax=PROBE_10_backwater-office-VoiceTest.rrd:RTTMax:MAX \
LINE2:AvgRTT#0000AA:"RTT:" \
GPRINT:RTTMin:MIN:"Min %1.2lf" \
GPRINT:AvgRTT:AVERAGE:"Avg %1.2lf" \
GPRINT:RTTMax:MAX:"Max %1.2lf\l" \
COMMENT:"Graph created at `date "+%Y-%b-%d %H:%M"`"

Advertisements

3 Responses to “Monitoring Cisco IP SLAs: rrdtool”

  1. Could you please share perl script to populate data into rrd?

    • I can’t, sorry – it makes use of some proprietary code of my employer’s. You could whip something up pretty quickly with SNMP.pm and the OIDs above, though.

Trackbacks

%d bloggers like this: