Archive for ‘system monitoring’

13 June, 2014

Postgres Monitoring Wishlist

by gorthx

Since “what should I monitor in my database” has come up in conversation several times lately, I thought I’d put this here where I (theoretically) won’t lose it. I’ll save for later the discussion of where to get this info and which tools give me which stats :)

Bare minimum:
server CPU, memory, I/O, network usage, and all “slow” queries logged.

More extensive:
System stats:
CPU usage, per-proc if available
Memory usage, including swap
disk usage (in terms of space – pay special attention to database partitions)
disk I/O
disk busy
network stats, including errors (if you have a Cisco network & are friends with the network team, netflow data is cool to have)

If I could have everything I wanted: everything from vmstat and iostat extended data

Pg stats:
number of connections
transactions
idle transactions
commits vs rollbacks
locks
checkpoint frequency
database size
table size (plus bloat, if we can find a good query for it)
index size (same)

If I could have everything I wanted:
everything from pg_stat_database, pg_stat_sys_*, pg_statio_*, and pg_stat_user_activity

Activity logs configured as outlined here.

Then there’s a whole class of things that fall under “How long does it take to…”: do a backup, restore a backup, etc.

22 March, 2013

Monitoring tools: nmon

by gorthx

Next up in my occasional monitoring tools review series: another oldie-but-goodie, readily available tool, nmon.

What: nmon
What it monitors: system stats
Where to get it: it’s probably pre-installed on your system. If not, get it from sourceforge.
Why you’d want (or not) to use it: Pretty much the same reasons you’d want to use sar, as I discussed previously.

I’ve (casually) used the interactive interface, and until a few weeks ago, thought that’s all that there was to this tool. Not so. There’s an option (-f) you can use to save a single data poll to a file, in “spreadsheet format”. You can also specify an interval and a number of polls to take:

nmon -f -s 60 -c 60
= poll once a minute for an hour.

nmon will create a file for you, with a default name of [server]-timestamp.nmon, or you can specify your own filename with -F.

To generate graphs, there are two Excel spreadsheets you can download from the wiki. I tried the nmon Analyzer Spreadsheet (the newer of the two). The docs recommend “keep the number of snapshots to around 300”. I agree. The graphs look a lot nicer with fewer data points in them. However, Excel graphs just aren’t as pretty as rrdtool graphs.

There’s an nmon2rrd tool, but it was compiled for AIX so I didn’t try it out.

Of the two, if I’m looking for on-the-spot visualization of system performance, nmon wins it. For storage and later review of the data, I’d go with sar + sar2rrd.pl over nmon + the Excel spreadsheet. The graphs are prettier and easier to read with sar.

Tags: ,
15 March, 2013

Monitoring Tools: sar

by gorthx

What: sar
What it monitors: pretty much every system stat you can imagine (and some you haven’t)
Where to get it: it’s probably pre-installed on your system; if not, try the sysstats package (the same one that includes iostats)
Why you’d want to use it:

  • you need an answer fast, but maybe don’t have access to the “enterprise” monitoring (or there isn’t any…[1])
  • you’re doing system testing and want a command-line tool that’s easy to configure and run in discrete timeframes.

Why you wouldn’t want to use it:

  • you want data you can easily throw into a graphing or analysis program; the data produced by sar isn’t readily machine-readable
  • you’re looking for a near-real-time long-term monitoring solution. In that case, just go ahead and set up munin or collectd.

Because it’s lightweight and so readily available, it’s a good tool to have in your toolbox. Plus, it’ll tell you things like fan speed and temperature, and I’m just a sucker for environmental monitoring [2].

read more »