There’s that saying about “the first time’s an accident, the second’s a coincidence, the third is a pattern”. It’s probably because I’ve been studying Postgres’s autovacuum feature so much lately and these things stand out to me now, but I’ve noticed a really intriguing pattern (n>5) over the past month or so: folks with their autovacuum_naptime set too dang high.
The autovacuum_naptime GUC is the amount of time the daemon waits between checking for tables that need vacuuming and analyzing (or have xids that need freezing). If you set this to, for example, one day1, each table in your database gets checked only once a day. It’s certainly possible that this will work ok for your use case. What tends to happen, though, is that you end up with several tables that need maintenance, but the long naptime doesn’t allow that to happen in a timely fashion. And of course, the longer you go between vaccing/analyzing/freezing them, the longer it takes to get that job done and soon you’re in a downward spiral, and maybe your database ends up taking a dirt nap, at least from the viewpoint of your users who are all saying “OMG, why is this so sloooooow?” Then you conclude “autovacuum sucks” and disable it and try to stay on top of it with cron jobs etc like back in the Bad Old Days When We Didn’t Have Nice Things.
So, while the autovac config params are all interdependent to some extent2, this is kind of a big one and IME you don’t want to increase the naptime beyond the already-conservative default of 1 minute.
1 – Yes, really. Not trying to embarrass anyone here! Rest assured you are not alone.
2 – I was going to say that individual GUCs don’t operate in a vacuum, but figured that might be taking things Just Too Far.