DBA: Night Watch

Many systems have a pattern of constant accumulation of data over time. Moreover, most of them never change again – that is, they are written in the mode append-only

These are not only various kinds of logs and equipment metrics, but also such seemingly unrelated things as correspondence between users or comments on news.

About a year ago, I already wrote about the model for organizing the partitioning of such data and the cascading changes in the database structure caused by this. And today, using the example of our service for analyzing the logs of PostgreSQL servers, we will analyze the features of servicing the databases organized like this, and how a competent approach (and a little night work) can reduce infrastructure costs at times

Something we are heavily loading the disk …

The entire base of our service is “sawn” into daily sections and reorganized for maximum throughput. You can read more about this in the articles “Writing in PostgreSQL on the sublight: 1 host, 1 day, 1TB” and “Save a penny on large volumes in PostgreSQL”.

At the same time, we try to write to the database in an extremely balanced way (see “Teleportation of tons of data in PostgreSQL”), but all the same, our “wishes” exceed the capabilities of the disk subsystem – it is 100% busy regularly and the access queues are small, but there are always:

“Chukchi is not a reader, Chukchi is a writer!” Or not?

No matter how amazing it is when write at speeds up to 200MB / s:

… and you read no less.

Stop – reading? .. But we don’t have tons of ineffective queries that read so much! But more than half of all disk operations are reads!

How so? Who is this villain who reads so much? And why is it here up to 18 maintenance processes in the middle of the day? These include different VACUUM, ANALYZE, CREATE INDEX, etc.

How and what we monitor in PostgreSQL is worth reading in the article “Monitoring the PostgreSQL database – who is to blame and what to do”.

“These are the times! – thought Stirlitz”

Well, let’s use the autoVACUUM / autoANALYZE analysis functionality:

It turns out that we run during the day a bunch of autovacuum (to prevent wraparound) on sections of some other days – and they all “miss” past the data cache, and climb into the disk! That is, we write so many transactions to the database that it regularly begins to want them to “clean up” the old sections, and we have them up to 150GB… Hmm.

These are the two, Mueller thought, and threw the second one.

So, what does autoANALYZE do with us? ..

A bunch of times during the day are done ANALYZE-runs on append-only signs! There is not much sense in this either, since all records here are obviously unique and have PK.

Night Watch and more

So, let’s start with the tail – get rid of autoanalyze. To do this, reconfigure the corresponding server parameters:

ALTER SYSTEM SET autovacuum_analyze_scale_factor = 1;
-- ждем для следующего сканирования, пока в табличку не запишут еще столько же (x2) записей

ALTER SYSTEM SET autovacuum_analyze_threshold = 100000;
-- ... но не меньше 100K

Now let’s take a closer look at the “old” sections. They become outdated exactly when we stop writing in them – around midnight, just when the load is minimal.

We have two types of them:

  • relatively small with support INSERT … ON CONFLICT UPDATE with all sorts of counters

  • mega-big append-only “facts”


Due to the specifics of MVCC, the first type of sections turns out to be “very full of holes” by the end of the day. This slows down reading and results in more busyness.

To “collapse” a section without blocking access to it, we developed daily night cron scriptstarting at 00:15, which successively applies to all such tables of the “previous day” pg_repack, “collapses” them and physically reorders the records according to the most used index:

Module pgrepack Is an extension to PostgreSQL that allows you to eliminate voids in tables and indexes and can optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL, it performs these operations on the fly, bypassing exclusive table locks during processing. In addition, pgrepack is effective, demonstrating performance comparable to direct use CLUSTER


For all other (append-only) tables – do it forcibly VACUUM FREEZE, removing the desire from the base to do it “suddenly” during the working day:

VACUUM FREEZE rawdata_20190419;
VACUUM FREEZE rawplan_20190419;


Disk read capacity decreased by 20 times, the disk load (disk busy) decreased to ~ 60%:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *