Instant postgres snapshots on tablespace and btrfs


Yet Another Postges on BTRFS

For work, it can be useful to have several copies of one real base for experiments, fixtures, or just test applications. The base is growing and the time of copying through expanding the dump or using a template also increases to tedious amounts. To solve this case, options for using a file system with CoW support have already been described – btrfs. Found on the Internet such instructions, they come down to the fact that a snapshot of the entire server is made. And for the second “copied” to work, you need to regenerate pid and change the port to prevent conflicts. This method is quite universal in relation to the configuration of databases on the server, but it seems to have a limit for an indefinite number of parallel server snapshots.

In this article, I propose my own implementation of snapshots on one instance of the postgres server and one database, for an arbitrary number of copies.

Instruction linux only, about support for CoW file systems on Windows is not in the know.

I. Preparation

1. We will need a special section for Btrfs, where the databases and snapshots will be stored.

  • This can be a standalone device (hardwired or virtual device manager)

  • This can be a blank partition on the current device (configure partitions via cfdisk)

  • It can even be a file-based loop device (See losetup)

# Пусть устройство называется
/dev/sdb

2. Let’s format the device in Btrfs.

sudo apt-get update && apt-get install -y btrfs-progs
sudo mkfs.btrfs -L btrfs_disk /dev/sdb

3. Mount the device to the system.

# Для примера возьмем каталог /mnt/btrfs
sudo mkdir -p /mnt/btrfs && sudo mount /dev/sdb /mnt/btrfs

– (Optional) You can make a separate directory just for tablespace.

sudo mkdir /mnt/btrfs/pg

4. Install postgres server if not installed. (Instructions can be found at office website)

II. To create the initial base $ORIGIN (the one we will snapshot):

1. Create btrfs subkey

sudo btrfs subvolume create /mnt/btrfs/pg/$ORIGIN

2. Postgres needs permissions on this directory

sudo chown postgres:postgres /mnt/btrfs/pg/$ORIGIN

3. Now let’s create a tablespace in this directory. Tablespace can be named the same as a directory.

psql -c "CREATE TABLESPACE $ORIGIN LOCATION '/mnt/btrfs/pg/$ORIGIN'"

4. Let’s create an empty database in tablespace.

psql -c "CREATE DATABASE $ORIGIN TABLESPACE $ORIGIN;"

6. Now the database can be filled with data – from a dump or from your application.

pgbench -i -s 10 $ORIGIN
# Будет отображено время затраченное на генерацию
# У pgbench есть и другие параметры, scale только влияет на кол-во записей и 
# размер базы. (У меня -s 10 дало базу 92Мб)

III. To snapshot the $ORIGIN database to the $SNAPSHOT database, you will need:

1. Create a simple directory in pg_btrfs for the “skeleton” of the $SNAPSHOT base. (While this will be a dummy database, in order to find out what OID the server will assign)

sudo mkdir /mnt/btrfs/pg/$SNAPSHOT

2. Let’s not forget to give the rights to this directory to the postgres user.

sudo chown postgres:postgres /mnt/btrfs/pg/$SNAPHOT

3. Let’s create a tablespace in this directory.

psql -c "CREATE TABLESPACE $SNAPSHOT LOCATION '/mnt/btrfs/pg/$SNAPSHOT'"

4. Let’s create an empty database in this tablespace.

psql -c "CREATE DATABASE $SNAPSHOT TABLESPACE $SNAPSHOT;"

5. HINT! Now you need to remember the OID, which will be the name of the directory inside the created directory structure after creating the database. It is important to note that this approach allows you to snapshot only one database in the tablespace.

 /mnt/btrfs/pg/
     - $SNAPSHOT/
       - PG_11_201809051  # версия postgres (может отличаться)
         - 99174299 # - DB OID. Запишем ее в переменную
new_oid=$(sudo ls -1 /mnt/btrfs/pg/$SNAPSHOT/PG_*/ | grep -P [0-9]+)

6. In the same way, copy the DB OID $ORIGIN

origin_oid=$(sudo ls -1 /mnt/btrfs/pg/$ORIGIN/PG_*/ | grep -P [0-9]+)

7. Now let’s delete the normal directory that contains the tablespace for the snapshot. See footnote [1].

sudo rm -rf /mnt/btrfs/pg/$SNAPSHOT/

8. Finally, make an honest snapshot of $ORIGIN via btrfs subvolume

sudo btrfs subvolume snapshot /mnt/btrfs/pg/$ORIGIN/ /mnt/btrfs/pg/$SNAPSHOT

9. In the directory /mnt/btrfs/pg/$SNAPSHOT there will be a Copy-on-Write copy of the $ORIGIN subvolume with the OID as in the original database. Therefore, we will rename it so that it looks like the $SNAPSHOT database from point 4 for the server.

cd /mnt/btrfs/pg/$SNAPSHOT/PG_*/
sudo mv $origin_oid $new_oid
cd -

10. Now you can connect to the $SNAPSHOT database and check that the same data is there as in $ORIGIN.

IV. When you need to delete the $SNAPSHOT database (also true for $ORIGIN).

0. It is useful to complete all activity with the database before deleting.

PGQUERY=$(cat<< EOM
   SELECT pg_terminate_backend(pid)
   FROM pg_stat_activity
   WHERE pid <> pg_backend_pid()
      AND datname="$SNAPSHOT";
    EOM
    )
psql -c "$PGQUERY"

1. Remove the $SNAPSHOT database

psql -c "DROP DATABASE IF EXISTS $SNAPSHOT;"

2. Remove tablespace $SNAPSHOT

psql -c "DROP TABLESPACE IF EXISTS $SNAPSHOT;

3. Remove subvolume btrfs.

btrfs subvolume delete /mnt/btrfs/pg/$SNAPSHOT

V. If you need to restore the state from $ORIGING to $SNAPSHOT:

1. Repeat steps to remove $SNAPSHOT

2. Repeat steps to create $SNAPSHOT from $ORIGIN

[1] Steps II.1-6 are only needed to create the database OID in the tablespace. Removing the directory where the tablespace is registered shouldn’t break postgres (but I don’t think it’s a good idea to query while taking the snapshot). Meta information about bases and tablespaces continues to be stored in the postgres system base, and we do not influence it. Perhaps there is another way to create a database OID and replace it with the $ORIGIN snapshot, if you know of one, please share.


PS This method helped me a lot at work, where I had to check user errors in the system every day on real data, i.e. production copies. The database occupied more than 100GB, and deploying the dump took about an hour. In the old fashioned way, after deployment, the test database was littered with patches or migrations from different developers, and the data in it began to conflict. Then you need to either have a reserve of deployed copies of the database, or wait until the new one is deleted and restored.

In the case of Btrfs, you need to wait once for the deployment of the original database, and therefore create snapshots in a couple of seconds, and when you also need to quickly roll back to the original database.

All these steps have been put into 3 bash scripts that match the last 3 sections and run through jenkins. During the year of operation, no problems were noticed on the side of btrfs or postgres.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *