PSQLBuddy – PostgreSQL backup and restore
Introduction
What problems were solved
Performing a Backup individual databases on the database server
Uploading each database separately to S3 storage
Cleaning storage from old backups
Restoring a database to a specially created archive using a bot
Interface for managing recovery from backup
History of BackupBuddy
Once upon a time, while still using virtual machines, I made a clean shell scriptwhich performs plus or minus the same tasks. A few years later, when the need arose to change the server again, I wanted to rewrite it all in python and normal libraries in order to run it all humanly in a small, cozy environment and add the ability for the user to restore databases to an archive without my participation.
Initially, I implemented two utilities – one for creating backups, and the other for deploying a backup to an archive. But in the end I merged them into one, making it possible to run the utility with arguments –backup & –restore. About them in detail below.
Cons of pg_dump, pg_dumpall, pg_restore
When using pg_dump To regularly backup each database, you need to create separate cron tasks. This approach requires significant management effort, especially if the system has many databases. Each task requires specifying multiple attributes, such as output formats, connection options, and compression levels, which increases the likelihood of errors and complexity of administration.
The problem gets worse when using the utility pg_restorewhich requires pre-cleaning the database before restoring. This process can be time-consuming and risky, especially if you are working in a combat environment. In addition, for successful recovery, you must know exactly the set of attributes used in the backup.
Pros of PSQLBuddy
PSQLBuddy is a more efficient and convenient approach to backing up PostgreSQL databases. Inside, a simple config is created with settings for the target storage, databases for backup and the number of stored backups. The PSQLBuddy utility organizes uploading data to S3 storage, as well as deleting files on the server after successful completion of the operation. Mechanisms are provided to clean up S3 storage in accordance with predefined backup storage rules.
How does this work
The PSQLBuddy utility is installed on a server running PostgreSQL.
Depending on the required functionality, PSQLBuddy is launched either in –backup mode (on a schedule) or in –restore mode (a hanging process with a bot).
Run in –backup mode
All databases added in the config begin to be processed sequentially.
The first step is to create a name for the backup. The name consists of three parts and is formed according to the following template: <НАЗВАНИЕ БД>-<год_месяц_день>-<ТИП БЭКАПА>.dump
.
Real example: volleyball_db-2024_09_19-DAILY.dump
And if everything is more or less clear with the first and second, then the type of backup requires some explanation. There are four types in total – YEARLY, MONTHLY, WEEKLY, DAILY.
YEARLY – created only on the first of January
MONTHLY – first day of the month
WEEKLY – first day of the week
DAILY – every day
It's worth noting that only one type of backup is created per day. That is, if today is the first day of the month, but at the same time the first day of the week, then the MONTHLY type will still be created.
Then a backup is created using the pg_dump command, launched in subprocess.
pg_dump -U postgres -F c -d <ИМЯ БД> -f <ПАПКА temp>/<НАЗВАНИЕ БЭКАПА>
After creation, the backup is uploaded to S3 storage and deleted from the temp folder, and a backup of the next database in the list is launched.
When all backups are made and uploaded, PSQLBuddy requests a list of backups stored in the S3 bucket, breaks them down by database and backup type. Then it goes through each database and each type of backup that exists for it. If the number of backups of the DAILY type for the volleyball_db database is 8, and in the config we indicated that 7 DAILY backups for this database should be stored, then the extra (oldest) backup is deleted.
Run in –restore mode
A telegram bot is launched, which responds only to the users specified in the config. Below is an illustration of the simple operation of this bot.
The /start command accesses S3 storage and obtains a list of backup copies. All databases are pulled out from this list and the response message asks you to select one of them. The correctness of the input is checked and confirmation of restoring the backup is requested, after which, in fact, the following happens sequentially:
downloading a copy from storage
deleting archive database
creating an archive database
restoring a downloaded backup to the archive database using the archive command
pg_restore -U postgres -d archive <ПАПКА TEMP>/<НАЗВАНИЕ БЭКАПА>
Installation and launch
Preliminary preparation
This assumes that you already have python installed. You can check this with the command python -v
We will run PSQLBuddy under the postgres user, which by default is present on the system with postgresql installed. Typically the home directory for the postgres user is located at /var/lib/postgresql/ therefore, in the future we will assume that we have everything in /var/lib/postgresql/PSQLBuddy
Go to the target folder and copy everything from the repository and go to the downloaded folder
cd /var/lib/postgresql/
git clone https://github.com/dmitrymp3/PSQLBuddy.git
cd PSQLBuddy/
Create a directory for temporary files (aka temp_path in the config)mkdir temp
We can start creating the environment and then activate it.
python -m venv venv
source venv/bin/activate
After which we should see (venv) at the beginning of the line.
If the previous step was successful and we see that we are in a venv environment, we can install dependencies for the program to run in this environment.
pip install -r requirements.txt
Filling out the config
Rename the config in the folder and open it for editing with vi, nano, or any other editor
mv conf/config.sample conf/config.py
nano conf/config.py
Here we pay attention to the following settings:
databases – an instance of the AllDatabases() class. Must be initialized in the CommonConfig class (everything is already done in the sample config, just add your bases). After initialization, a class instance has a method for setting default values:
databases.set_default_freq({
'DAILY' : 4,
'WEEKLY' : 4,
'MONTHLY' : 4,
'YEARLY' : 999,
})
And then you can add databases as follows:
databases.add_database('volleyball_db')
databases.add_database('basketball_db', {'WEEKLY': 10})
databases.add_database('super_db', {
'DAILY' : 44,
'WEEKLY' : 22,
'MONTHLY' : 11,
'YEARLY' : 1,
})
...
You can view the list of databases on your server with the command
psql --list
from the postgres user (su postgres
)
boto_config – by default configured to work with s3 storage in timewebbut can be configured for any other s3 storage. More about this in the chapter organization of S3 storage. Approximate time spent – 5 minutes.
Minimum data set –
aws_access_key_id
(login),aws_secret_access_key
(password) ands3_bucket
(folder).
bot_token – your bot’s token, which you will use to manage the restoration of backups.
How to get a bot token? We write in the cart @BotFather. Team
/newbot
starts the creation of a new bot. As a result, you will receive a token that can be used in the config. Approximate time spent – 2 minutes.
tg_admins – IDs of users to whom the bot will generally respond.
You can find out your Telegram-ID from this botFor example. Or similar.
temp_path – directory where temporary files (downloaded and created backups) will be stored. If you do not want to touch the main disk, you can change it, but be careful, the postgres user must have rights to read and write to this folder.
Additional adjustment
Assign owner to all files – postgreschown -R postgres:postgres /home/postgres
Changing the usersu postgres
Launch
Run in –backup mode
In this mode, the program is launched once, and therefore we need to add a task to cron (under the postgres user)
Editing crontab
crontab -e
Add a line to the end
0 4 * * * cd /var/lib/postgresql/PSQLBuddy/ && venv/bin/python3 main.py --backup
What it will mean: Every day at 4.00 go to the folder with the program and run the program with the backup argument from the interpreter in the environment
Run in –restore mode
In restore mode, the program must be constantly running, and the best way to ensure this is to run it as a service. To do this you can use the file PSQLBuddy.service. First you need to check what's inside the file. The paths must be written there correctly WorkingDirectory
& ExecStart
. By default they lead to /var/lib/postgresql/PSQLBuddy
.
Then we softlink this file to the folder /etc/systemd/system/
and update the list of services.
We create a softlink and work with systemctl under root or any other administrative user, since by default the postgres user does not have enough rights to work with systemd & systemctl.
So, enter into the console exit
if we are still under the postgres user and start.
ln -s /var/lib/postgresql/PSQLBuddy/PSQLBuddy.service /etc/systemd/system/PSQLBuddy.service
systemctl daemon-reload
Then we activate the service and start it
systemctl enable PSQLBuddy.service
systemctl start PSQLBuddy.service
We check the success of the launch with the command
systemctl status PSQLBuddy.service
If something doesn’t work out with launching the bot as a service, you can always run the script manually:
/var/lib/postgresql/PSQLBuddy/venv/bin/python3 main.py --restore
Other materials
Organization of S3 storage
Timeweb.cloud
I personally have been using the timeweb service for many years, so I initially did and will describe the setup for it.
If you are not yet registered with timeweb, you can do so using my referral link. I will be very grateful 🙂
But if you register directly, then the magic will not disappear anywhere, everything will work exactly the same, it’s just that TimeWeb won’t treat me to coffee 🙂
After registration, we create a new S3 storage.
I think you will do a great job creating a new bucket and you will see something like this:
From here we are interested in three things – S3 Access Key, S3 Secret Access Key and the name of the bucket (339e534d-my_test_bucket in the screenshot above). You can easily match them with three fields in the config (aka boto_config).
Other S3 storages
The full set of settings is described in the BotoConfig class in the file conf/config_classes.py. You can either set them directly there (default values) or override them in a file config.py.
Bot lexicon
At the time of release, I was too lazy to edit the bot’s phrases, and in general, it seems to me that “Vitya” is a good collective image for an accountant 🙂
In any case, all phrases are hardcoded in the file bot/bot_init.py and you don’t need a lot of knowledge to correct them. Customize to your heart's content.
In the future, however, it is planned to improve the usability of the bot and, accordingly, correct the vocabulary to be more neutral and office-appropriate.
Logging
By default, the logs are poured into /var/log/syslog. In file main.py you can uncomment the line filename="common.log"
then the logs will go to the appropriate file.
You can see what is happening in the logs using the command tail
tail -n 200 /var/log/syslog | grep python
tail -n 200 /var/log/syslog | grep cron
Conclusion
Results of the implementation of utilities
Having installed a copy of the program on my database, I now receive separate backups every day, rotate them and store them in cheap storage. As of the end of 24, we pay 1200 rubles for 500 gigabytes, which is much more adequate than buying an additional VDS disk. There, 500 gigabytes will cost 5,000 rubles monthly, and I didn’t see the possibility of connecting more than 500 gigabytes.
My accountant, in turn, can easily restore the database for a certain period and see what was there (for example, if a shortage was discovered in the warehouse and there is no information in the database).
Plans for further improvement
[ ] Block work with the database if one process is already running
[ ] Add the frequency of creating backups (we backup this database only on Mondays).
[ ] Sometimes drop archive does not work if the database is in use. The error in the bot is uninformative, it can be improved. Or add a forced disconnection.
[ ] Add name selection for archive database
[ ] Structure logs. Separate file – backup copies. Separate – bot work + rotation
Epilogue
If you press the star, I'll be happy.
If you register in timeweb using my referral, I will be happy and jump to the ceiling.
If you have any problems, write to me telegram. I'll try to figure it out and fix it. (MODERATOR (!) I’m not sure that this line is legal here, please delete it if my doubts are justified)