Raising a PostgreSQL cluster in Docker and Testcontainers
Such a solution, in addition to fault tolerance itself, allows you to partially solve the performance problem by redistributing data reading from the primary host to replicas. All write requests will still go to the primary host.
Terminology
A small digression regarding terminology:
primary = master = master host
slave = standby = secondary = replica
failover – an automatic process of turning a backup host (replica) into a primary one (in the event of an emergency). The replica becomes the new primary and starts servicing the write request.
switchover is a manual process of transferring primary from one host to another in a cluster.
More terminology Here.
Out of the box, PostgreSQL provides mechanism to promote (promote) a replica to primary in case of failure, but does not provide a complete solution for building an HA cluster (High Availability). There are a number of such solutions on the market from different vendors (you may come across the term HA utilities). I recommend watching video report Alexey Lesovsky to broaden his horizons in this area.
Probably the most popular solution at the moment is Patroni, but in this article I will use repmgr from EnterpriseDB. The point is that for repmgr there are already ready images from bitnamiwhich are updated regularly.
Fault tolerance in a PostgreSQL cluster is achieved through WAL and streaming replication. You can read more about this on the Selectel blog. It is important to understand that any changes you make to primaryfirst fall into WALand then broadcast over the network to a replica and play there.
Lyrical digression
Therefore, the execution of any DML command wrapped in a block BEGIN/ROLLBACK
is not safe: it only protects against data changes visible to an external observer, but can potentially slow down or even crash your database cluster.
For example, there is a large table of 100 GB; you need to test the migration, which will update about half of the rows in this table. What can go wrong? The CPU or disk space (MVCC and WAL) may run out, the network may become clogged. During this time, all other requests coming to the cluster will be idle and probably timed out. Outside, the situation will look like a denial of service.
Before executing any command on the database, think about what it will lead to.
Cluster via docker-compose
First, let’s bring up a PostgreSQL cluster in Docker via a compose file. Source codes, as usual, available on GitHub.
My version of the compose file is based on such from bitnami. Additionally, I added monitoring via postgres_exporter. To run, run the command from the directory with the compose file:
docker-compose --project-name="habr-pg-ha-14" up -d
Keep in mind that the cluster will take some time to come up: faster on Linux, longer on macOS/Windows. Depending on the power of your computer, it may take up to two minutes to start the cluster.
version: "3.9"
services:
pg-1:
container_name: postgres_1
image: docker.io/bitnami/postgresql-repmgr:14.9.0
ports:
- "6432:5432"
volumes:
- pg_1_data:/bitnami/postgresql
- ./create_extensions.sql:/docker-entrypoint-initdb.d/create_extensions.sql:ro
environment:
- POSTGRESQL_POSTGRES_PASSWORD=adminpgpwd4habr
- POSTGRESQL_USERNAME=habrpguser
- POSTGRESQL_PASSWORD=pgpwd4habr
- POSTGRESQL_DATABASE=habrdb
- REPMGR_PASSWORD=repmgrpassword
- REPMGR_PRIMARY_HOST=pg-1
- REPMGR_PRIMARY_PORT=5432
- REPMGR_PARTNER_NODES=pg-1,pg-2:5432
- REPMGR_NODE_NAME=pg-1
- REPMGR_NODE_NETWORK_NAME=pg-1
- REPMGR_PORT_NUMBER=5432
- REPMGR_CONNECT_TIMEOUT=1
- REPMGR_RECONNECT_ATTEMPTS=2
- REPMGR_RECONNECT_INTERVAL=1
- REPMGR_MASTER_RESPONSE_TIMEOUT=5
restart: unless-stopped
networks:
- postgres-ha
pg-2:
container_name: postgres_2
image: docker.io/bitnami/postgresql-repmgr:14.9.0
ports:
- "6433:5432"
volumes:
- pg_2_data:/bitnami/postgresql
- ./create_extensions.sql:/docker-entrypoint-initdb.d/create_extensions.sql:ro
environment:
- POSTGRESQL_POSTGRES_PASSWORD=adminpgpwd4habr
- POSTGRESQL_USERNAME=habrpguser
- POSTGRESQL_PASSWORD=pgpwd4habr
- POSTGRESQL_DATABASE=habrdb
- REPMGR_PASSWORD=repmgrpassword
- REPMGR_PRIMARY_HOST=pg-1
- REPMGR_PRIMARY_PORT=5432
- REPMGR_PARTNER_NODES=pg-1,pg-2:5432
- REPMGR_NODE_NAME=pg-2
- REPMGR_NODE_NETWORK_NAME=pg-2
- REPMGR_PORT_NUMBER=5432
- REPMGR_CONNECT_TIMEOUT=1
- REPMGR_RECONNECT_ATTEMPTS=2
- REPMGR_RECONNECT_INTERVAL=1
- REPMGR_MASTER_RESPONSE_TIMEOUT=5
restart: unless-stopped
networks:
- postgres-ha
pg_exporter-1:
container_name: pg_exporter_1
image: prometheuscommunity/postgres-exporter:v0.11.1
command: --log.level=debug
environment:
DATA_SOURCE_URI: "pg-1:5432/habrdb?sslmode=disable"
DATA_SOURCE_USER: habrpguser
DATA_SOURCE_PASS: pgpwd4habr
PG_EXPORTER_EXTEND_QUERY_PATH: "/etc/postgres_exporter/queries.yaml"
volumes:
- ./queries.yaml:/etc/postgres_exporter/queries.yaml:ro
ports:
- "9187:9187"
networks:
- postgres-ha
restart: unless-stopped
depends_on:
- pg-1
pg_exporter-2:
container_name: pg_exporter_2
image: prometheuscommunity/postgres-exporter:v0.11.1
command: --log.level=debug
environment:
DATA_SOURCE_URI: "pg-2:5432/habrdb?sslmode=disable"
DATA_SOURCE_USER: habrpguser
DATA_SOURCE_PASS: pgpwd4habr
PG_EXPORTER_EXTEND_QUERY_PATH: "/etc/postgres_exporter/queries.yaml"
volumes:
- ./queries.yaml:/etc/postgres_exporter/queries.yaml:ro
ports:
- "9188:9187"
networks:
- postgres-ha
restart: unless-stopped
depends_on:
- pg-2
networks:
postgres-ha:
driver: bridge
volumes:
pg_1_data:
pg_2_data:
Initiating auto failover
One of the functionality repmgr is switchover – manual controlled change of the master host in the cluster. Unfortunately, in images from bitnami this operation is not supported (the old repository had issue to this, but the repository has now been deleted).
Only available for testing failover mode. How to initiate it? Just extinguish the primary container:
docker stop postgres_1
The logs on the replica will contain entries like:
LOG: database system was not properly shut down; automatic recovery in progress
…
LOG: database system is ready to accept connections
After that, the former replica will become primary and will be ready to serve write requests.
Now you can bring the first host back into operation:
docker start postgres_1
It will come up, see that there is a new master, and continue to function as a replica:
INFO ==> This node was acting as a primary before restart!
INFO ==> Current master is 'pg-2:5432'. Cloning/rewinding it and acting as a standby node...
...
LOG: database system is ready to accept read-only connections
Who is primary now?
The question arises: “How to understand in what role a particular host functions?” A simple query will help answer it:
select case when pg_is_in_recovery() then 'secondary' else 'primary' end as host_status;
It is more convenient to execute it from the container console without explicitly logging into psql:
psql -c "select case when pg_is_in_recovery() then 'secondary' else 'primary' end as host_status;" "dbname=habrdb user=habrpguser password=pgpwd4habr"
And it’s even more convenient to do it from the command line:
docker exec postgres_1 psql -c "select case when pg_is_in_recovery() then 'secondary' else 'primary' end as host_status;" "dbname=habrdb user=habrpguser password=pgpwd4habr"
Cluster in Testcontainers
Automated testing and Continuous Integration (CI) are two practices that, in my opinion, have greatly changed software development. When you need to test the interaction of an application with any store or message broker, it is convenient and useful to use Testcontainers. In library pg-index-health, which I support and develop together with the open source community, has a function for collecting statistics on all cluster hosts. To fully test it, we needed to raise the PostgreSQL cluster in end-to-end tests.
Ready-made module for the cluster in test containers didn’t exist, so we made a little wrapper PostgresBitnamiRepmgrContainer over standard JdbcDatabaseContainer and packed it all up PostgreSqlClusterWrapper. Many thanks to Alexey @Evreke Antipin for the work done!
PostgreSqlClusterWrapper can raise a two-node cluster, as well as “extinguish” the first host, thereby causing auto failover. Jar-nick available in Maven Central.
The main difficulty was the correct initialization of the cluster: raise the first host, then the second; wait until the second host joins the cluster and starts streaming WAL With primary. For this purpose, we used LogMessageWaitStrategy
And Awaitility
.
For what PostgreSqlClusterWrapper can be useful to you? Mainly, this is an opportunity to check the behavior of your application in case of an abnormal situation on the database: whether it can continue to work correctly when changing the master.
Tuning JDBC driver parameters
JDBC driver supports client balancing and can itself reconnect to a new primary host if several database hosts are specified in the connection URL at once.
In general, the connection string looks like this:
jdbc:postgresql://localhost:32769,localhost:32770/test_db?connectTimeout=1&hostRecheckSeconds=2&socketTimeout=600&targetServerType=primary
Here we see two DB instances on different ports: localhost:32769
And localhost:32770
but connection parameters are much more important and interesting.
targetServerType=primary
indicates that we are only interested in the database master host. If you want to organize reading from a replica, then the option is suitable for you targetServerType=secondary
or targetServerType=preferSecondary
.
Parameter connectTimeout
determines how long we are willing to wait before establishing a connection with the database server. The default value is 10 seconds. Decreasing this value allows you to more aggressively iterate over the hosts in the list and try to connect to them.
Option hostRecheckSeconds
affects the frequency of checking who we have now is the master and who is the replica.
And finally socketTimeout
acts as a global limiter for the execution time of all queries to the database. Setting this parameter is optional; moreover, the specific value must be carefully selected to suit your needs.
Of course, these are far from all the parameters that the JDBC driver supports, but, in my opinion, this is the necessary minimum.
In industrial applications, a database connection pool is usually used. Each connection pool implementation has its own set of parameters that allow you to fine-tune the connections in the pool. You can read more about this in the article. Database timeouts. For HikariCPat least it makes sense to set connectionTimeout
And validationTimeout
:
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import javax.annotation.Nonnull;
import lombok.SneakyThrows;
import lombok.experimental.UtilityClass;
import org.postgresql.Driver;
@UtilityClass
public class HikariDataSourceProvider {
@Nonnull
@SneakyThrows
public HikariDataSource getDataSource(@Nonnull final String jdbcUrl,
@Nonnull final String username,
@Nonnull final String password) {
final HikariConfig hikariConfig = new HikariConfig();
hikariConfig.setMaximumPoolSize(3);
hikariConfig.setConnectionTimeout(250L);
hikariConfig.setMaxLifetime(30_000L);
hikariConfig.setValidationTimeout(250L);
hikariConfig.setJdbcUrl(jdbcUrl);
hikariConfig.setUsername(username);
hikariConfig.setPassword(password);
hikariConfig.setDriverClassName(Driver.class.getCanonicalName());
return new HikariDataSource(hikariConfig);
}
}
I have prepared a couple of applications demonstrating the use PostgreSqlClusterWrapperand hosted them on GitHub: console And spring-boot. Let them be the starting point for your experiments!