Part 4 or Commitfest 2024-01

Spring is already in full swing, and we will remember the hot news from the winter itself, the January commitfest. And let's start right away with the trump cards.

Incremental backup

Logical replication: maintaining subscription state when updating the subscriber server

Dynamic Shared Memory Register

EXPLAIN (memory): amount of memory to schedule the query

pg_stat_checkpointer: monitoring restart points on replicas

Parallel creation of BRIN indexes

Queries with IS condition [NOT] NULL for required columns

SET search_path optimization

GROUP BY optimization

Scheduler Auxiliary Functions for Working with Range Types

PL/pgSQL: arrays for %TYPE and %ROWTYPE types

jsonpath: new data conversion methods

COPY… FROM: Ignores format conversion errors

to_timestamp: TZ and OF format masks

GENERATED AS IDENTITY in partitioned tables

ALTER COLUMN…SET EXPRESSION

Incremental backup
commit: 174c4805, dc212340

To quickly restore to a point in time in the past, it is advisable to make frequent backups so that replaying WAL segments to the desired point takes less time. However, it is often very expensive to perform full backups of large databases: copies take up a lot of space, take a long time to create, and load the system.

In the 17th version they will come to the rescue incremental copieswhich save only changes made relative to another backup.

Here's how it works.

We launch walsummarizer – a process that scans WAL to collect information about changed pages. To do this, you need to enable the parameter summarize_wal. Conveniently, there is no need to restart the server; just re-read the configuration. It is also important that the process can be enabled not only on the main server, but also on the replica:

ALTER SYSTEM SET summarize_wal = on;
SELECT pg_reload_conf();

The new process appears in pg_stat_activity. Now information about changed pages will be saved in a compact form in the pg_wal/summaries directory.

First, create a full backup.

$ pg_basebackup -c fast -D backups/full

Let's change something, for example, make a copy of the table.

CREATE TABLE tickets_copy AS SELECT * FROM tickets;

Now we create an incremental copy relative to the previous full copy. To do this, in the -i parameter we specify the manifest file of the full copy.

$ pg_basebackup -c fast -D backups/increment1 \
    -i backups/full/backup_manifest

A few more changes.

DELETE FROM tickets_copy;

Now let's create a second incremental copy, which will be based on the previous incremental copy, and not on the full one. To do this, specify the incremental copy manifest file in the -i parameter.

$ pg_basebackup -c fast -D backups/increment2 \
    -i backups/increment1/backup_manifest

To restore from an incremental copy, you first need to reconstruct the full copy using the pg_combinebackup utility. If we want to recover from a second incremental copy, then when running pg_combinebackup we need to specify both incremental copies and a full backup copy, as well as the location where to put the reconstructed full copy.

$ pg_combinebackup backups/full backups/increment1 backups/increment2 \
    -o backups/full_combined

By the way, about copy sizes. As you might expect, incremental copies are significantly smaller than full ones.

$ du -h -s backups/*
2,7G    backups/full
2,7G    backups/full_combined
411M    backups/increment1
25M     backups/increment2

From a new full copy, we launch the server on a free port:

$ pg_ctl start -D backups/full_combined \
    -o '-p 5400' \
    -l backups/full_combined/logfile

Let's make sure that the changes saved in the second incremental copy are restored.

$ psql -p 5400 -c 'SELECT count(*) FROM tickets_copy'
 count
-------
     0
(1 row)

Let's summarize. To create and work with incremental backups, the server architecture provides the following components:

  • The collection of information about modified pages is performed by the walsummarizer process, controlled by the parameter summarize_wal.
  • A utility is designed to diagnose and monitor the operation of the walsummarizer process, as well as the contents of the pg_wal/summaries directory pg_walsummary and SQL functions pg_available_wal_summaries, pg_wal_summary_contents, pg_get_wal_summarizer_state.
  • IN replication protocol the UPLOAD_MANIFEST command has been added, and the BASE_BACKUP command now has an INCREMENTAL parameter.
  • To create an incremental copy, the utility pg_basebackup received the -i (–incremental) parameter. With this option, pg_basebackup creates an incremental copy using new replication protocol capabilities.
  • Utility pg_combinebackup Reconstructs a full backup from an incremental copy and all copies on which it depends.

see also

Incremental Backup: What To Copy? (Robert Haas)

Incremental Backups: Evergreen and Other Use Cases (Robert Haas)

Waiting for Postgres 17: Incremental base backups (Lukas Fittl)

Waiting for PostgreSQL 17 – Add support for incremental backup. (Hubert 'depesz' Lubaczewski)

Logical replication: maintaining subscription state when updating the subscriber server
commit: 9a17be1e

The previous article discussed moving replication slots on the publishing server during an upgrade to a new major version. The topic of server upgrades in the context of logical replication has been continued. Now, when updating the subscriber server, the state of subscriptions will be fully preserved, which will allow you to continue receiving changes from the publishing server without resynchronizing the data.

Dynamic Shared Memory Register
commit: 8b2bcf3f, abb0b4fc

To use shared memory, modules and libraries usually require loading via shared_preload_libraries, and for this you need to restart the server. Processes can request allocation of dynamic shared memory (via the DSM API), but in order for other processes to use it, they must know where to look for it. This is exactly the problem that the patch solves: it maintains a register of allocated dynamic shared memory (the first commit).

The first extension to use the new interface was pg_prewarm (second commit). Calling the autoprewarm_start_worker and autoprewarm_dump_now functions no longer consumes shared memory if the pg_prewarn library was not loaded when the server started.

EXPLAIN (memory): amount of memory to schedule the query
commit: 5de890e3

Memory is needed not only for executing the request, but also for scheduling it. Exactly how much memory was allocated and used to build the plan can be seen with the new memory parameter of the EXPLAIN command:

EXPLAIN (memory, costs off)
SELECT * FROM tickets;
                   QUERY PLAN                    
-------------------------------------------------
 Seq Scan on tickets
 Planning:
   Memory: used=7920 bytes  allocated=8192 bytes
(3 rows)
EXPLAIN (memory, costs off)
SELECT * FROM tickets a, tickets b;
                    QUERY PLAN                     
---------------------------------------------------
 Nested Loop
   ->  Seq Scan on tickets a
   ->  Materialize
         ->  Seq Scan on tickets b
 Planning:
   Memory: used=19392 bytes  allocated=65536 bytes
(6 rows)

The more tables are involved in the query, the more memory is needed to build the plan. This is especially true for queries with partitioned tables.

pg_stat_checkpointer: monitoring restart points on replicas
commit: 12915a58

Columns for monitoring restart points have been added to the pg_stat_checkpointer view, new for version 17: restartpoints_timed, restartpoints_req and restartpoints_done. The last column will help you understand how many restart points were actually completed. The fact is that restart points on the replica cannot be executed more often than checkpoints on the main server. Therefore, until the replica receives a WAL record about passing the checkpoint, the requested restart points will fail.

Parallel creation of BRIN indexes
commit: b4375717

BRIN indexes can be created in parallel by multiple processes. Previously, this only worked for B-tree indexes.

Queries with IS condition [NOT] NULL for required columns
commit: b262ad44

In version 16, for each row of the tickets table, a check is made that the ticket_no column is not empty:

16=# EXPLAIN (costs off)
SELECT * FROM tickets WHERE ticket_no IS NOT NULL;
            QUERY PLAN             
-----------------------------------
 Seq Scan on tickets
   Filter: (ticket_no IS NOT NULL)
(2 rows)

However, the ticket_no column has a NOT NULL constraint, so it cannot be empty, and validation resources are wasted. Even worse, if the condition is IS NULL: you will have to scan the entire table (albeit by index) to make sure that there is nothing to return:

16=# EXPLAIN (costs off)
SELECT * FROM tickets WHERE ticket_no IS NULL;
                QUERY PLAN                
------------------------------------------
 Index Scan using tickets_pkey on tickets
   Index Cond: (ticket_no IS NULL)
(2 rows)

Of course, there is no point in including such conditions in the request. But there is no prohibition.

In addition, there are situations when, to optimize min/max aggregates, the scheduler itself can add the IS NOT NULL condition at the query rewriting stage. This, in turn, can further lead to poor index selection. Details are in the correspondence.

In the 17th version to build more optimal plans for IS conditions [NOT] NULL checks whether the column has a NOT NULL constraint:

17=# EXPLAIN (costs off)
SELECT * FROM bookings WHERE book_date IS NOT NULL;
      QUERY PLAN      
----------------------
 Seq Scan on bookings
(1 row)
17=# EXPLAIN (costs off)
SELECT * FROM tickets WHERE ticket_no IS NULL;
        QUERY PLAN        
--------------------------
 Result
   One-Time Filter: false
(2 rows)

It is worth noting that it is necessary to limit NOT NULL. Limitation CHECK (column_name IS NOT NULL) although it performs the same check, it will not be taken into account by the scheduler.

SET search_path optimization
commit: ad57c2a7, a86c61c9

Parameter setting search_path in the function definition provides safer work with database objects. But not for free.

Let's perform simple measurements of the function's operating time in psql. First without setting the parameter:

16=# CREATE FUNCTION f() RETURNS int AS 'SELECT 0' LANGUAGE SQL;
16=# \timing on
16=# DO 'BEGIN PERFORM f() FROM generate_series(1, 10_000_000); END';
DO
Time: 1909,958 ms (00:01,910)

And now with setting the parameter:

16=# ALTER FUNCTION f SET search_path = a,b,c;
16=# DO 'BEGIN PERFORM f() FROM generate_series(1, 10_000_000); END';
DO
Time: 12594,300 ms (00:12,594)

In version 17 for installation search_path Several optimizations have been made: a hash table of previous values ​​is used, and a check is made for a match with the previous value. This allowed us to speed up the previous example:

17=# DO 'BEGIN PERFORM f() FROM generate_series(1, 10_000_000); END';
DO
Time: 9501,717 ms (00:09,502)

GROUP BY optimization
commit: 0452b461

This optimization was included in version 15; its description can be found in the review of the March 2022 commitfest. But just before the release of PostgreSQL 15, they rolled it back. Now is the second try.

In short, in queries where multiple columns are listed in GROUP BY, the planner may swap columns in search of the optimal plan, for example to use an index or incremental sort. This will not affect the query result, but performance may increase.

see also

GROUP BY reordering (Adrien Nayrat)

Scheduler Auxiliary Functions for Working with Range Types
commit: 075df6b2

Let's say we want to find out which flights were operating in the first seven days of August 2018:

16=# EXPLAIN (costs off) SELECT * FROM flights WHERE actual_departure <@ tstzrange('2017-08-01', '2017-08-08', '[)');
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on flights
   Filter: (actual_departure <@ '["2017-08-01 00:00:00+03","2017-08-08 00:00:00+03")'::tstzrange)
(2 rows)

Заметим, что по столбцу actual_departure есть индекс, но он не может быть использован с оператором <@ .

В 17-й версии для операторов работы с диапазонами @> и <@ добавили вспомогательные функции планировщика, которые переписывают условие на сравнение значения с обеими границами диапазона.

План этого же запроса в 17-й версии:

17=# EXPLAIN (costs off)
SELECT *
FROM flights
WHERE actual_departure <@ tstzrange('2017-08-01', '2017-08-08', '[)');
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using flights_actual_departure_idx on flights
   Index Cond: ((actual_departure >= '2017-08-01 00:00:00+03'::timestamp with time zone) AND (actual_departure < '2017-08-08 00:00:00+03'::timestamp with time zone))
(2 rows)

Переписанное в таком виде условие отлично использует индекс.

PL/pgSQL: массивы для типов %TYPE и %ROWTYPE

commit: 5e8674dc

В PL/pgSQL теперь можно объявлять массивы с конструкциями наследования типов %TYPE и %ROWTYPE.

DO $$
DECLARE
    seats_list seats.seat_no%TYPE[] := (SELECT array_agg(seat_no) FROM seats WHERE aircraft_code="733" AND fare_conditions="Business");  BEGIN RAISE NOTICE '%', seats_list;  END;  $$ LANGUAGE plpgsql;
NOTICE:  {1A,1C,1D,1F,2A,2C,2D,2F,3A,3C,3D,3F}
DO

Similarly, you can create an array of a composite type table_name%ROWTYPE[].

see also

Waiting for PostgreSQL 17 – In plpgsql, allow %TYPE and %ROWTYPE to be followed by array decoration. (Hubert 'depesz' Lubaczewski)

jsonpath: new data conversion methods
commit: 66ea94e8

In accordance with the SQL standard, support for new data conversion methods has been added to the jsonpath language: .bigint(), .boolean(), .date(), .decimal(), .integer(), .number(), .string( ), .time(), .time_tz(), .timestamp(), .timestamp_tz().

Example of using some methods:

\set json '{"key1": 42, "key2": "t", "key3": "Hello, World!"}'

SELECT jsonb_path_query(:'json', '$.key1.integer()'),
       jsonb_path_query(:'json', '$.key2.boolean()'),
       jsonb_path_query(:'json', '$.key3.string()');
 jsonb_path_query | jsonb_path_query | jsonb_path_query
------------------+------------------+------------------
 42               | true             | "Hello, World!"

COPY… FROM: Ignores format conversion errors
commit: 9e2d8701, b725b7ee, 72943960

Failure to load any row with the COPY... FROM command results in the entire transaction being rolled back. It would be nice to be able to load all the correct lines and then process the erroneous ones separately.

The first step has been taken in this direction. The COPY... FROM command has learned to ignore errors associated with the incorrect format of individual column values. The basis was the functions of “soft” processing of value formats that appeared in version 16.

Example table:

CREATE TABLE t (id int PRIMARY KEY);

The COPY command has a new parameteron_error. Default value stop corresponds to habitual behavior and stops at the first error. Second and so far last meaning ignore says that format conversion errors will be ignored:

COPY t FROM STDIN (on_error 'ignore');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1
>> два
>> 3
>> \.
NOTICE:  1 row was skipped due to data type incompatibility
COPY 2

Of the three lines, two were loaded. One line could not be loaded, as indicated by the warning. During a long load, you can monitor the number of skipped rows in the tuples_skipped column of the pg_stat_progress_copy view (third commit).

How can you find out which input lines were not loaded? It is assumed that in the future the parameter on_errorvalues ​​will appear file And tablein which you can specify the name of the file or table, respectively.

Please note that only format conversion errors are ignored. Any other row loading errors will still stop and cancel the entire loading. So, in the following example, the first line causes the primary key integrity constraint to fail, preventing you from continuing:

COPY t FROM STDIN (on_error 'ignore');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 3
>> 4
>> \.
ERROR:  duplicate key value violates unique constraint "t_pkey"
DETAIL:  Key (id)=(3) already exists.
CONTEXT:  COPY t, line 1
SELECT * FROM t;
 id
----
  1
  3
(2 rows)

see also

Waiting for PostgreSQL 17 – Add new COPY option SAVE_ERROR_TO / Rename COPY option from SAVE_ERROR_TO to ON_ERROR (Hubert 'depesz' Lubaczewski)

to_timestamp: TZ and OF format masks
commit: 8ba6fdf9

The to_char function has long understood the TZ (time zone abbreviation) and OF (time zone offset from UTC) format masks.

Support for these format masks has now been added to the to_timestamp inverse function:

SELECT to_timestamp('2024-02-18 23:50:00MSK', 'YYYY-MM-DD HH24:MI:SSTZ') tz,
       to_timestamp('2024-02-18 23:50:00+03', 'YYYY-MM-DD HH24:MI:SSOF') of;
           tz           |           of           
------------------------+------------------------
 2024-02-18 23:50:00+03 | 2024-02-18 23:50:00+03

GENERATED AS IDENTITY in partitioned tables
commit: 69958631

Identity columns declared as GENERATED AS IDENTITY are fully supported in partitioned tables.

When you create a new partition or attach a partition from an existing table (ATTACH PARTITION), the identity column is associated with a single sequence defined for the partitioned table. New values ​​will be selected from the sequence both when inserting into a separate partition and into the partitioned table itself. When you disable a section (DETACH PARTITION), the now independent table column is disconnected from the sequence.

see also

Waiting for PostgreSQL 17 – Support identity columns in partitioned tables (Hubert 'depesz' Lubaczewski)

ALTER COLUMN…SET EXPRESSION
commit: 5d06e99a

It is now possible to change the expression of generated table columns. Previously, an expression could only be deleted.

As an example, let's distribute the table rows across three nodes in the node column:

CREATE TABLE t (
    id int PRIMARY KEY,
    node int GENERATED ALWAYS AS (mod(id, 3)) STORED
);

WITH ins AS (
    INSERT INTO t SELECT x FROM generate_series(1,100) AS x
    RETURNING *
)
SELECT node, count(*) FROM ins GROUP BY 1 ORDER BY 1;
 node | count
------+-------
    0 |    33
    1 |    34
    2 |    33
(3 rows)

When changing the number of nodes, you can replace the expression with one command:

ALTER TABLE t ALTER COLUMN node SET EXPRESSION AS (mod(id, 4));

It is important that the command not only sets a new expression, but also completely rewrites the table, updating the existing values:

SELECT node, count(*) FROM t GROUP BY 1 ORDER BY 1;
 node | count
------+-------
    0 |    25
    1 |    25
    2 |    25
    3 |    25
(4 rows)

see also

Waiting for PostgreSQL 17 – ALTER TABLE command to change generation expression (Hubert 'depesz' Lubaczewski)

That's all for now. Ahead of the news is the latest March commitfest of the 17th version.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *