Part 4 or Commitfest 2022-01

The first three commitfests of version 15 (

2021-07

,

2021-09

,

2021-11

) did not spoil us with major changes. The fourth, January commitfest was no exception. But still there is something to talk about.


psql: case completion of keywords and object names

commit:

02b8048b

,

020258fb

In almost every commitfest, there are patches that improve tab completion for individual commands. Starting from the 15th version, I stopped mentioning them.

However, this patch is worth highlighting. It’s not about autocompletion for individual commands. This is a general improvement in the handling of object names/keywords, including in different cases.

For myself, I noted that pressing the tab finally saves the case:

vacuum ana[tab] -> vacuum analyze

before it was:

vacuum ana[tab] -> vacuum ANALYZE

psql: getenv command and user home directory definition

commit:

33d3ead

and

376ce3e4

The setenv command to set an environment variable in psql has been around for a long time, but there was no reverse command to get the value.

The first commit adds the getenv command to psql:

getenv home_saved HOME
echo :home_saved
/home/pluzanov

In fact, the same thing can be achieved like this:

set home_saved `echo $HOME`

But this option does not work on all operating systems.

But the second commit is for UNIX systems only. It changes the way the user’s home directory is determined. Instead of getpwuid(getuid())->pw_dir, the $HOME variable is now checked, which can be changed if desired:

setenv HOME /tmp
cd
! pwd
/tmp

Let’s restore the original value:

setenv HOME :home_saved
cd
! pwd
/home/pluzanov

psql: dl+ to display LOB privileges

commit:

328dfbda

The essence of the patch is in the title. But who needs large objects and access rights to them? This patch would not have been included in the review, if not for one circumstance. This is my patch 🙂

Why do I need large objects? Actually, not really. But looking at the documentation here, it was embarrassing for them. Why does the only empty cell in the table refer to the psql command for working with large objects? Corrected.

But the most surprising thing is that knowledge of the C language at the level of the first chapter from the book of Kernighan and Ritchie turned out to be quite enough. So literally everyone can contribute to the reasonable, good, eternal. Do not be shy!

postgresql: continuous integration (CI)
commit: 93d97349

Continuing the theme of development. Support for Cirrus CI continuous integration has been added to the server source code. This allows for large and complex projects to clone the postgresql repository on github, make changes and see how the project is built and tested on different platforms. Currently supported: FreeBSD, Linux, macOS and Windows.

It is convenient because to check the functionality on different platforms, the patch does not need to be sent to -hackers/commitfest, where it is built in: http://cfbot.cputube.org/

Details in src/tools/ci/README.

pg_log_backend_memory_contexts shows memory allocation for server background processes
commit: 790fbda9

The function that appeared in the 14th version pg_log_backend_memory_contexts Writes information about the memory used by the specified server process to the server log.

Now you can see the memory allocation for server background processes (bgwriter, wal writer, archiver, checkpointer…), except for logger and stats collector.

Unification of VACUUM VERBOSE output and log_autovacuum_min_duration
commit: 49c9d9fc

VACUUM VERBOSE commands have included useful information in the detailed cleanup report, which previously only appeared in the server log when triggered log_autovacuum_min_duration.

From the important point, now instead of a separate entry for each pass through the indexes (if there are several of them, urgently reconfigure autocleaning!) One accumulating entry is displayed. In addition, information has been added on the average read / write speed from disk, the use of the buffer cache and WAL.

Speed ​​up processing of strings in UTF-8
commit: 911588a3

According to tests author, checking strings in UTF-8 encoding is many times faster. This change will definitely help speed up the loading of large amounts of text data with the COPY FROM command.

postgres_fdw: setting application_name for session on 3rd party server
commit: 449ab635, 6e0cb3de

For demonstration in the postgres database, a third-party server is configured for the demo database in the same cluster. Let’s query the foreign table and look at the list of processes:

SELECT count(*) FROM bookings;

SELECT datname, pid, usename, application_name
FROM pg_stat_activity
WHERE datname IN ('demo','postgres');
 datname  |  pid   | usename  | application_name
----------+--------+----------+------------------
 postgres | 103897 | postgres | psql
 demo     | 103898 | postgres | postgres_fdw

By default postgres_fdw sets the value of the option

application_name

for remote sessions in “postgres_fdw”. But if several remote sessions are connected at once, then it is not very easy to figure out who came from where.

New extension parameter postgres_fdw.application_name allows you to set the connection context.

c
SET postgres_fdw.application_name="fdw: %u@%d(%p)";
SELECT count(*) FROM bookings;

Now let’s look at the list of processes:

SELECT datname, pid, usename, application_name
FROM pg_stat_activity
WHERE datname IN ('demo','postgres');
 datname  |  pid   | usename  |        application_name        
----------+--------+----------+--------------------------------
 demo     | 104073 | postgres | fdw: postgres@postgres(104009)
 postgres | 104009 | postgres | psql

A description of special sequences can be found in

documentation

. This example uses the local username (%u), the local database name (%d), and the process number (%p), which makes it easy to find the session running the request.

Logical replication without superuser
commit: a2ab9c06, 96a6f11c

The logical replication worker runs with the privileges of the owner of the subscription, which is no longer required to be the superuser.

The setup is done in two steps. Superuser rights are still required to create a subscription. But then they can be taken away. The main thing is that the owner of the subscription has write access to the tables where the changes are applied.

For example, we will replicate the test table from the postgres database to the db database located in the same cluster.

с postgres postgres
CREATE TABLE test (id int);
CREATE ROLE alice LOGIN SUPERUSER;
CREATE DATABASE db OWNER alice;

Create a publication and a slot:

CREATE PUBLICATION test_pub FOR TABLE test;
SELECT pg_create_logical_replication_slot('testslot','pgoutput');

On the subscription side, while alice is the superuser:

c db alice
CREATE TABLE test (id int);
CREATE SUBSCRIPTION test_sub
    CONNECTION 'user=postgres dbname=postgres'
    PUBLICATION test_pub
    WITH (create_slot = false, slot_name = testslot);

We take away the rights of the superuser and check the operation of replication:

с postgres postgres
ALTER ROLE alice NOSUPERUSER;
INSERT INTO test VALUES(1);

c db alice
SELECT * FROM test;
 id
----
  1
(1 row)

While there is a limitation, which may be removed in future versions. If you create row protection policies (RLS) for the test table on the subscription side, then logical replication will stop working, even if the policies do not prevent alice from changing the table. Either return the superuser attribute for alice, or set the bypassrls attribute.

Server log in JSON format
commit: dc686681

The log_destination parameter has a new value – jsonlog. This means that the server log now supports the JSON format. It is more difficult to view the log in this format with your eyes than in stderr. However, for developers of log analysis tools, this is good news. The recording format is, of course, in documentation.

Description by Depesz.

Default settings: log_checkpoints and log_autovacuum_min_duration
commit: 64da07c4

When something goes wrong with system performance, it’s good to have more information to analyze.

To this end, starting from the 15th version, the default values ​​of two parameters are changed:

SELECT name, boot_val, unit
FROM pg_settings
WHERE name IN ('log_checkpoints','log_autovacuum_min_duration');
            name             | boot_val | unit
-----------------------------+----------+------
 log_autovacuum_min_duration | 600000   | ms
 log_checkpoints             | on       |

This means that information about the execution of checkpoints and long (> 10 min) starts of automatic cleaning will be recorded in the server log.

But it is not necessary to wait for the transition to the 15th version. If the options are not enabled now, you should enable them.

Composite foreign keys with ON DELETE SET NULL
commit: d6f96ed9

When creating foreign keys, you can specify what happens when the parent record is deleted. For example, reset the foreign key values ​​of all referring records to NULL. However, in situations where the primary/foreign key is composite, unexpected problems can arise.

In the following example, the employee table has a composite primary key that includes the company code in addition to the employee ID. In addition, there is also a foreign key – the code of the employee’s manager, again a composite one.

CREATE TABLE employees (
    company_code text,
    employee_code text,
    manager_code text,
    PRIMARY KEY (company_code, employee_code),
    CONSTRAINT employees_manager_fkey FOREIGN KEY (company_code, manager_code)
            REFERENCES employees (company_code, employee_code)
            ON DELETE SET NULL
);

INSERT INTO employees VALUES
    ('Головной офис', 'Директор', NULL),
    ('Филиал', 'Директор', NULL),
    ('Филиал', 'Зам.директора', 'Директор');
SELECT * FROM employees;
 company_code  | employee_code | manager_code
---------------+---------------+--------------
 Головной офис | Директор      |
 Филиал        | Директор      |
 Филиал        | Зам.директора | Директор
(3 rows)

Suppose we decide to delete the line with the branch director, and thanks to the ON DELETE SET NULL option, we hope that his subordinates will remain without a head for the time being:

DELETE FROM employees
    WHERE company_code="Филиал" AND employee_code="Директор";
ERROR:  null value in column "company_code" of relation "employees" violates not-null constraint
DETAIL:  Failing row contains (null, Зам.директора, null).

But in a composite foreign key, the company_id column is also part of the primary key, so you can’t reset it to NULL.

In version 15, defining a foreign key with ON DELETE SET NULL|DEFAULT, you can additionally specify a list of columns to which this instruction will apply:

ALTER TABLE employees
    DROP CONSTRAINT employees_manager_fkey,   
    ADD CONSTRAINT employees_manager_fkey
            FOREIGN KEY (company_code, manager_code)
            REFERENCES employees (company_code, employee_code)
             ON DELETE SET NULL (manager_code);

Those. we want only the manager code to be reset to NULL, and let the company code remain unchanged.

DELETE FROM employees
    WHERE company_code="Филиал" AND employee_code="Директор";
SELECT * FROM employees;
 company_code  | employee_code | manager_code
---------------+---------------+--------------
 Головной офис | Директор      |
 Филиал        | Зам.директора |
(2 rows)

This approach can be useful for applications that are used by different clients and where a clear separation of data is required. For these purposes, you can safely add a certain tenant_id to each table as part of the primary key without fear of getting the problem described above.

UNIQUE and NULL
commit: 94aa7cc5

The unique constraint allows NULL values. Starting with the 15th version, this statement requires clarification.

By creating a unique constraint, you can specify how to treat NULL values:

  • NULLS DISTINCT – NULL values ​​are considered distinct, this is the default behavior.
  • NULLS NOT DISTINCT – NULL values ​​are considered the same.

Let’s look at the new behavior:

CREATE TABLE test (
    c1 int,
    c2 int,
    UNIQUE NULLS NOT DISTINCT (c1,c2)
);

A clarification has appeared in the description of the UNIQUE constraint:

d test
               Table "bookings.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
Indexes:
    "test_c1_c2_key" UNIQUE CONSTRAINT, btree (c1, c2) NULLS NOT DISTINCT

Records are inserted into the table as long as there are no duplicate NULL values.

INSERT INTO test VALUES
    (1, NULL),
    (NULL, 1),
    (NULL, NULL);

However, re-adding any of these strings will not work, they will not pass the uniqueness check:

INSERT INTO test VALUES (1, NULL);
ERROR:  duplicate key value violates unique constraint "test_c1_c2_key"
DETAIL:  Key (c1, c2)=(1, null) already exists.
INSERT INTO test VALUES (NULL, 1);
ERROR:  duplicate key value violates unique constraint "test_c1_c2_key"
DETAIL:  Key (c1, c2)=(null, 1) already exists.
INSERT INTO test VALUES (NULL, NULL);
ERROR:  duplicate key value violates unique constraint "test_c1_c2_key"
DETAIL:  Key (c1, c2)=(null, null) already exists.

A similar hint when creating unique indexes:

CREATE UNIQUE INDEX index_name ON table_name (column_list) NULLS NOT DISTINCT;

Formally, this is a patch from the March commitfest, but it happened in early February, so let the January one complete.


That’s all for now. Ahead is the final March commitfest, after which we can sum up the results of the entire release cycle of the 15th version.

Similar Posts

Leave a Reply