Postgresso 29

We continue to bring you some of the most exciting news from PostgreSQL.

Conference PGConf.Online 2021

It starts already on March 1st and will end on the 3rd. It is described in detail in the article. Ivan Panchenko, deputy. Gender Postgres Professional.

This conference (which is not instead of, but apart from the offline, warm-tube, it is expected at the end of spring) will have a record number of foreign guests – which was clearly facilitated by the online format. Including this time will participate and Simon Riggs (Simon Riggs). Reports in 3 streams from 10 am to 6 pm. And master classes…

Articles

PostgreSQL 14: Part 4 or the “January offensive” (Commitfest 2021-01)

Another must read Pavel Luzanov… Major changes after the first three relatively modest commitfests (July, September, November).

The “seed questions” suggested by Paul:

Can ranges contain gaps in values?
Why do we need index notation for the json type?
Can an index grow less than a table with frequent updates? And not to grow at all?
How long sessions were idle in idle_in_transaction?
How to build an ER diagram for system catalog tables?

Deep PostgreSQL Thoughts: The Linux Assassin

The word deep is already scary: isn’t it about AI? But no. Joe Conway (Joe Conway, Crunchy Data) really digs deep. Not even Postgres, not even its own extension plr… This time the theme is the Spooky Killer from the bowels of Linux – Oom killer…

Joe starts with a story: the first discussions in the Postgres community and the first patches in 2003 – how to make the killer work in Postgres terms. Joe goes on to explain the host-level relationship between killer and Postgres (oom_score and oom_score_adj) and at the level CGroup, explains why it is so important not to let the hitman come.

And then Joe Conway moves on to the specific problems of OOM Killer in Kubernetes – this, apparently, was the main reason for writing this article. It turns out that there, in K8s, the killer’s destructive activity can begin even with a relatively safe memory state. Swap is disabled by default in K8s. Up to this point, the author did not touch on the topic of swap, which, in fact, is very much discussed among postgresist practitioners.

Joe links to in-depth article Chris Down (Chris Down) In defense of swap: common misconceptions, and there is also a Russian translation (not by an auto-translator): In defense of a swap: common misconceptions… Postgres is out of the question, but it may interest postgresists too.

He also refers to the article The weird interactions of cgroups and linux page cache in hypervisor environments on the StorPool company blog, where the team consists mainly of Bulgarian names.

Joe Conway then smoothly transitions to the development and efforts of Crunchy Data in the PostgreSQL – Kubernetes – Linux kernel triangle.

??
Shark chews google cable (The Guardian ??)

Things I Wished More Developers Knew About Databases

This article is not (only) about Postgres. Sometimes it is useful to take another look at different DBMSs from a bird’s eye view. Here’s an impressive list of topics to keep in mind for app developers. The article Joanna Dogan (Jaana dogan) was not too lazy to expand and develop them. Sometimes in an unexpected direction: at point # 1, for example, we find out that Google cables just now bitten by sharks… A lot of SQL examples, schemas are the PostgreSQL vs. MySQL.

If the network is available 99.999% of the time, you are very lucky;
ACID is understood differently;
each DBMS has its own ability to maintain consistency and isolation;
optimistic locks can help when it is not possible to hold exclusive locks;
there are anomalies other than dirty reads and data loss;
my DBMS, in what order I want to execute transactions, in this order I execute;
sharding at the application level does not mean sharding outside the DBMS;
AUTOINCREMENT can bring unpleasant surprises;
stale data can be useful and help to avoid blocking;
discrepancies due to hours;
latency can mean different things;
it is necessary to evaluate performance not by average indicators, but by critical operations / transactions;
nested transactions are unsafe;
transactions must not maintain application state;
the scheduler will help you learn a lot about the database;
non-stop migrations are difficult but possible;
significant database growth increases unpredictability.

Troubleshooting Performance Issues Due to Disk and RAM

Hamid Akhtar (Hamid Akhtar, HighGo, China) wrote a simple but useful memo for those who want to quickly narrow down their suspects when looking for hardware problems. Starting with the very obvious top, free and df, it turns to utilities for analyzing disk, processor and memory performance, and offers useful sets of options:
iostat (information about both the disk and the processor), for example. iostat -dmx sda 1
sar (System Activity Report, part of the sysstat package), e.g. sar -f /var/log/sa/sa03 -b -s 02:00:00 -e 02:30:00 -r -S
dstate.g. dstat -cdngy

And here is a script for analyzing memory:

#!/bin/bash
grep -A3 "MemTotal" /proc/meminfo  
grep "Swap" /proc/meminfo
grep -A1 "Dirty|Active" /proc/meminfo

…

Starting with Pg – where is the config?

Despatch (Hubert Lubaszewski) in a short note reminds how you can find configuration files if they are in a non-standard location. The methods he suggests to use are not sensational, but it can be useful, say, a convenient set of options.
For example, like this:
ps -fxao pid,command | grep -E 'post(gres|master)'

– the output will be path. And from here:
sudo grep -E '(hba|ident).conf' <путь к postgresql.conf>

Or now we dance from pid:
sudo cat /proc/<подставляем pid>/environ | tr '' 'n' | grep ^PG | sort

Or:
sudo lsof -p <подставляем pid> -a -d cwd

– we get the data catalog and information about it.
If such advice is not needed, you can reflect on the topic “I would have done differently.” Let’s say simply using find, eg.

Aggregates in the database

Kirill Borovikov aka kilor completed a mini-series of articles about units:

Why, how, and is it worth it?

– how to make reports fast, how to implement them and what “rakes” are encountered along the way;

Efficient processing of the “facts” stream

– how to process a LARGE (kilor caps) number of records and not block anyone especially, including the “incoming stream” of data. For example, it can be the recalculation of balances and the maintenance of cumulative sales by goods during their constant shipments, or the aggregation of balances and turnovers by accounts, with massive changes in transactions.

Multidimensional superaggregates

– hierarchical aggregates in several simultaneous sections;

Proxy tables

– how you can reduce the total delays for inserting many changes into the aggregate tables through the use of intermediate tables and external processing.

The clouds

Babelfish: the Elephant in the Room?

Russian translation of the title of this article, which appeared on the website of the Spanish-speaking community foundation FUNDACIÓN POSTGRESQL would sound like this: “Babylonian fish Or did I not notice the elephant? “We have already mentioned that the idea of the project is over-ambitious: Bablefish is PostgreSQL, compatible with SQL Server so much that applications written for it (including T-SQL and the TDS protocol) will immediately work without knowing that they are working with PostgreSQL.

The author of the article – Alvaro Hernandez (Álvaro Hernández Tortosa, OnGres) – starts with the market situation to further present the Hamlet question that the authors of the Babylonian Fish had to ask: fork or not fork?

Babelfish cannot work as an extension yet without tweaking the PostgreSQL core. Alvaro reminds that on January 25th, an honored and authoritative person in the community – Jan Wyck (Jan Wieck) – offered discuss the extensibility of the PostgreSQL protocol: make such hooks that will allow the SQL Server protocol to be implemented as an extension without changes to the kernel. But this is not a quick process. At the same time, we decided to discuss compatibility with MySQL. But what should AWS do with Bablefish if the community ignores this path, or if the integration goes haywire? Most likely, Alvaro believes, AWS will develop Bablefish as a fork (as has already happened with Aurora), no matter how much they would like to do without the fork. And if you still have to, then AWS can do it.

Alvaro then draws on the Innovator’s Dilemma. And it asks another interesting question: do we (ie the community) want Babelfish to become “MariaDB of PostgreSQL”?

A person

Another PG-person of the week is Alexander Sosnaliving in a small town on the Lower Rhine and free from work in credativ Teaching IT Security at the University of Lower Rhine. He is working on a rather unusual extension: pg_snakeoil… This is an antivirus specifically for PostgreSQL: it looks for viruses in the data so as not to interfere with the operation of the database, which is by no means typical for conventional antiviruses. As Alexander notes, viruses are not always hunted for because of their harmfulness, sometimes only because regulatory documents require it.

Releases

PostgreSQL 13.2

PostgreSQL 13.2, 12.6, 11.11, 10.16, 9.6.21, 9.5.25 released (latest release of branch 9.5). These releases have overcome two security issues:
in PostgreSQL 13 it could behaving SELCT rights of one column, get all the table columns using a sophisticated query;
the second issue was for versions 11, 12, and 13. If a user has UPDATE rights on a partitioned table, but does not have SELECT rights on a certain column, he could get the column data from error messages.
Also fixed more than 80 bugs…

pg_probackup 2.4.9

The flag appeared --force for incremental mode. It is now possible to overwrite the contents in the directory specified by PGDATA if system-identifier in the target instance and the copies are NOT the same (the error message came before).

pgAdmin 4 v. 5.0

In version 5.0, among other things, there was support for logical replication; support for publications and subscriptions in Schema Diff.

Apache AGE 0.3.0

Apache AGE is an extension that adds graph database functionality to PostgreSQL. The goal of the project is a single repository for relational and graph data models so that users can use both standard SQL and query languages for graph databases openCypher and GQL…