Between this review and the previous one, beta3 and the first release candidate were released.
But this is almost a detective story: PostgreSQL: CVE-2020-21469 is not a security vulnerability.
Analysis of atypical CVEs, created August 22, showed that about 150 reports were added that day (1, 2, 3, 4, 5, 6) under identifiers issued in 2020-2022. The identifiers cover a large number of open source projects and, at first glance, also interpret ordinary errors not related to the processing of external data as dangerous vulnerabilities.
As usual, 15.4, 14.9, 13.12, 12.16, 11.21 were updated synchronously. And then it became known that support for PostgreSQL 11 would end on November 9th.
Bugs fixed. In particular, the performance problem when running COPY in parallel on the same table has been resolved. The list of fixes can be found on the page current issues (open items).
You can already see the preliminary PostgreSQL16 Release Notes.
The general availability version of PostgreSQL 16 is scheduled for release September 14th.
PostgreSQL 17: Part 1 or Commitfest 2023-07
Another review article Pavel Luzanova. Previous articles about version 16, linked to commitfests: 2022-07, 2022-09, 2022-11, 2023-01, 2023-03. Here it is English version (as practice shows, it is in demand).
There have been some important changes since April that are worth noting. Let’s start with losses. The following developments have been cancelled:
“Don’t compare: the living is incomparable”
But still everyone compares everything with everything else, it’s human nature.
The Bytebase blog has an interesting comparison: things have already come to serverless databases on different platforms. Author of the article – Tianzhou (Tianzhou), he is a team lead at Google Cloud SQL, supported PostgreSQL and MySQL at Google, headed the Database/DevTools/Collaboration department at Ant Group.
He undertook to compare these new databases according to the following set of criteria:
compatibility (with MySQL and PostgreSQL respectively),
joint development opportunities (developer workflow)
ease of operation (operability),
compliance with safety standards (compliance)
In a couple of sentences he describes these bases as follows:
PlanetScale and Neon are similar to each other, and each is similar to MongoDB and Snowflake, respectively:
Like MongoDB, PlanetScale uses a shared-nothing architecture, provides a user-friendly database platform, is obsessively developer-centric, and has impressive success stories to tell.
Like Snowflake, Neon brings a new shared-storage approach to a stagnant OLTP architecture, uses the Postgres dialect, and is also very developer-friendly.
Both PlanetScale and Neon could be the next MongoDB/Snowflake in the world of modern RDBMSs as DBaaS. What we’ve been waiting for a long time.
You can find more comparisons on the company’s website:
And here is a much more traditional subgenre (and this is Tianzhou again):
They have a slightly different set of criteria. Let’s refrain from quoting – read.
Or don’t read, just skim through the article Differences between MySQL and PostgreSQL. Choosing which is better, PostgreSQL or MySQL Kirill Kosolapov, General Director Amvera. There are, to put it mildly, a lot of overlaps with the Tianzhou article. But he doesn’t have a point Popularity in Russia. But Kirill doesn’t have JSON and CTE.
From Friday to Geo
Filling to the Brim: The Impact of Table Column Order on PostgresQL Database Sizes
Original – On Rocks and Sand | Optimizing Postgres Column Order belongs Sean Thomas (Shaun Thomas). The same one that started PG Fridays – this is how we began to translate PG Phridays. Next incarnation Friday – PGSQL Phriday Ryan Booze (Rayan Booz). She has succeeded in media. Now we already have PGSQL Phriday #012: What Excites You About PostgreSQL 16.
But let’s get to the point. The articles are interesting not only in themselves. Sometimes valuable information comes from a comment. Shona added this to the translation of this article: Igor Sukhorukov:
Utility postgres_dba hides such queries behind its interface, analyzes all tables in PostgreSQL and provides recommendations on the order of columns and an estimate of space savings in %. I have successfully used this approach in openstreetmap_h3 projectwhere the database volume is already half a terabyte and will only increase over time.
To this the author of the translation replied: Thank you. It was worth writing an article just for this comment..
Igor begins his own article with the following phrase: in publications on Habré, most often the most interesting things are in the comments. Here is this article, where he shares his experience of creating interfaces “on the knee” for testing prototypes.
We know Igor Sukhorukov from his speeches at PGConf.Russia, for example: How to fit the whole world into a regular laptop: PostgreSQL and OpenStreetMap at a recent PGConf.Russia 2023. And the other day this article appeared:
A way to get into the guts of the operating system, Docker from PostgreSQL using SQL
This can now be done using postgres_osquery. This tool works as an interface to an open source tool osquery, which itself is a SQL interface to the operating system. Igor composed it in Python storage
run_osquery, which receives a query string for osquery as input, and outputs JSON, replacing empty lines with null. Function results
run_osquery you can save to tables using pg_cron on a schedule – the author advises – or run on each node of the CitusDB cluster using
run_command_on_all_nodes().Or build your own SIEM (Security information and event management) solution for database hosts using only PostgreSQL.
Some high-profile articles
Dan Slimmon (Dan Slimmon) shared his personal experience. The database with a monolithic SaaS application grew and grew and ceased to fit into the hardware in terms of performance. What to do? Well, whatever. Switch to write-based sharding or microservices. It’s a good thing. But, says Dan, this is a dramatic complication that will lead to tears later. We must squeeze out of the situation what we can squeeze out. And they squeezed it out – the maximum processor load dropped from 90% to 30%. We did without shards and microservices.
But not everyone agrees with this. A fair amount has spread discussion on Hacker News. By the end, everyone forgot about Postgres and were already arguing about philosophical topics of optimization in general.
Claire Giordano (Claire Giordano, Microsoft Community Hub) gives a bird’s-eye view of, as she puts it, Microsoft’s improvements for 2023. The developers participated in kernel patches, finalized Citus And pg_cron. Their decisions are overgrown with muscles: many will be interested in appearing on this list Patroni, PgBouncer, pgcopydb. Here is a convenient rubric list:
Since Microsoft’s activities traditionally contribute to the release of adrenaline in the open source community, semi-conspiratorial discussions in the spirit of oh, this love of open source is not good. We even remembered the article from 2019 Anton Semin How Microsoft fought Open Source and why it had to fall in love with Linux on Skillbox Media and all the way back to 2007 Why Google’s monopoly is better than Microsoft’s Vitaliy with the almost sinister nickname @Tonatos.
Explanatory article Brian Turiff (Bryan Turriff, EDB). It contains the basics, and the history of the issue, and – as promised – about the contribution of EDB (2ndQuadrant), and advantages, and disadvantages, and pieces of code, and about EDB Postgres Distributed (PGD), of course, and about what’s new in PostgreSQL 16. I don’t know the author’s name. He is the chief marketing officer for the company’s products.
Here’s another news from them:
The company open sourced its tool for deploying and configuring highly available (HA) clusters. It’s called Trusted Postgres Architect (TPA). Details perhaps next time.
In the article Egor Romanov, Chase Granberry (Chase Granberry) And Stanislav Muzhik (Stanislav Muzhyk) talks about their puller. It’s built on Elixir (language on top of Erlang, creator – Jose Valim (Jose Valim) participated in this development of Supabase). The pooler was created with multi-tenancy in mind. The scalability line graphs are really impressive. So far, the version has not even been released yet 1.x, but only 0.9.0.
This release doesn’t make any major changes, but it at least demonstrates that project alive and well. Supports him Dave Kramer (Dave Cramer), and in the repository there is a note with a link to the founding father Joe Conway (Joe/Joseph E Conway): forked from jconway/plr. Joe, it looks like the project is already doesn’t practice.
The version takes into account changes in PostgreSQL 16. There are assemblies for Windows with R versions 4.1.3 and 4.2.3. All changes Here.
Hydra – columnar Postgres for analysts. The developers claim that everything is so well parallelized that queries that would take weeks to execute will give an answer in minutes. And that you can safely query billions of rows. The vacuum is also optimized for speakers. Vector storage. Following the fashion, they also optimized the similarity search.
Before this, we tested a version with the modest name 0.3.0 alpha for 4 months. They promise that version 1.0 GA (generally available) will appear very soon. Hydra is built on top of Citus Columnar, which in turn is a modernized version of the expansion cstore_fdw. Repository Here.
Little by little it is turning into our regular column. Perhaps influenced by the article Jonathan Katz Vectors are the new JSON in PostgreSQL.
Francesco Tizieux (Francesco Tisiot) on the blog DEV Community offers to make sure that, armed with pgvector, you can easily and quickly deal with a well-known (in certain AI circles) meme: teach the network to distinguish a Chihuahua dog (more identically: Chihuahua) from a cookie with raisins instead of eyes and nose. You can try Python notebooks like this: here laptopHere dataset for training, here free PostgreSQL on Aiven.
Jonathan Katz (Jonathan Katz) admits that he did not work on HNSW – the new 0.5.0 feature, only tested it (but worked on pgvector). Shares information about the new algorithm and begins testing performance/recall (completeness – translated, for example, here). For comparison, Jonathan took pg_embedding. In the last issue we wrote about the expansion pg_embedding, which was developed by Neon. Now they have committed it to pgvector.
Great article Christopher Winslet (Christopher Winslett, Crunchy Data). It contains beautiful and clear diagrams and various aspects of working with this algorithm.
And then, finally, he appeared pgvector 0.5.0
In this note about its release, they report on what Jonathan prepared us for in advance. But it also talks about the unit
sum and functions
l1_distance. Here Here new.
After the official release, Jonathan continued to explain and test:
And not only him. On the Supabase blog its employee Egor Romanov (Egor Romanov) also continued testing ivfflat and HNSW:
And everywhere continues to add pgvector to their products. pgvector was also added to DBLab Engine, more about them below. They (postgres.ai) explain this by following an obvious trend, citing Google Cloud:
From the message Sandhya Ghai (Sandhya Ghai) and Narisimkhana Balls (Bala Narasimhan) follows not only that Cloud SQL for PostgreSQL And AlloyDB for PostgreSQL now support pgvectorbut also that they know how to work with pretrained models through Vertex AIwhich is itself a vector database (not relational) and Matching Engine ANN service (ANN=Approximate Nearest Neighbor).
Database Lab => DBLab 3.4
This unique database cloning system is not at all alien to our Postgres Professional Education department:
To automatically check assignments, we used a system developed by our colleague Ilya Bashtanov based on Database Lab Engine. It maintains a pool of thin PostgreSQL clones and runs the entire test suite on them in parallel. Actually we are going to use this system to create our own SQL
blackjacksimulator, but that’s a completely different story.
– This is Yegor Rogov telling about the tasks of the third stage of the IT Planet Olympiad on PostgreSQL.
The name has changed, yes. But, rather, cosmetically: there was Database Lab Engine, now DBLab Engine. But the insides have changed a lot: the largest number of changes for the entire existence – says Nikolay Samokhvalov (Nikolay Samokhvalov), founder and head of Postgres.ai. Here Here they are all listed.
There is a paid installer for the paid version – DBLab Engine SE. There are changes to configuration options. Fixed restarting clone containers. Now you can, for example, run
pg_upgrade -k inside some clone container and immediately start testing the new major version of Postgres in isolation.
Education, basics, science, useful entertainment
On August 11, the final conference of the Postgres Professional Summer School was held in Novosibirsk on the basis of Novosibirsk State University. 15 finalists presented 12 group and individual projects that focused on testing tools BiHAstreaming replication and multimasterexpansion development pgsphere.
The training will take place from September 16 to December 23 in a distance format. The program is built on the basis textbook of the same name team of authors from St. Petersburg State University under the leadership Boris Asenovich Novikov. The course is taught by the technical director of Billing Center LLC, a group of companies of the Gorod system – Davydov Evgeniy Stanislavovichwhose experience working with databases is more than 30 years.
David Jan (David Zhang) from the Canadian branch of the Chinese High Go often appears in our reviews. This time he teaches
write Hello World calculate a+b in different ways:
Authors – E. V. Ivanova, M. L. Tsymbler. Let us recall that in issue Postgresso 8-9 (45-46) there was a small chapter about Michael: Interesting author. His work on parallel databases is based on PostgreSQL and can be found on such resources as academia.edu (account required). Here is a presentation about Taming Elephants: Embed Parallelism into PostgreSQL. Here is his page with courses.
Greg Sabino Mullaney (Greg Sabino Mullane) from Crunchy Data continues to entertain inquisitive postgresists. All this is again within the framework Advent of Codewhich we talked a little about in Postgresso issue No. 6 (55).
Among the results: reports were presented by experts from Postgres Professional, Sibedge, Lukoil and other companies. And further: Postgres Professional employees made a presentation in the Lectures on Grass section.
We remind you about PGConf.SPb 2023
PGConf will be held for the first time on September 25 in St. Petersburg. The conference will be called PGConf.SPb 2023. Place – Hotel CorinthiaNevsky Prospekt, 57.
And before that it will take place Strike 2023 – IT conference in Ulyanovsk
And so Who will speak at the Strike.
Incomplete works of Lorenz Albe
Changing the data type in a large table is always a headache. ALTER TABLE receives an Exclusive lock, and the table becomes unavailable for reading and writing. Lorenz Albe (Laurenz Albe, Cybertec) explains how to reduce these blockages to a minimum.
As always, there is a funny picture above the article. But that’s not the main thing. He’s digging deep again. In this case, he proposes to distinguish between scalar queries and tabular ones, and examines the types of subqueries from the point of view of performance problems and their resolution.
And in this article, Lorenz dives into memory allocation mechanisms in C:
He talks about the concept memory contexts as a means (an attempt) to protect Postgres from memory leaks, which often occur in C; explains the architecture of contexts and reminds about representation pg_backend_memory_contexts and functions pg_log_backend_memory_contexts()which provide an opportunity to look into these structures.
The advice “use hots so that the cluster does not rot” does not exhaust the content of this article by Lorenz. He explains the mechanisms of swelling of indexes and tables that were subjected to clustering, explains the relationship between the effectiveness of HOT Updates and fillfactor, then launches pgbench and demonstrates quantitative dependencies.
That’s all for now.