Postgresso №8 (57)

PostgreSQL releases

Between this review and the previous one, beta3 and the first release candidate were released.

IN PostgreSQL 16 Beta 3 2 vulnerabilities are closed, they concern extensions And MERGE. IN message There are important notes for those who use BRIN indexes for queries involving NULL values.

But this is almost a detective story: PostgreSQL: CVE-2020-21469 is not a security vulnerability.

Here explain:

Analysis of atypical CVEs, created August 22, showed that about 150 reports were added that day (1, 2, 3, 4, 5, 6) under identifiers issued in 2020-2022. The identifiers cover a large number of open source projects and, at first glance, also interpret ordinary errors not related to the processing of external data as dangerous vulnerabilities.

As usual, 15.4, 14.9, 13.12, 12.16, 11.21 were updated synchronously. And then it became known that support for PostgreSQL 11 would end on November 9th.

PostgreSQL: PostgreSQL 16 RC1

Bugs fixed. In particular, the performance problem when running COPY in parallel on the same table has been resolved. The list of fixes can be found on the page current issues (open items).

You can already see the preliminary PostgreSQL16 Release Notes.

The general availability version of PostgreSQL 16 is scheduled for release September 14th.

PostgreSQL 17: Part 1 or Commitfest 2023-07

Another review article Pavel Luzanova. Previous articles about version 16, linked to commitfests: 2022-07, 2022-09, 2022-11, 2023-01, 2023-03. Here it is English version (as practice shows, it is in demand).

There have been some important changes since April that are worth noting. Let’s start with losses. The following developments have been cancelled:

Release 16. Acknowledgments

Preliminary list of investors PG 16 from Bruce. There are many familiar and familiar-sounding names.

“Don’t compare: the living is incomparable”

But still everyone compares everything with everything else, it’s human nature.

PlanetScale vs. Neon: the Continued Saga between MySQL and PostgreSQL

The Bytebase blog has an interesting comparison: things have already come to serverless databases on different platforms. Author of the article – Tianzhou (Tianzhou), he is a team lead at Google Cloud SQL, supported PostgreSQL and MySQL at Google, headed the Database/DevTools/Collaboration department at Ant Group.

He undertook to compare these new databases according to the following set of criteria:

In a couple of sentences he describes these bases as follows:

PlanetScale and Neon are similar to each other, and each is similar to MongoDB and Snowflake, respectively:

  • Like MongoDB, PlanetScale uses a shared-nothing architecture, provides a user-friendly database platform, is obsessively developer-centric, and has impressive success stories to tell.

  • Like Snowflake, Neon brings a new shared-storage approach to a stagnant OLTP architecture, uses the Postgres dialect, and is also very developer-friendly.

Both PlanetScale and Neon could be the next MongoDB/Snowflake in the world of modern RDBMSs as DBaaS. What we’ve been waiting for a long time.

You can find more comparisons on the company’s website:

And here is a much more traditional subgenre (and this is Tianzhou again):

Postgres vs. MySQL: a Complete Comparison in 2023

They have a slightly different set of criteria. Let’s refrain from quoting – read.

Or don’t read, just skim through the article Differences between MySQL and PostgreSQL. Choosing which is better, PostgreSQL or MySQL Kirill Kosolapov, General Director Amvera. There are, to put it mildly, a lot of overlaps with the Tianzhou article. But he doesn’t have a point Popularity in Russia. But Kirill doesn’t have JSON and CTE.

From Friday to Geo

Filling to the Brim: The Impact of Table Column Order on PostgresQL Database Sizes

Original – On Rocks and Sand | Optimizing Postgres Column Order belongs Sean Thomas (Shaun Thomas). The same one that started PG Fridays – this is how we began to translate PG Phridays. Next incarnation Friday – PGSQL Phriday Ryan Booze (Rayan Booz). She has succeeded in media. Now we already have PGSQL Phriday #012: What Excites You About PostgreSQL 16.

However, it is possible that Thomas himself once saw glimpses of Philosophical Fridaysor Physique Phriday, or something like that. And I was inspired.

But let’s get to the point. The articles are interesting not only in themselves. Sometimes valuable information comes from a comment. Shona added this to the translation of this article: Igor Sukhorukov:

Utility postgres_dba hides such queries behind its interface, analyzes all tables in PostgreSQL and provides recommendations on the order of columns and an estimate of space savings in %. I have successfully used this approach in openstreetmap_h3 projectwhere the database volume is already half a terabyte and will only increase over time.

To this the author of the translation replied: Thank you. It was worth writing an article just for this comment..

Igor begins his own article with the following phrase: in publications on Habré, most often the most interesting things are in the comments. Here is this article, where he shares his experience of creating interfaces “on the knee” for testing prototypes.

We know Igor Sukhorukov from his speeches at PGConf.Russia, for example: How to fit the whole world into a regular laptop: PostgreSQL and OpenStreetMap at a recent PGConf.Russia 2023. And the other day this article appeared:

A way to get into the guts of the operating system, Docker from PostgreSQL using SQL

This can now be done using postgres_osquery. This tool works as an interface to an open source tool osquery, which itself is a SQL interface to the operating system. Igor composed it in Python storage run_osquery, which receives a query string for osquery as input, and outputs JSON, replacing empty lines with null. Function results run_osquery you can save to tables using pg_cron on a schedule – the author advises – or run on each node of the CitusDB cluster using run_command_on_all_nodes().Or build your own SIEM (Security information and event management) solution for database hosts using only PostgreSQL.

Some high-profile articles

Squeeze the hell out of the system you have

Dan Slimmon (Dan Slimmon) shared his personal experience. The database with a monolithic SaaS application grew and grew and ceased to fit into the hardware in terms of performance. What to do? Well, whatever. Switch to write-based sharding or microservices. It’s a good thing. But, says Dan, this is a dramatic complication that will lead to tears later. We must squeeze out of the situation what we can squeeze out. And they squeezed it out – the maximum processor load dropped from 90% to 30%. We did without shards and microservices.

But not everyone agrees with this. A fair amount has spread discussion on Hacker News. By the end, everyone forgot about Postgres and were already arguing about philosophical topics of optimization in general.

What’s new with Postgres at Microsoft (August 2023)

Claire Giordano (Claire Giordano, Microsoft Community Hub) gives a bird’s-eye view of, as she puts it, Microsoft’s improvements for 2023. The developers participated in kernel patches, finalized Citus And pg_cron. Their decisions are overgrown with muscles: many will be interested in appearing on this list Patroni, PgBouncer, pgcopydb. Here is a convenient rubric list:

Since Microsoft’s activities traditionally contribute to the release of adrenaline in the open source community, semi-conspiratorial discussions in the spirit of oh, this love of open source is not good. We even remembered the article from 2019 Anton Semin How Microsoft fought Open Source and why it had to fall in love with Linux on Skillbox Media and all the way back to 2007 Why Google’s monopoly is better than Microsoft’s Vitaliy with the almost sinister nickname @Tonatos.

PostgreSQL Logical Replication: Advantages, EDB’s Contributions and PG 16 Enhancements

Explanatory article Brian Turiff (Bryan Turriff, EDB). It contains the basics, and the history of the issue, and – as promised – about the contribution of EDB (2ndQuadrant), and advantages, and disadvantages, and pieces of code, and about EDB Postgres Distributed (PGD), of course, and about what’s new in PostgreSQL 16. I don’t know the author’s name. He is the chief marketing officer for the company’s products.

Here’s another news from them:

EDB Open Sources Powerful Tool to Automate and Manage Postgres Deployments

The company open sourced its tool for deploying and configuring highly available (HA) clusters. It’s called Trusted Postgres Architect (TPA). Details perhaps next time.

Some releases

Supavisor: Scaling Postgres to 1 Million Connections

In the article Egor Romanov, Chase Granberry (Chase Granberry) And Stanislav Muzhik (Stanislav Muzhyk) talks about their puller. It’s built on Elixir (language on top of Erlang, creator – Jose Valim (Jose Valim) participated in this development of Supabase). The pooler was created with multi-tenancy in mind. The scalability line graphs are really impressive. So far, the version has not even been released yet 1.x, but only 0.9.0.

PostgreSQL: PL/R 8.4.6

This release doesn’t make any major changes, but it at least demonstrates that project alive and well. Supports him Dave Kramer (Dave Cramer), and in the repository there is a note with a link to the founding father Joe Conway (Joe/Joseph E Conway): forked from jconway/plr. Joe, it looks like the project is already doesn’t practice.

The version takes into account changes in PostgreSQL 16. There are assemblies for Windows with R versions 4.1.3 and 4.2.3. All changes Here.

Hydra 1.0 beta

Hydra – columnar Postgres for analysts. The developers claim that everything is so well parallelized that queries that would take weeks to execute will give an answer in minutes. And that you can safely query billions of rows. The vacuum is also optimized for speakers. Vector storage. Following the fashion, they also optimized the similarity search.

Before this, we tested a version with the modest name 0.3.0 alpha for 4 months. They promise that version 1.0 GA (generally available) will appear very soon. Hydra is built on top of Citus Columnar, which in turn is a modernized version of the expansion cstore_fdw. Repository Here.

pgvector

Little by little it is turning into our regular column. Perhaps influenced by the article Jonathan Katz Vectors are the new JSON in PostgreSQL.

Machine Learning challenge: Chihuahua vs Muffin with PostgreSQL and pgvector

Francesco Tizieux (Francesco Tisiot) on the blog DEV Community offers to make sure that, armed with pgvector, you can easily and quickly deal with a well-known (in certain AI circles) meme: teach the network to distinguish a Chihuahua dog (more identically: Chihuahua) from a cookie with raisins instead of eyes and nose. You can try Python notebooks like this: here laptopHere dataset for training, here free PostgreSQL on Aiven.

An early look at HNSW performance with pgvector

Jonathan Katz (Jonathan Katz) admits that he did not work on HNSW – the new 0.5.0 feature, only tested it (but worked on pgvector). Shares information about the new algorithm and begins testing performance/recall (completeness – translated, for example, here). For comparison, Jonathan took pg_embedding. In the last issue we wrote about the expansion pg_embedding, which was developed by Neon. Now they have committed it to pgvector.

HNSW Indexes with Postgres and pgvector

Great article Christopher Winslet (Christopher Winslett, Crunchy Data). It contains beautiful and clear diagrams and various aspects of working with this algorithm.

And then, finally, he appeared pgvector 0.5.0

In this note about its release, they report on what Jonathan prepared us for in advance. But it also talks about the unit sum and functions l1_distance. Here Here new.

After the official release, Jonathan continued to explain and test:

pgvector 0.5.0 Feature Highlights and HOWTOs

And not only him. On the Supabase blog its employee Egor Romanov (Egor Romanov) also continued testing ivfflat and HNSW:

pgvector v0.5.0: Faster semantic search with HNSW indexes

And everywhere continues to add pgvector to their products. pgvector was also added to DBLab Engine, more about them below. They (postgres.ai) explain this by following an obvious trend, citing Google Cloud:

Unlock the power of gen AI with the pgvector PostgreSQL extension

From the message Sandhya Ghai (Sandhya Ghai) and Narisimkhana Balls (Bala Narasimhan) follows not only that Cloud SQL for PostgreSQL And AlloyDB for PostgreSQL now support pgvectorbut also that they know how to work with pretrained models through Vertex AIwhich is itself a vector database (not relational) and Matching Engine ANN service (ANN=Approximate Nearest Neighbor).

Database Lab => DBLab 3.4

DBLab 3.4: new name, SE installer, and lots of improvements

This unique database cloning system is not at all alien to our Postgres Professional Education department:

To automatically check assignments, we used a system developed by our colleague Ilya Bashtanov based on Database Lab Engine. It maintains a pool of thin PostgreSQL clones and runs the entire test suite on them in parallel. Actually we are going to use this system to create our own SQLblackjacksimulator, but that’s a completely different story.

– This is Yegor Rogov telling about the tasks of the third stage of the IT Planet Olympiad on PostgreSQL.

The name has changed, yes. But, rather, cosmetically: there was Database Lab Engine, now DBLab Engine. But the insides have changed a lot: the largest number of changes for the entire existence – says Nikolay Samokhvalov (Nikolay Samokhvalov), founder and head of Postgres.ai. Here Here they are all listed.

There is a paid installer for the paid version – DBLab Engine SE. There are changes to configuration options. Fixed restarting clone containers. Now you can, for example, run pg_upgrade -k inside some clone container and immediately start testing the new major version of Postgres in isolation.

Education, basics, science, useful entertainment

Final conference of the Postgres Pro Summer School in Novosibirsk: Postgres Professional Company

On August 11, the final conference of the Postgres Professional Summer School was held in Novosibirsk on the basis of Novosibirsk State University. 15 finalists presented 12 group and individual projects that focused on testing tools BiHAstreaming replication and multimasterexpansion development pgsphere.

Registration is open for the free course “Basics of Database Technologies” from Postgres Professional

The training will take place from September 16 to December 23 in a distance format. The program is built on the basis textbook of the same name team of authors from St. Petersburg State University under the leadership Boris Asenovich Novikov. The course is taught by the technical director of Billing Center LLC, a group of companies of the Gorod system – Davydov Evgeniy Stanislavovichwhose experience working with databases is more than 30 years.

Customizing SQL Functions in PostgreSQL: Exploring Various Approaches

David Jan (David Zhang) from the Canadian branch of the Chinese High Go often appears in our reviews. This time he teaches write Hello World calculate a+b in different ways:

Introduction of the matrix profile concept into a relational DBMS for time series mining

Authors – E. V. Ivanova, M. L. Tsymbler. Let us recall that in issue Postgresso 8-9 (45-46) there was a small chapter about Michael: Interesting author. His work on parallel databases is based on PostgreSQL and can be found on such resources as academia.edu (account required). Here is a presentation about Taming Elephants: Embed Parallelism into PostgreSQL. Here is his page with courses.

Fun with PostgreSQL Puzzles: Recursive Functions with Animations

Greg Sabino Mullaney (Greg Sabino Mullane) from Crunchy Data continues to entertain inquisitive postgresists. All this is again within the framework Advent of Codewhich we talked a little about in Postgresso issue No. 6 (55).

The toolkit, of course, is not limited to recursive functions. Greg also uses CTE, regexp_split_to_array And GREATEST/LEAST.

Conferences

Results of PGConf.Siberia 2023 in Tomsk

Among the results: reports were presented by experts from Postgres Professional, Sibedge, Lukoil and other companies. And further: Postgres Professional employees made a presentation in the Lectures on Grass section.

We remind you about PGConf.SPb 2023

PGConf will be held for the first time on September 25 in St. Petersburg. The conference will be called PGConf.SPb 2023. Place – Hotel CorinthiaNevsky Prospekt, 57.

And before that it will take place Strike 2023 – IT conference in Ulyanovsk

And so Who will speak at the Strike.

Incomplete works of Lorenz Albe

Online Data Type Change in PostgreSQL

Changing the data type in a large table is always a headache. ALTER TABLE receives an Exclusive lock, and the table becomes unavailable for reading and writing. Lorenz Albe (Laurenz Albe, Cybertec) explains how to reduce these blockages to a minimum.

Subqueries and performance in PostgreSQL

As always, there is a funny picture above the article. But that’s not the main thing. He’s digging deep again. In this case, he proposes to distinguish between scalar queries and tabular ones, and examines the types of subqueries from the point of view of performance problems and their resolution.

And in this article, Lorenz dives into memory allocation mechanisms in C:

Memory Context: How Postgres Allocates Memory

He talks about the concept memory contexts as a means (an attempt) to protect Postgres from memory leaks, which often occur in C; explains the architecture of contexts and reminds about representation pg_backend_memory_contexts and functions pg_log_backend_memory_contexts()which provide an opportunity to look into these structures.

Use HOT, so CLUSTER won’t rot in PostgreSQL

The advice “use hots so that the cluster does not rot” does not exhaust the content of this article by Lorenz. He explains the mechanisms of swelling of indexes and tables that were subjected to clustering, explains the relationship between the effectiveness of HOT Updates and fillfactor, then launches pgbench and demonstrates quantitative dependencies.


That’s all for now.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *