postgresso №7 (56)

Conferences, Battle of Ibiza

PGConf.SPb 2023

PGConf will be held for the first time on September 25 in St. Petersburg. The conference will be called PGConf.SPb 2023. Location – Corinthia HotelNevsky prospect., 57.

Registration is open and applications are being accepted. You can apply in your personal account. The number of topics is not limited.

PGConf.Siberia 2023

Pass on site City of IT September 9th in Tomsk. The program is still being formed, but the general outline is known: in addition to the PostgreSQL technical conference itself, there will be:

  • plenary session with a discussion of the region’s IT business,

  • thematic expert sections,

  • business section (regional IT business),

  • Workshops for team leaders and managers,

  • City IT Students (sections for young IT entrepreneurs),

  • certification from Postgres Professional for PostgreSQL Administration,

  • (Last but not least?) after party (positioned as the most carbon monoxide party of the Tomsk it-community).

You can register for the event now. Participation in the conference is available to all attendees IT CITY.

I can add, I can’t resist: the wooden quarter of Tomsk is a wonder of the world, I was very impressed. In general, the city is pretty.

PerfConf – load testing conference

This conference is held for the 9th time by Performance Lab. The program has not yet been announced, it is known that there will be 3 venues – online, the main stage and the business lounge, and what will be reported Alexander Ivanov (Yandex), Tatiana Spiridonova (Ozon) what Andrey Zubkov will talk about pg_profile/PWRA Pavel Tolmachev about how the PostgreSQL optimizer works on large queries (both Postgres Professional). The conference will be held on September 13 in Moscow.

PGConf NYC 2023

It will take place in October. Recently published complete programit contains an introductory word Andy Pavlo (Andy Pavlo), performances Bruce Momjan (Bruce Momjian) Ryan Booza (Ryan Booz) Claire Giordano (Claire Giordano) and others.

www.pgday.uk 2023

This is the first PJD in the UK. Now it will be held every year in London. The nearest one is September 12th, but program already known. They will perform, for example, Magnus Hagander (Magnus Hagander, Redpill Linpro) with What’s new in PostgreSQL 16 and Tom Vondra (EDB) with BRIN improvements and new opclasses. Registration Here.

PASS Data Summit 2023 Sessions

This is an MS-leaning event, so there are a lot of SQL Server, Power BI and Azure, but there are also postgres talks. For example:

There are also well-known names in the huge list, but these topics seemed curious to us.

PGMeetup: 1C on Postgres Pro

Meetup (webinar) was held on July 26 and aroused considerable interest. Videos and slides can be viewed Here. There are two reports with a total duration of one and a half hours. First:

Konstantin SumkinCIO Softline Outsourcing:

  • Analysis of the Postgres configuration for the tasks being solved. Implementation of various Postgres Pro builds for 1C databases;

  • Analysis of requests and 1C code during the transition and implementation of Postgres Pro from various DBMS;

  • Performance analysis of systems based on Postgres Pro DBMS using 1C: Corporate tool package 8;

  • Possibilities of the 1C platform, starting from version 23, for data migration without uploading to *.dt;

  • Plan-fact analysis of 1C database transitions from various DBMS to Postgres Pro.

Second – Andrey ZabelinSenior Technical Adviser Postgres Professional:

  • Postgres Pro features for 1C platform

  • Data compression

  • Backup and restore

  • Fault tolerance and disaster tolerance

  • 1C performance on Postgres Pro

And a lot of questions and answers.

And:

PGIbiza 2023: Postgres and Ecosystem

But it’s not easy for her. It would seem that, in Ibiza there is everything to work and relax sequentially or simultaneously (as far as I remember, there are even reports on the beach). But an old conflict resurfaced Alvaro Hernandez (Álvaro Hernández Tortosa) and the elders (Core Team) of PostgreSQL. Since there is a risk of misinterpreting a lot, let’s recall the background. Alvaro is not a raider trying to take over the PostgreSQL trademarks. He is revolutionary.

The story begins in 2021. In Postgresso 37 we wrote:

In a paragraph Trademark Dispute talking about settled conflict with Alvaro Hernandez (Álvaro Hernández Tortosa). Trademarks are trademarks, but Alvaro actually criticized the team on several issues at once, he especially did not like the opacity of decisions.

And in Postgresso 34 there is a lot about Alvaro’s revolutionary nature: he was concerned about gender and racial balance (they say that the percentage of white CIS-gender men in the leadership is too high, quotas are needed). The majority was against at least the last points on the principle of “don’t wake famouslywhile it is quiet” (read the comments under Alvaro’s appeal).

Those who wish can form their own opinion by reading the claims of the parties. Chronologically in 2023 it is:

There is a feeling that this outbreak will (or has already) come to naught without overshadowing this non-standard gathering in Ibiza.

Contributors

List Contributors – not money, but lines of code in PostgreSQL updated. New arrivals announced:

Title Major Contributors got:

  • Julien Rouhaud (VMWare) – well known to us from PGDay.Russia conferences;

  • Stacey Haysler (PGX);

  • Steve Singer (on his own).

Title Contributorsalso a lot of friends:

  • Ajin Cherian (not sure about pronunciation – Ajin Cherian, Fujitsu);

  • Alexander Kukushkin (Alexander Kukushkin, MS Citus);

  • Alexander Lakhin (Alexander Lakhin, now ??);

  • Dmitry Dolgov (Dmitry Dolgov, Zalando??);

  • Hou Zhijie (not sure – Hou Zhijie, Fujitsu);

  • Ilya Kosmodemyansky (Ilya Kosmodemiansky, Data Egret);

  • Melanie Plageman (Melanie Plageman, VMWare);

  • Michael Bank (Michael Banck, credit);

  • Michael Brewer (Michael Brewer, Franklin College of Arts and Sciences)

  • Paul Jungwirth (Paul Jungwirth, Illuminated Computing);

  • Peter Smith (Peter Smith, Fujitsu);

  • Vigneshwaran See (Vigneshwaran C, Fujitsu).

Fujitsu’s activity is impressive, isn’t it?

Money: Ligand Pro and Neon

The developer of the main Russian DBMS is building AI for half a billion, which speeds up the creation of drugs at times

CNews unveiled not only plans to create a startup called LigandPro (Ligand Pro LLC), but also the fact that 100 million rubles have already been invested in 2023. The platform will be built on deep machine learning and implemented in Python using libraries PyTorch And JAX.

Founders: Skoltech Dmitry Ivankov (general director, biologist, 18%), Ivan Oseledets (AI Specialist, 15%), Alexander Shapeev (applied mathematician, 12%), Marina Pak (0%) on one side, and Oleg Bartunov, Ivan Panchenko, Fedor Sigaev, Anton Sushkevich And Andrey Flute (all 10% each) from Postgres Professional.

Kommersant also wrote about this:

Artificial intelligence that generates substances unknown to man

Deputy CEO of PostgresProfessional Ivan Panchenko commented on this project: Ligand Pro is a socially significant domestic project that will have a positive impact on the image of the Russian scientific community on the world stage, give a new round to the development of the pharmaceutical industry and increase the availability of medicines for consumers. We are happy to help in this matter.

Neon raises $46 Million to advance serverless PostgreSQL database for the AI ​​era

We often wrote about the Neon startup, but we missed one significant news: in the first round, Neon had already raised about the same amount. All investments together already amount to $104 million. New portion received thanks to Menlo Venturesand participate in the financing Founders Fund, General Catalyst, GGV Capital, Khosla Ventures, Snowflake Ventures And Databricks Ventures.

In the news on VentureBeat There are also interesting technical details. Neon is counting on success in the clouds for developers, because they really need serverless databases. But another important direction of promotion is the bases for storing embedding vectors. Neon has its own development – extension pg_embedding: unlike the already popular extension pgvector, neon gives an efficient vector search and uses more modern algorithms – for example, the HNSW index. Gendir – Nikita Shamgunov (Nikita Shamgunov) – assures that he has nothing against pgvector, and the common enemy (ok, competitor) is Pinecone – Pinecone – and other modern bases specialized for vectors.

By the way, here’s a note Rauf Chebri (Raouf Chebri) and Daniel Price (Daniel Price) short and not all clear [??]although there is a visual diagram illustrating this rather peculiar architecture:

Introducing Same-Region Read Replicas to Serverless Postgres

It is clear that replicas in the Neon sense are not exactly replicas – data is not replicated in the way it is usually done in Postgres. They are just instances. Their job is to calculate requests, send and receive data, and the data itself is stored in storage managed by the Pageserver, and Savekeepers monitor consistency.

Education

PostgreSQL 14 Internals

English speaking PostgreSQL 14 Internals now available for print on demand at lulu.com. Links to book page. UPD: Yegor says that 38 books have already been ordered in less than a month.

1st and 2nd places are still ours.

  1. POSTGRES: The First Experience Author: Pavel Luzanov, Egor Rogov, Igor Levshin (translated by Liudmila Mantrova)

  2. PostgreSQL 14 Internals. Egor Rogov

  3. PostgreSQL – Architecture and notions advances. Guillaume Lelarge, Julien Rouhaud

The English course “Introduction to PostgreSQL” has been updated to the 15th version of the DBMS

Two day course Introduction to PostgreSQL updated to version 15 DBMS. The free program was developed by the Postgres Professional education department and includes introductory information on PostgreSQL. Get acquainted with courses in Russian you can on the site in the section Education.

At the same time:

10 Postgres tips for beginners

This is not just advice for beginners. These are the tips that Nikolay Samokhvalov (Postgres.ai, DBLab, postgres.fm And YouTube version). And this advice is not quite beginners. In a recent Friday discussion with Michael Christophides (Michael Christofides, pgMustard) They were bombarded with questions, so Nikolai decided to develop the topic.

In the preface, he philosophizes: PostgreSQL is not just another database. It is more than a base, it is a system with features that can change the way you work with data.

For each item there is an explanation of several paragraphs, where you can almost always find something non-trivial. And perhaps the most useful thing is not the short tips themselves, but the links under each to the corresponding Friday discussion. Here are the 10 points:

  1. Understand early on: tuples are physical versions of strings.

  2. EXPLAIN ANALYZE – always with BUFFERS!

  3. UI other than pgAdmin.

  4. Logging: set up ASAP.

  5. Observe what is happening (observability), using the pg_stat_statements extensions and others.

  6. Take advantage of database cloning (branching) during development using DBLab.

  7. Make sure checksums are enabled.

  8. Set autovacuum for fast performance.

  9. Adjust the configuration for query optimization.

  10. Index support: required experience.

Retro

The Import Substitution We Lost: Soviet Programming Languages ​​and Their Creators – Part 1 (1950s – 1960s)

This is not propaganda, but an interested presentation D.M. Smirnov from EAE-Consult is not known to everyone. In the 50s, this is a “programming program”, it is also PP-1, APL (Soviet, address programming language), Alexey Lyapunov, Ekaterina Yushchenko And Vladimir Korolyuk; in the 60s – Refal Valentina Turchina.

Does it have anything to do with databases? Indirect. But the same author has part II, which for some reason appeared before this part I:

Import substitution that we lost (part 2): APCS over telegraph cable, neural networks and RDBMS in the 60s:

In 1961, the first Soviet relational database management system was implemented on the Kyiv computer – “Autodirector“. The system made it possible to save data in the form of tables that were recorded on an electronic magnetic medium. The conceptual “father” of “Kyiv” was Sergey Alekseevich Lebedev.

Old JSON and “Json of the future” (i.e. pgvector)

Basics of JSON – could be attributed to the Education section, this is a tutorial with educational interactive examples, everything in the browser. Actually there is more JSONB than JSON. The topic is covered quite broadly – there are operators -> ->> and @@, and modern notation (like for arrays), and about btree and gin indexes. And the non-obvious effects of type conversion are explained. I really liked it, well done Crunchy Data.

JSON vs JSONB in ​​PostgreSQL

Inhabitant of Verona Francesco Tizieu (Francesco Tisiot, Aiven) decided to compare these types. What to compare? By almost all criteria, jsonb is more convenient. This also follows from the summary table at the end. Francesco has series of articles by JSON.

pgvector: Fewer dimensions are better

We have already written a benchmark for pgvector. Authors from Supabase write – Greg Richardson (Greg Richardson) Oliver Rice (Oliver Rice) Egor Romanov (Egor Romanov). And half a month earlier, Yegor wrote on this topic – pgvector 0.4.0 – With Pavel Borisov (Pavel Borisov, also at Supabase).

Yes, I forgot to explain: JSON of the future This from here.

Timescale joined the race:

PostgreSQL as a Vector Database: Create, Store, and Query OpenAI Embeddings With pgvector

They even announce Timescale Vector: The fastest ANN search for vector embeddings on PostgreSQL.

Clouders include pgvector in their packages. In sequence:

(NO)VACUUM

PostgreSQL: No More VACUUM, No More Bloat

This is about OrioleDB writes Alexander Korotkov. What’s more, this boisterously marketing article has the following subtitle: The Engine of the Future. But there are some details in rough strokes: to solve the problems of swelling, the “engine of the future” implemented undo-logs both at the line level and at the block level. In addition, there is an automatic merging of pages.

An interesting and “critical” part of the article, where Alexander describes the horrors of the vacuum (The Terrifying Tale of VACUUM). Alexander mentions failures due to vacuum and an article Rick Branson 10 things I Hate In Postgres (alternative link).

We wrote about it in Postgresso 31 – there is a whole section self-criticism. This issue was also discussed at unforgettable ruPostgres.Tuesdays Nikolay Samokhvalov And Ilya Kosmodemyansky. Then the guest was Andrey Zubkovand Tuesday is called – s02e15: Ten problems with PostgreSQL. Request monitoring, pg_profile.

The article is very famous: even earlier, in Postgreso 20, we wrote about this article, but then we highlighted another gloomy conclusion: process scalability is getting better and better with each major release. But in the end there is a hard performance limit due to the architecture of Postgres compared to the same MySQL with its one thread per connection – a topic that is now being actively discussed (and we wrote about this in the last issue).

Vacuum is such a sore subject that pganalyze even provides a special tool that suggests optimal settings at the level of individual tables:

Introducing the New pganalyze VACUUM Advisor for Postgres

A Robert Haas (Robert Haas) in the article Updating the Cost Limit On The Fly tells how now, after patch Melanie Plageman (Melanie Plageman) Autovacuum can be affected by cost settings.

Migration

Migrating to PostgreSQL, testing OLAP cubes and understanding T-SQL code validation: three reports from YuMoney mitap

Video recordings and main thoughts – based on the April High SQL online meetup on working with data in DWH on Microsoft SQL Server.

  • Stanislav Flusov (from Monopoly Online) shared his company’s three years of experience migrating applications from MS SQL to PostgreSQL.

  • Artyom Korshunov (YMoney) talked about BI tools, compared two TSQL parsers (ANTLR and DacFx) and showed how to write a simple validator.

  • Lena Sukhikh (YMoney) told how her team tested OLAP cubes and what came of it

My Oracle to PostgreSQL Migration: The 7 Tools That Made It Possible

Here they are, these seven, about which he writes Ouded Waylin (Oded Valin, EverSQL):

  1. Ora2Pg

  2. AWS Database Migration Service (DMS)

  3. pgLoader

  4. Foreign Data Wrappers (FDW)

  5. pg_dump, pg_restore and additional built-in PG features

  6. EverSQL, for Post Migration Tuning

  7. Npgsql

For each item there is Behind And Against.

In addition, the troubles of migration are discussed:

  1. Differences in SQL Syntax and Functionality

  2. Transaction Behavior

  3. Case Sensitivity

  4. Sequences and Auto-Incrementing

  5. 5. Data Types

  6. NULLs and empty strings

  7. Date and Time Types

Ora2Pg 24.0

Ora2pg is no longer quite Ora: this version officially adds the ability to migrate and SQL Server. You can download from github Gilles Darol (Gilles Darold).

PostGIS and cyclists

PostGIS, ArcGIS Enterprise and the Tour de France route

Florian Nadler (Florian Nadler, Cybertec) uses PostgreSQL 14.5 + PostGIS 3.2, Ubuntu 22.04.02 and ESRI ArcGIS Pro 3.1.2 to plot the race route. And also ArcGIS Pro ST_Geometry Libraries (PostgreSQL) with myesri.com. Well, the main source – Tour de France 2023: GPX (cyclingstage.com). The author is very concerned about the stages, where there are large and steep ascents.

Florian, by the way, also published an article on air travel and MobilityDB: Analyzing historical flight data with MobilityDB. We are not indifferent to this topic: demo base our Department of Education is also aviation. And MobilityDB was discussed more than once in reports at PGConf.Russia – Mobility Data Management with MobilityDB (2021)For example.

Today’s latest version of PostGIS 3.4.0rc1.

Pyatnetsa #11 Buza and ProtoPyatnetsy Sean

Ryan Booz well done: does not let you forget about your idea. According to him PGSQL Phridays calendar the queue was Tomasz Gintovt (Tomasz Gintowt, Real-Time Data Team) and he suggested a very interesting and relevant topic: #011 – partitioning vs sharding in PostgreSQL.

So many people responded. I won’t repeat it, here are the links:

Ryan is great, but every time I somehow even feel a little sorry for Shaun Thomas (now Postgres High Availability Performance Architect, EDB). After all, his Fridays were also good. For example PG Phriday: Around the World in Two Billion Transactions. Or:

The Do’s and Don’ts of Postgres High Availability Part 1: Great Expectations; Part 2: Architecture Baseline; Part 3: Tools Rules.

And here it is in 2022:

PG Phriday: Do’s and Don’ts of Postgres High Availability Q&A And PG Phriday: Defining High Availability in a Postgres World.

And this is in 2021:

High Availability, PostgreSQL, and more: The Return of PG Phriday – the revival of his Pyatnets within the EDB.

That’s all for today.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *