What's new in the planner

PostgreSQL 16 brings many improvements to the query planner and allows many SQL queries to be executed faster than in previous versions of PostgreSQL.

If you look at PG16 release notes, you'll see some of these scheduler improvements. But due to the volume of changes made in each release of PostgreSQL, it is impossible to provide sufficient detail about each change.

In this post you will get an in-depth understanding of 10 improvements made to the PostgreSQL 16 query planner. For each of the improvements, there will be comparisons of the PG15 and PG16 scheduler output, as well as examples of what has changed, as a standalone test that you can try for yourself.

1. Allow incremental sorting in more cases, including DISTINCT (David Rowley)

Have the planner consider Incremental Sort for DISTINCT

Incremental sorts were first added in PostgreSQL 13. They reduce the effort required to produce sorted results. How? Using the knowledge that a given result set is already sorted on 1 or more leading columns, and sorting only on the remaining columns.

For example, if the column has a btree index a and we need strings ordered by a Andbthen we can use the btree index (which provides pre-sorted results by column a) and sort the scanned rows only when the value changes a. Thanks to the quicksort algorithm, sorting many small groups is more efficient than sorting one large group.

The PostgreSQL 16 query scheduler now provides incremental sorting for SELECT DISTINCT requests. Up to PG16 when selecting sort method for queries SELECT DISTINCT the scheduler only considered performing a full sort (which is more expensive than incremental sorting).

-- Setup
CREATE TABLE distinct_test (a INT, b INT);
INSERT INTO distinct_test
SELECT x,1 FROM generate_series(1,1000000)x;
CREATE INDEX on distinct_test(a);
VACUUM ANALYZE distinct_test;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT DISTINCT a,b FROM distinct_test;

PG15 EXPLAIN output

                          QUERY PLAN
---------------------------------------------------------------
 HashAggregate (actual rows=1000000 loops=1)
   Group Key: a, b
   Batches: 81  Memory Usage: 11153kB  Disk Usage: 31288kB
   ->  Seq Scan on distinct_test (actual rows=1000000 loops=1)
 Planning Time: 0.065 ms
 Execution Time: 414.226 ms
(6 rows)

PG16 EXPLAIN output

                          QUERY PLAN
------------------------------------------------------------------
 Unique (actual rows=1000000 loops=1)
   ->  Incremental Sort (actual rows=1000000 loops=1)
         Sort Key: a, b
         Presorted Key: a
         Full-sort Groups: 31250  Sort Method: quicksort  Average Memory: 26kB  Peak Memory: 26kB
         ->  Index Scan using distinct_test_a_idx on distinct_test (actual rows=1000000 loops=1)
 Planning Time: 0.108 ms
 Execution Time: 263.167 ms
(8 rows)

In the output of PostgreSQL 16 EXPLAIN it is clear that the scheduler decided to use the index distinct_test_a_idx for column aand then executed Incremental Sortto sort all equal values a By b. This indicates Presorted Key: a. Since the above statements INSERT added only one value b for each value aeach group of tuples sorted using incremental sort contains only one row.

Conclusion EXPLAIN for PostgreSQL 16 shows that Peak Memory For Incremental Sort was only 26kB, while the hashing method used by PostgreSQL 15 required a lot of memory, so much so that 31288kB had to be loaded onto disk. Query execution in PostgreSQL 16 is 63% faster.

2. Added the ability for aggregates that have ORDER BY or DISTINCT to use pre-sorted data (David Rowley)

Improve performance of ORDER BY / DISTINCT aggregates

In PostgreSQL 15 and earlier, aggregate functions containing ORDER BY or DISTINCTled to the fact that the executor always performed sorting inside Aggregate. Because sorting was always performed, the scheduler never attempted to generate a plan to provide pre-sorted input to concatenate the rows in order.

The PostgreSQL 16 scheduler now attempts to generate a plan that passes rows to Aggregate in the right order. And the performer is now smart enough to realize this and refuse to perform the sort when the rows are already pre-sorted in the correct order.

-- Setup
CREATE TABLE aggtest (a INT, b text);
INSERT INTO aggtest SELECT a,md5((b%100)::text) FROM generate_series(1,10) a, generate_series(1,100000)b;
CREATE INDEX ON aggtest(a,b);
VACUUM FREEZE ANALYZE aggtest;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF, BUFFERS)
SELECT a,COUNT(DISTINCT b) FROM aggtest GROUP BY a;

PG15 EXPLAIN output

                          QUERY PLAN
---------------------------------------------------------------
 GroupAggregate (actual rows=10 loops=1)
   Group Key: a
   Buffers: shared hit=892, temp read=4540 written=4560
   ->  Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
         Heap Fetches: 0
         Buffers: shared hit=892
 Planning Time: 0.122 ms
 Execution Time: 302.693 ms
(8 rows)

PG16 EXPLAIN output

                          QUERY PLAN
---------------------------------------------------------------
 GroupAggregate (actual rows=10 loops=1)
   Group Key: a
   Buffers: shared hit=892
   ->  Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
         Heap Fetches: 0
         Buffers: shared hit=892
 Planning Time: 0.061 ms
 Execution Time: 115.534 ms
(8 rows)

Besides PostgreSQL 16 running queries twice as fast as PG15, the only indication of this change is EXPLAIN ANALYZE is temp read=4540 written=4560which is not present in PostgreSQL 16. In PG15 this is because implicit sorting is moved to disk.

3. Allow memoize in UNION ALL (Richard Guo)

Enable use of Memoize atop an Append that came from UNION ALL

Plan nodes Memoize were first introduced in PostgreSQL 14. Memoize acts as a cache layer between the parameterized Nested Loop and the inside of a nested loop. When the same value needs to be looked up multiple times, Memoize can provide a nice performance boost because it can skip execution of its subnode if the desired rows have already been requested and are in the cache.

PostgreSQL 16 query planner will now take into account usage Memoizewhen request UNION ALL appears on the inside of the parameterized Nested Loop.

-- Setup
CREATE TABLE t1 (a INT PRIMARY KEY);
CREATE TABLE t2 (a INT PRIMARY KEY);
CREATE TABLE lookup (a INT);

INSERT INTO t1 SELECT x FROM generate_Series(1,10000) x;
INSERT INTO t2 SELECT x FROM generate_Series(1,10000) x;
INSERT INTO lookup SELECT x%10+1 FROM generate_Series(1,1000000)x;

ANALYZE t1,t2,lookup;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t2) t
INNER JOIN lookup l ON l.a = t.a;

PG15 EXPLAIN output

                                  QUERY PLAN
-------------------------------------------------------------------------------
 Nested Loop (actual rows=2000000 loops=1)
   ->  Seq Scan on lookup l (actual rows=1000000 loops=1)
   ->  Append (actual rows=2 loops=1000000)
         ->  Index Only Scan using t1_pkey on t1 (actual rows=1 loops=1000000)
               Index Cond: (a = l.a)
               Heap Fetches: 1000000
         ->  Index Only Scan using t2_pkey on t2 (actual rows=1 loops=1000000)
               Index Cond: (a = l.a)
               Heap Fetches: 1000000
 Planning Time: 0.223 ms
 Execution Time: 1926.151 ms
(11 rows)

PG16 EXPLAIN output

                                   QUERY PLAN
---------------------------------------------------------------------------------
 Nested Loop (actual rows=2000000 loops=1)
   ->  Seq Scan on lookup l (actual rows=1000000 loops=1)
   ->  Memoize (actual rows=2 loops=1000000)
         Cache Key: l.a
         Cache Mode: logical
         Hits: 999990  Misses: 10  Evictions: 0  Overflows: 0  Memory Usage: 2kB
         ->  Append (actual rows=2 loops=10)
               ->  Index Only Scan using t1_pkey on t1 (actual rows=1 loops=10)
                     Index Cond: (a = l.a)
                     Heap Fetches: 10
               ->  Index Only Scan using t2_pkey on t2 (actual rows=1 loops=10)
                     Index Cond: (a = l.a)
                     Heap Fetches: 10
 Planning Time: 0.229 ms
 Execution Time: 282.120 ms
(15 rows)

In PostgreSQL 16 EXPLAIN you can see that Memoize placed on top Appendwhich led to a reduction in the number loops V Append from 1 million in PG15 to 10 in PG16. Every time when Memoize goes into cache, no need to execute Append to retrieve records. This leads to Query execution in PostgreSQL 16 is approximately 6 times faster.

4. Allow anti-join with non-zero input as internal relation (Richard Guo)

Support “Right Anti Join” plan shapes

By doing Hash Join For INNER JOIN PostgreSQL prefers to create a hash table based on the smaller of the two tables. Smaller hash tables are better because they require less effort to create. Smaller tables are also better to use because they are more cache-friendly for the CPU, and the CPU is less likely to stall waiting for data to arrive from main memory.

Up to PostgreSQL 16, in Anti Join table mentioned in NOT EXISTS, was always placed in the inner part of the union. This meant that there was no flexibility to hash the smaller of the two tables, possibly resulting in the need to create a hash table for the larger table.

The PostgreSQL 16 query planner may choose to hash the smaller of the two tables. This can now be done since PostgreSQL 16 supports Right Anti Join.

-- Setup
CREATE TABLE small(a int);
CREATE TABLE large(a int);
INSERT INTO small
SELECT a FROM generate_series(1,100) a;
INSERT INTO large
SELECT a FROM generate_series(1,1000000) a;
VACUUM ANALYZE small,large;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM small s
WHERE NOT EXISTS(SELECT 1 FROM large l WHERE s.a = l.a);

PG15 EXPLAIN output

                          QUERY PLAN
---------------------------------------------------------------
 Hash Anti Join (actual rows=0 loops=1)
   Hash Cond: (s.a = l.a)
   ->  Seq Scan on small s (actual rows=100 loops=1)
   ->  Hash (actual rows=1000000 loops=1)
         Buckets: 262144  Batches: 8  Memory Usage: 6446kB
         ->  Seq Scan on large l (actual rows=1000000 loops=1)
 Planning Time: 0.103 ms
 Execution Time: 139.023 ms
(8 rows)

PG16 EXPLAIN output

                        QUERY PLAN
-----------------------------------------------------------
 Hash Right Anti Join (actual rows=0 loops=1)
   Hash Cond: (l.a = s.a)
   ->  Seq Scan on large l (actual rows=1000000 loops=1)
   ->  Hash (actual rows=100 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 12kB
         ->  Seq Scan on small s (actual rows=100 loops=1)
 Planning Time: 0.094 ms
 Execution Time: 77.076 ms
(8 rows)

Due to the fact that the PG16 scheduler decided to use Hash Right Anti Join, Memory Usage PostgreSQL 16 is much smaller than PostgreSQL 15, and Execution Time almost halved.

5. Allow parallelization of FULL and RIGHT OUTER hash joins (Melanie Plageman, Thomas Munro)

Parallel Hash Full Join

In PostgreSQL 11 appeared Parallel Hash Join. This allows multiple parallel processors in a parallel query to help create a single hash table. In versions prior to 11, each handler created its own identical hash table, which resulted in additional memory overhead.

Improved in PostgreSQL 16 Parallel Hash Joinnow it supports FULL And RIGHT types of connections. This allows you to execute queries in parallel that have FULL OUTER JOINand also execute plans in parallel Right Joins.

-- Setup
CREATE TABLE odd (a INT);
CREATE TABLE even (a INT);
INSERT INTO odd
SELECT a FROM generate_series(1,1000000,2) a;
INSERT INTO even
SELECT a FROM generate_series(2,1000000,2) a;
VACUUM ANALYZE odd, even;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT COUNT(o.a),COUNT(e.a) FROM odd o FULL JOIN even e ON o.a = e.a;

PG15 EXPLAIN output

                            QUERY PLAN
-------------------------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   ->  Hash Full Join (actual rows=1000000 loops=1)
         Hash Cond: (o.a = e.a)
         ->  Seq Scan on odd o (actual rows=500000 loops=1)
         ->  Hash (actual rows=500000 loops=1)
               Buckets: 262144  Batches: 4  Memory Usage: 6439kB
               ->  Seq Scan on even e (actual rows=500000 loops=1)
 Planning Time: 0.079 ms
 Execution Time: 220.677 ms
(9 rows)

PG16 EXPLAIN output

                                    QUERY PLAN
--------------------------------------------------------------------------------
 Finalize Aggregate (actual rows=1 loops=1)
   ->  Gather (actual rows=2 loops=1)
         Workers Planned: 1
         Workers Launched: 1
         ->  Partial Aggregate (actual rows=1 loops=2)
               ->  Parallel Hash Full Join (actual rows=500000 loops=2)
                     Hash Cond: (o.a = e.a)
                     ->  Parallel Seq Scan on odd o (actual rows=250000 loops=2)
                     ->  Parallel Hash (actual rows=250000 loops=2)
                           Buckets: 262144  Batches: 4  Memory Usage: 6976kB
                           ->  Parallel Seq Scan on even e (actual rows=250000 loops=2)
 Planning Time: 0.161 ms
 Execution Time: 129.769 ms
(13 rows)

PostgreSQL 16 was able to perform the join in parallel, which led to a significant reduction Execution Time.

6. Allow window functions to use faster ROWS mode when RANGE mode is active but not needed (David Rowley)

Allow window functions to adjust their frameOptions

When a request contains a window function such as row_number(), rank(), dense_rank(), percent_rank() cume_dist(), ntile() and, if the condition does not specify a parameter ROWSthen PostgreSQL will always use the parameter RANGE default.

Option RANGE forces the performer to look ahead until he finds the first “non-peer” line. Peer string is a string in the window frame that is compared equally according to ORDER BY in window expression. If ORDER BY is missing, all rows in the window frame are peers. When processing records containing many rows that are sorted the same according to ORDER BY in a window, the additional processing to determine these peer rows can be expensive.

The window functions mentioned above work unchanged, regardless of whether ROWS or RANGE. However, the executor in versions of PostgreSQL before 16 did not know this, and since for some window functions is a really important option ROWS/RANGEthe executor had to perform peer string checks in all cases.

PostgreSQL 16 query planner knows which window functions need an option ROWS/RANGEand passes this information to the executor so that he can skip unnecessary additional processing.

This optimization works especially well when row_number() is used to limit the number of results in a query, as shown in the example below.

-- Setup
CREATE TABLE scores (id INT PRIMARY KEY, score INT);
INSERT INTO scores SELECT s,random()*10 FROM generate_series(1,1000000)s;
CREATE INDEX ON scores(score);
VACUUM ANALYZE scores;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (
    SELECT id,ROW_NUMBER() OVER (ORDER BY score) rn,score
    FROM scores
) m WHERE rn <= 10;

PG15 EXPLAIN output

                                 QUERY PLAN
-------------------------------------------------------------------------------
 WindowAgg (actual rows=10 loops=1)
   Run Condition: (row_number() OVER (?) <= 10)
   ->  Index Scan using scores_score_idx on scores (actual rows=50410 loops=1)
 Planning Time: 0.096 ms
 Execution Time: 29.775 ms
(5 rows)

PG16 EXPLAIN output

                                 QUERY PLAN
----------------------------------------------------------------------------
 WindowAgg (actual rows=10 loops=1)
   Run Condition: (row_number() OVER (?) <= 10)
   ->  Index Scan using scores_score_idx on scores (actual rows=11 loops=1)
 Planning Time: 0.191 ms
 Execution Time: 0.058 ms
(5 rows)

Index Scan in PG15 shows that 50410 lines had to be read from scores_score_idx index before stopping execution. In PostgreSQL 16, only 11 rows were read because the executor realized that once the row_number reached 11, there would be no more rows matching <= 10 condition. On PostgreSQL 16 this query is over 500 times faster.

7. Optimizing ever-increasing window functions ntile(), cume_dist() and percent_rank() (David Rowley)

Teach planner about more monotonic window functions

In PG15, the query scheduler was changed to allow the executor to terminate processing early WindowAgg nodes This can be done when the element is in WHERE filters the window function such that once a condition becomes false, it will never be true again.

row_number() is an example of a function that can give such guarantees, since it is a monotonically increasing function, meaning subsequent rows in the same section will never have a row_number less than the previous row.

PostgreSQL Query Planner 16 expands the scope of this optimization to also include ntile(), cume_dist() And percent_rank(). In PostgreSQL 15 this only worked for row_number(), rank(), dense_rank() count() And count(*).

-- Setup
CREATE TABLE marathon (id INT PRIMARY KEY, time INTERVAL NOT NULL);
INSERT INTO marathon
SELECT id,'03:00:00'::interval + (CAST(RANDOM() * 3600 AS INT) || 'secs')::INTERVAL - (CAST(RANDOM() * 3600 AS INT) || ' secs')::INTERVAL
FROM generate_series(1,50000) id;
CREATE INDEX ON marathon (time);
VACUUM ANALYZE marathon;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (SELECT *,percent_rank() OVER (ORDER BY time) pr
FROM marathon) m WHERE pr <= 0.01;

PG15 EXPLAIN output

                              QUERY PLAN
-----------------------------------------------------------------------
 Subquery Scan on m (actual rows=500 loops=1)
   Filter: (m.pr <= '0.01'::double precision)
   Rows Removed by Filter: 49500
   ->  WindowAgg (actual rows=50000 loops=1)
         ->  Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
 Planning Time: 0.108 ms
 Execution Time: 84.358 ms
(7 rows)

PG16 EXPLAIN output

                              QUERY PLAN
-----------------------------------------------------------------------
 WindowAgg (actual rows=500 loops=1)
   Run Condition: (percent_rank() OVER (?) <= '0.01'::double precision)
   ->  Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
 Planning Time: 0.180 ms
 Execution Time: 19.454 ms
(5 rows)

From the output of PostgreSQL 16 EXPLAIN it can be seen that the scheduler was able to use the condition pr <= 0.01 How Run Conditionwhereas in PostgreSQL 15 this sentence appeared as Filter in a subquery. In PG16, the execution condition was used to terminate execution early WindowAgg node. As a result Execution Time V PG16 was more than 4 times fasterthan in PG15.

8. Allow deleting left joins and unique joins in partitioned tables (Arne Roland)

Allow left join removals and unique joins on partitioned tables

For a long time PostgreSQL has been able to delete LEFT JOINwhen the query did not require any columns from the left-joined table, and the join could not duplicate any rows.

However, versions prior to PostgreSQL 16 did not have support for deleting left joins in partitioned tables. Why? Because the evidence that the planner uses to determine the possibility that any inner row can duplicate any outer row was missing for partitioned tables.

PostgreSQL 16 query planner now allows for delete optimization LEFT JOIN in partitioned tables.

This join-elimination optimization will help when working with views, since it is often the case that not all columns that exist in a view are always queried.

-- Setup
CREATE TABLE part_tab (id BIGINT PRIMARY KEY, payload TEXT) PARTITION BY HASH(id);
CREATE TABLE part_tab_p0 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE part_tab_p1 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 1);
CREATE TABLE normal_table (id INT, part_tab_id BIGINT);

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT nt.* FROM normal_table nt LEFT JOIN part_tab pt ON nt.part_tab_id = pt.id;

PG15 EXPLAIN output

                            QUERY PLAN
-------------------------------------------------------------------
 Merge Right Join (actual rows=0 loops=1)
   Merge Cond: (pt.id = nt.part_tab_id)
   ->  Merge Append (actual rows=0 loops=1)
         Sort Key: pt.id
         ->  Index Only Scan using part_tab_p0_pkey on part_tab_p0 pt_1 (actual rows=0 loops=1)
               Heap Fetches: 0
         ->  Index Only Scan using part_tab_p1_pkey on part_tab_p1 pt_2 (actual rows=0 loops=1)
               Heap Fetches: 0
   ->  Sort (actual rows=0 loops=1)
         Sort Key: nt.part_tab_id
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on normal_table nt (actual rows=0 loops=1)
 Planning Time: 0.325 ms
 Execution Time: 0.037 ms
(14 rows)

PG16 EXPLAIN output

                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on normal_table nt (actual rows=0 loops=1)
 Planning Time: 0.244 ms
 Execution Time: 0.015 ms
(3 rows)

It's important to note that the PostgreSQL 16 plan does not include joining part_tabwhich means all you need to do is scan normal_table.

9. Use Limit instead of Unique to implement DISTINCT whenever possible (David Rowley)

Use Limit instead of Unique to implement DISTINCT, when possible

The PostgreSQL query planner may not enable scheduling nodes to remove duplicate results if it detects that all rows contain the same value. This is not difficult to detect, and optimization can lead to huge performance gains.

-- Setup
CREATE TABLE abc (a int, b int, c int);
INSERT INTO abc SELECT a%10,a%10,a%10 FROM generate_series(1,1000000)a;
VACUUM ANALYZE abc;

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT DISTINCT a,b,c FROM abc WHERE a = 5 AND b = 5 AND c = 5;

PG15 EXPLAIN output

                               QUERY PLAN
------------------------------------------------------------------------
 Unique (actual rows=1 loops=1)
   ->  Gather (actual rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Unique (actual rows=1 loops=3)
               ->  Parallel Seq Scan on abc (actual rows=33333 loops=3)
                     Filter: ((a = 5) AND (b = 5) AND (c = 5))
                     Rows Removed by Filter: 300000
 Planning Time: 0.114 ms
 Execution Time: 30.381 ms
(10 rows)

PG16 EXPLAIN output

                    QUERY PLAN
---------------------------------------------------
 Limit (actual rows=1 loops=1)
   ->  Seq Scan on abc (actual rows=1 loops=1)
         Filter: ((a = 5) AND (b = 5) AND (c = 5))
         Rows Removed by Filter: 4
 Planning Time: 0.109 ms
 Execution Time: 0.025 ms
(6 rows)

If you look closely at the SQL query, you will notice that each column in DISTINCT also contains the equality condition in WHERE. This means that all output rows in the query will have the same values ​​in each column. The PostgreSQL 16 query planner can take advantage of this knowledge and simply use LIMIT convert query results into 1 line. PostgreSQL 15 produced the same query result by reading all the results in their entirety and using the operator Uniqueto reduce all lines to one line. Execution Time for PostgreSQL 16 was more than 1200 times fasterthan for PostgreSQL 15.

10. Relaxed too strict rules in select_outer_pathkeys_for_merge() (David Rowley)

Relax overly strict rules in select_outer_pathkeys_for_merge()

Before PostgreSQL 16, when the query planner considered doing Merge Joinit checked whether the merge sort order matches any top-level plan operation (such as DISTINCT, GROUP BY or ORDER BY), and used this order only if it exactly met the requirements for the top level. This option has been slightly deprecated since for these top level operations you can use Incremental Sortsand incremental sorts can take advantage of results that are pre-sorted by only some of the leading columns on which the results need to be sorted.

PostgreSQL 16 query planner has changed the rule used when considering order Merge JoinWith “the order of the lines must match exactly” on “there must be at least 1 leading column, properly ordered“. This allows the scheduler to use Incremental Sorts to put the rows in the correct order to perform top-level operations. We learned earlier in this blog that incremental sort, when possible, requires less work than full sort because incremental sort can use partially sorted input and perform the sort in smaller chunks, resulting in less memory consumption and fewer comparisons overall .

-- Setup

CREATE TABLE a (a INT, b INT);
CREATE TABLE b (x INT, y INT);
INSERT INTO a SELECT a,a FROM generate_series(1,1000000) a;
INSERT INTO b SELECT a,a FROM generate_series(1,1000000) a;
VACUUM ANALYZE a, b;

SET enable_hashjoin=0;
SET max_parallel_workers_per_gather=0;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT a,b,count(*) FROM a INNER JOIN b ON a.a = b.x GROUP BY a,b ORDER BY a DESC, b;

PG15 EXPLAIN output

                                QUERY PLAN
---------------------------------------------------------------------------
 GroupAggregate (actual rows=1000000 loops=1)
   Group Key: a.a, a.b
   ->  Sort (actual rows=1000000 loops=1)
         Sort Key: a.a DESC, a.b
         Sort Method: external merge  Disk: 17664kB
         ->  Merge Join (actual rows=1000000 loops=1)
               Merge Cond: (a.a = b.x)
               ->  Sort (actual rows=1000000 loops=1)
                     Sort Key: a.a
                     Sort Method: external merge  Disk: 17664kB
                     ->  Seq Scan on a (actual rows=1000000 loops=1)
               ->  Materialize (actual rows=1000000 loops=1)
                     ->  Sort (actual rows=1000000 loops=1)
                           Sort Key: b.x
                           Sort Method: external merge  Disk: 11768kB
                           ->  Seq Scan on b (actual rows=1000000 loops=1)
 Planning Time: 0.175 ms
 Execution Time: 1010.738 ms
(18 rows)

PG16 EXPLAIN output

                                QUERY PLAN
---------------------------------------------------------------------------
 GroupAggregate (actual rows=1000000 loops=1)
   Group Key: a.a, a.b
   ->  Incremental Sort (actual rows=1000000 loops=1)
         Sort Key: a.a DESC, a.b
         Presorted Key: a.a
         Full-sort Groups: 31250  Sort Method: quicksort  Average Memory: 26kB  Peak Memory: 26kB
         ->  Merge Join (actual rows=1000000 loops=1)
               Merge Cond: (a.a = b.x)
               ->  Sort (actual rows=1000000 loops=1)
                     Sort Key: a.a DESC
                     Sort Method: external merge  Disk: 17672kB
                     ->  Seq Scan on a (actual rows=1000000 loops=1)
               ->  Materialize (actual rows=1000000 loops=1)
                     ->  Sort (actual rows=1000000 loops=1)
                           Sort Key: b.x DESC
                           Sort Method: external merge  Disk: 11768kB
                           ->  Seq Scan on b (actual rows=1000000 loops=1)
 Planning Time: 0.140 ms
 Execution Time: 915.589 ms
(19 rows)

In PG16 EXPLAIN you can see what has been used Incremental Sort (compared to PG15 which used instead Sort), and this led to a slight reduction Execution Time in PG16 and a significant reduction in the amount of memory used to perform sorting.

Conclusion

PostgreSQL 16 has had a lot of engineering work done to improve the query planner by many engineers around the world.

Each of 10 scheduler improvements PostgreSQL 16described above, is enabled by default – and is either applied in all cases where optimization is possible, or applied selectively by the query planner when it believes that optimization will help.

If you are using an older version of PostgreSQL, I recommend that you try your workload on PostgreSQL 16 to see which of your queries run faster. And, as always, feedback on actual use of PostgreSQL is welcome on the pgsql-general@postgresql.org mailing list – you don't have to write only about problems, you can always share positive experiences.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *