What's new in the planner
PostgreSQL 16 brings many improvements to the query planner and allows many SQL queries to be executed faster than in previous versions of PostgreSQL.
If you look at PG16 release notes, you'll see some of these scheduler improvements. But due to the volume of changes made in each release of PostgreSQL, it is impossible to provide sufficient detail about each change.
In this post you will get an in-depth understanding of 10 improvements made to the PostgreSQL 16 query planner. For each of the improvements, there will be comparisons of the PG15 and PG16 scheduler output, as well as examples of what has changed, as a standalone test that you can try for yourself.
1. Allow incremental sorting in more cases, including DISTINCT (David Rowley)
Have the planner consider Incremental Sort for DISTINCT
Incremental sorts were first added in PostgreSQL 13. They reduce the effort required to produce sorted results. How? Using the knowledge that a given result set is already sorted on 1 or more leading columns, and sorting only on the remaining columns.
For example, if the column has a btree index a
and we need strings ordered by a
Andb
then we can use the btree index (which provides pre-sorted results by column a
) and sort the scanned rows only when the value changes a
. Thanks to the quicksort algorithm, sorting many small groups is more efficient than sorting one large group.
The PostgreSQL 16 query scheduler now provides incremental sorting for SELECT DISTINCT
requests. Up to PG16 when selecting sort method for queries SELECT DISTINCT
the scheduler only considered performing a full sort (which is more expensive than incremental sorting).
-- Setup
CREATE TABLE distinct_test (a INT, b INT);
INSERT INTO distinct_test
SELECT x,1 FROM generate_series(1,1000000)x;
CREATE INDEX on distinct_test(a);
VACUUM ANALYZE distinct_test;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT DISTINCT a,b FROM distinct_test;
PG15 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------
HashAggregate (actual rows=1000000 loops=1)
Group Key: a, b
Batches: 81 Memory Usage: 11153kB Disk Usage: 31288kB
-> Seq Scan on distinct_test (actual rows=1000000 loops=1)
Planning Time: 0.065 ms
Execution Time: 414.226 ms
(6 rows)
PG16 EXPLAIN output
QUERY PLAN
------------------------------------------------------------------
Unique (actual rows=1000000 loops=1)
-> Incremental Sort (actual rows=1000000 loops=1)
Sort Key: a, b
Presorted Key: a
Full-sort Groups: 31250 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Index Scan using distinct_test_a_idx on distinct_test (actual rows=1000000 loops=1)
Planning Time: 0.108 ms
Execution Time: 263.167 ms
(8 rows)
In the output of PostgreSQL 16 EXPLAIN
it is clear that the scheduler decided to use the index distinct_test_a_idx
for column a
and then executed Incremental Sort
to sort all equal values a
By b
. This indicates Presorted Key: a
. Since the above statements INSERT
added only one value b
for each value a
each group of tuples sorted using incremental sort contains only one row.
Conclusion EXPLAIN
for PostgreSQL 16 shows that Peak Memory
For Incremental Sort
was only 26kB, while the hashing method used by PostgreSQL 15 required a lot of memory, so much so that 31288kB had to be loaded onto disk. Query execution in PostgreSQL 16 is 63% faster.
2. Added the ability for aggregates that have ORDER BY or DISTINCT to use pre-sorted data (David Rowley)
Improve performance of ORDER BY / DISTINCT aggregates
In PostgreSQL 15 and earlier, aggregate functions containing ORDER BY
or DISTINCT
led to the fact that the executor always performed sorting inside Aggregate
. Because sorting was always performed, the scheduler never attempted to generate a plan to provide pre-sorted input to concatenate the rows in order.
The PostgreSQL 16 scheduler now attempts to generate a plan that passes rows to Aggregate
in the right order. And the performer is now smart enough to realize this and refuse to perform the sort when the rows are already pre-sorted in the correct order.
-- Setup
CREATE TABLE aggtest (a INT, b text);
INSERT INTO aggtest SELECT a,md5((b%100)::text) FROM generate_series(1,10) a, generate_series(1,100000)b;
CREATE INDEX ON aggtest(a,b);
VACUUM FREEZE ANALYZE aggtest;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF, BUFFERS)
SELECT a,COUNT(DISTINCT b) FROM aggtest GROUP BY a;
PG15 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------
GroupAggregate (actual rows=10 loops=1)
Group Key: a
Buffers: shared hit=892, temp read=4540 written=4560
-> Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
Heap Fetches: 0
Buffers: shared hit=892
Planning Time: 0.122 ms
Execution Time: 302.693 ms
(8 rows)
PG16 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------
GroupAggregate (actual rows=10 loops=1)
Group Key: a
Buffers: shared hit=892
-> Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
Heap Fetches: 0
Buffers: shared hit=892
Planning Time: 0.061 ms
Execution Time: 115.534 ms
(8 rows)
Besides PostgreSQL 16 running queries twice as fast as PG15, the only indication of this change is EXPLAIN ANALYZE
is temp read=4540 written=4560
which is not present in PostgreSQL 16. In PG15 this is because implicit sorting is moved to disk.
3. Allow memoize in UNION ALL (Richard Guo)
Enable use of Memoize atop an Append that came from UNION ALL
Plan nodes Memoize
were first introduced in PostgreSQL 14. Memoize
acts as a cache layer between the parameterized Nested Loop
and the inside of a nested loop. When the same value needs to be looked up multiple times, Memoize can provide a nice performance boost because it can skip execution of its subnode if the desired rows have already been requested and are in the cache.
PostgreSQL 16 query planner will now take into account usage Memoize
when request UNION ALL
appears on the inside of the parameterized Nested Loop
.
-- Setup
CREATE TABLE t1 (a INT PRIMARY KEY);
CREATE TABLE t2 (a INT PRIMARY KEY);
CREATE TABLE lookup (a INT);
INSERT INTO t1 SELECT x FROM generate_Series(1,10000) x;
INSERT INTO t2 SELECT x FROM generate_Series(1,10000) x;
INSERT INTO lookup SELECT x%10+1 FROM generate_Series(1,1000000)x;
ANALYZE t1,t2,lookup;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t2) t
INNER JOIN lookup l ON l.a = t.a;
PG15 EXPLAIN output
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (actual rows=2000000 loops=1)
-> Seq Scan on lookup l (actual rows=1000000 loops=1)
-> Append (actual rows=2 loops=1000000)
-> Index Only Scan using t1_pkey on t1 (actual rows=1 loops=1000000)
Index Cond: (a = l.a)
Heap Fetches: 1000000
-> Index Only Scan using t2_pkey on t2 (actual rows=1 loops=1000000)
Index Cond: (a = l.a)
Heap Fetches: 1000000
Planning Time: 0.223 ms
Execution Time: 1926.151 ms
(11 rows)
PG16 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------------------------
Nested Loop (actual rows=2000000 loops=1)
-> Seq Scan on lookup l (actual rows=1000000 loops=1)
-> Memoize (actual rows=2 loops=1000000)
Cache Key: l.a
Cache Mode: logical
Hits: 999990 Misses: 10 Evictions: 0 Overflows: 0 Memory Usage: 2kB
-> Append (actual rows=2 loops=10)
-> Index Only Scan using t1_pkey on t1 (actual rows=1 loops=10)
Index Cond: (a = l.a)
Heap Fetches: 10
-> Index Only Scan using t2_pkey on t2 (actual rows=1 loops=10)
Index Cond: (a = l.a)
Heap Fetches: 10
Planning Time: 0.229 ms
Execution Time: 282.120 ms
(15 rows)
In PostgreSQL 16 EXPLAIN you can see that Memoize
placed on top Append
which led to a reduction in the number loops
V Append
from 1 million in PG15 to 10 in PG16. Every time when Memoize
goes into cache, no need to execute Append
to retrieve records. This leads to Query execution in PostgreSQL 16 is approximately 6 times faster.
4. Allow anti-join with non-zero input as internal relation (Richard Guo)
Support “Right Anti Join” plan shapes
By doing Hash Join
For INNER JOIN
PostgreSQL prefers to create a hash table based on the smaller of the two tables. Smaller hash tables are better because they require less effort to create. Smaller tables are also better to use because they are more cache-friendly for the CPU, and the CPU is less likely to stall waiting for data to arrive from main memory.
Up to PostgreSQL 16, in Anti Join
table mentioned in NOT EXISTS
, was always placed in the inner part of the union. This meant that there was no flexibility to hash the smaller of the two tables, possibly resulting in the need to create a hash table for the larger table.
The PostgreSQL 16 query planner may choose to hash the smaller of the two tables. This can now be done since PostgreSQL 16 supports Right Anti Join
.
-- Setup
CREATE TABLE small(a int);
CREATE TABLE large(a int);
INSERT INTO small
SELECT a FROM generate_series(1,100) a;
INSERT INTO large
SELECT a FROM generate_series(1,1000000) a;
VACUUM ANALYZE small,large;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM small s
WHERE NOT EXISTS(SELECT 1 FROM large l WHERE s.a = l.a);
PG15 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------
Hash Anti Join (actual rows=0 loops=1)
Hash Cond: (s.a = l.a)
-> Seq Scan on small s (actual rows=100 loops=1)
-> Hash (actual rows=1000000 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 6446kB
-> Seq Scan on large l (actual rows=1000000 loops=1)
Planning Time: 0.103 ms
Execution Time: 139.023 ms
(8 rows)
PG16 EXPLAIN output
QUERY PLAN
-----------------------------------------------------------
Hash Right Anti Join (actual rows=0 loops=1)
Hash Cond: (l.a = s.a)
-> Seq Scan on large l (actual rows=1000000 loops=1)
-> Hash (actual rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small s (actual rows=100 loops=1)
Planning Time: 0.094 ms
Execution Time: 77.076 ms
(8 rows)
Due to the fact that the PG16 scheduler decided to use Hash Right Anti Join
, Memory Usage
PostgreSQL 16 is much smaller than PostgreSQL 15, and Execution Time
almost halved.
5. Allow parallelization of FULL and RIGHT OUTER hash joins (Melanie Plageman, Thomas Munro)
In PostgreSQL 11 appeared Parallel Hash Join
. This allows multiple parallel processors in a parallel query to help create a single hash table. In versions prior to 11, each handler created its own identical hash table, which resulted in additional memory overhead.
Improved in PostgreSQL 16 Parallel Hash Join
now it supports FULL
And RIGHT
types of connections. This allows you to execute queries in parallel that have FULL OUTER JOIN
and also execute plans in parallel Right Joins
.
-- Setup
CREATE TABLE odd (a INT);
CREATE TABLE even (a INT);
INSERT INTO odd
SELECT a FROM generate_series(1,1000000,2) a;
INSERT INTO even
SELECT a FROM generate_series(2,1000000,2) a;
VACUUM ANALYZE odd, even;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT COUNT(o.a),COUNT(e.a) FROM odd o FULL JOIN even e ON o.a = e.a;
PG15 EXPLAIN output
QUERY PLAN
-------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Hash Full Join (actual rows=1000000 loops=1)
Hash Cond: (o.a = e.a)
-> Seq Scan on odd o (actual rows=500000 loops=1)
-> Hash (actual rows=500000 loops=1)
Buckets: 262144 Batches: 4 Memory Usage: 6439kB
-> Seq Scan on even e (actual rows=500000 loops=1)
Planning Time: 0.079 ms
Execution Time: 220.677 ms
(9 rows)
PG16 EXPLAIN output
QUERY PLAN
--------------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
-> Gather (actual rows=2 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Partial Aggregate (actual rows=1 loops=2)
-> Parallel Hash Full Join (actual rows=500000 loops=2)
Hash Cond: (o.a = e.a)
-> Parallel Seq Scan on odd o (actual rows=250000 loops=2)
-> Parallel Hash (actual rows=250000 loops=2)
Buckets: 262144 Batches: 4 Memory Usage: 6976kB
-> Parallel Seq Scan on even e (actual rows=250000 loops=2)
Planning Time: 0.161 ms
Execution Time: 129.769 ms
(13 rows)
PostgreSQL 16 was able to perform the join in parallel, which led to a significant reduction Execution Time
.
6. Allow window functions to use faster ROWS mode when RANGE mode is active but not needed (David Rowley)
Allow window functions to adjust their frameOptions
When a request contains a window function such as row_number()
, rank()
, dense_rank()
, percent_rank()
cume_dist()
, ntile()
and, if the condition does not specify a parameter ROWS
then PostgreSQL will always use the parameter RANGE
default.
Option RANGE
forces the performer to look ahead until he finds the first “non-peer” line. Peer string is a string in the window frame that is compared equally according to ORDER BY
in window expression. If ORDER BY
is missing, all rows in the window frame are peers. When processing records containing many rows that are sorted the same according to ORDER BY
in a window, the additional processing to determine these peer rows can be expensive.
The window functions mentioned above work unchanged, regardless of whether ROWS
or RANGE
. However, the executor in versions of PostgreSQL before 16 did not know this, and since for some window functions is a really important option ROWS
/RANGE
the executor had to perform peer string checks in all cases.
PostgreSQL 16 query planner knows which window functions need an option ROWS
/RANGE
and passes this information to the executor so that he can skip unnecessary additional processing.
This optimization works especially well when row_number()
is used to limit the number of results in a query, as shown in the example below.
-- Setup
CREATE TABLE scores (id INT PRIMARY KEY, score INT);
INSERT INTO scores SELECT s,random()*10 FROM generate_series(1,1000000)s;
CREATE INDEX ON scores(score);
VACUUM ANALYZE scores;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (
SELECT id,ROW_NUMBER() OVER (ORDER BY score) rn,score
FROM scores
) m WHERE rn <= 10;
PG15 EXPLAIN output
QUERY PLAN
-------------------------------------------------------------------------------
WindowAgg (actual rows=10 loops=1)
Run Condition: (row_number() OVER (?) <= 10)
-> Index Scan using scores_score_idx on scores (actual rows=50410 loops=1)
Planning Time: 0.096 ms
Execution Time: 29.775 ms
(5 rows)
PG16 EXPLAIN output
QUERY PLAN
----------------------------------------------------------------------------
WindowAgg (actual rows=10 loops=1)
Run Condition: (row_number() OVER (?) <= 10)
-> Index Scan using scores_score_idx on scores (actual rows=11 loops=1)
Planning Time: 0.191 ms
Execution Time: 0.058 ms
(5 rows)
Index Scan
in PG15 shows that 50410 lines had to be read from scores_score_idx
index before stopping execution. In PostgreSQL 16, only 11 rows were read because the executor realized that once the row_number reached 11, there would be no more rows matching <= 10
condition. On PostgreSQL 16 this query is over 500 times faster.
7. Optimizing ever-increasing window functions ntile(), cume_dist() and percent_rank() (David Rowley)
Teach planner about more monotonic window functions
In PG15, the query scheduler was changed to allow the executor to terminate processing early WindowAgg
nodes This can be done when the element is in WHERE
filters the window function such that once a condition becomes false, it will never be true again.
row_number()
is an example of a function that can give such guarantees, since it is a monotonically increasing function, meaning subsequent rows in the same section will never have a row_number less than the previous row.
PostgreSQL Query Planner 16 expands the scope of this optimization to also include ntile()
, cume_dist()
And percent_rank()
. In PostgreSQL 15 this only worked for row_number()
, rank()
, dense_rank()
count()
And count(*)
.
-- Setup
CREATE TABLE marathon (id INT PRIMARY KEY, time INTERVAL NOT NULL);
INSERT INTO marathon
SELECT id,'03:00:00'::interval + (CAST(RANDOM() * 3600 AS INT) || 'secs')::INTERVAL - (CAST(RANDOM() * 3600 AS INT) || ' secs')::INTERVAL
FROM generate_series(1,50000) id;
CREATE INDEX ON marathon (time);
VACUUM ANALYZE marathon;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT * FROM (SELECT *,percent_rank() OVER (ORDER BY time) pr
FROM marathon) m WHERE pr <= 0.01;
PG15 EXPLAIN output
QUERY PLAN
-----------------------------------------------------------------------
Subquery Scan on m (actual rows=500 loops=1)
Filter: (m.pr <= '0.01'::double precision)
Rows Removed by Filter: 49500
-> WindowAgg (actual rows=50000 loops=1)
-> Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
Planning Time: 0.108 ms
Execution Time: 84.358 ms
(7 rows)
PG16 EXPLAIN output
QUERY PLAN
-----------------------------------------------------------------------
WindowAgg (actual rows=500 loops=1)
Run Condition: (percent_rank() OVER (?) <= '0.01'::double precision)
-> Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
Planning Time: 0.180 ms
Execution Time: 19.454 ms
(5 rows)
From the output of PostgreSQL 16 EXPLAIN
it can be seen that the scheduler was able to use the condition pr <= 0.01
How Run Condition
whereas in PostgreSQL 15 this sentence appeared as Filter
in a subquery. In PG16, the execution condition was used to terminate execution early WindowAgg
node. As a result Execution Time
V PG16 was more than 4 times fasterthan in PG15.
8. Allow deleting left joins and unique joins in partitioned tables (Arne Roland)
Allow left join removals and unique joins on partitioned tables
For a long time PostgreSQL has been able to delete LEFT JOIN
when the query did not require any columns from the left-joined table, and the join could not duplicate any rows.
However, versions prior to PostgreSQL 16 did not have support for deleting left joins in partitioned tables. Why? Because the evidence that the planner uses to determine the possibility that any inner row can duplicate any outer row was missing for partitioned tables.
PostgreSQL 16 query planner now allows for delete optimization LEFT JOIN
in partitioned tables.
This join-elimination optimization will help when working with views, since it is often the case that not all columns that exist in a view are always queried.
-- Setup
CREATE TABLE part_tab (id BIGINT PRIMARY KEY, payload TEXT) PARTITION BY HASH(id);
CREATE TABLE part_tab_p0 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE part_tab_p1 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 1);
CREATE TABLE normal_table (id INT, part_tab_id BIGINT);
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT nt.* FROM normal_table nt LEFT JOIN part_tab pt ON nt.part_tab_id = pt.id;
PG15 EXPLAIN output
QUERY PLAN
-------------------------------------------------------------------
Merge Right Join (actual rows=0 loops=1)
Merge Cond: (pt.id = nt.part_tab_id)
-> Merge Append (actual rows=0 loops=1)
Sort Key: pt.id
-> Index Only Scan using part_tab_p0_pkey on part_tab_p0 pt_1 (actual rows=0 loops=1)
Heap Fetches: 0
-> Index Only Scan using part_tab_p1_pkey on part_tab_p1 pt_2 (actual rows=0 loops=1)
Heap Fetches: 0
-> Sort (actual rows=0 loops=1)
Sort Key: nt.part_tab_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on normal_table nt (actual rows=0 loops=1)
Planning Time: 0.325 ms
Execution Time: 0.037 ms
(14 rows)
PG16 EXPLAIN output
QUERY PLAN
-----------------------------------------------------
Seq Scan on normal_table nt (actual rows=0 loops=1)
Planning Time: 0.244 ms
Execution Time: 0.015 ms
(3 rows)
It's important to note that the PostgreSQL 16 plan does not include joining part_tab
which means all you need to do is scan normal_table
.
9. Use Limit instead of Unique to implement DISTINCT whenever possible (David Rowley)
Use Limit instead of Unique to implement DISTINCT, when possible
The PostgreSQL query planner may not enable scheduling nodes to remove duplicate results if it detects that all rows contain the same value. This is not difficult to detect, and optimization can lead to huge performance gains.
-- Setup
CREATE TABLE abc (a int, b int, c int);
INSERT INTO abc SELECT a%10,a%10,a%10 FROM generate_series(1,1000000)a;
VACUUM ANALYZE abc;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT DISTINCT a,b,c FROM abc WHERE a = 5 AND b = 5 AND c = 5;
PG15 EXPLAIN output
QUERY PLAN
------------------------------------------------------------------------
Unique (actual rows=1 loops=1)
-> Gather (actual rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Unique (actual rows=1 loops=3)
-> Parallel Seq Scan on abc (actual rows=33333 loops=3)
Filter: ((a = 5) AND (b = 5) AND (c = 5))
Rows Removed by Filter: 300000
Planning Time: 0.114 ms
Execution Time: 30.381 ms
(10 rows)
PG16 EXPLAIN output
QUERY PLAN
---------------------------------------------------
Limit (actual rows=1 loops=1)
-> Seq Scan on abc (actual rows=1 loops=1)
Filter: ((a = 5) AND (b = 5) AND (c = 5))
Rows Removed by Filter: 4
Planning Time: 0.109 ms
Execution Time: 0.025 ms
(6 rows)
If you look closely at the SQL query, you will notice that each column in DISTINCT
also contains the equality condition in WHERE
. This means that all output rows in the query will have the same values in each column. The PostgreSQL 16 query planner can take advantage of this knowledge and simply use LIMIT
convert query results into 1 line. PostgreSQL 15 produced the same query result by reading all the results in their entirety and using the operator Unique
to reduce all lines to one line. Execution Time
for PostgreSQL 16 was more than 1200 times fasterthan for PostgreSQL 15.
10. Relaxed too strict rules in select_outer_pathkeys_for_merge() (David Rowley)
Relax overly strict rules in select_outer_pathkeys_for_merge()
Before PostgreSQL 16, when the query planner considered doing Merge Join
it checked whether the merge sort order matches any top-level plan operation (such as DISTINCT
, GROUP BY
or ORDER BY
), and used this order only if it exactly met the requirements for the top level. This option has been slightly deprecated since for these top level operations you can use Incremental Sorts
and incremental sorts can take advantage of results that are pre-sorted by only some of the leading columns on which the results need to be sorted.
PostgreSQL 16 query planner has changed the rule used when considering order Merge Join
With “the order of the lines must match exactly” on “there must be at least 1 leading column, properly ordered“. This allows the scheduler to use Incremental Sorts
to put the rows in the correct order to perform top-level operations. We learned earlier in this blog that incremental sort, when possible, requires less work than full sort because incremental sort can use partially sorted input and perform the sort in smaller chunks, resulting in less memory consumption and fewer comparisons overall .
-- Setup
CREATE TABLE a (a INT, b INT);
CREATE TABLE b (x INT, y INT);
INSERT INTO a SELECT a,a FROM generate_series(1,1000000) a;
INSERT INTO b SELECT a,a FROM generate_series(1,1000000) a;
VACUUM ANALYZE a, b;
SET enable_hashjoin=0;
SET max_parallel_workers_per_gather=0;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
SELECT a,b,count(*) FROM a INNER JOIN b ON a.a = b.x GROUP BY a,b ORDER BY a DESC, b;
PG15 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------------------
GroupAggregate (actual rows=1000000 loops=1)
Group Key: a.a, a.b
-> Sort (actual rows=1000000 loops=1)
Sort Key: a.a DESC, a.b
Sort Method: external merge Disk: 17664kB
-> Merge Join (actual rows=1000000 loops=1)
Merge Cond: (a.a = b.x)
-> Sort (actual rows=1000000 loops=1)
Sort Key: a.a
Sort Method: external merge Disk: 17664kB
-> Seq Scan on a (actual rows=1000000 loops=1)
-> Materialize (actual rows=1000000 loops=1)
-> Sort (actual rows=1000000 loops=1)
Sort Key: b.x
Sort Method: external merge Disk: 11768kB
-> Seq Scan on b (actual rows=1000000 loops=1)
Planning Time: 0.175 ms
Execution Time: 1010.738 ms
(18 rows)
PG16 EXPLAIN output
QUERY PLAN
---------------------------------------------------------------------------
GroupAggregate (actual rows=1000000 loops=1)
Group Key: a.a, a.b
-> Incremental Sort (actual rows=1000000 loops=1)
Sort Key: a.a DESC, a.b
Presorted Key: a.a
Full-sort Groups: 31250 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Merge Join (actual rows=1000000 loops=1)
Merge Cond: (a.a = b.x)
-> Sort (actual rows=1000000 loops=1)
Sort Key: a.a DESC
Sort Method: external merge Disk: 17672kB
-> Seq Scan on a (actual rows=1000000 loops=1)
-> Materialize (actual rows=1000000 loops=1)
-> Sort (actual rows=1000000 loops=1)
Sort Key: b.x DESC
Sort Method: external merge Disk: 11768kB
-> Seq Scan on b (actual rows=1000000 loops=1)
Planning Time: 0.140 ms
Execution Time: 915.589 ms
(19 rows)
In PG16 EXPLAIN you can see what has been used Incremental Sort
(compared to PG15 which used instead Sort
), and this led to a slight reduction Execution Time
in PG16 and a significant reduction in the amount of memory used to perform sorting.
Conclusion
PostgreSQL 16 has had a lot of engineering work done to improve the query planner by many engineers around the world.
Each of 10 scheduler improvements PostgreSQL 16described above, is enabled by default – and is either applied in all cases where optimization is possible, or applied selectively by the query planner when it believes that optimization will help.
If you are using an older version of PostgreSQL, I recommend that you try your workload on PostgreSQL 16 to see which of your queries run faster. And, as always, feedback on actual use of PostgreSQL is welcome on the pgsql-general@postgresql.org mailing list – you don't have to write only about problems, you can always share positive experiences.