using the coordinate descent method to optimize DBMS parameters
For simplicity, the following parameters were selected for optimization:
max_wal_size: The maximum size that WAL can grow to during automatic checkpoints.
bgwriter_lru_maxpages: Sets the maximum number of buffers that a background recording process can write to during an activity round.
bgwriter_flush_after: When the background write process writes more than a specified amount of data, the server instructs the OS to write this data to the underlying storage. This limits the amount of dirty data in the kernel page cache and reduces the likelihood of slowdowns during execution.
fsync
at the end of a checkpoint or when the OS flushes data to disk in large chunks in the background.
Test load on the DBMS
Standard Load Creation Tool – pgbench
Options pgbench for one test iteration
--protocol=extended --report-per-command --jobs=24 --client=100 --transactions=10000 test_pgbench
Test duration: 30 minutes.
Optimization algorithm
Running the test
Mark the statistical average(median) DBMS performance value for the test period.
Change parameter value
Running the test
Mark the statistical average(median) DBMS performance value for the test period.
If the performance value has decreased, return to the previous parameter value. Select the next parameter to optimize and go to step 3. If all parameters for optimization have been sorted out – completion.
If the performance value has increased, go to step 3.
Implementation
1.max_wal_size=8192
Average statistical performance value = 235631.814286247
2.max_wal_size=16384
Average statistical performance value = 307091.478012516
The value of productivity has increased.
3.max_wal_size=32768
Average statistical performance value = 296544.027015618
The performance value has decreased, we return to the old parameter value max_wal_size=16384 .
We consider this parameter value max_wal_size=16384 – optimal for the given nature of the load.
Let's move on to the next parameter for optimization.
5.bgwriter_lru_maxpages = 800
Average statistical performance value = 332130,209038783
The value of productivity has increased.
6.bgwriter_lru_maxpages=1600
Average statistical performance value =310819.820905112
The performance value has decreased, we return to the old parameter value bgwriter_lru_maxpages = 800
We consider this parameter value bgwriter_lru_maxpages = 800 – optimal for the given nature of the load.
Let's move on to the next parameter for optimization.
7.bgwriter_flush_after = 32
Average statistical performance value = 331818.714844122
The productivity value has decreased compared to 5.bgwriter_lru_maxpages = 800, return to the old parameter value bgwriter_flush_after = 64
All selected parameters for optimization have been tested. The test is completed.
Optimal parameter values
Thus, for a given load nature, the optimal values are:
Change in DBMS performance as a result of optimization
Base value: 235631.814286247
Performance value after optimization: 310819.820905112
Performance gains: ~41%
Final pgbench test
pgbench initialization options
--no-vacuum --quiet --foreign-keys --scale=100 -i test_pgbench10
pgbench test parameters
--progress=60 --protocol=extended --report-per-command --jobs=24 --client=100 --time=1800 test_pgbench10
Results for basic parameter values
latency average = 10.616 ms
latency stddev = 8.605 ms
tps = 9333.052818 (without initial connection time)
pgbench (14.11)
transaction type: <builtin: TPC-B (sort of)>
default transaction isolation level: read committed
transaction maximum tries number: 1
scaling factor: 100
query mode: extended
number of clients: 100
number of threads: 24
duration: 1800 s
number of transactions actually processed: 16799220
latency average = 10.616 ms
latency stddev = 8.605 ms
initial connection time = 181.648 ms
tps = 9333.052818 (without initial connection time)
statement latencies in milliseconds:
0.012 \set aid random(1, 100000 * :scale)
0.002 \set bid random(1, 1 * :scale)
0.002 \set tid random(1, 10 * :scale)
0.002 \set delta random(-5000, 5000)
0.394 BEGIN;
0.648 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
0.388 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
1.020 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
3.170 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
0.678 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
4.331 END;
Results for optimized parameters
latency average = 9.759 ms
latency stddev = 7.164 ms
tps = 10147.221902 (without initial connection time)
pgbench (14.11)
transaction type: <builtin: TPC-B (sort of)>
default transaction isolation level: read committed
transaction maximum tries number: 1
scaling factor: 100
query mode: extended
number of clients: 100
number of threads: 24
duration: 1800 s
number of transactions actually processed: 18264638
latency average = 9.759 ms
latency stddev = 7.164 ms
initial connection time = 210.679 ms
tps = 10147.221902 (without initial connection time)
statement latencies in milliseconds:
0.011 \set aid random(1, 100000 * :scale)
0.002 \set bid random(1, 1 * :scale)
0.002 \set tid random(1, 10 * :scale)
0.002 \set delta random(-5000, 5000)
0.389 BEGIN;
0.519 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
0.380 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
0.918 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
2.912 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
0.562 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
4.088 END;
Results
It is necessary to test the applicability of the coordinate descent method for selecting the optimal value of other groups of DBMS configuration parameters
A further development of the idea would be the development of a methodology for adaptive optimization of DBMS parameters depending on the changing load on the DBMS.