Setting up memory, CPU and I

Basic Memory Configuration Parameters

shared_buffers determines the amount of memory allocated by the server for data caching. Recommended set it within 15-25% of the total amount of system RAM. For example, for a server with 32 GB of RAM, the recommended value is about 8 GB. Changing this setting requires a restart of the database server.

Work_mem atcontrols the amount of memory, used for sorting operations and hash tables during queries. Setting an adequate value work_mem can reduce the number of disk writes, which in turn leads to faster query execution. It is recommended to start with a low value, for example 32-64MBand adapt it in accordance with the observed performance and the sizes of temporary files recorded in the logs.

maintenance_work_mem hSets the maximum amount of memory for maintenance operationssuch as VACUUM and CREATE INDEX. In most cases it is safe to set this value significantly higher than work_mem, as this can speed up the cleaning process. The general recommendation is to set it to 10% of the total system memory, but not more than 1 GB.

effective_cache_size provides an estimate of memory capacity, available for caching on disk by the operating system and the database itself. The parameter is used by the PostgreSQL query planner to determine whether the query plans in question can fit in RAM. It is recommended to set its value to approximately 50% of the total amount of RAM on the server.

In order to apply these settings, you must modify the file postgresql.conflocated in the PostgreSQL data directory $PGDATA. For example:

shared_buffers="8GB"
work_mem = '64MB'
maintenance_work_mem = '1GB'
effective_cache_size="16GB"

After making changes, you must restart the database server to apply the new settings.

However, it is worth remembering that these values ​​are just a starting point. To achieve maximum performance, you need to benchmark your specific workload and adjust settings based on your results. It will be useful to use tools such as pgTunefor initial setup recommendations, and pgBadger.

Setting up WAL and autovacuum

WAL saves database changes in a special journal before committing them to the database. This allows you to recover your database after failures without losing important data.

max_wal_size And min_wal_size: parameters control the maximum and minimum log sizes. Increase max_wal_size can reduce the number of checkpoints and, as a result, reduce write latency, but increases the recovery time after a failure. It is recommended to find a balance that ensures that the WAL size is sufficient for the workload while maintaining an acceptable recovery time.

checkpoint_completion_target: the parameter sets the checkpoint completion goal as a fraction of the total time between checkpoints. A setting closer to 1.0 distributes the I/O load more evenly, reducing the likelihood of sudden latency spikes.

Auto cleaning (autovacuum) designed to remove dead tuples. Unoptimized autocleanup can lead to garbage accumulation and poor performance.

autovacuum_vacuum_scale_factor And autovacuum_vacuum_threshold: settings determine how often autocleanup should run for each table. The default values ​​are usually conservative, and adjusting them can help speed up the removal of dead tuples, especially on active databases.

autovacuum_max_workers: Increasing the number of auto-cleaning workflows can speed up the cleaning processespecially on systems with a large number of cores.

Example of setting up WAL and autovacuum:

-- WAL Configuration
max_wal_size="2GB"
min_wal_size="1GB"
checkpoint_completion_target = 0.9

-- Autovacuum Configuration
autovacuum_vacuum_scale_factor = 0.02
autovacuum_vacuum_threshold = 50
autovacuum_max_workers = 5

Disk Management

PostgreSQL uses temporary files for various operationssuch as sorting or hashingwhen there is not enough RAM. Parameter temp_file_limit limits the maximum size of all temporary files that a single process can use. If this limit is exceeded, the process will be aborted. For example, setting temp_file_limit="20GB" will limit the use of disk space to 20 GB for temporary files, which will help avoid disk clogging when performing operations that require a large amount of temporary storage.

effective_io_concurrency allows PostgreSQL to estimate how many I/O operations can be performed in parallel. For SSD, a value of approx. 200while for HDD lower values ​​will be more suitable, for example, 2 For effective_io_concurrencyconsidering that HDDs cope worse with parallel read operations due to mechanical limitations.

Setting example:

-- Ограничение использования временных файлов
temp_file_limit="20GB";

-- Оптимизация для SSD
effective_io_concurrency = 200;

-- Для HDD
effective_io_concurrency = 2;

other information

Configuration based on load types

For OLTP systemswhere performance depends on transaction processing speed, it is recommended to use processors with a large number of fast cores and a large L3 cache.

For OLAP systemsoriented towards analytical processing of large volumes of data, it is important to have a sufficient amount of RAM and fast drives (for example, SSD or NVMe), since this directly affects the speed of reading data and executing complex aggregation queries.

One of the tools that makes PostgreSQL configuration easier is pgTune. This tool analyzes the characteristics of your hardware and suggests optimized configuration settings for your use case, be it an OLTP or OLAP workload. PgTune allows you to configure settings such as the amount of memory used, sorting and maintenance options, and settings related to data saving and recovery after failures.

For performance improvements for OLAP workloadscan be used Foreign Data WrappersFor example, clickhousedb_fdw to integrate PostgreSQL with ClickHouse, a columnar DBMS optimized for analytical queries. This solution allows you to speed up the execution of OLAP queries by redirecting them to ClickHouse, where they are executed much faster than in PostgreSQL itself. This creates a heterogeneous environment in which each database management system performs the tasks for which it is optimized, thereby improving overall system performance.

Important Monitoring Metrics

  • Active sessions: The maximum number of active sessions must be configured via the parameter max_connections. It is recommended that active compounds do not exceed 90% of max_connections.

  • Logs: Log monitoring allows you to identify errors such as long queries, authentication problems, deadlocks or any fatal errors.

  • Query performance: Query tracking allows you to optimize your database and improve its performance​.

  • Replication delays: Replication health monitoring helps ensure high availability and data consistency between nodes.

  • Cache hit rate: Ideally, at least 99% of data reads come from the cache.

  • Ratio of index scans to total number of scans: It is desirable for this ratio to be above 99% for optimized databases.

  • Deadlock rate: Deadlock monitoring helps prevent additional deadlocks. load on OS resources and delays in the future.

Monitoring Tools

  • pg_stat_statements: Tracks scheduling and execution statistics for all SQL queries executed by the database server​.

  • pgBuffercache: gives an idea of ​​which pages of which database objects are currently in the shared cache​.

  • pgBadger: PostgreSQL log analyzer providing detailed reports​.

  • Prometheus with PostgreSQL Exporter: Integrates with PostgreSQL Exporter to extract database metrics such as queries per second, rows processed per second, database locks, active sessions, replications, etc.


Remember the importance of testing any changes in a secure environment before applying them to production servers to avoid any unwanted surprises.

I also invite you to free webinar System Analyst course. Advanced

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *