Tuning the Linux kernel to improve memory performance

The material has been translated. Link to original


Linux tries to optimize memory usage by taking up free space in the cache. If the memory is not being used in any way, then it is wasted memory.

The cache is filled with data as the system is running, and when applications require memory, the kernel searches among the cache pages for a block of suitable size, frees it, and allocates it to the application.

In some cases, this process can have an impact on performance because it takes longer to free the cache than just accessing unused RAM. Therefore, sometimes you can observe a decrease in performance.

The reason for this is solely because RAM is being used at full capacity, and there may be no symptoms other than occasional episodic increases in latency. The same picture can be observed if the hard disk cannot cope with reading and writing. The impact may also be on such components of the operating system as the network card / iptables / ebtables / iproute2 – instead of the real reason, you see problems in network latency. In this article we will discuss this in more detail and see how to minimize the impact on the system.


Linux has several types of caches:

  • dirty cache – blocks of data that have not yet been written to disk (on file systems that support caching, such as ext4). This cache can be cleared with the sync command. Clearing this cache can result in poor performance. During normal operation, you should not do this unless you need to flush data to the hard drive, for example, in a disaster.

  • clean cache – data blocks that are located both on the hard disk and in memory to speed up access. Clearing the clean cache can cause performance degradation as all data will be read from disk.

  • inode cache – cache of inode location information. It can be cleared in the same way as the clean cache, but also with a consequent decrease in performance.

  • slab cache – stores objects allocated to applications using malloc so that in the future they can be re-allocated with already filled object data, which speeds up memory allocation.

FROM dirty cache little can be done, but other types of caches can be cleared. Cleaning them can have two results. In applications that consume a lot of memory, such as Aerospike, latency will decrease. But on the other hand, I / O speed will slow down, since all data will have to be read from disk.

Cleaning slab cache can lead to a temporary short-term reduction in speed. For this reason, clearing the cache is not recommended. Instead, it is better to tell the system that a certain amount of memory should always be free and should not be occupied by the cache.

If necessary, you can clear the cache as follows:

# clear page cache (above type 2 and 3)
$ echo 1 > /proc/sys/vm/drop_caches

# clear slab cache (above type 4)
$ echo 2 > /proc/sys/vm/drop_caches

# clear page and slab cache (types 2,3,4)
$ echo 3 > /proc/sys/vm/drop_caches

Most of the memory is page cacheso if you clear the cache, it is recommended to clear it (echo 1).

To fix the problem, you can set the minimum amount of free memory. Consider the following example:

Total RAM: 100GB
Used: 10GB
Buffers: 40GB
Minimum free: 10GB
Cache: 40GB

In this example, 10 GB of memory is free, limited by using the parameter minimum free… If you need to allocate 5 GB of memory, then you can do it instantly. A portion of the cache is freed to provide 10 GB of free memory. Memory allocation will be fast and the cache will shrink dynamically to keep 10 GB free at all times. The memory allocation will look like this:

Total RAM: 100GB
Used: 10GB
Buffers: 45GB
Minimum free: 10GB
Cache: 35GB

The exact setting of these parameters depends on your load. For Aerospike, if available memory permits, there should be at least 1.1 GB of free memory in min_free_kbytes… Then the cache will be sufficient, leaving room for applications.

$ cat /proc/sys/vm/min_free_kbytes

The setting is done as follows:

echo NUMBER > /proc/sys/vm/min_free_kbytes

NUMBER – the number of kilobytes that should be free in the system.

To leave 3% of the memory unused on a 100 GB computer, run the following command:

echo 3145728 > /proc/sys/vm/min_free_kbytes

Aerospike recommends keeping at least 1.1 GB in min_free_kbytes, i.e. 1153434.

On a system with more than 37 GB of total memory, keep no more than 3% free memory min_free_kbytesso that the kernel does not waste too much time on unnecessary memory reclamation. In such systems, this will be from 1.1 GB to 3% of the total RAM.

Be careful when setting this parameter; too small or too large a value can adversely affect system performance. Too low value min_free_kbytes will prevent the system from freeing memory. Which can lead to system freeze or destruction of processes via OOM.

Too large a value (5-10% of the total memory) will cause the system to quickly run out of memory. Linux uses all available RAM to cache file system data. Setting a high value min_free_kbytes can cause the system to spend too much time recovering memory.

RedHat recommends support min_free_kbytes at the level of 1-3% of the amount of memory in the system. At the same time, Aerospike recommends leaving at least 1.1 GB, even if this is higher than the officially recommended value.

It is also recommended to either decrease the parameter swappiness to zero, or do not use swap. In any case, for low latency operations, using a swap will dramatically decrease performance.

Set the value swappiness in 0to reduce potential latency:

echo 0 > /proc/sys/vm/swappiness

Notes (edit)

IMPORTANT: All changes mentioned above are NOT saved. They work only when the machine is running. To make the changes permanent, you need to make them in /etc/sysctl.conf

Add the following lines:

vm.min_free_kbytes = 1153434
vm.swappiness = 0

As always, be careful when editing these parameters. Check them out on test servers before making changes to the production environment.

Another parameter similar to the above is zone_reclaim… Unfortunately, this option triggers aggressive repairs and scans. Therefore, it is better to turn it off. In all new kernels and distributions, this option is disabled by default.

To check that zone_reclaim disabled use the following command:

$ sysctl -a |grep zone_reclaim_mode
vm.zone_reclaim_mode = 0

The translation of the material was prepared as part of the course “Administrator Linux. Advanced”.

If you are interested in learning more about the course, we invite you to Open Day online, where the teacher will talk about the training format and program.


The material has been translated. Link to original

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *