Impact of MD checking on performance and methods to reduce the impact on system performance

MD (Multiple Device) — is a technology in Linux that allows combining several physical disks into one logical drive using various RAID (Redundant Array of Independent Disks) schemes. mdXXX (hereinafter md disk) is one of the devices created using this technology. To determine the impact of checking the status of the md disk array on system performance, several aspects must be considered.

  1. Array Status Check (MD Check)
    Health checking (or md check) is a process by which Linux checks the integrity and consistency of data on all disks in an array.
    This may include:

    a) Checking and restoring data integrity.

    b) Alignment (resync) of data between mirrored disks.

    d) Data redistribution for RAID 5/6.

  1. Impact on performance
    Checking the health of an array can significantly impact system performance.
    Here are some key points:

    A) I/O Operations: Array checking causes intensive read and write operations, which can slow down access to data on these disks for other processes.

    b) CPU Usage: The status checking process uses processor resources for computation and I/O management.

    V) Delays: High disk access latencies can impact applications that work with data on those disks.

  1. Managing the impact of testing on performance

    To minimize the impact of health checking on performance, you can apply the following techniques:

    Setting sysctl parameters:
    dev.raid.speed_limit_min And dev.raid.speed_limit_max : These parameters allow you to limit the minimum and maximum speed of array checking and alignment. For example:
    You can view the current settings in the following way:

сat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max

The obtained result is Kbit/second
accordingly we will write for the minimum speed 500 Mbps
and maximum 2000 Mbps
to check at the least loaded time, we increase or decrease for the period with the greatest load

echo 500000 > /proc/sys/dev/raid/speed_limit_min
echo 2000000 > /proc/sys/dev/raid/speed_limit_max

3.1. Planning the inspection: Perform tests during periods of low system load (such as at night or on weekends).

3.2. QoS (Quality of Service): Use QoS mechanisms to manage I/O operation priorities.
3.1.1 Planning the audit Specify the time to perform the check after CRON :

0 3 * * 0 /usr/share/mdadm/checkarray --all

3.1.2 Or by means Systemd Timer :
Create a timer file:

sudo nano /etc/systemd/system/mdcheck.timer
Add the following content to the file:

[Unit]
Description=Run RAID check every Sunday at 3 AM

[Timer]
OnCalendar=Sun 03:00:00
Persistent=true

[Install]
WantedBy=timers.target

Create a service file that will be started by the timer:

sudo nano /etc/systemd/system/mdcheck.service
Add the following content to the file:

[Unit]
Description=Check RAID arrays

[Service]
Type=oneshot
ExecStart=/usr/share/mdadm/checkarray --all

Reload systemd configuration and enable the timer:

sudo systemctl daemon-reload
sudo systemctl enable --now mdcheck.timer

3.2.QoS (Quality of Service)
Usage Quality of Service (QoS) to manage the load on the disk system during RAID array health checks can help minimize the impact of these operations on performance. In Linux, you can use tools such as ionic And cgroups to manage input/output (I/O) priorities.

3.2.1 Using ionice
ionic Allows you to set I/O priorities for processes in Linux.

Example of use ionic :
Define a class and priority level for checking the array state. For example, you can use the idle class so that the check is performed only when the system is not busy with other operations:

ionice -c 3 /usr/share/mdadm/checkarray --all
Here:

-c 3 specifies that the process should use the idle class, that is, perform I/O operations only when other processes are not active.
Turn on ionic command in your cron job or systemd service.

Example with Cron :

0 3 * * 0 ionice -c 3 /usr/share/mdadm/checkarray --all

Example with Systemd Service :

[Unit]
Description=Check RAID arrays

[Service]
Type=oneshot
ExecStart=/usr/bin/ionice -c 3 /usr/share/mdadm/checkarray --all

3.2.2 Using cgroups
Cgroups (Control Groups) are a mechanism that allows you to limit and control the resources used by groups of processes. You can use cgroups to limit the use of I/O resources.

Example of use cgroups :
Create a new cgroup to test the RAID array:

sudo cgcreate -g blkio:/raid_check
Set I/O limits for this cgroup . For example, you can limit the read/write speed for devices:

sudo cgset -r blkio.throttle.read_bps_device="8:0 1048576" raid_check
sudo cgset -r blkio.throttle.write_bps_device="8:0 1048576" raid_check

Here:

8:0 — this is your device identifier (you can find out using the command lsblk or
ls -l /dev/disk/by-id) .
1048576 — this is the speed limit in bytes per second ( 1 MB/s ).
Run the array state check in the context of the new cgroup:

sudo cgexec -g blkio:raid_check /usr/share/mdadm/checkarray --all
Include this command in your cron job or systemd service.

Example with Cron:

0 3 * * 0 sudo cgexec -g blkio:raid_check /usr/share/mdadm/checkarray --all
Example with Systemd Service:

[Unit]
Description=Check RAID arrays

[Service]
Type=oneshot
ExecStart=/usr/bin/cgexec -g blkio:raid_check /usr/share/mdadm/checkarray --all

Instead of a Conclusion
Checking the health of an md disk array in Linux can significantly affect system performance especially during periods of heavy disk usage. However, by adjusting the scan speed settings and scheduling these operations, you can minimize the negative impact on performance.
Using QoS by using ionic And cgroups allows flexible management of the system load during RAID array health check. ionice provides a simple way to manage I/O priorities, while cgroups offers more detailed control and the ability to limit various resources. The choice of the appropriate method depends on the specific requirements and capabilities of the system..

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *