How PIDs are generated in Linux
Review
Have you ever wondered what happens behind the scenes when we start or terminate a process? In this tutorial, we will learn how Linux generates PIDs for processes.
Process table in Linux
The Linux kernel uses a data structure called the process table for various tasks such as process scheduling. Every time we start a process, the kernel inserts an entry into the table with the following information:
PID
Parenting process
Environment variables
Past tense
Status is one of D (Uninterrupted), R (Performed), S (Sleeping), T (Stopped) or Z (Zombie)
Memory usage
We can get this information through the file system. procfs, mounted in the catalog /proc by using various resource monitoring tools such as How top.
Let's take a look at some of this data by running the command top:
Mem: 4241112K used, 12106916K free, 360040K shrd, 20K buff, 1772160K cached
CPU: 0.8% usr 0.8% sys 0.0% nic 98.3% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 0.26 0.33 0.35 1/489 21888
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
1 0 root S 2552 0.0 10 0.0 init
30405 30404 baeldung S 2552 0.0 7 0.0 /bin/sh
219 1 root S 2552 0.0 10 0.0 /bin/getty 38400 tty2
1309 1308 baeldung S 2552 0.0 9 0.0 /bin/sh
21873 640 baeldung S 2552 0.0 5 0.0 [sleep]
21874 21758 baeldung R 2552 0.0 8 0.0 top
PID generation
Linux assigns process IDs sequentially, starting from 0 and up to a maximum configured value.
Behind the kernel process idle task, which ensures that there is always a task ready to run, is reserved as PID 0, and PID 1 is reserved for the init system, which is the first process.
The process with PID 0 is an idle task, which is a “dummy process.” It runs when no other processes can run, ensuring that the processor remains active and ready to handle other tasks as they become available.
We can check the maximum allowed PID value in the system by looking into the file /proc/sys/kernel/pid_max. Usually it is a 5-digit number:
$ cat /proc/sys/kernel/pid_max
32768
We can set the limit to a maximum of 222 (4 194 304), writing the desired number to the file under root:
# desired=4194304# echo {pid##*/}" # Извлечение PID [ "highest" ] && highest="highest"for _ in (readlink /proc/self)"done
Our second attempt gave an error because the number is greater than 2.22.
When we start a process, a PID is generated for it, which allows us to uniquely identify it. This is done simply by incrementing the current highest PID by 1.
Let's confirm this with a simple script:
!/bin/sh -e
# Этот скрипт предполагает, что printf является встроенной функцией shell и, следовательно, не занимает лишних PID.
highest=0
for pid in /proc/[0-9]*; do
pid="${pid##*/}" # Извлекаем PID
[ "$pid" -gt "$highest" ] && highest="$pid" # -gt означет больше, чем ("greater than")
done
printf "Найбольший PID равен %d\n" "$highest"
for _ in $(seq 4); do
printf "Запущен новый процесс с PID %d\n" "$(readlink /proc/self)"
done
First we calculated the current largest PID in the system. Then we started four processes readlinkeach of which displayed the new PID assigned to it.
Let's take a look at the output of the script:
Наибольший PID - 10522
Запущен новый процесс с PID 10524
Запущен новый процесс с PID 10525
Запущен новый процесс с PID 10526
Запущен новый процесс с PID 10527
We can notice that there is a gap of 1 PID between checking the highest PID and printing the new PIDs, since the command itself seq itself starts another additional process. Therefore, External processes affect such tests because the processes themselves create new PIDs.
Can PIDs run out?
In the previous section we discussed the maximum PID value in the system, so what happens when we reach this limit?
If we reach the maximum PID value, then the search for the next PID starts again from the minimum value and continues until a free one is found. A PID belonging to a process that has already terminated.
It should be noted that the process is considered “completed” only if its completion status has been received by its parent. Thus, the malware can exhaust the available PIDs in the system.
Conclusion
In this article, we learned about process IDs in Linux – how they are generated, how high they can go, and what happens when the limit is reached.