How PIDs are generated in Linux

Review

Have you ever wondered what happens behind the scenes when we start or terminate a process? In this tutorial, we will learn how Linux generates PIDs for processes.

Process table in Linux

The Linux kernel uses a data structure called the process table for various tasks such as process scheduling. Every time we start a process, the kernel inserts an entry into the table with the following information:

  • PID

  • Parenting process

  • Environment variables

  • Past tense

  • Status is one of D (Uninterrupted), R (Performed), S (Sleeping), T (Stopped) or Z (Zombie)

  • Memory usage

We can get this information through the file system. procfs, mounted in the catalog /proc by using various resource monitoring tools such as How top.

Let's take a look at some of this data by running the command top:

Mem: 4241112K used, 12106916K free, 360040K shrd, 20K buff, 1772160K cached
CPU:  0.8% usr  0.8% sys  0.0% nic 98.3% idle  0.0% io  0.0% irq  0.0% sirq
Load average: 0.26 0.33 0.35 1/489 21888
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
    1     0 root     S     2552  0.0  10  0.0 init
30405 30404 baeldung S     2552  0.0   7  0.0 /bin/sh
  219     1 root     S     2552  0.0  10  0.0 /bin/getty 38400 tty2
 1309  1308 baeldung S     2552  0.0   9  0.0 /bin/sh
21873   640 baeldung S     2552  0.0   5  0.0 [sleep]
21874 21758 baeldung R     2552  0.0   8  0.0 top

PID generation

Linux assigns process IDs sequentially, starting from 0 and up to a maximum configured value.

Behind the kernel process idle task, which ensures that there is always a task ready to run, is reserved as PID 0, and PID 1 is reserved for the init system, which is the first process.

The process with PID 0 is an idle task, which is a “dummy process.” It runs when no other processes can run, ensuring that the processor remains active and ready to handle other tasks as they become available.

We can check the maximum allowed PID value in the system by looking into the file /proc/sys/kernel/pid_max. Usually it is a 5-digit number:

$ cat /proc/sys/kernel/pid_max 
32768

We can set the limit to a maximum of 222 (4 194 304), writing the desired number to the file under root:

# desired=4194304# echo {pid##*/}" # Извлечение PID    [ "highest" ] && highest="highest"for _ in (readlink /proc/self)"done

Our second attempt gave an error because the number is greater than 2.22.

When we start a process, a PID is generated for it, which allows us to uniquely identify it. This is done simply by incrementing the current highest PID by 1.

Let's confirm this with a simple script:

!/bin/sh -e
# Этот скрипт предполагает, что printf является встроенной функцией shell и, следовательно, не занимает лишних PID.
highest=0

for pid in /proc/[0-9]*; do
    pid="${pid##*/}" # Извлекаем PID
    [ "$pid" -gt "$highest" ] && highest="$pid" # -gt означет больше, чем ("greater than")
done

printf "Найбольший PID равен %d\n" "$highest"

for _ in $(seq 4); do
    printf "Запущен новый процесс с PID %d\n" "$(readlink /proc/self)"
done

First we calculated the current largest PID in the system. Then we started four processes readlinkeach of which displayed the new PID assigned to it.

Let's take a look at the output of the script:

Наибольший PID - 10522
Запущен новый процесс с PID 10524
Запущен новый процесс с PID 10525
Запущен новый процесс с PID 10526
Запущен новый процесс с PID 10527

We can notice that there is a gap of 1 PID between checking the highest PID and printing the new PIDs, since the command itself seq itself starts another additional process. Therefore, External processes affect such tests because the processes themselves create new PIDs.

Can PIDs run out?

In the previous section we discussed the maximum PID value in the system, so what happens when we reach this limit?

If we reach the maximum PID value, then the search for the next PID starts again from the minimum value and continues until a free one is found. A PID belonging to a process that has already terminated.

It should be noted that the process is considered “completed” only if its completion status has been received by its parent. Thus, the malware can exhaust the available PIDs in the system.

Conclusion

In this article, we learned about process IDs in Linux – how they are generated, how high they can go, and what happens when the limit is reached.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *