The material has been translated. Link to original article
Have you been tasked with designing a container-based infrastructure? And you most likely understand the benefits containers can bring to developers, architects, and operations teams. You’ve already read something about them and are looking forward to exploring this technology in more depth. However, before diving into discussing architecture and deploying containers in a production environment, there are three important things to know:
All applications, including containerized ones, use the core of the underlying OS.
The kernel provides APIs to applications through system calls.
The versions of this API are important, as it is the glue that provides deterministic communication between user space and kernel space.
Containers are sometimes thought of as virtual machines, but it is important to note that, unlike virtual machines, the kernel is the only layer of abstraction between programs and resources that need to be accessed. Let’s see why.
All processes make system calls:
And since containers are processes too, they also make system calls:
So, we have an understanding of what a process is, and that containers are also processes. But what about the files and programs inside the container? These files and programs are located in the so-called user space… When the container starts, the program from the container image is loaded into memory. But the program running in the container still needs to make system calls in kernel space… What is important is the ability for deterministic interactions between user space and kernel space.
User space refers to all operating system code that resides outside the kernel. Most Unix-like operating systems (including Linux) come with a variety of preinstalled utilities, development tools, and graphical tools — all user-space applications.
Applications can be written in C, Java, Python, Ruby, and other programming languages. In the container world, these programs usually come in a container image format like Docker. When you run an image in a container Red Hat Enterprise Linux 7 of Red Hat Registry, then you are using the preconfigured minimal user space of Red Hat Enterprise Linux 7, which contains utilities such as bash, awk, grep, and yum (for additional software installation).
docker run -i -t rhel7 bash
All user applications (both containerized and not) use different data when running, but where is this data stored? Some data comes from the registers of the processor and external devices, but more often it is stored in memory and on disk. Applications access data by making special requests to the kernel – system calls… For example, such as allocating memory (for variables) or opening a file. Memory and files often contain confidential information belonging to different users, so access to them must be requested from the kernel using system calls.
The kernel provides an abstraction for security, hardware, and internal data structures. For example the system call
open() used to get a file descriptor in Python, C, Ruby and other programming languages. You probably don’t want your program to work with XFS at the bit level, so the kernel provides system calls and works with drivers. In fact, this system call is so common that it is part of POSIX libraries …
In the following figure, notice that bash calls
getpid()which returns its own process ID. And the cat command requests access to
/etc/hosts by calling the file
open()… In the next article we’ll look at how this works in the container world, but for now, note that some of the code is in user space and some is in the kernel.
Regular user-space programs constantly make system calls to do their work, for example:
ls ps top bash
Some programs running in user space map almost directly to system calls, for example:
chroot sync mount/umount swapon/swapoff
Digging one level deeper, you can find examples of system calls that are performed by the programs listed above. They are usually called through libraries such as
glibc, or through the interpreter (Ruby, Python) or through the Java Virtual Machine.
open (files) getpid (processes) socket (network)
A typical program accesses kernel resources through multiple layers of abstraction, as shown in the following figure:
To get an idea of what system calls are available in the Linux kernel, see the man page
syscalls… It is interesting to note that I am running this command on my laptop with Red Hat Enterprise Linux 7, but using a Red Hat Enterprise Linux 6 container to see what the system calls have changed:
docker run -t -i rhel6-base man syscalls
SYSCALLS(2) Linux Programmer’s Manual SYSCALLS(2) NAME syscalls - Linux system calls SYNOPSIS Linux system calls. DESCRIPTION The system call is the fundamental interface between an application and the kernel. System call Kernel Notes ------------------------------------------------------------------------------ _llseek(2) 1.2 _newselect(2) _sysctl(2) accept(2) accept4(2) 2.6.28 access(2) acct(2) add_key(2) 2.6.11 adjtimex(2) afs_syscall(2) Not implemented alarm(2) alloc_hugepages(2) 2.5.36 Removed in 2.5.44 bdflush(2) Deprecated (does nothing) since 2.6 bind(2) break(2) Not implemented brk(2) cacheflush(2) 1.2 Not on i386
Note that according to the information in man, some system calls (also known as interfaces) have been removed and some have been added. Linus Torvalds and others are giving great attention to make the behavior of system calls clear and stable. On Red Hat Enterprise Linux 7 (kernel 3.10) available 382 system calls… New ones are added from time to time, and some are declared obsolete. This should be taken into account when considering the lifecycle of your container infrastructure and the applications that will run on it.
There are a few important things to know about user and kernel space:
Applications contain business logic but use system calls.
After compiling the program, the set of system calls used is embedded in a binary file (in higher-level languages, this is an interpreter or JVM).
Containers do not abstract (obviate) the need for user space and kernel space to use the same set of system calls.
In the container world, user space is bundled and deployed on a variety of hosts from laptops to production servers.
Problems may appear in the coming years.
Over time, it will be difficult to ensure that a container created today will work tomorrow. Imagine that it is 2024 (perhaps real hoverboards) and you are still running a containerized application in production that requires a Red Hat Enterprise Linux 7 user space. How can I safely upgrade my container host and infrastructure? Will a containerized application perform equally well on new hosts available in the market?
In the second part of this series (original in English. Architecting Containers Part 2: Why the User Space Matters) we’ll look at how the relationship between user space and kernel space influences architectural decisions and what can be done to minimize problems.
The translation of the material was prepared as part of the course “Administrator Linux. Professional”… We invite everyone to an open lesson “Using VPN Tunnels on Linux”…
During this webinar:
– find out what a VPN is;
– get to know the main types of VPN and compare them;
– Let’s analyze the configuration options for OpenVPN, try to understand the difference between them;
– let’s get acquainted with WireGuard, compare its performance with OpenVPN.