Adapting fuzzing to find vulnerabilities

Fuzzing is a very popular method of testing software with random input data. There is a huge amount of material on the Internet about how to find software defects with its help. At the same time, there are almost no articles or speeches in the public space about how to search for vulnerabilities with fuzzing. Perhaps security researchers do not want to share their secrets. It is therefore interesting to consider this topic in this article.

I have been researching operating system security and fuzzing for several years. I like the tool because it allows you to delegate the tedious task of writing software tests to the computer. However, the way fuzzing is used can vary greatly depending on the purpose of its use.

In particular, a developer uses a fuzzer for his code to find all the errors in it. Therefore, a developer usually includes all available error detectors in his project and analyzes all the cases of their triggering that are detected during fuzzing.

The security researcher has different goals:

  1. Unlike the developer, he is not interested in all the errors in the code, but looks specifically for vulnerabilities. These are errors that can be provoked by an attacker interacting with the attack surface of the system.

  2. Moreover, for a security researcher, errors that can be triggered relatively quickly and reliably in a system are of great value.

  3. Finally, a security researcher strives to find unique vulnerabilities that are unlikely to be found by his competitors. It is very frustrating to spend time and effort analyzing a program glitch and then find out that someone else found and fixed it.

In this article I will discuss how the listed features affect the fuzzer setup and usage.

To make the article more specific, I will discuss my favorite kernel fuzzer syzkaller. It is a well-known open source project, it is used for dynamic kernel analysis in many operating systems. I have been using it for several years to research the security of the Linux kernel.

Syzkaller fuzzer architecture

The diagram shows the architecture of syzkaller – see Fig. 1.

  Figure 1. Architecture of the syzkaller fuzzer

Figure 1. Architecture of the syzkaller fuzzer

The main part and the main logic of the syzkaller fuzzer is located in the syz-manager component. It is responsible for managing virtual machines during fuzzing. Also, syz-manager works with a set of programs for testing the kernel, which is called the corpus. It adds new promising programs to the corpus and removes useless ones. These programs are essentially random input data for kernel fuzzing. They are written in a special language syzlang, which specifies the format and arguments of Linux system calls.

If during fuzzing the kernel fails (a kernel crash occurs), the fuzzer saves this event in the database and tries to generate a minimal reproducer – the shortest combination of system calls that can cause this error in the kernel.

The kernel fuzzing process itself occurs inside a virtual machine. Its user space contains parts of syzkaller that execute system calls and collect kernel code coverage metrics based on the results of testing. This information is passed to syz-manager, which uses it to select promising programs for the fuzzing corpus. This is a very effective technology called coverage guided fuzzing.

Also, when fuzzing Linux, kernel error detectors and so-called sanitizers are very important. They are needed to make the kernel fail when an abnormal situation occurs. Without them, the error that occurs, such as using memory after freeing, will not be detected and fuzzing will essentially be useless.

This is the basic architecture of the syzkaller fuzzer. Now let's look at how to adapt it to search for vulnerabilities in the Linux kernel.

How to Find Vulnerabilities in the Linux Kernel Using Fuzzing

Vulnerabilities in the Linux kernel can be divided into two classes:

  1. Local Privilege Escalation (LPE) vulnerabilities: When exploited, a local unprivileged user becomes root or another user with elevated privileges on the system.

  2. Remote Code Execution (RCE) vulnerabilities: When exploited, an attacker interacting with a Linux system over a network can execute arbitrary code in the kernel space.

In order for syzkaller to find only errors that can potentially lead to LPE, the only modification required is to run the fuzzer inside a virtual machine without administrator privileges. In this case, system calls will be executed under an unprivileged user account and only the Linux kernel attack surface will be tested, as shown in the diagram (see Figure 2).

  Figure 2. Running the fuzzer without administrator privileges

Figure 2. Running the fuzzer without administrator privileges

Finding bugs that could potentially cause RCE requires a different approach: fuzzing the Linux kernel's network interfaces. This is described in detail in Andrey Konovalov's excellent article Looking for Remote Code Execution bugs in the Linux kernel. In it, he showed the device of the virtual network interface TUN/TAP and a special call syz_emit_ethernet, which allow the syzkaller fuzzer to interact with the Linux kernel network stack.

  Figure 3. Interaction of the syzkaller fuzzer with the Linux kernel network stack

Figure 3. Interaction of the syzkaller fuzzer with the Linux kernel network stack

How to Find Consistently Manifesting Vulnerabilities

As mentioned above, for a security researcher, errors that can be triggered relatively quickly and reliably in a system are of great value.

syz-manager has a certain logic that is triggered when a kernel failure is detected. It starts testing the entire large set of system calls that caused the error, and gradually finds the minimal reproducing program that leads to the desired effect using the dichotomy method. This process is unstable due to various side effects and race conditions in the kernel. Therefore, when searching for a reproducing program, errors of the 1st and 2nd kind often occur.

To avoid wasting time and effort on analyzing unstable repeaters, a security researcher should design an automatic system for sorting fuzzing results (shown in the diagram – see Fig. 4). I also developed such automation for my search criteria. This can be easily done using the syz-repro utility, which allows you to repeat the process of identifying a minimal repeater several times.

  Figure 4. Automatic sorting of fuzzing results

Figure 4. Automatic sorting of fuzzing results

How to Find Unique Vulnerabilities

Let's move on to the most interesting part of the article and look at how to find unique vulnerabilities that are unlikely to be found by other researchers.

The thing is, it's impossible to find something unique using standard tools that everyone has. So you need to somehow modify your fuzzing process to find unique vulnerabilities.

In the diagram provided (see Fig. 5) I have marked in red and numbered the components of the syzkaller fuzzer and the Linux kernel that need to be modified in order to obtain unique findings.

  Figure 5. Modifiable components of the syzkaller fuzzer and the Linux kernel

Figure 5. Modifiable components of the syzkaller fuzzer and the Linux kernel

  1. The simplest idea is to limit the allowed Linux system calls that the fuzzer executes. This can be done in the syzkaller configuration. Using this method, you can narrow the attack surface that is fuzzed. This allows syzkaller to go deeper into the kernel code and get more coverage in the subsystem being examined.

  2. Another effective way to find yet undiscovered vulnerabilities is to develop new kernel API descriptions in syzlang. As mentioned above, syzlang is a special language that describes the format and arguments of kernel system calls. Those that are not yet described in syzkaller are not subject to fuzz testing and therefore represent an interesting target. Many vulnerabilities have been found by researchers using this method.

  3. There are many user-space fuzzers, and they compete with each other by improving the mechanisms of mutation of the fuzzing corpus and the use of symbolic execution. This growth area is also relevant for syzkaller: changing the mutation engine affects which code paths in the kernel the fuzzer touches. This means that it can help find unique vulnerabilities. However, such modification of the fuzzer requires deep immersion in its design.

  4. An easier way to influence the fuzzing process is to start with a specially prepared corpus. Many studies show that programs in the initial corpus (also called seeds) have a significant impact on the fuzzing process.

  5. Moving on to modifying Linux kernel components. Having the source code in the hands of a researcher makes it possible to do a great trick: modify the Linux kernel so that it becomes more convenient for fuzz testing. This is how I found the vulnerability CVE-2019-18683for which I then developed a prototype exploit, made a responsible disclosure, and developed a fixing patch. This Linux kernel vulnerability was hidden behind a kernel warning, and I found it by modifying the kernel to turn off all warnings. Changing the fuzz target can be very effective in finding new bugs.

  6. Now let's look at the most obvious way to differentiate yourself from your competitors – using even more computing power. The more servers you fuzz, the more virtual machines you run on them, the more kernel failures they detect. It's important that the researcher has the time and energy to analyze them.

  7. Another unusual way to find unique bugs in the Linux kernel is to modify the rootfs. The virtual machine file system image does not have a direct impact on the Linux kernel that is being fuzzed, but sometimes changes to the rootfs can have an unexpected effect and enable additional kernel APIs. This is how I discovered the vulnerability CVE-2017-2636for which I also managed to develop a prototype exploit and perform a responsible disclosure. In that case, I added compiled kernel modules to the VM file system image, and during fuzzing, the kernel automatically loaded the n_hdlc module, which then contained a bug that I analyzed.

  8. A rather complex but very effective approach is to modify sanitizers and other means of detecting errors in the Linux kernel. Some types of errors remain unnoticed during fuzzing because they are not tracked by detectors and, therefore, do not lead to a kernel crash. An example is an out-of-bounds error inside the sk_buff kernel object, which is a representation of a network packet in the Linux kernel memory. Developing a detector for this class of errors would allow fuzzing to detect vulnerabilities that potentially lead to RCE.

  9. Another approach common in user-space fuzzer development is targeted fuzzing, which limits the set of code to be tested. The same can be done when fuzzing the Linux kernel by configuring cover_filter in syzkaller or modifying the kernel kcov subsystem to collect coverage only for the Linux subsystem in which we are looking for vulnerabilities.

Conclusion

Having shared these ideas, I will conclude with a short story.

In 2021, I found a vulnerability in the Linux kernel CVE-2021-26708. When researching exploitation methods, I needed a special heap-spraying primitive — a kernel object whose size and contents can be controlled by an attacker from user space. None of the publicly known exploit primitives were suitable. After a long, tedious reading of the kernel source code, I decided to delegate this task to the computer and use fuzzing to find the required object. This is how I invented heap spraying using the msg_msg kernel object, which then became very popular in the research community.

So fuzzing is a great tool that can be useful to a security researcher for more than just finding vulnerabilities. However, fuzzing requires the researcher to be willing to risk their time and the computing power of their servers.

Thank you for your attention!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *