automation of fuzzing testing using ClusterFuzz as an example

This is the second part of a series of articles on fuzzing testing from employees of the “Secure Development” direction of the UCSB Cybersecurity Center. We have practical experience in testing programs, and we are ready to share our knowledge.

The first part of the series of articles is available here.

Introduction

In the previous part, we introduced the concept of fuzzing, compared approaches to conducting this type of testing, and talked about the requirements of Russian legislation. In this article, we will delve into the technical features of fuzzing testing, touch on some specialized terms and talk about fuzzing farms.

Fuzzing testing is already an integral part of ensuring application security in the modern digital world. With the ever-increasing threat of cyber attacks and the emergence of new vulnerabilities, the use of innovative testing methods such as fuzzing has become an important element of the application security strategy.

With the growing complexity of modern applications and increasing requirements for their security, the need for effective testing becomes more urgent. Fuzzing provides a unique approach that automates the vulnerability discovery process, providing high code coverage and identifying hidden errors that might otherwise go undetected with traditional testing methods. Together with methods such as SAST, DAST and others, fuzzing allows you to identify various classes of vulnerabilities.

How the fuzzer works

Fuzzers are fuzz testing tools that automatically generate large amounts of input data to detect vulnerabilities in software. Here is a brief description of some types of fuzzers:

1. Generative fuzzers (grammar-based)

  • principle of operation: they use grammatical rules to generate more realistic test cases;

  • advantages: generating all possible states of a protocol or file format allows you to increase code coverage and the likelihood of detecting complex errors;

  • Disadvantages: require preliminary configuration of the “grammar”, can take a long time on unknown protocols or file formats.

2. Mutation fuzzers

  • principle of operation: the original test data is changed to create new variants;

  • advantages: quick to set up;

  • Disadvantages: May not detect more complex vulnerabilities that require significant data changes, will never actually reproduce all protocol or file format states.

Mutation fuzzing is currently the focus of attention. AFL (American Fuzzy Lop) played a key role in the development of fuzzing testing due to its efficient test data mutation mechanism. AFL was also one of the first to use the code coverage approach.

Each type of fuzzer has its own advantages and limitations, and the choice depends on the specific testing requirements and characteristics of the target application. Sometimes a combined approach using several types of fuzzers can be most effective for detecting a variety of vulnerabilities.

Program code instrumentation

To receive feedback on test data from the assessment object, an instrumentation procedure is carried out on it. Software instrumentation in the context of fuzzing is useful for collecting information about which parts of the program were achieved during testing, which in turn helps determine the effectiveness of the overall testing.

The instrumentation process involves adding additional instructions to the program's source code. This code is designed to collect information about the program execution process and track certain events.

Speaking about software instrumentation within the framework of fuzzing, it is necessary to mention sanitizers. Sanitizers are tools that help identify errors in program code: from buffer overflows to thread races. They instrument program code, allowing you to monitor, for example, memory operations at runtime.

Thus, the use of sanitizers in the fuzzing process makes it possible to detect new software errors and collect additional information for their analysis.

Fuzzing farms

A fuzzing farm is a distributed system in which several instances of fuzzers perform vulnerability testing of applications in parallel. This approach brings several benefits:

  • Increasing code coverage

    A fuzzing farm allows for a wider coverage of possible vulnerabilities. By running tests in parallel on different instances of fuzzers, the system allows you to generate more use cases and detect vulnerabilities that would be ignored by a single tool.

  • Speeding up the testing process

    Parallel execution of test tasks on the fuzzer allows you to speed up the process of detecting vulnerabilities. A fuzzing farm can significantly reduce the time required to run tests, which is especially important when development deadlines are tight.

  • Working with large amounts of data

    Fuzzing farms are capable of processing large volumes of test data, which increases the likelihood of detecting various classes of vulnerabilities. This is especially useful when testing large and complex applications.

  • Easy to scale

    Fuzzing farms are easily scalable, allowing you to add new fuzzer instances to handle additional tests or expand test coverage as needed.

  • Automation and collection of results

    Fuzzing farms provide automated testing and collection of results. This makes it easier to manage the process and analyze the data obtained.

An example of a specific fuzzing farm – Clustefuzz.

ClusterFuzz is open source automated fuzzing testing software developed by Google. Its main purpose is to ensure software security and stability by detecting and eliminating vulnerabilities.

ClusterFuzz has a number of advantages, including an automated fuzzing testing process, scalability, integration with CI/CD, and analysis of results.

Google provides instructions for deploying and using this tool on the Google Cloud Platform, but does not provide instructions for deploying a fuzzing farm on a local distributed computing cluster. In this regard, we decided to develop instructions for local configuration of ClusterFuzz on a single node.

Installing ClusterFuzz

Cloning a repository

~$git clone https://github.com/google/clusterfuzz.git

Installing dependencies

Installing golang

Install using the package manager:

$ sudo apt install golang

Installing dependencies with a script

~/clusterfuzz$ local/install_deps.bash

To explicitly specify the Python version, use an environment variable: $ PYTHON=python3.7 ./local/install_deps.bash

Functionality check

We activate the virtual environment with all dependencies:

~/clusterfuzz$ pipenv shell

Let's run the main script through which clustefuzz is controlled:

~/clusterfuzz$ python3 butler.py –help

usage: butler.py [-h]

{bootstrap,py_unittest,js_unittest,format,lint,package,deploy,run_server,run,run_bot,remote,clean_indexes,create_config,integration_tests}

Butler is here to help you with command-line tasks.

positional arguments:

{bootstrap,py_unittest,js_unittest,format,lint,package,deploy,run_server,run,run_bot,remote,clean_indexes,create_config,integration_tests}

bootstrap Install all required dependencies for running an appengine, a bot,and a mapreduce locally.

py_unittest Run Python unit tests.

js_unittest Run Javascript unit tests.

format Format changed code in current branch.

lint Lint changed code in current branch.

package Package clusterfuzz with a staging revision

deploy Deploy to Appengine

run_server Run the local Clusterfuzz server.

run Run a one-off script against a datastore (eg migration).

run_bot Run a local clusterfuzz bot.

remote Run command-line tasks on a remote bot.

clean_indexes Clean up undefined indexes (in index.yaml).

create_config Create a new deployment config.

integration_tests Run end-to-end integration tests.

optional arguments:

-h, –help show this help message and exit

There are no errors, which means that all the necessary modules have been installed and loaded.

To be more confident, you can run tests:

~/clusterfuzz$ python3 butler.py py_unittest -t appengine

————————————————– ——————–

Ran 609 tests in 72.576s

OK (skipped=9)

If you encounter a Python version error at this point, see the “Possible errors” section at the end of the article.

Server initialization

First server launch

~/clusterfuzz$ python3 butler.py run_server –bootstrap

String `[INFO] Listening at: http://0.0.0.0:9000 (5397)` indicates that the server has been successfully launched. From this moment on, it becomes possible to access the server’s web interface at the specified address.

The server uses other ports in addition to port 9000 (9004, 9008, 9009), so make sure that they are not occupied by other services and are not blocked by the firewall, or change them to others in the file clusterfuzz/src/local/butler/constants.py

Running a local fuzzing bot

Let's launch the local environment:

$ pipenv shell

Let's launch the bot, specifying the directory where it is located:

~/clusterfuzz$ python3 butler.py run_bot ../bot

The bot logs its activity in the bot.log file:

~/bot/clusterfuzz/bot/logs$ tail bot.log

2024-02-07 08:29:08,860 – run_bot – INFO – Using local source, skipping source code update.

2024-02-07 08:29:08,860 – run_bot – INFO – Running platform initialization scripts.

2024-02-07 08:29:09,366 – run_bot – INFO – Completed running platform initialization scripts.

2024-02-07 08:29:09,678 – run_bot – ERROR – Failed to get any fuzzing tasks. This should not happen.

NoneType: None

The error on the last line occurs because the bot does not have tasks assigned to it.

At this point, the deployment of the local ClusterFuzz instance is complete.

Preparing a fuzzing target

Requirements

libFuzzer and AFL use the Clang compiler instrumentation. Requires Clang version 6.0 or higher.

Install the compiler via apt:

~$ sudo apt install clang

~$clang –version

clang version 10.0.0-4ubuntu1

Target: x86_64-pc-linux-gnu

Thread model: posix

InstalledDir: /usr/bin

Source

We'll show you how to detect a vulnerability Heartbleed using libFuzzer and ClusterFuzz.

Heartbleed is a vulnerability in OpenSSL 1.0.1 discovered in 2014. Occurs due to insufficient input validation, as a result of which a potential attacker has the opportunity to read arbitrary sections of the server’s RAM, including confidential data (keys, passwords, etc.).

First, let's download the vulnerable version of OpenSSL:

~$ curl -O https://www.openssl.org/source/old/1.0.1/openssl-1.0.1f.tar.gz && tar xf openssl-1.0.1f.tar.gz

Preparing for assembly:

~$ cd openssl-1.0.1f/ && ./config

Building a vulnerable version of OpenSSL:

~/openssl-1.0.1f$ make CC=”clang -g -fsanitize=address,fuzzer-no-link”

Download the fuzzing target and the prepared certificate:

$ curl -O https://raw.githubusercontent.com/google/clusterfuzz/master/docs/setting-up-fuzzing/heartbleed/handshake-fuzzer.cc

$ curl -O https://raw.githubusercontent.com/google/clusterfuzz/master/docs/setting-up-fuzzing/heartbleed/server.key

$ curl -O

https://raw.githubusercontent.com/google/clusterfuzz/master/docs/setting-up-fuzzing/heartbleed/server.pem

Assembling a fuzzing target

Let's start building the tested executable file with instrumentation using clang++:Z

~$ clang++ -g handshake-fuzzer.cc -fsanitize=address,fuzzer openssl-1.0.1f/libssl.a openssl-1.0.1f/libcrypto.a -std=c++17 -Iopenssl-1.0.1f/include/ -lstdc++fs -ldl -lstdc++ -o handshake-fuzzer

Let's add the compiled target and certificate to the archive for uploading to the server:

~$ zip openssl-fuzzer-build.zip handshake-fuzzer server.key server.pem

Fuzzing

Create a task

To create a task, you need to open the page at http://:9000/jobs and find the form called “ADD NEW JOB”.

Let's fill in the required fields:

  • Name: libfuzzer_asan_linux_openssl

  • Platform: Linux

  • Select/modify fuzzers: libFuzzer

  • Description: heartbleed example

  • Templates: engine_asan and libfuzzer

  • Custom Build: upload the zip archive generated in the previous section

  • Environment: CORPUS_PRUNE=True (to remove inputs that do not increase code coverage).

After clicking the ADD button, the target will be loaded into the data storage, and the server will assign the fuzzing task to the bot.

If at this stage the error “Failed to upload” occurs, please refer to the “Possible errors” section at the end of the article.

In the bot logs you can observe the fuzzing testing process:

~/bot/clusterfuzz/bot/logs$ tail -f bot.log

Some time later, a stack trace and the line AddressSanitizer: heap-buffer-overflow will appear in the log files.

After a while, information about the found defect will appear on the main page of the server (/testcases): Heap-buffer-overflow READ{}.

Possible problems

Downgrade to python3.7

Clusterfuzz requires Python3.7. By default, Ubuntu 20.04 comes with Python3.8 pre-installed, so in case of errors with the Python version, you must install an earlier version. This can be done by building Python from source, installing from the deadsnakes PPA, or using a version manager asdf. Let's describe the installation using the latter method.

Installing asdf

Clone the repository:

~$ git clone https://github.com/asdf-vm/asdf.git ~/.asdf –branch v0.14.0 # 0.14.0 – the most current version at the time of writing

Install asdf:

~$ echo '. “$HOME/.asdf/asdf.sh”' >> .bashrc

~$ echo '. “$HOME/.asdf/completions/asdf.bash”' >> .bashrc

Restart the shell and check that everything is installed:

~$ asdf –version

v0.14.0-ccdd47d

Installing Python3.7

Install the Python plugin:

~$ asdf plugin-add python

Install the packages required to build Python:

sudo apt-get install curl gcc libbz2-dev libev-dev libffi-dev libgdbm-dev liblzma-dev libncurses-dev libreadline-dev libsqlite3-dev libssl-dev make tk-dev wget zlib1g-dev

Install Python3.7.4:

~$ asdf install python 3.7.4

Let's activate Python3.7.4:

~$ asdf global python 3.7.4

Checking the version:

~$python3 –version

Python 3.7.4

Failed to upload. (user@localhost)

This error may occur when trying to create a fuzzing job.

All objects must be uploaded to Google Cloud Storage, so running ClusterFuzz locally emulates the data storage. Local/emulators/gcs.go is responsible for this. By default, the emulator runs on localhost:9008. To solve the problem, you need to change the listening address to 0.0.0.0:9008:

~/clusterfuzz$ sed -i 's/localhost:%d/0.0.0.0:%d/g' local/emulators/gcs.go

It remains to indicate the new address of the emulated Google Cloud Storage to our server:

~$ SERVER_IP= #eg 192.168.1.2

~/clusterfuzz$ sed -i 's|http://localhost|http://'”$SERVER_IP”'|g' src/local/butler/constants.py

Useful discussions

Google doesn't provide instructions for deploying ClusterFuzz on a distributed local computing cluster, but there are several discussions on the topic on the project's official GitHub repository where users share their ideas:

Conclusion

ClusterFuzz is a powerful fuzz testing automation tool that enables efficient detection of software vulnerabilities. In this article, we walked through the process of deploying ClusterFuzz locally on a single compute node, highlighting the practical steps and benefits of the approach.

Fuzzing testing is becoming an integral part of application security strategy. Using fuzzing allows you to identify vulnerabilities that might otherwise go undetected using traditional testing methods. ClusterFuzz makes this process even more efficient and convenient, opening up new opportunities to improve security in the world of software.

Author: Roman Kornilov, analyst of the secure development direction of the UCSB

Vacancies are open in the direction of safe development of UCSB leading systems analyst, DevSecOps engineer, AppSec Engineer. Follow the links to find out more and become part of the team that conducts fuzz testing on cool projects and creates a cloud platform for analyzing application security.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *