Fuzzing libraries
Not long ago, when I started learning web hacking, I found it interesting to research Linux and Windows for binary vulnerabilities. Although I think it’s possible to make money legally alone as a hacker in Russia only through web hacking, I still want to study all the interesting aspects of the attacking and defending side. Who knows, maybe someday I will be in the red team. Well, for now I’m just gnawing on the granite of science.
Having thought a little about solving the problem, I determined what was needed to implement my problem. I don’t know how others fuzz libraries that don’t have source code, but I came up with one option. Below are two examples for Linux and Windows.
Linux
First of all, I started developing a template for Linux. It was necessary to identify all the points that I would need to face. These items made up the following list:
The library does not have source codes
Which assembler language should I use to write code?
How to call functions from a dynamic library
Yes, option 3 seems like a very stupid question. But let me explain in more detail why I thought about it. I didn’t know how a dynamic library is linked to an assembler program. It’s understandable if we use dlopen, dlsym in a C program, but here we need functionality that would allow us to use C++ classes. I have never entered such a jungle for assembler.
I chose nasm assembler. I liked this assembler more than fasm, although I have used fasm before. Nasm is cross-platform, and as you will see later, it is also suitable for Windows development.
The library that needs to be checked for errors, the code of which I wrote as a fool, I did not give the executing part, only the header.
#ifndef TE_H
#define TE_H
#include <cstdio>
#include <string>
class Handler {
public:
Handler ();
private:
FILE *fp = {nullptr};
};
class V8 {
public:
V8 ();
int parse_string (Handler& handle, std::string& code);
};
Handler *create_handler ();
V8 *create_js ();
#endif
We need to pass lines of code to V8::parse_string and wait for a response in the form of correct, incorrect or segfault.
I also provide the title of the fuzzer.
#ifndef GETTER_H
#define GETTER_H
#include <string>
std::string *getter_string ();
#endif
In this case, the library passes a pointer to std::string with each call. It was more convenient to pass just a pointer, which does not carry with it anything other than storing the pointer in memory.
The next step was to collect the libraries and look at the names of the connecting functions using radare2. They became them.
extern _Z14create_handlerv
extern _Z9create_jsv
extern _Z13getter_stringB5cxx11v
extern _ZN2V812parse_stringER7HandlerRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Their encrypted symbols hid their definitions, and only the names of the functions made it clear that this was exactly what I was looking for.
Then all that remains is to write a program that takes the next line and sends it to a class in another library.
section .text
extern _Z14create_handlerv
extern _Z9create_jsv
extern _Z13getter_stringB5cxx11v
extern _ZN2V812parse_stringER7HandlerRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
global main
main:
sub rsp, 8 + 8 + 8
call _Z13getter_stringB5cxx11v
mov [rsp + 16], rax
call _Z14create_handlerv
mov [rsp + 0], rax
call _Z9create_jsv
mov [rsp + 8], rax
mov rdi, [rsp + 8]
mov rsi, [rsp + 0];
mov rsi, [rsi]
lea rdx, [rsp + 16];
mov rdx, [rdx]
call _ZN2V812parse_stringER7HandlerRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
mov rax, 60
mov rbx, 0
syscall
My favorite assembler. I love it because it doesn't require us to check data types. For strict C++, it will be very difficult to restore a class so that it can be used in someone else's library. The assembler program gives us an advantage. If a C++ class in a library takes up 120 bytes, then we simply either allocate 120 bytes on the stack or keep 8 bytes of memory to store the pointer.
All that remains is to assemble it all and this is what it looks like.
all:
nasm -felf64 main.asm -o main.o
gcc main.o -Wl,-rpath=libs -Llibs -lte -lgetter -o test
clean:
rm main.o
rm test
As a result, we get a program that can receive data from one library and send it to another.
Windows
For Windows it turned out to be a little more complicated. I spent 2 hours on this and decided how to do it. To assemble an assembler program, the dll library needs to have its connecting part in the form of dll.lib. As I understand it, it is needed so that the program can understand what functions the dll library contains and integrate this data into our program.
I won’t give DLL headers as an example, but I can say that there is nothing unusual there. It is just declared according to Windows rules along with dllspec and dllexport. We collect it in the usual way and send it to the folder with the fuzzer. For the fuzzing library, you can copy the dll.lib file, but the dll in which we need to find an error may not have source codes and here we need to perform several operations.
First of all, we use dumpbin.
dumpbin /nologo /exports Dllcrackme.dll > Dllcrackme.def
From this file we can see our internally named functions that can be used in the assembler. Of everything that was there, I highlighted only those functions that were found.
EXPORTS
??0Code@@QEAA@XZ = ??0Code@@QEAA@XZ @1
?check_code@Code@@QEAAHAEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z = ?check_code@Code@@QEAAHAEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z @2
?print@Code@@QEAAXXZ = ?print@Code@@QEAAXXZ @3
Instead of @1, for example, their real names were written in C++ style (public __dllspec Code::Code (void))
Next you need to use the lib program with this line.
lib /nologo /def:Dllcrackme.def /MACHINE:x64 /out:Dllcrackme.lib
But here an error occurred when it was not @1 that was written, but the normal function name. @1 solved this problem. If memory serves me correctly, this indicates the function number.
At the output we get a file that will be used to link the assembler program with the dll. That is, only the linking occurs, and the dll will then be used every time it is launched.
The assembly code turned out like this.
nasm -f win64 main.asm -o main.o
link main.o Dllcrackme.lib /entry:main /out:fuzzer.exe
And the program with assembly code was like this.
section .text
global main
extern ?print@Code@@QEAAXXZ
main:
call ?print@Code@@QEAAXXZ
ret
There is not much code here, but it shows that this is how everything works, and you can continue to improve the program.
In this article, I showed how you can fuzz libraries, and not how to write a fuzzer. I think this article will open some people's eyes to how this can be done. If so, then I'm glad I shared my knowledge and contributed to becoming a good expert. After all, in the future someone will invent something outstanding and will also share their knowledge. I don’t know how useful my knowledge is, but I’m sure that there are as many researchers like me in the world who are looking for answers to eternal questions that are almost never written about on the Internet.