Comparing Sanitizers and Valgrind

This article presents a comparison of two tools for finding memory errors in programs written in memory-unsafe (unsafe when working with memory) languages ​​- Sanitizers And Valgrind. These two tools work completely differently. So while Sanitizers (developed by Google engineers) has a number of advantages over Valgrind, they each have their own strengths and weaknesses. It should be immediately noted that the Sanitizers project has a plural name because it consists of several tools that we will consider together in this article.

Unlike Java, Python and similar safe memory access languages, DMA languages ​​such as C and C++, you need special tools to identify errors in working with it. In a memory-unsafe language, it is quite easy to mistakenly write data to the end of a memory buffer or read memory after it has been freed. Programs containing such errors may run perfectly fine most of the time, crashing only in rare situations. It is very difficult to catch such errors yourself, so special tools are needed for this.

To begin with, Valgrind slows down programs much more than Sanitizers. A program running under Valgrind can run 20 to 50 times slower than in normal mode. This can be a serious bottleneck for computationally intensive programs. The slowdown of work in Sanitizers is usually 2-4 times compared to normal work. Instead of Valgrind, you can specify the use of Sanitizers at compile time.

Brief instructions for Sanitizers

The list of recommendations below summarizes some of the information from this article and may be useful to readers already familiar with Sanitizers:

-fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow -fno-sanitize=null -fno-sanitize=alignment
  • For LLDB/GDB and to prevent very short stack traces and generally false leak detections:

$ export ASAN_OPTIONS=abort_on_error=1:fast_unwind_on_malloc=0:detect_leaks=0 UBSAN_OPTIONS=print_stacktrace=1
$ export G_SLICE=always-malloc G_DEBUG=gc-friendly
  • Clang option to intercept read of uninitialized memory: -fsanitize=memory. This option cannot be combined with -fsanitize=address.

  • In rpmbuild .spec files you should additionally use: -Wp,-U_FORTIFY_SOURCE.

Performance Benefits of Sanitizers

Valgrind uses dynamic rather than static instrumentation at compile time, which introduces a large overhead that may not be practical for CPU-intensive applications. Sanitizers uses static instrumentation and allows similar checks to be performed with less overhead.

Table 1 provides a fairly detailed comparisondemonstrating the capabilities and impact on program execution time of both tool sets.

In my tests, Valgrind spends 23 seconds at startup reading system debugging information files. This load can be temporarily reduced by renaming the /usr/lib/debug directory. Just don't forget to rename it back! Otherwise, system debugging information may disappear and at the same time stop being installed (since it is already installed):

$ sudo mv /usr/lib/debug /usr/lib/debug-x; sleep 1h; sudo mv /usr/lib/debug-x /usr/lib/debug

The slowdown tests were compiled using the following command:

$ clang++ -g -O3 -march=native -ffast-math ...
clang-11.0.0-2.fc33.x86_64
valgrind-3.16.1-5.fc33.x86_64

The teams used the usual -fsanitize=address, -fsanitize=undefined or –fsanitize=thread without any additional options that I suggested in instructions for Sanitizers.

This article uses the Clang compiler. GCC has the same support for Sanitizers, except that the compiler lacks support -fsanitize=memory (we'll discuss this in the section MSAN: Read uninitialized memory).

Installing debuginfo

The examples in this article assume that the files *-debuginfo.rpm already installed. You can install them using the command dnf debuginfo-install packagename. In some versions Red Hat Enterprise Linux (RHEL) instead dnf need to use yum. When you run the program in gdbit will immediately tell you if you are missing any RPMs:

$ cat >vector.cpp <<'EOF'
// Здесь должна быть ваша программа, которую вы отлаживаете,
// этот код с std::vector является лишь примером.
#include <vector>
int main() {
  std::vector<int> v;
}
EOF
$ clang++ -o vector vector.cpp -Wall -g
$ gdb ./vector
Reading symbols from ./vector...
(gdb) start
...
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64
...
Missing separate debuginfos, use: dnf debuginfo-install libgcc-10.2.1-9.fc33.x86_64 libstdc++-10.2.1-9.fc33.x86_64
(gdb) quit
$ sudo dnf debuginfo-install glibc-2.32-2.fc33.x86_64
...
$ sudo dnf debuginfo-install libgcc-10.2.1-9.fc33.x86_64 libstdc++-10.2.1-9.fc33.x86_64
...

Example situation: Buffer overflow

Below we see a completely normal program, except for the presence of a buffer overrun error. Although the error here seems obvious, in real programs such errors are much more hidden and difficult to find. C developers usually use Valgrind Memcheck, which can catch most of these errors. Running the program with Valgrind Memcheck shows:

$ cat >overrun.c <<'EOF'
#include <stdlib.h>
int main() {
  char *p = malloc(16);
  p[24] = 1; // выход за пределы буфера, в p всего 16 байт
  free(p); // free(): некорректный указатель
  return 0;
}
EOF
$ clang -o overrun overrun.c -Wall -g
$ valgrind ./overrun
...
==60988== Invalid write of size 1
==60988==    at 0x401154: main (overrun.c:4)
==60988==  Address 0x4a52058 is 8 bytes after a block of size 16 alloc'd
==60988==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==60988==    by 0x401147: main (overrun.c:3)

Note: You can look at this code and its output in Compiler Explorer.

Since the program works fine without Valgrind, you might think that there might not be any bug there. After adding code that simulates a non-trivial program, the example crashes even when launched cleanly. Please note that the message free(): invalid pointer comes from glibc (GNU C Standard System Library), and not from the Sanitizers or Valgrind instrumentation:

$ cat >overrun.c <<'EOF'
#include <stdlib.h>
int main() {
  char *p = malloc(16);
  char *p2 = malloc(16);
  p[24] = 1; // выход за пределы буфера, в p всего 16 байт
  free(p2); // free(): некорректный указатель
  free(p);
  return 0;
}
EOF
$ clang -o overrun overrun.c -Wall -g
$ ./overrun
free(): invalid pointer
Aborted

Note: You can look at this code and its output in Compiler Explorer.

The tool in Sanitizers corresponding to using Valgrind for this bug is AddressSanitizer. The difference is that while Valgrind works with regular executable files, AddressSanitizer requires recompilation of the code, which is then executed directly without additional tools:

$ clang -o overrun overrun.c -Wall -g -fsanitize=address $ ./overrun ============================= ===================================== ==61268==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000028 at pc 0x0000004011b8 bp 0x7fff37c8aa70 sp 0x7fff37c8aa68 WRITE of size 1 at 0x602000000028 thread T0 #0 0x4011b7 in main overrun.c:4 #1 0x7f4c94a2d 1e1 in __libc_start_main ../csu/libc-start.c:314 #2 0x4010ad in _start ( overrun+0x4010ad) 0x602000000028 is located 8 bytes to the right of 16-byte region [0x602000000010,0x602000000020)
allocated by thread T0 here:
    #0 0x7f4c94c7b3cf in __interceptor_malloc (/lib64/libasan.so.6+0xab3cf)
    #1 0x401177 in main overrun.c:3
    #2 0x7f4c94a2d1e1 in __libc_start_main ../csu/libc-start.c:314


SUMMARY: AddressSanitizer: heap-buffer-overflow overrun.c:4 in main
...

Примечание: Вы можете посмотреть на этот код и его вывод в Compiler Explorer.

ASAN: Выход за границы переменных стека

Поскольку Valgrind не требует перекомпиляции программы, он не может обнаружить некоторые некорректные обращения к памяти. Одной из таких ошибок является обращение к памяти вне диапазона автоматических (локальных) переменных и глобальных переменных (см. документацию по AddressSanitizer Stack Out of Bounds).

Поскольку Valgrind подключается к работе только во время выполнения программы, он отлавливает и отслеживает память от выделений malloc. К сожалению, выделение переменных в стеке является неотъемлемой частью уже скомпилированной программы без вызова каких-либо внешних функций, таких как malloc, поэтому Valgrind не может выяснить, является ли обращение к стековой памяти корректным. Sanitizers, с другой стороны, проверяют весь код во время компиляции, когда компилятор еще знает, к какой именно переменной в стеке пытается обратиться программа и каковы правильные границы в стека для этой переменной:

$ cat >stack.c <<'EOF'
int main(int argc, char **argv) {
  int a[100];  return a[argc + 100];  } EOF $ clang -o stack stack.c -Wall -g -fsanitize=address $ ./stack ==88682==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff54f500f4 at pc 0x0000004f4c51 bp 0x7fff54f4ff30 sp 0x7fff54f4ff28 READ of size 4 at 0x7fff54f500f4 thread T0 #0 0x4f4c50 in main /tmp/stack.c:3:10 #1 0x7f9983c7e1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start. c:314:16 #2 0x41c41d in _start (/tmp/stack+0x41c41d) Address 0x7fff54f500f4 is located in stack of thread T0 at offset 436 in frame #0 0x4f4a9f in main /tmp/stack.c:1 This frame has 1 object (s):
    [32, 432) 'a' (line 2) <== Memory access at offset 436 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /tmp/stack.c:3:10 in main
...
$ clang -o stack stack.c -Wall -g
$ valgrind ./stack
...
(nothing found by Valgrind)

Примечание: Вы можете посмотреть этот код и его вывод в Compiler Explorer.

ASAN: Выход за границы глобальных переменных

Как и в случае с переменными в стеке, Valgrind не может обнаружить выход за границы глобальной переменной, поскольку не перекомпилирует программу (см. документацию AddressSanitizer Global Out of Bounds).

Я уже описал выше, почему Valgrind не может отловить такие ошибки. Вот результаты работы AddressSanitizer и Valgrind:

$ cat >global.c <<'EOF'
int a[100];  int main(int argc, char **argv) { return a[argc + 100];  } EOF $ clang -o global global.c -Wall -g -fsanitize=address $ ./global ============== ====================================== ==88735==ERROR: AddressSanitizer: global-buffer- overflow on address 0x000000dcee74 at pc 0x0000004f4b04 bp 0x7ffd5292b580 sp 0x7ffd5292b578 READ of size 4 at 0x000000dcee74 thread T0 #0 0x4f4b03 in main /tmp/global.c:3:10 #1 0x7f d416cda1e1 in __libc_start_main /usr/src/debug/glibc-2.32- 20-g5c36293f06/csu/../csu/libc-start.c:314:16 #2 0x41c41d in _start (/tmp/global+0x41c41d) 0x000000dcee74 is located 4 bytes to the right of global variable 'a' defined in ' global.c:1:5' (0xdcece0) of size 400 SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/global.c:3:10 in main ... $ clang -o global global.c -Wall -g $ valgrind ./global ... (nothing found by Valgrind)

Note: You can see this code and its output in Compiler Explorer.

MSAN: Read uninitialized memory

AddressSanitizer does not detect reading uninitialized memory. For this purpose it was developed MemorySanitizer. It requires separate compilation and execution (see MemorySanitizer documentation). Why AddressSanitizer was not designed with the functionality of MemorySanitizer in mind is unclear to me (and not just me).

As a result of running MemorySanitizer we will see:

$ cat >uninit.c <<'EOF'
int main(int argc, char **argv) {
  int a[2];
  if (a[argc != 1])
    return 1;
  else
    return 0;
}
EOF
$ clang -o uninit uninit.c -Wall -g -fsanitize=address -fsanitize=memory
clang-11: error: invalid argument '-fsanitize=address' not allowed with '-fsanitize=memory'
$ clang -o uninit uninit.c -Wall -g -fsanitize=memory
$ ./uninit
==63929==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4985a9 in main /tmp/uninit.c:3:7
    #1 0x7f93e232c1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #2 0x41c39d in _start (/tmp/uninit+0x41c39d)
SUMMARY: MemorySanitizer: use-of-uninitialized-value /tmp/uninit.c:3:7 in main

It's easier to catch this error using Valgrind, which by default reports reading from uninitialized memory:

$ clang -o uninit uninit.c -Wall -g
$ valgrind ./uninit
...
==87991== Conditional jump or move depends on uninitialised value(s)
==87991==    at 0x401136: main (uninit.c:3)
...

Note: You can see this code and its output in Compiler Explorer.

ASAN: Stack usage after return

AddressSanitizer requires enablement ASAN_OPTIONS=detect_stack_use_after_return=1 at runtime, since this function imposes additional runtime overhead (see AddressSanitizer Use After Return documentation). Below is an example of a program that runs without errors on its own or with Valgrind, but shows an error when run with AddressSanitizer:

$ cat >uar.cpp <<'EOF' int *f() { int i = 42;  int *p = &i;  return p;  } int g(int *p) { return *p;  } int main() { return g(f());  } EOF $ clang++ -o uar uar.cpp -Wall -g -fsanitize=address $ ./uar (nothing found by default) $ ASAN_OPTIONS=detect_stack_use_after_return=1 ./uar ============= ===================================================== == ==164341==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fb71a561020 at pc 0x0000004f78e1 bp 0x7ffc299184c0 sp 0x7ffc299184b8 READ of size 4 at 0x7fb71a561020 thread T0 #0 0 x4f78e0 in g(int*) /home/lace/ src/uar.cpp:7:10 #1 0x4f790b in main /home/lace/src/uar.cpp:10:10 #2 0x7fb71dbde1e1 in __libc_start_main (/lib64/libc.so.6+0x281e1) #3 0x41c41d in _start (/home/lace/src/uar+0x41c41d) Address 0x7fb71a561020 is located in stack of thread T0 at offset 32 ​​in frame #0 0x4f771f in f() /home/lace/src/uar.cpp:1 This frame has 1 object (s):
    [32, 36) 'i' (line 2) <== Memory access at offset 32 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-return /home/lace/src/uar.cpp:7:10 in g(int*)
...
$ clang++ -o uar uar.cpp -Wall -g
$ valgrind ./uar
...
(nothing found by Valgrind)

UBSAN: Неопределенное поведение

UndefinedBehaviorSanitizer защищает код от вычислений, запрещенных стандартом языка (см. документацию по UndefinedBehaviorSanitizer). Из соображений производительности некоторые неопределенные вычисления могут не отлавливаться во время выполнения, но никто не может ничего гарантировать о программе, если они имеют место в коде. Чаще всего такие числовые выражения просто вычисляют неожиданный результат. UndefinedBehaviorSanitizer может обнаружить и сообщить о таких операциях.

UndefinedBehaviorSanitizer можно использовать вместе с наиболее распространенным Sanitizer’ом, AddressSanitizer:

$ cat >undefined.cpp <<'EOF'
int main(int argc, char **argv) {
  return 0x7fffffff + argc;
}
EOF
$ clang++ -o undefined undefined.cpp -Wall -g -fsanitize=undefined
$ export UBSAN_OPTIONS=print_stacktrace=1
$ ./undefined
undefined.cpp:2:21: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
    #0 0x429269 in main /tmp/undefined.cpp:2:21
    #1 0x7f1212a3e1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #2 0x40345d in _start (/tmp/undefined+0x40345d)


SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior undefined.cpp:2:21 in
$ valgrind ./undefined
...
(nothing found by Valgrind)

Примечание: Вы можете посмотреть этот код и его вывод в Compiler Explorer.

Лично я предпочитаю прерывать работу программы при первом же таком проявлении, потому что иначе ошибку найти будет очень трудно. Поэтому я использую -fno-sanitize-recover=all. Я также предпочитаю немного расширить покрытие UndefinedBehaviorSanitizer, включив в него: -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow.

LSAN: Утечки памяти

LeakSanitizer сообщает о выделенной памяти, которая не была высвобождена до завершения работы программы (см. документацию по LeakSanitizer). Такое поведение не обязательно является ошибкой. Но высвобождение всей выделенной памяти облегчает, например, отлов реальных, непредусмотренных утечек памяти:

$ cat >leak.cpp <<'EOF'
#include <stdlib.h>
int main() {
  void *p = malloc(10);
  return p == nullptr;
}
EOF
$ clang++ -o leak leak.cpp -Wall -g -fsanitize=address
$ ./leak
=================================================================
==188539==ERROR: LeakSanitizer: detected memory leaks


Direct leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x4bfcdf in malloc (/tmp/leak+0x4bfcdf)
    #1 0x4f7728 in main /tmp/leak.cpp:3:13
    #2 0x7fd5a7a781e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)


SUMMARY: AddressSanitizer: 10 byte(s) leaked in 1 allocation(s).
$ clang++ -o leak leak.cpp -Wall -g
$ valgrind --leak-check=full ./leak
...
==188524== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1
==188524==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==188524==    by 0x401148: main (leak.cpp:3)
...

Примечание: Вы можете посмотреть этот код и его вывод в Compiler Explorer.

LSAN: Утечки памяти при использовании определенных библиотек (glib2)

Некоторые фреймворки имеют собственные аллокаторы памяти, которые не позволяют LeakSanitizer выполнять свою работу. В следующем примере используется подобный фреймворк — glib2 (не glibc). Другие библиотеки могут иметь другие опции времени выполнения или компиляции. Вот так будут выглядеть результаты работы LeakSanitizer и Valgrind:

$ cat >gc.c <<'EOF'
#include <glib.h>
int main(void) {
    GHashTable *ht = g_hash_table_new(g_str_hash, g_str_equal);
    g_hash_table_insert(ht, "foo", "bar");
//    g_hash_table_destroy(ht); // утечка памяти в glib2
    g_malloc(100); // прямая утечка памяти
    return 0;
}
EOF
$ clang -o gc gc.c -Wall -g $(pkg-config --cflags --libs glib-2.0) -fsanitize=address
$ ./gc
=================================================================
==233215==ERROR: LeakSanitizer: detected memory leaks


Direct leak of 100 byte(s) in 1 object(s) allocated from:
    #0 0x4bfd2f in malloc (/tmp/gc+0x4bfd2f)
    #1 0x7f1fcf12b908 in g_malloc (/lib64/libglib-2.0.so.0+0x5b908)
    #2 0x7f1fced961e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)


SUMMARY: AddressSanitizer: 100 byte(s) leaked in 1 allocation(s).
$ clang -o gc gc.c -Wall -g $(pkg-config --cflags --libs glib-2.0)
$ valgrind --leak-check=full ./gc
...
==233250== 100 bytes in 1 blocks are definitely lost in loss record 8 of 11
==233250==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==233250==    by 0x48DF908: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.3)
==233250==    by 0x4011C5: main (gc.c:6)
==233250==
==233250== 256 (96 direct, 160 indirect) bytes in 1 blocks are definitely lost in loss record 9 of 11
==233250==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==233250==    by 0x48DF908: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.3)
==233250==    by 0x48F71C1: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.6600.3)
==233250==    by 0x48C5A51: g_hash_table_new_full (in /usr/lib64/libglib-2.0.so.0.6600.3)
==233250==    by 0x401197: main (gc.c:3)
...

LeakSanitizer об утечке хэш-таблицы не сообщает, в то время как Valgrind о ней сообщает. Это происходит потому, что glib2 специально обнаруживает Valgrind и в присутствии Valgrind отключает свой собственный аллокатор памяти (g_slice). Однако можно сделать glib2 пригодной для отладки даже с применением LeakSanitizer:

$ clang -o gc gc.c -Wall -g $(pkg-config --cflags --libs glib-2.0) -fsanitize=address
# otherwise the backtraces would have only 2 entries:
$ export ASAN_OPTIONS=fast_unwind_on_malloc=0
# Show all glib2 memory leaks:
$ export G_SLICE=always-malloc G_DEBUG=gc-friendly
$ ./gc
=================================================================
==233921==ERROR: LeakSanitizer: detected memory leaks


Direct leak of 100 byte(s) in 1 object(s) allocated from:
    #0 0x4bfd2f in malloc (/tmp/gc+0x4bfd2f)
    #1 0x7f2a7c302908 in g_malloc ../glib/gmem.c:106:13
    #2 0x4f4b35 in main /tmp/gc.c:6:5
    #3 0x7f2a7bf6d1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #4 0x41c46d in _start (/tmp/gc+0x41c46d)


Direct leak of 96 byte(s) in 1 object(s) allocated from:
    #0 0x4bfd2f in malloc (/tmp/gc+0x4bfd2f)
    #1 0x7f2a7c302908 in g_malloc ../glib/gmem.c:106:13
    #2 0x7f2a7c31a1c1 in g_slice_alloc ../glib/gslice.c:1069:11
    #3 0x7f2a7c2e8a51 in g_hash_table_new_full ../glib/ghash.c:1072:16
    #4 0x4f4b07 in main /tmp/gc.c:3:22
    #5 0x7f2a7bf6d1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #6 0x41c46d in _start (/tmp/gc+0x41c46d)


Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4bfd2f in malloc (/tmp/gc+0x4bfd2f)
    #1 0x7f2a7c302908 in g_malloc ../glib/gmem.c:106:13
    #2 0x7f2a7c317ce1  ../glib/gstrfuncs.c:392:17
    #3 0x7f2a7c317ce1 in g_memdup ../glib/gstrfuncs.c:385:1
    #4 0x7f2a7c2e8b65 in g_hash_table_ensure_keyval_fits ../glib/ghash.c:974:36
    #5 0x7f2a7c2e8b65 in g_hash_table_insert_node ../glib/ghash.c:1327:3
    #6 0x7f2a7c2e930f in g_hash_table_insert_internal ../glib/ghash.c:1601:10
    #7 0x7f2a7c2e930f in g_hash_table_insert ../glib/ghash.c:1630:10
    #8 0x4f4b28 in main /tmp/gc.c:4:5
    #9 0x7f2a7bf6d1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #10 0x41c46d in _start (/tmp/gc+0x41c46d)


Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4bfed7 in calloc (/tmp/gc+0x4bfed7)
    #1 0x7f2a7c302e20 in g_malloc0 ../glib/gmem.c:136:13
    #2 0x7f2a7c2e50ef in g_hash_table_setup_storage ../glib/ghash.c:592:24
    #3 0x7f2a7c2e8a90 in g_hash_table_new_full ../glib/ghash.c:1084:3
    #4 0x4f4b07 in main /tmp/gc.c:3:22
    #5 0x7f2a7bf6d1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #6 0x41c46d in _start (/tmp/gc+0x41c46d)


Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4c0098 in realloc (/tmp/gc+0x4c0098)
    #1 0x7f2a7c302f5f in g_realloc ../glib/gmem.c:171:16
    #2 0x7f2a7c2e50da in g_hash_table_realloc_key_or_value_array ../glib/ghash.c:380:10
    #3 0x7f2a7c2e50da in g_hash_table_setup_storage ../glib/ghash.c:590:24
    #4 0x7f2a7c2e8a90 in g_hash_table_new_full ../glib/ghash.c:1084:3
    #5 0x4f4b07 in main /tmp/gc.c:3:22
    #6 0x7f2a7bf6d1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #7 0x41c46d in _start (/tmp/gc+0x41c46d)


SUMMARY: AddressSanitizer: 292 byte(s) leaked in 5 allocation(s).

TSAN: Состояние гонки

ThreadSanitizer сообщает о состоянии гонки, когда несколько потоков обращаются к данным без защиты от состояния гонки (см. документацию по ThreadSanitizer). Ниже приведен пример:

$ cat >tiny.cpp <<'EOF'
#include <thread>


static volatile bool flip1{false};
static volatile bool flip2{false};


int main() {
  std::thread t([&]() { while (!flip1);  flip2 = true;  });  flip1 = true;  while (!flip2);  t.join();  } EOF $ clang++ -o tiny tiny.cpp -Wall -g -pthread -fsanitize=thread $ ./tiny ================== WARNING: ThreadSanitizer: data race (pid =4057433) Write of size 1 at 0x000000fb4b09 by thread T1: #0 main::$_0::operator()() const /tmp/tiny.cpp:9:11 (tiny+0x4cfc98) #1 void std::__invoke_impl (std::__invoke_other, main::$_0&&) /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/ c++/10/bits/invoke.h:60:14 (tiny+0x4cfc30) #2 std::__invoke_result::type std::__invoke(main::$ _0&&) /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95:14 (tiny+0x4cfb40) # 3 void std::thread::_Invoker >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-redhat-linux/ 10/../../../../include/c++/10/thread:264:13 (tiny+0x4cfae8) #4 std::thread::_Invoker >::operator()() /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/thread:271:11 (tiny+ 0x4cfa88) #5 std::thread::_State_impl > >::_M_run() /usr/lib/gcc/x86_64-redhat-linux /10/../../../../include/c++/10/thread:215:13 (tiny+0x4cf97f) #6 execute_native_thread_routine ../../../../../libstdc++ -v3/src/c++11/thread.cc:80:18 (libstdc++.so.6+0xd65f3) Previous read of size 1 at 0x000000fb4b09 by main thread: #0 main /tmp/tiny.cpp:12:11 (tiny+0x4cf51f) Location is global 'flip2' of size 1 at 0x000000fb4b09 (tiny+0x000000fb4b09) Thread T1 (tid=4057435, running) created by main thread at: #0 pthread_create  (tiny+0x488b7d) #1 < null> /usr/src/debug/gcc-10.2.1-9.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr -default.h:663:35 (libstdc++.so.6+0xd6898) #2 std::thread::_M_start_thread(std::unique_ptr >, void

()) ../../../../../libstdc++-v3/src/c++11/thread.cc:135:37 (libstdc++.so.6+0xd6898) #3 main /tmp /tiny.cpp:7:15 (tiny+0x4cf4f4) SUMMARY: ThreadSanitizer: data race /tmp/tiny.cpp:9:11 in main::$_0::operator()() const ======= =========== ThreadSanitizer: reported 1 warnings $ clang++ -o tiny tiny.cpp -Wall -g -pthread $ valgrind --tool=helgrind ./tiny ... ==4057510== -- -------------------------------------------------- ------------ ==4057510== ==4057510== Possible data race during write of size 1 at 0x40406D by thread #1 ==4057510== Locks held: none ==4057510== at 0x4011DC: main (tiny.cpp:11) ==4057510== ==4057510== This conflicts with a previous read of size 1 by thread #2 ==4057510== Locks held: none ==4057510== at 0x4015F8 : main::$_0::operator()() const (tiny.cpp:8) ==4057510== by 0x4015DC: void std::__invoke_impl(std::__invoke_other, main ::$_0&&) (invoke.h:60) ==4057510== by 0x40156C: std::__invoke_result::type std::__invoke(main::$ _0&&) (invoke.h:95) ==4057510== by 0x401544: void std::thread::_Invoker >::_M_invoke<0ul>(std::_Index_tuple< 0ul>) (thread:264) ==4057510== by 0x401514: std::thread::_Invoker >::operator()() (thread:271) == 4057510== by 0x40148D: std::thread::_State_impl > >::_M_run() (thread:215) ==4057510== by 0x49575F3: execute_native_thread_routine (thread.cc:80) ==4057510== by 0x4840737: mythread_wrapper (hg_intercepts.c:387) ==4057510== Address 0x40406d is 0 bytes inside data symbol "_ZL5flip1" ==4057510== == 4057510== ----------------------------------------------- ----------------- ==4057510== ==4057510== Possible data race during read of size 1 at 0x40406D by thread #2 ==4057510== Locks held: none ==4057510== at 0x4015F8: main::$_0::operator()() const (tiny.cpp:8) ==4057510== by 0x4015DC: void std::__invoke_impl (std::__invoke_other, main::$_0&&) (invoke.h:60) ==4057510== by 0x40156C: std::__invoke_result::type std::__invoke(main::$_0&&) (invoke.h:95) ==4057510== by 0x401544: void std::thread::_Invoker >::_M_invoke<0ul >(std::_Index_tuple<0ul>) (thread:264) ==4057510== by 0x401514: std::thread::_Invoker >::operator()() (thread:271) ==4057510== by 0x40148D: std::thread::_State_impl > >::_M_run() (thread: 215) ==4057510== by 0x49575F3: execute_native_thread_routine (thread.cc:80) ==4057510== by 0x4840737: mythread_wrapper (hg_intercepts.c:387) ==4057510== by 0x4BD33F8: start_thread (pthread_create. c:463) ==4057510== by 0x4CED902: clone (clone.S:95) ==4057510== ==4057510== This conflicts with a previous write of size 1 by thread #1 ==4057510== Locks held: none == 4057510== at 0x4011DC: main (tiny.cpp:11) ==4057510== Address 0x40406d is 0 bytes inside data symbol "_ZL5flip1" ==4057510== ==4057510== ---------- -------------------------------------------------- ---- ==4057510== ==4057510== Possible data race during write of size 1 at 0x40406E by thread #2 ==4057510== Locks held: none ==4057510== at 0x401613: main::$_0 ::operator()() const (tiny.cpp:9) ==4057510== by 0x4015DC: void std::__invoke_impl(std::__invoke_other, main::$_0&&) ( invoke.h:60) ==4057510== by 0x40156C: std::__invoke_result::type std::__invoke(main::$_0&&) (invoke.h :95) ==4057510== by 0x401544: void std::thread::_Invoker >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (thread: 264) ==4057510== by 0x401514: std::thread::_Invoker >::operator()() (thread:271) ==4057510== by 0x40148D: std::thread::_State_impl > >::_M_run() (thread:215) ==4057510== by 0x49575F3: execute_native_thread_routine (thread .cc:80) ==4057510== by 0x4840737: mythread_wrapper (hg_intercepts.c:387) ==4057510== by 0x4BD33F8: start_thread (pthread_create.c:463) ==4057510== by 0x4CED902: clone (clone.S :95) ==4057510== ==4057510== This conflicts with a previous read of size 1 by thread #1 ==4057510== Locks held: none ==4057510== at 0x4011E4: main (tiny.cpp:12 ) ==4057510== Address 0x40406e is 0 bytes inside data symbol "_ZL5flip2" ==4057510== ==4057510== ----------------------- ----------------------------------------- ==4057510== ==4057510= = Possible data race during read of size 1 at 0x40406E by thread #1 ==4057510== Locks held: none ==4057510== at 0x4011E4: main (tiny.cpp:12) ==4057510== ==4057510== This conflicts with a previous write of size 1 by thread #2 ==4057510== Locks held: none ==4057510== at 0x401613: main::$_0::operator()() const (tiny.cpp:9) ==4057510== by 0x4015DC: void std::__invoke_impl(std::__invoke_other, main::$_0&&) (invoke.h:60) ==4057510== by 0x40156C: std ::__invoke_result::type std::__invoke(main::$_0&&) (invoke.h:95) ==4057510== by 0x401544: void std:: thread::_Invoker >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (thread:264) ==4057510== by 0x401514: std::thread: :_Invoker >::operator()() (thread:271) ==4057510== by 0x40148D: std::thread::_State_impl > >::_M_run() (thread:215) ==4057510== by 0x49575F3: execute_native_thread_routine (thread.cc:80) ==4057510== by 0x4840737: mythread_wrapper ( hg_intercepts.c:387) ==4057510== Address 0x40406e is 0 bytes inside data symbol "_ZL5flip2" ...Note : You can see this code and its output

in Compiler Explorer

Recompiling libraries AddressSanitizer automatically processes all calls toglibc -fsanitize=address. This does not apply to other system or user libraries. For AddressSanitizer to work best, you must also recompile such libraries with the parameter

. This is not required for Valgrind. libuser.c Next error in the library

is still caught by AddressSanitizer thanks to the glibc interceptor, even if the library is not compiled with AddressSanitizer: [0x602000000010,0x602000000011)
allocated by thread T0 here:
    #0 0x4bfcef in malloc (/tmp/libuser+0x4bfcef)
    #1 0x4f4ab1 in main /tmp/libuser.c:4:13
    #2 0x7fae9f1be1e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16


SUMMARY: AddressSanitizer: heap-buffer-overflow (/tmp/libuser+0x484a6c) in __interceptor_strcpy.part.0
...

В следующем случае AddressSanitizer пропускает повреждение памяти, если библиотека не была перекомпилирована с помощью AddressSanitizer:

$ cat >library.c <<'EOF'
void library(char *s) {
  const char *cs = "string";
  while (*cs)
    *s++ = *cs++;
  *s = 0;
}
EOF
$ cat >libuser.c <<'EOF'
#include <stdlib.h>
void library(char *s);
int main(void) {
  char *s = malloc(1);
  library(s);
  free(s);
}
EOF
$ clang -o library.so library.c -Wall -g -shared -fPIC
$ clang -o libuser libuser.c -Wall -g ./library.so -fsanitize=address
$ ./libuser
(nothing found by AddressSanitizer)

Valgrind может найти ошибку без перекомпиляции:

$ clang -o library.so library.c -Wall -g -shared -fPIC; clang -o libuser libuser.c -Wall -g ./library.so; valgrind ./libuser
...
==128708== Invalid write of size 1
==128708==    at 0x4849146: library (library.c:4)
==128708==    by 0x40116E: main (libuser.c:5)
==128708==  Address 0x4a57041 is 0 bytes after a block of size 1 alloc'd
==128708==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==128708==    by 0x401161: main (libuser.c:4)
...

AddressSanitizer также может найти ошибку, если мы перекомпилируем библиотеку с помощью AddressSanitizer:

$ clang -o library.so library.c -Wall -g -shared -fPIC -fsanitize=address
$ clang -o libuser libuser.c -Wall -g ./library.so -fsanitize=address
$ ./libuser
=================================================================
==128719==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000011 at pc 0x7f7e4e68b269 bp 0x7ffc40c0dc30 sp 0x7ffc40c0dc28
WRITE of size 1 at 0x602000000011 thread T0
    #0 0x7f7e4e68b268 in library /tmp/library.c:4:10
    #1 0x4f4abe in main /tmp/libuser.c:5:3
    #2 0x7f7e4e3141e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16
    #3 0x41c42d in _start (/tmp/libuser+0x41c42d)


0x602000000011 is located 0 bytes to the right of 1-byte region [0x602000000010,0x602000000011)
allocated by thread T0 here:
    #0 0x4bfcef in malloc (/tmp/libuser+0x4bfcef)
    #1 0x4f4ab1 in main /tmp/libuser.c:4:13
    #2 0x7f7e4e3141e1 in __libc_start_main /usr/src/debug/glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16


SUMMARY: AddressSanitizer: heap-buffer-overflow /tmp/library.c:4:10 in library
...

Взаимодействие Sanitizers с _FORTIFY_SOURCE

По умолчанию rpmbuild использует опцию -Wp,-D_FORTIFY_SOURCE=2, которая реализует свои собственные проверки корректности доступа к памяти. К сожалению, она отключает некоторые проверки памяти, выполняемые AddressSanitizer. Эта проблема может быть исправлена в будущем. В настоящее время, чтобы подготовиться к проверке с помошью Sanitizers, просто отключите _FORTIFY_SOURCE с помощью -Wp,-U_FORTIFY_SOURCE (это более универсальная форма простой -D_FORTIFY_SOURCE=0):

$ cat >strcpyfrom.spec <<'EOF'
Summary: strcpyfrom
Name: strcpyfrom
Version: 1
Release: 1
License: GPLv3+
%description
%build
cat >strcpyfrom.c <<'EOH'
#include <stdlib.h>
#include <string.h>
int main(void) {
  char *s = malloc(1);
  char d[0x1000]$ cat >library.c <<'EOF' #include  void library(char *s) { strcpy(s,"string");  } EOF $ cat >libuser.c <<'EOF' #include  void library(char *s);  int main(void) { char *s = malloc(1);  library(s);  free(s);  } EOF $ clang -o library.so library.c -Wall -g -shared -fPIC $ clang -o libuser libuser.c -Wall -g ./library.so -fsanitize=address $ ./libuser ===== ===================================================== ========== ==128657==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000011 at pc 0x000000484a6d bp 0x7fff13a4ace0 sp 0x7fff13a4a490 WRITE of size 7 at 0x602000000011 thread T0 #0 0x484a6c in __interceptor_strcpy.part. 0 (/tmp/libuser+0x484a6c) #1 0x7fae9f53512b in library /tmp/library.c:3:3 #2 0x4f4abe in main /tmp/libuser.c:5:3 #3 0x7fae9f1be1e1 in __libc_start_main /usr/src/debug /glibc-2.32-20-g5c36293f06/csu/../csu/libc-start.c:314:16 #4 0x41c42d in _start (/tmp/libuser+0x41c42d) 0x602000000011 is located 0 bytes to the right of 1-byte region

; strcpy(d, s); return 0; } EOH gcc -o strcpyfrom strcpyfrom.c $RPM_OPT_FLAGS -fsanitize=address echo no error caught: ./strcpyfrom gcc -o strcpyfrom strcpyfrom.c $RPM_OPT_FLAGS -fsanitize=address -Wp,-U_FORTIFY_SOURCE echo error caught: ./strcpyfrom EOF $ rpmbuild -bb strcpyfrom.spec Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.KTLr7c + umask 022 + cd src/rpm/BUILD + cat + gcc -o strcpyfrom strcpyfrom.c -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/ rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fsanitize=address + echo no error caught: no error caught: + ./strcpyfrom + gcc -o strcpyfrom strcpyfrom.c -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc -switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs= /usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fsanitize=address -Wp,-U_FORTIFY_SOURCE annobin: strcpyfrom.c : Warning: -D_FORTIFY_SOURCE defined as 0 + echo error caught: error caught: + ./strcpyfrom ================ ===================================== ==412157==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000011 at pc 0x7fe75d8b2075 bp 0x7ffccf5dd1e0 sp 0x7ffccf5dc990 READ of size 2 at 0x602000000011 thread T0 #0 0x7fe75d8b2074 (/lib64/libasan.so.6+0x52074 ) #1 0x4011be in main strcpyfrom.c:6 #2 0x7fe75d6bd1e1 in __libc_start_main ../ csu/libc-start.c:314 #3 0x40127d in _start (strcpyfrom+0x40127d) 0x602000000011 is located 0 bytes to the right of 1-byte region[0x6020000000100x602000000011)allocatedbythreadT0here:#00x7fe75d90b3cfin__interceptor_malloc(/lib64/libasanso6+0xab3cf)#10x4011b2inmainstrcpyfromc:4#20x40200f(strcpyfrom+0x40200f)SUMMARY:AddressSanitizer:heap-buffer-overflow(/lib64/libasanso6+0x52074)error:Badexitstatusfrom/var/tmp/rpm-tmpKTLr7c(%build)RPMbuilderrors:Badexitstatusfrom/var/tmp/rpm-tmpKTLr7c(%build)

Conclusion -fsanitize=address If you're used to Valgrind, try AddressSanitizer - just add a link and compile option to get started CFLAGS(that is, to everyone CXXFLAGS , LDFLAGSAnd). If you like it, then return to the section "Brief instructions for Sanitizers


" for recommendations on fine-tuning. In conclusion, we invite everyone to the open lesson “Implementing dynamic data structures in C and Python”, which will be held today at 20:00. As a result, we will be able to describe dynamic memory management schemes in C and Python, methods for constructing dynamic data structures in these languages, features and possibilities for using dynamic memory management functions for specific tasks. We can also write a template application for our own implementation of a binary tree. You can sign up

on the “Programmer C” course page.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *