A case from the practice of code analysis

Greetings to all Habr readers. In addition to administrative work, my main activity is searching for various vulnerabilities. Most often, my toolkit is a set of some debuggers, dynamic analyzers, and other similar things. But sometimes I have to analyze the source code of an arbitrary degree curvatures understandability. And this is practically a separate parallel world in the field of security.

A 20 minute adventure, literally

A 20 minute adventure, literally

This time, my luck led me to analyze the source code of OpenJDK (you can look at it, for example, on github). I started with something simple: I got the static code analysis logs and started studying them. In general, the log was pretty boring – about 1500 warnings, 99% of which were false positives of the analyzer. The most annoying thing is that, despite the fact that I know in advance that almost always more than 99% of warnings are useless, I still need to spend time on each one to check whether it is included in that 1%.

I will still analyze one relatively interesting case. In the best traditions of the PVS-Studio blog, I will first show a piece of code and suggest finding the problem with your own eyes.

static boolean setHuffTable(JNIEnv *env,
                         JHUFF_TBL *huff_ptr,
                         jobject table) {

    jshortArray huffLens;
    jshortArray huffValues;
    jshort *hlensBody, *hvalsBody;
    jsize hlensLen, hvalsLen;
    int i;

    // lengths
    huffLens = (*env)->GetObjectField(env,
                                      table,
                                      JPEGHuffmanTable_lengthsID);
    hlensLen = (*env)->GetArrayLength(env, huffLens);
    hlensBody = (*env)->GetShortArrayElements(env,
                                              huffLens,
                                              NULL);
    CHECK_NULL_RETURN(hlensBody, FALSE);

    if (hlensLen > 16) {
        /* Ignore extra elements of bits array. Only 16 elements can be
           stored. 0-th element is not used. (see jpeglib.h, line 107)  */
        hlensLen = 16;
    }
    for (i = 1; i <= hlensLen; i++) {
        huff_ptr->bits[i] = (UINT8)hlensBody[i-1];
    }
    (*env)->ReleaseShortArrayElements(env,
                                      huffLens,
                                      hlensBody,
                                      JNI_ABORT);
    // values
    huffValues = (*env)->GetObjectField(env,
                                        table,
                                        JPEGHuffmanTable_valuesID);
    hvalsLen = (*env)->GetArrayLength(env, huffValues);
    hvalsBody = (*env)->GetShortArrayElements(env,
                                              huffValues,
                                              NULL);
    CHECK_NULL_RETURN(hvalsBody, FALSE);

    if (hvalsLen > 256) {
        /* Ignore extra elements of hufval array. Only 256 elements
           can be stored. (see jpeglib.h, line 109)                  */
        hlensLen = 256;
    }
    for (i = 0; i < hvalsLen; i++) {
        huff_ptr->huffval[i] = (UINT8)hvalsBody[i];
    }
    (*env)->ReleaseShortArrayElements(env,
                                      huffValues,
                                      hvalsBody,
                                      JNI_ABORT);
    return TRUE;
}

If anyone wants to see it for themselves on GitHub, here is a link to the file jdk/src/java.desktop/share/native/libjavajpeg/imageioJPEG.c. The function starts on line 746.

Although the problem can be seen in the given piece of code, to understand the visible part of the problem you need to look at the neighboring header files.

Hidden text

Actually, the problem is in line 46 of the fragment above or line 791 on github.

Note that one variable is checked. hvalsLenand the meaning of the other one changes – hlenslenwhich is not used further in the code. That is, if hvalsLen will be greater than 256, then the calls in line 49 will go beyond the expected limit.

Now I'll tell you why this is especially bad. The thing is that hvalsBody is an array of fixed size of 256 elements. The memory for this array (or rather for the structure of which this array is a field) is allocated on the heap, so potentially everything looks like a heap buffer overflow (CWE-122).

Now let's start thinking logically and with a lyrical digression. When I triage warnings, I assign one of the following possible statuses: “potential error”, “unlikely error”, “not an error”, and “false positive of the analyzer”.

  • False positive from the analyzer. Exactly what is written – the code analyzer gave out some nonsense (maybe the parser broke or some other logic glitched). In short, the analyzer is lying.

  • Not an error. The analyzer is basically right with its warning, but for some reason it is not an error.

  • Potential error. This is what I write when I see a really serious bug in the code. But I'm cautious – it's not always possible to guess what the error will actually turn into, so I still write “potential”.

  • An unlikely error. This is the most interesting part – I see a problem in the code, but at the same time I understand that the strange code is in some rather ordinary place. And here we have the following situation. If the problem were serious, the code would constantly crash or work poorly. But in most cases, crashes are not that frequent. So there is some trick here, for example, even if I do not see any restrictions in the code, they may still exist somewhere.

The code fragment above clearly shows a “low probability error” situation: if the virtual machine crashed because of a crooked jpeg image, it would have been detected long ago. So, the problem is probably not that serious and, perhaps, the check for 256 is redundant and is actually provided by logic, and not by this check itself.

I went online with this question, because I will leave studying standards to the weak-willed. And I found a wonderful code in Python that decodes jpeg. Here I firmly believed that this is my path – code analysis java-runtime written in Prosby studying pythonic code.

For my purposes, everything turned out to be quite simple – at the beginning of the file 0xFFD8, at the end of the file 0xFFD9, and between them we simply stick a block with the header 0xFFC4, after which there are two bytes of length (which gives a potential 65535, which is really good, more than 256), a descriptor byte and then there are correctly formed arrays of these 16 and 256+ bytes.

For example, here

For example, here

IrfanView gives a completely honest error “Bogus Huffman table definition”, apparently, it knows something. But we are not checking IrfanView, but OpenJDK. Therefore, we download the runtime binaries, enable plugins in vscode and write our short masterpiece.

try {
  BufferedImage img = ImageIO.read(new File("C:\\tmp\\huffman.jpg"));
} catch (IOException e) {
  System.err.println(e.getMessage());
}

The result of the launch, unfortunately, is not a starting calculator, but just the same error: Bogus Huffman table definition.

Then it gets boring and sad: we find a message in github. And then its use in different files.

For example here
Or here

In general, the heap, so to speak, did not overflow, the simple option did not work. Well, okay, I at least found where this is checked in the code.

An attentive programmer might reasonably note that I tried to cheat loading the image and calling that function with the error are not generally related. That is, the check performed, to put it mildly, does not cover all cases and, since I took it upon myself, I need to check it properly. The real scenario looks like this: they read the image, worked with it, and somewhere after that the above-mentioned function was called.

On the one hand, it is extremely logical. When loading, the check was performed, but then something was done with the picture, as a hypothetical example – it was rotated, and so, suddenly, it turned out that the Huffman table became wider.

On the other hand, if you use common sense, then vulnerability becomes a problem only if it can be triggered in some ready-made code. If you also need to write special code in Java for the trigger, then it doesn't make much sense. In general, you can do simple things, but not simple ones.

What conditions must be met for an error to occur? hvalsLen must be greater than 256 elements, and this array is taken as a field JPEGHuffmanTable_valuesID in the object class tablewhich is transmitted as job project table into function setHuffTable. Let's see who calls the function in question: there are only two options and both are in the function setHTables V in the same file. We continue to build the collstack upwards, in the calls the argument table passed as array elements jobjectArray DCHuffmanTables And jobjectArray ACHuffmanTableswhich are both arguments to the function setHTables. Very convenient, let's go higher.

Function setHTables is already called in three different places. But the names of the calling functions are very specific and clearly indicate that they have Java prototypes, and as arguments for setHTables arguments of the corresponding functions are passed without processing. So we immediately look for native in the java code.

For these functions, the arguments we are interested in are of the type JPEGHuffmanTableThe constructor of this class greets us with joyful checks.

JPEGHuffmanTable class constructor

This means that you cannot create an object with a field using this constructor. values longer than 256 elements. The field itself is private, and its value is returned via a copy, which eliminates the possibility of external modification.

The last option to spoil the object is the second constructor, which fills the fields without checking.

JPEGHuffmanTable class constructor

But there is nothing special to check here – the constructor is closed, the comment says that it is for internal use, and it is called on quite constant data. So we can sum it up.

The general conclusion we can make is this: despite the obvious typo in the check, it cannot affect the security of openJDK. An incorrectly formed image will pass several checks during loading, which will not allow access beyond the allocated memory. No correct interactions in Java will allow us to create an object processed in this function so that it implements an exit beyond the buffer.

And that's it. I promised a 20-minute adventure.

Well, and a bit of empty talk at the end of the article. Did I inform the developers, did I send them a patch, a pull request or something else? Nope. There are even several reasons for this. In fact, the error does not deserve special attention, that's one. I don't know what is better in this case – to fix the check or remove it altogether, that's two. From the OpenJDK website, I didn't understand where it would be correct to write about this, that's three. I was too lazy, that's four.

Conclusion on the article, in three words

Conclusion on the article, in three words

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *