Reverse Golang binaries using GHIDRA. Part 2

This is the second part of our series on reverse engineering Go binaries using Ghidra. IN previous article we discussed how to recover function names in deleted Go files and how to help Ghidra recognize and identify strings in these binaries. We’ve focused on ELF binaries, only briefly mentioning the differences between PE files.

This article will discuss a new topic – the process of extracting type information from Go binaries. We will also explain in more detail how to handle Windows PE files. Finally, we explore the differences between the different versions of Go, including changes that have occurred since our last blog post.

At the time of writing this article, the latest version of Go is go1.18 and we are using the latest version of Ghidra – 10.1.4. As earlier, all Ghidra scripts we have created can be found in our repositories GitHub along with the “Hello World” test files. The malicious files used in the examples can be downloaded from VirusTotal.

Extraction Types

The following article gives a detailed explanation Go type systems

Go has built-in basic types such as bool, string And float64, as well as so-called composite types such as structs, functions, and interface types. Go also allows users to declare their own types. Extracting these types is an important step in static malware analysis and helps analysts understand specific parts of the code.

Below you can find some example type definitions from sys.x86_64_unp using the fix.

redress is a tool for analyzing remote Go binaries compiled with the Go compiler. It extracts data from a binary file and uses it to reconstruct symbols and perform analysis. Essentially, it tries to “dress up” a “stripped” binary file. It can be downloaded from his GitHub pages .

Defining the miner.Process structure type

Defining the miner.Process structure type

Defining the interface type _Exploit.exploiter

Defining the interface type _Exploit.exploiter

To understand how type definitions are stored in a binary file, we need to look at Go source code .

The first useful information is the list of available types.

Detailed descriptions of the different types follow. According to the documentation, rtype is the generally accepted implementation of most meanings. It is built into other types of structures. So the most important step is to understand the structure rtype.

There is quite a lot of useful information here, but the most important data for reverse is the view and name offset. This can help us understand what type we are dealing with and what that particular type is called. In the examples above, the first one is a type of structure called miner.Process and the second is an interface type called exploit.exploiter .

To find more information about certain types, we will have to examine the description of each type separately. Below we will show one example, but they are all in one file type.go. Let’s look at the type of structure.

Structure type starts with structure rtype, which we just discussed. It is followed by the name of the package containing that particular structure. When miner.Process the package is called shell/miner. Finally, there is an array of structure fields.

Structure structField contains a field and a pointer to a structure rtype, which tells us what type this field is. So, in our example the structure miner.Process contains five fields:

  1. type intcalled pid

  2. It is followed by three types: name, path and cmdline

  3. Fragment containing values uint8which is calledbuf

Now, we understand what information we should be looking for, how the most important type data is stored, but the question is how do we find these structures inside the binary file.

Module data

First, we need to understand the so-called module data structure. This table is available in Go binaries since version 1.5. It has undergone some changes over the years, so each time we have to take into account the version of Go that was used to create a particular binary. A good introduction to modular data can be found Here . Below we will discuss the latest form of this structure available in go1.18.

Let’s look at it again source .

According to this, “the data module writes information about the layout of the executable image.”

This table contains a lot of additional useful information, but the following data is useful for extracting types:

  1. types, etypes – address of the beginning and end of the section containing type descriptions

  2. typelinks – a slice containing 32-bit integers that are offsets of type structures from types

In ELF binaries it is very easy to find these specific addresses and offsets without even finding the module data structure. Offsets can be found in section .typelinksand types are actually the start of a section .rodata.

Retrieving type definitions is a recursive process. As you can see from the examples above, some types refer to other types, such as struct fields within a struct type.

Note. Our script currently extracts this information for ELF binaries without discovering the module data structure. However, be aware that this process may fail due to changes in future versions of Go or due to some confusion when section names change. In this case, use the same method as for PE files, see explanation Here.

Bottom line

In this section, we summarize the necessary steps to extract type information from ELF binaries.

  1. Find the .typelinks section and look at the offsets.

  2. Find type descriptions using offsets from the .rodata section.

  3. Determine the type of types

  4. Retrieve available type information based on its type.

  5. Find the reference types and repeat steps 3-5 again.

  6. Our script follows the steps above and creates tags and comments that help reverse engineer the binaries.

  7. Each type is labeled by its name, and for some types additional information is added in the form of preliminary comments.

The script currently adds detailed descriptions to function types, interfaces, and structures. Next we’ll look at an example of what this looks like in Ghidra.

Example: Extracting Golang types into Ghidra

To give you a better idea of ​​what this looks like, here’s a detailed example where extracting type information provides quick and easy tips for reverse engineering a Go binary.

The ech0raix Ransomware

After restoring the function names and strings (see the previous post for more details), we already have a lot of useful information about the purpose of the file. We can easily find the main functions and get some insight into their behavior. In the example below, we’ll look at the main.getInfo function, where we see some network communication taking place. The question is what data is transferred through this connection. Just before the runtime.newobject function is called, we see an interesting data reference: DAT_824bd20.

Unfortunately, looking at this section of data won’t do much good. However, a closer look reveals that the type declaration structure is actually stored there.

After running our type extraction script, we’ll see that in the listing view just above the runtime.newobject object, the data reference in the function call has been renamed to something meaningful: main.Infowhich is the name of the extracted type.

If we follow this link we will find more information about this specific type. In this case main.Info is a structure type containing two fields (RsaPublicKey and Readme), the types of both fields are string. From this, we can safely assume that this is where the RSA public key and the contents of the ransom note are transferred between the C2 server and the victim.

Reverse engineering Windows PE files using Ghidra

As we saw above, extracting type information from ELF binaries requires just a few simple steps, and thanks to clearly defined sections such as .typelinks (or .pclntab in the case of function name recovery), finding the required data is very easy. easy. Unfortunately, these named sections are not available in PE files. So, to extract the type information, we have to look up the module’s data table directly.

In our script, the findModuledata and isModuledata functions are responsible for searching the module data table. The script takes advantage of the fact that this table starts with a pointer to a pclntab structure (pcHeader for later versions). So first we look up the pclntab structure and use the links to find the module data, since one of the links pointing to pclntab must be the start of the data module. Finally, we check other fields in the module data to make sure we found the correct address.

The findPclntabPE and isPclntab functions are used to find the pclntab structure, which is a separate section called .gopclntab for ELF files. For PE files, we look for magic values ​​at the beginning of the structure and check the next few bytes for known values. More information about the structure of pclntab can be found in our previous article on reverse-engineering Go binary code.

Recovering the function name

Our function name recovery script has been updated to work with both ELF and PE files, as well as the latest version of Go. We use the same functions as for type extraction to find the pclntab structure and from there find the function names – everything works the same as for ELF binaries.

Golang version differences

The biggest challenge in analyzing Go binaries is the constant version changes. What works for one version may not work for another. For this reason, we must keep an eye on version updates and update our scripts accordingly. Most importantly, we must be able to determine the Go version of a particular binary file in order to properly parse that file.

Our current approach is string-based, which means we look for the string “go1.x” in the binary and use the first occurrence to determine the version. Although this approach worked in all cases where we used our scripts to analyze malware, it has several disadvantages:

  1. It’s slow.

  2. Strings of different versions can be found in the same binary if certain Go packages are included from different versions.

  1. Can be easily faked.

According to our research, the most important version changes are the following:

  1. Pclntab header updates (available from version 1.2, changes in versions 1.16, 1.18)

  2. Moduldata structure update (available from version 1.5, changes in versions 1.7, 1.8, 1.10, 1.16)

  3. Type name structure update (1.18)

Scenario improvements and future plans

  1. Improve the method for fetching versions.

  2. Add detailed type definitions for types other than just functions, structures, and interfaces. (At this time, it’s possible that we’ll skip a few type declarations because we don’t do iteration steps for types like chan or map.)

Links and further reading

Source:
https://cujo.com/reverse-engineering-go-binaries-with-ghidra-part-2-type-extraction-windows-pe-files-and-golang-versions/

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *