Reverse engineering applications for the little ones

Reverse Engineering is a method of researching devices or software in order to understand how it works or to discover undocumented possibilities. Reverse engineering plays a significant role in information security, thanks to which information security specialists can investigate malicious applications, understand how they work for subsequent, for example, compiling signatures in antivirus databases and protecting other users from an impending digital threat.

There are 3 methods of reverse engineering:

  • analysis of application data exchange using various traffic analyzers;

  • disassembling the machine code of the program (study takes quite a lot of time);

  • decompilation of the program code to create the source code of the program in a high-level programming language.

The topic of disassembling machine code, in general, is inextricably linked with reverse engineering, but a layman can simply discourage any desire to contact him. This is due to the required level of training and the time spent on “disassembly”. Nevertheless, each case can turn out to be a rather fascinating “journey” into the bowels of the application. Therefore, today we will try to visually disassemble one of the applications to show that disassembling is not at all scary and with due diligence and patience, you can pump a new skill.

As a test case, let’s take the task that was used in Test lab 15. But for greater clarity, the program was compiled without debugging information and without using optimizations (gcc -O0). Then the utility was applied to it strip with parameter -s, which removes all unnecessary symbolic information from the executable file, such as variable and function names.

The working tool will be Radare2 – cross-platform set of utilities for reverse engineering and debugging. Includes tools for analysis, disassembly, decompilation and patching. Unlike the popular IDA Pro distributed free of charge.

Installation

The developers’ recommended way to install and update Radare2 is to install from the official git repository… Pre-installed packages must be present in the system git, build-essential and make

$ sudo apt install git build-essential make
$ git clone https://github.com/radareorg/radare2
$ cd radare2
$ sys/install.sh

It is recommended to start the installation not under the root user, otherwise the script will lower its privileges by itself.

Next, install the graphical shell for Radare2. We will be installing an official GUI called Iaito. Install the packages required to install Iaito:

$ sudo apt install qttools5-dev-tools qt5-default libqt5svg5-dev

For Linux distributions based on Debian, there are ready-made packages, links to which you can take here… Download and install the required version of the package:

$ wget https://github.com/radareorg/iaito/releases/download/5.2.2/iaito_5.2.2_amd64.deb
$ sudo dpkg -i iaito_5.2.2_amd64.deb

Now let’s install the plugin r2ghidrawhich is the integration of the Ghidra decompiler for Radare2. The plugin does not require a separate installation of Ghidra, as it contains everything you need. For installation, the plugin is available as an r2pm package:

$ r2pm update
$ r2pm -ci r2ghidra

The installed plugin is automatically integrated into the Iaito GUI. After installation, we launch the graphical shell and if everything is done correctly, we see the start screen:

Now we can start with our example. The essence of the example program is as follows: when launched, a certain token is displayed on the screen, which is necessary to record the execution of the task and the private SSH key. But something went wrong, and as a result, the key is displayed in an incorrect form, and the token is not accepted as correct.

Running the sample program for the first time

Open the file in Iaito, leave the default analysis settings:

Settings

After Radare2 analyzes the file, see the result that opens in the Dashboard tab:

The program is compiled for 64-bit Linux, written in C. On the left, we see a list of functions that Radare2 was able to detect. Among them are the functions imported from the libc library printf, puts and putchar, displaying a formatted string and character.

Function main Is the main function of the program. Execution starts with it. Double-clicking on its name opens the Disassembly tab with the result of its disassembly:

Result
^
^

A little about Assembler

Assembler is a low-level machine-oriented programming language. It is a notation system used to represent programs written in machine code in an easy-to-read form. It is named after the utility of the same name, which translates the program into the machine code of the processor.

Assembler commands

Each Assembler instruction is an instruction for the processor. The command syntax consists of several parts:

Team – means what operation needs to be performed. For example:

  • mov – data transfer command. Copies the contents of one operand to another;

  • lea – calculates the effective address of the source operand and stores it in the register;

  • cmp – comparison of two operands;

  • conditional and unconditional jumps (jmp, jne, je, jle,…) – unconditional and conditional (condition must be met) jumps to labels. For example, jump @exit jumps to the label unconditionally exit;

  • nop – a single-byte command that does nothing, but only takes up space and time. It is mainly used to create a delay in a program or as a placeholder for remote instructions. For example, the command for checking the license key in cracked programs is replaced with “do nothing”;

  • etc

Operands – what the commands will be executed on. Operands can be names of registers, memory locations, or overhead of instructions.

Comment – it is clear from the name that this is a comment for ease of reading the code It is written after a semicolon.

Tags – designation of the code section. It also improves the readability of the code, but they are also needed to go to the section marked by it.

mov ax, 0 ; Помещаем значение 0 в регистр ax

Where:

  • mov – command (moving a value from one operand to another);

  • ax, 0 – operands (register and value);

  • ; – comment

Or consider another example of what the raising of a number to a power looks like in Assembler:

mov ax,2 ; Помещаем значение 2 в регистр AX
mov bx,ax ; Помещаем значение регистра AX в BX
mul bx ; Выполняем дважды командой mul возведение в степень числа 2
mul bx

The same action will look like in a high-level language, for example, C:

pow (2, 3);

Back to our assignment

For a better understanding of the logic of program execution, you can switch to the tab Graph at the bottom of the window. There we will see blocks of commands of the function in which we are located, and the transitions between them, built by Radare2 based on the commands of conditional and unconditional transitions.

Scaling on this tab is performed by the keyboard shortcuts Ctrl + “-” and Ctrl + “+”. One could begin to understand the work of the program from this point, but there is an opportunity to look at the program in an even more “readable” form. Switch to the tab Decompiler, at the bottom of the window and we see the pseudocode obtained as a result of decompilation (restoration to the code in the language in which the program was written, in our case – the C language) using the built-in decompiler Radare2.

There are still many references to registers and unconditional jumps in the resulting text. Let’s switch to the decompiler Ghidrathat we installed earlier. To do this, select “pdg” instead of “pdc” in the drop-down list in the lower right corner of the window.

Now the program code has become almost completely readable, with the exception of variable names.

undefined8 main(void)
{
    uint32_t uVar1;
    uint32_t uVar2;
    uint32_t var_ch;
    undefined8 var_8h;
    
    puts("Token:");
    fcn.00001189(0x5020, 0x5090);
    for (var_8h._0_4_ = 0; (int32_t)var_8h < 0xf; var_8h._0_4_ = (int32_t)var_8h + 1) {
        putchar(*(undefined *)((int64_t)(int32_t)var_8h * 8 + 0x5020));
    }
    puts(0x2c78);
    puts("Key:");
    printf("n-----BEGIN RSA PRIVATE KEY-----");
    var_ch = 0;
    for (var_8h._4_4_ = 0; (int32_t)var_8h._4_4_ < 0xc68; var_8h._4_4_ = var_8h._4_4_ + 2) {
        if ((var_8h._4_4_ & 0x7f) == 0) {
            putchar(10);
        }
        uVar1 = (*(uint8_t *)((int32_t)var_8h._4_4_ + *(int64_t *)0x5098) & 0x1f) + 9;
        uVar2 = (*(uint8_t *)(*(int64_t *)0x5098 + (int64_t)(int32_t)var_8h._4_4_ + 1) & 0x1f) + 9;
        putchar((char)uVar2 + (char)(uVar2 / 0x19) * -0x19 + ((char)uVar1 + (char)(uVar1 / 0x19) * -0x19) * 'x10' ^
                *(uint8_t *)((int64_t)(int32_t)var_ch * 8 + 0x5020));
        var_ch = var_ch + 1;
        if (var_ch == 0xf) {
            var_ch = 0;
        }
    }
    putchar(10);
    puts("-----END RSA PRIVATE KEY-----");
    return 0;
}

In the code, we see that first the “Token:” line is displayed, after which a function is called with two parameters, after which there is a loop with a variable var_8hwhich goes through values ​​from 0 to 14 inclusive and outputs something character by character based on the memory address 0x5020 and a counter with a multiplier 8. From this we can conclude that in memory, starting from the address 0x5020, there is an array of structures of 15 values, 8 bytes in size. It is also worth noting that the address 0x5020 passed as the first parameter to the function called before this loop. For simplicity, we will refer to it as “token” below. Further, the lines of the beginning of the private key are displayed in the code, and the private key character by character is displayed in a loop. Inside the key derivation loop, there is a repetitive loop through the previously discovered array of structures, using the variable var_ch… Before being displayed on the screen, an exclusive OR (XOR) operation is performed with the current token character on each character of the private key. After the loop, a string is output that ends the private SSH key. Based on the fact that the token displayed by the program is not correct, we can conclude that something is wrong in the previously discovered function with two parameters fcn.00001189called before displaying the token to the screen. Let’s go to it by double-clicking on the name of the function in the list on the left.

In the function code obtained after decompilation, we see that it is a double loop with a parameter, in which, after comparing two values ​​of the structure elements, their places are exchanged if one value is less than the other. Most of all it looks like a sorting algorithm. In particular, one of the bubble sort implementations. Based on the information about the bubble sorting algorithm and the code we received, we can conclude that the condition for exiting the nested loop is written with an error. The pass is not carried out to the end of the array of structures.

_var_18h < (undefined8 *)(arg2 + -8)

It turns out that we need to fix it. Let’s switch to the disassembler tab:

In the resulting code, we see only one subtract 8 command:

0x00001211 sub rax, 8

Let’s switch to graph mode to correlate the assembler code with the decompilation result:

Graph representation

Having analyzed the logic of transitions and correlating it with the assembler code, we confirm that we are interested in this particular area of ​​the function.

Now we need to remove the command sub rax, 8… The easiest way to do this is to rewrite the command over it nop (No Operation) – that is, replacing the subtraction command with the command “do nothing”.

To do this, being on the disassembler tab, put the cursor on this command and switch to the tab Hexdump:

By the relative address of the command 0x00001211 make sure the cursor is where it needs to be. Allocate 4 bytes starting from the address 0x00001211 and on the right, select the “Parsing” tab. We will see the result of disassembling the allocated bytes.

Now you need to replace the allocated bytes with 4 bytes with a value of 90 (the hexadecimal value of the machine code of the command nop), but here we are faced with the fact that in Iaito it is impossible to just edit the hexadecimal value at the address. We can see the list of available actions by clicking on the selected bytes with the right mouse button.

Yes, you can use a third-party hex editor, but that would be “unsportsmanlike”. Since we are trying to perform all actions only within the Radare2 functionality, we will use what we have.

First, let’s select “Write zeros”. Iaito will remind us that the file is open in read-only mode and will offer to reopen it either in write mode or enable caching mode. In cached mode, all changes to the original file will be applied only after selecting the “File → Commit changes” menu item.

A warning

Let’s select the caching mode and then try to write the zeros again. And now we can do it. On each of the four bytes, select the item “Edit → Increment / Decrement” from the context menu and add the value 144 (decimal notation of the hexadecimal number 90).

We look at the resulting result:

After making the changes, do not forget to click “File → Commit Changes”. Run the dechip program again to see the result of our actions:

It was
Has become

It is worth noting that some of our actions were based on assumptions. And they are not always confirmed so quickly and successfully. For guaranteed success, you need to learn more deeply the Assembly language of the processor architecture, for which you want to reverse engineer programs, as well as the most common algorithms.

Conclusion

In general, the free counterpart of IDA Pro in the person of Radare2 is a pretty good solution. However, although the official GUI of Radare2 allows you to conveniently navigate between Radare2 tools and in terms of displaying information, it is more convenient than the console version, but at the same time, it is not yet fully developed and does not provide all the features that can be implemented through the console. All the features of the console version can be found in the official book on Radare2

As for reverse engineering, it turned out to be not scary at all, and even with a level of knowledge of the Assembler language, you can understand the structure of some simple application.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *