Executable body kit

Hello, Khabrovites. For future students of the course Reverse-Engineering. Basic “ Alexander Kolesnikov has prepared a useful article.

We also invite everyone to attend an open webinar on the topic “Exploiting driver vulnerabilities”… The first part of the webinar will be an example of the classic buffer overflow vulnerability as well as the integer overflow vulnerability. Also, the participants, together with an expert, will analyze the specifics of developing exploits in kernel mode.


The article will be divided into 2 parts – a theoretical minimum for understanding the main elements of hinged protection of executable files, and the second, which will show several examples of parsing files. All data are not intended to be complete. To fully understand the topic and unpack files that were protected by the protections in question, you need patience and a sufficient background in the Internals OS.

Disclamer: All information is provided for educational purposes only.

Hinged protection

The term “overhead guard” used is actually only used to refer to a class of packers whose main purpose is to prevent an engineer or novice researcher from quickly understanding the algorithm and modifying or copying it. The key word here is fast. There are any ways to achieve this goal, up to blocking and detecting any tools for dynamic instrumentation and debugging. Let’s try to describe the range of possibilities:

  • Security procedures that detect the presence of a debugger in the system;

  • Security procedures for detecting the sandbox;

  • Security procedures for detecting dynamic instrumentation;

  • Blocking the debugging capabilities of the protected process;

  • A virtual machine with its own set of commands;

  • Methods for obfuscation and obfuscation of the “fake insertion” code, “Nanomites“.

The list is impressive, making it all work for the protected application is a rather difficult task, but nevertheless not impossible. Today, such protections exist and are quite successfully applied. Examples are WinLicense, Themida, Enigma protector

All of these systems are commercial and are used for licensing and protecting applications. These packers are certainly designed for good purposes so that the work of software developers is reliably protected, but technologies of this kind can also be used to package malicious software. Moreover, the listed packers can be used, or there can be partial own implementation. One of the most recent and striking examples is the FinSpy RAT, which was just protected by its own implementation of “hinged protection”.

The cornerstone of all the protections listed above is a custom virtual machine. A trick that can be mathematically described as a modified horse automaton. It is this element that always causes a headache for beginner reverse engineers. How to analyze such mechanisms? What tools to use? Let’s try to figure it out.

Research methods

What is a Code Protection VM? This is an algorithm that operates on its own versions of instructions, which may differ from the instructions of the processor on which the application is launched. There are 2 types of virtual machines:

  • The virtual machine that converts the commands of the executable file saves them to a file. When such an application is launched, the commands are converted into processor commands and the application runs as usual.

  • The virtual machine contains a set of commands that are used to execute the algorithm of the protected application. The protected application is converted into a sequence of commands that only a virtual machine can parse, and these commands are used to perform all operations of the protected application. That is, there is absolutely no point in time when the instructions of the virtual machine again become the usual instructions of the processor. These virtual machines are sometimes referred to as “virtual obfuscators”.

A typical virtual machine architecture with its own instruction set has the following distinctive features:

  • Virtual machine initialization algorithm;

  • Huge switch block that implements every command of the virtual machine;

  • A large block of data that is contained in a file and cannot be decompiled. This is usually bytecode that is used by the virtual machine. Themida, for example, can use 4 different kinds of bytecode. When packaging an application, the user can choose which bytecode variant to use.

All actions that the protected application could perform at the lowest level are performed by calling commands from the switch block of the virtual machine. You can see how these blocks might look in the decompiled version in the pictures below:

Picture borrowed from here

How to analyze this? The general analysis algorithm is reduced to:

  • Removal of all anti-debugging techniques;

  • Identification of the place where all registers are saved in the structure for starting the virtual machine;

  • Identification of the used set of commands;

  • Identification of the used bytecode;

  • Identification of the type of virtual machine (whether the code is decoded to native in memory and then works as usual or not);

  • Creation of a decoder to work with the identified bytecode.

Let’s do a little practice.

The beginning of the way

As an example, we will use the ancient VMProtect 1.1 version. Let’s build a simple application and obfuscate it. Application source code:

#include <iostream>

int main
{
  std::cout<< "Hello World!" <<std::endl;
  system("pause");
  return 0;
}

We will build the application in the release version and decompile it. As a result, the Main function will look like this:

And after conversion through VMProtect:

Since there are not so many commands in general, then on the graph we see not a huge switch block, but only a few additional blocks.

Let’s go through each of them:

The first two blocks function as the first steps of initialization. In these steps, general-purpose registers are saved in a common structure for more convenient use, and a pointer to a set of bytecode that will be used for the application algorithm to work is also initialized.

The next two blocks are created to obfuscate the algorithm, this is weak obfuscation, which just blurry the eye. The main actions are performed further, in the last block of this function:

The last block allows the edx register to be initialized with the address of the handler function and the algorithm is pulled through the virtual machine executing the bytecode.

By the way, the bytecode looks like this:

In the next article, we will see how to decode commands and debug such an application.


Learn more about the course Reverse-Engineering. Basic “

Watch an open webinar on the topic “Exploiting driver vulnerabilities”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *