Using Windbg for reverse engineering

This article is a tutorial on how to use Windbg. The “classic” version of the debugger will be considered. Let’s customize the look and explore the commands that can be used to explore the application.

Installation

Installation is possible only when using the Windows SDK. The Windows 10 version can be found here… To work with the debugger, you need to start the SDK installation process and select only the options with a set of debugging tools. An example of a selection is shown in the picture below.

After installing the toolbox, you can find all the files in the “Windows Kits” directory. The debugger is installed for several processor versions at once. To execute further commands in the manual, you need to select the processor architecture corresponding to the file under study. For research, take here this file and try to find the key. Before starting to examine the file, it is recommended to make additional settings:

  • Set directory and server for debug symbols The easiest way to do this is through the menu: File-> Symbol File Path. In the menu that opens, you need to register this line: “SRVC: symbolshttp://msdl.microsoft.com/download/symbols “… Then create a directory C: symbols;

  • Install WorkPlace with a convenient layout of work panels. You can take a ready-made Workspace from here… As a result, if you run notepad.exe for the test in the debugger, it will look like this:

Now you can move on to exploring commands. Let’s open in the debugger file and let’s get started.

Command set and application analysis

A complete reference for all debugger commands can be found under the “.hh” command. Help will appear as shown below. Here you can enter a description or a specific command.

Any set of commands that can be represented as a simple list is not very useful, so all the commands that will be described in the article will be accompanied by an example of application. Let’s start simple.

  1. Let’s determine where the application starts from. Without going into details, the very beginning of the application is the EntryPoint from the file header, but this is not entirely true. EntryPoint for modern programming languages ​​today is the beginning of the preparation for the entire application. Therefore, if we need to find the code that was written by a programmer, we will have to travel a little more through the disassembled listing.

The first command is lm. Shows a list of modules that are loaded into the memory of the process; by issuing this command, you can get the base address of the file.

The main commands that will become eyes and ears when examining data in RAM are d? (b, c, a, p, w, q). The command displays a memory dump at the specified address. You can use a specific address or register. For example, you can see what part of the file header looks like in memory:

The! Dh command parses the file and displays the headers. We need a file header, so let’s add the -f flag. If it is necessary to show all the information about file and section headers, then you do not need to supplement the command.

As you can see in the figure, the data generated by the command can be used to localize the data within the file.

We localize the entry point by summing the base address and header information. This support can be performed using a number of commands:? – execution of an expression, uf – disassembling a function, bp – setting a breakpoint. And now, in order, using an example:

Calculation of the address.

Disassembling a function from address to ret command.

Setting a breakpoint, by the way, to control breakpoints, you can change the last letter of the command. The figure shows an example of setting a breakpoint and viewing a list of existing points using the bl command.

To reach the breakpoint, you can enter the command that is used to execute the application algorithm – g.

As soon as the OS loader algorithm completes all the preparatory actions, we will see the following data on the command line:

You can start looking for the main function. It is quite simple to do this, you need to localize the constants from the file interface when the greeting and data request are executed:

To search for data, we will use the s command. This command searches for the amount of data specified in the command. Accordingly, to get data about the location of the prompt for entering the key, you need to specify the entire address space of the application under study. It is also necessary to indicate the type of data to be searched for.

Now that we know the address of the data that is being used, we can set a breakpoint that will monitor access to the data. This can be done with the ba command. For such a breakpoint, you need to specify the size of the data for which the debugger will monitor, and in the same place the address and type of access. The address must be 4-byte aligned. After installation, you need to run the application again via the g command. The figure below shows a variant of the command:

When the breakpoint hits, you need to find the part of the application algorithm that prints the greeting to the screen. It is convenient to use the function call stack for this. To do this, you can run the k command.

The figure shows that copying a string to work with it is performed by a library function, and its call was made from “For_Crackme + 0x15f2”.

2. Let’s localize the key validation function. The validation function will be close to the offer to enter user data. In the last step, we found the offset inside the function before this operation. Let’s enter the modified uf – u command in order to see several commands after the address "For_Crackme+0x15f":

The code snippet does not contain additional debugging symbols, so just look at the data side by side:

  • offset For_Crackme+0x40a2

  • offset For_Crackme+0x40bb

To do this, use the db command:

It looks like the function is preparing the data to present information to the user. So the key check must be somewhere nearby. Let’s pay attention to 2 constants that are placed in memory through the following commands:

...
00401612 c744241c30372f31 mov     dword ptr [esp+1Ch],312F3730h
0040161a c7442420302f3937 mov     dword ptr [esp+20h],37392F30h
...

If we decode the constants, we get the following value: “07/10/97”. The command can help to decode .formats 312F3730h… From the list of formats, we are interested in Char or symbolic representation. It is worth remembering that the data in memory is stored in the LittleEndian format, so if you read it the other way around, you get the data needed to pass the validation.

Thus, you can analyze applications using Windbg and not resort to additional tools.


The article was written on the eve of the start of the course “Reverse-Engineering. Professional”… We remind you that tomorrow will be the second day of the free intensive on the topic “Writing a process dumper”. You can sign up for an intensive course at the link below:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *