how to compress images 20 times faster than STBI and without loss


The QOI image compression format introduced a month ago already has implementations in various languages, plugins for GIMP, Xn View MP and Paint.NET, and dll for displaying thumbnails in Windows Explorer… Can download image and immediately look at it here… Read more about qoi from the author of the format under the cut.


Introducing QOI – Quite OK Image Format… It losslessly compresses RGB and RGBA images down to PNG size while providing acceleration 20-50 times when compressed and accelerated by 3-4 times when unpacking. Everything is single-threaded, no SIMD. And simple to the point of stupidity.

In the beginning, I want to say that I have no idea what I am doing. I am not a fan of compression and barely understand how Huffman and DCT coding works. Fortunately, QOI does not use either one or the other. I was just working through some ideas that I thought could help compress images. The result surprised me a lot.

For what? Short tirade

The file formats are terrible. I absolutely do not like PNG, JPEG or, even worse, MPEG, MOV, MP4. They are torn at the seams from complexity. Every little thing in them screams: “Design by the consortium!”

Some time ago I pampered a little with MPEG. The basic ideas of MPEG video compression are ingenious, and even more ingenious for 1993, but the resulting file format is an abomination.

I imagine a moving image expert group meeting where some random lawsuit demanded that the format have a way to indicate that the video stream is copyrighted. And the copyright bit flag went into the standard and successfully stopped film piracy before it even started.

The industry standard MPEG was conceived thirty years ago, all patents have expired long ago, and professional interest has been lost. But a sacred spec called ISO / IEC 11172-2Is a well-kept secret, only disclosed to those who shell out $ 200 to sponsor the ISO.

There are alternative, open source video codecs, but again they are terribly complex. They compete with the latest technologies, require huge libraries, computationally intensive and difficult to work with. The PNG alternatives compete with each other in terms of compression ratio, and their complexity is constantly increasing.

There is certainly a market for video, audio, and image codecs that traded compression for speed and simplicity, but no one serves it. These guys could, but everything is patented there.

Yes, stb_image spared us all the pain of libpng and is therefore used in countless games and applications. A while ago I tried to do the same for a video using pl_mpeg and has had some success.

But with all the things I’ve learned, why not take a step back and implement simple a compression scheme capable of competing with PNG, but without the jarring? Or why not implement simple video compression scheme similar to MPEG, but in normal format?

And I was messing with all this, wanted to take MPEG-1 elements and make the parsing easier, speed it up on the GPU, but came across a solution in terms of a lossless image format, sometimes able to compete with PNG. The compression ratio is slightly worse, but the format is simpler.

Technical details

The QOI file format specification has been updated. Please read it at qoiformat.org… QOI encodes and decodes images in one pass, and it goes through each pixel only once. The pixel is encoded in different ways.

The resulting values ​​are packed into blocks that start with a 2-4-bit tag (indicating one of these four methods) followed by a series of data bits. All of these blocks (tag and data bits) are byte-aligned, so there is no need to wiggle out bits between blocks.

There are four compression methods:

1. Run by pixels

If the current pixel is exactly the same as the previous one, the path length is increased by 1. When a pixel that is different from the previous one is detected, the path length is stored in the encoded data, and the current pixel is packed using one of three other methods. The path length fragment has two options, the option depends on the number of required bits:

QOI_RUN_8 {
    u8 tag  :  3;   // b010
    u8 run  :  5;   // 5-bit run-length repeating the previous pixel: 1..32
}
QOI_RUN_16 {
u8 tag  :  3;   // b011
u16 run : 13;   // 13-bit run-length repeating the previous pixel: 33..8224
}

2. Index in the previously seen pixel

The encoder stores a 64 pixel array of mileage that it encountered earlier. Upon detecting that the current pixel is present in this array, the program saves the index in this array to the stream. To make encoding take O (n) time, only one lookup is performed on this array.

The position of the search is determined by the “hash” of the rgba value (actually just r ^ g ^ b ^ a). Linear search or bookkeeping more difficult will not improve compression too much, but will slow things down a bit.

QOI_INDEX {
u8 tag  :  2;   // b00
u8 idx  :  6;   // 6-bit index into the color index array: 0..63
}

3. The difference with the previous pixel

When the current color of a pixel is not too far from the previous one, the difference from the previous pixel is stored in the stream. There are three options here, depending on how big the difference is. The focus is on the RGB value; alpha changes are more expensive:

QOI_DIFF_8 {
u8 tag  :  2;   // b10
u8 dr   :  2;   // 2-bit   red channel difference: -1..2
u8 dg   :  2;   // 2-bit green channel difference: -1..2
u8 db   :  2;   // 2-bit  blue channel difference: -1..2
}
QOI_DIFF_16 {
u8 tag  :  3;   // b110
u8 dr   :  5;   // 5-bit   red channel difference: -15..16
u8 dg   :  4;   // 4-bit green channel difference:  -7.. 8
u8 db   :  4;   // 4-bit  blue channel difference:  -7.. 8
}
QOI_DIFF_24 {
u8 tag  :  4;   // b1110
u8 dr   :  5;   // 5-bit   red channel difference: -15..16
u8 dg   :  5;   // 5-bit green channel difference: -15..16
u8 db   :  5;   // 5-bit  blue channel difference: -15..16
u8 da   :  5;   // 5-bit alpha channel difference: -15..16
}

4. Full rgba values

If all three previous methods fail, the r, g, b, a values ​​(but only those that differ from the previous pixel) are stored in the stream as full bytes:

QOI_COLOR {
u8 tag  :  4;   // b1111
u8 has_r:  1;   //   red byte follows
u8 has_g:  1;   // green byte follows
u8 has_b:  1;   //  blue byte follows
u8 has_a:  1;   // alpha byte follows
u8 r;           //   red value if has_r == 1: 0..255
u8 g;           // green value if has_g == 1: 0..255
u8 b;           //  blue value if has_b == 1: 0..255
u8 a;           // alpha value if has_a == 1: 0..255
}

That’s all. If you have a minute, please read the source qoi.h

What’s next?

Seriously, I’m stunned. BMP and TIFF have mileage encoding, GIF works with LZW. But what is in between? Nothing. Why? I found that the field to work between RLE and LZW is large enough to spend more than a day on it. The QOI was a lot of fun. Seeing how each change that was made affected the compression ratio was breathtaking.

Better designed, QOI can serve as the basis for a lossless video codec for screencasts and the like. Speed ​​up SIMD for QOI would also be cool, but (based on my very limited knowledge of some of the SIMD instructions on ARM) the format doesn’t seem to be a good fit for this. Maybe someone with more experience can shed more light on this?

I am also willing to explore the rather vast field of simple lossy image compression format. Many texture compression schemes implement very interesting ideas, but there is nothing that can compete with JPEG and be simpler.

This concludes the author’s post. And you can learn how to solve problems using algorithms in our courses:

find out details of the action

Other professions and courses

Data Science and Machine Learning

Python, web development

Mobile development

Java and C #

From the basics to the depth

As well as

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *