How do smartphones process photos? Parsing

We take a modern smartphone. We open the camera. Taking pictures. And we get a good shot right away. Without setting anything at all, without thinking at all whether this camera will cope!

The raw image, before being processed by all algorithms, looks rather gray and dull. But we get a bright, saturated frame.

But how did we get to a wonderful life such that modern smartphones take such cool pictures as if by magic? But this is not magic …

Today we will talk about the six stages that digital photography goes through before it turns into a masterpiece in the memory of your smartphone.

1. Photons of light

So, the first stage. Light passes through the lens and hits the sensor. And then the magic begins … In our case, the magic will take place on the flagship Samsung Galaxy S20 Fan Edition.

We know that a matrix is ​​made up of millions of pixels. In this case, 12 megapixels. There is a triple camera: wide-angle, ultra-wide-angle and 3x zoom.

But first, we’ll talk about the main one, on which most of the pictures are taken. So, each pixel, in turn, consists of many details, but the main one is a photodiode. The task of this thing is to capture photons of light and convert them into electricity. How he does it?

Look, a photodiode is made of silicon. And silicon is a material with interesting properties. For example, if a current is applied to it, it becomes sensitive to electromagnetic radiation in the range from 400 to 1100 nm. And this just covers the spectrum of radiation visible to humans. We see wavelengths from 380 to 740 nm and silicon from 400 to 1100 nm. It turns out that silicon sees almost the same thing as us, plus a little infrared radiation.

But what does it mean, sees? Let me explain when the photons of light hit the photodiode: a photon that has penetrated into silicon knocks out an electron, which falls into a so-called potential well or a trap for electrons.

Further, by counting the number of electrons in this trap, we can understand how much light hit the pixel. So we can determine the brightness of the pixel. For example, if there are few photons, then the pixel will be black, and if a lot, it means white.

And here there is an important point. At this stage, it is important for the matrix not to lose a single photon of light. The more photons are converted into electrons, the more efficiently the matrix works.

And in all honesty, even 3 years ago, matrices were not particularly effective. For about 10 photons, 4-6 electrons were generated. This indicator is called quantum efficiency and it used to be 40-60%.

Why did this happen? Basically, just because the photons just didn’t hit the photodiode. Even despite the fact that special microlenses have been placed over each pixel for a long time, which focus the light to the center of the pixel. All the same, a lot of photons were re-reflected and lost, or even worse hit the neighboring pixel, due to which the efficiency dropped, and cross-talk appeared.

This problem was solved by Samsung with the introduction of ISOCELL Plus technology. Essentially, it is an isolated cell technology. They built up thin walls on all sides of the pixel, which completely isolate one pixel from another.

Also, Samsung increased the pixels themselves, but not in width, the pixels even became closer to each other. At the same time, the depth and, accordingly, its volume increased, which increased the capacity of the potential pit. This increased the dynamic range.

All this made it possible to increase the fraction of working photons (this is called quantum efficiency) of a pixel up to 120%. This means that one photon of light excites more than one electron. Hence the amazing sensitivity of ISOCELL Plus sensors.

For example, here is the Galaxy S20 Fan Edition, here is, in my opinion, the most optimal sensor from Samsung – Samsung S5K2LD.

Why the most optimal. Well, look, first of all, the resolution is 12MP? and more is not necessary. This is ISOCELL Plus. And most importantly, the pixel size here is 1.8 microns! And this is a lot …

2. RAW

Okay, now stage two. After we have collected electrons in the potential well, we need to count and digitize them. That is, collect all the raw data … and then there are two options. Or, all raw data is collected into a RAW file. Such files are loved by all professionals or simply enthusiastic photographers.

The point is that RAW files can be manipulated as you please. You can tweak the white balance, pull out shadows, and even highlights if you’re lucky. You can play with noise canceling and stuff.

Previously, this format was only available on large digital cameras. And now, in the same Samsung smartphones, you can take pictures in RAW in manual camera mode. Also in this mode you can set the exposure, shutter speed, whatever you like. And edit in the mobile editor.

3. Debayerization

The third stage is debayering. Cho is it?

We remember that silicon reacts to a fairly wide spectrum of light. But at the same time, he does not distinguish between colors. Therefore, to obtain a color image, a grid of color filters is applied to the matrix. The most common layout for such a grid is the Bayer RGB filter, where for each blue and red sector there are two green ones. By the way, this is an interesting feature of our perception of the image.

As a result, when filtering light in this way, at the output we get a mosaic of red, green and blue pixels with a lot of empty areas.

And so, in order to restore a full-fledged color image, we need to fill in the missing data in each color channel, for example, by averaging the values ​​of neighboring pixels that contain data. This process is called debayering.

Previously, this stage was quite straightforward. But after the advent of ISOCELL technology and the like, manufacturers learned to make very small pixels smaller, that is, only less than one micron. And they began to combine four pixels, such matrices are called TetraCell, or even nine pixels – this is Nonacell – under one color filter. As in the S20 Ultra for example.

This made it possible to combine pixels in low light and get one superpixel, consisting of four or nine pixels. In the daytime, on the contrary, you can shoot in full resolution using the reverse debayering algorithm.

In this smartphone, all the main modules have the usual Quad Bayer structure, but the front camera is TetraCell. Therefore, you can choose which selfie you want to take – at 8 or 32 MP.


And so we glued the color photo together. You think this is where all the treatments end. No? they are just beginning. Further, in order to increase the dynamic range of the image and reduce noise, HDR algorithms come into play. Traditionally, there are two ways to get an HDR image – either Image Stacking or Image Bracketing. What it is?

Image Stacking is when several identical images are taken in a row, and then they are glued together pixel by pixel, averaging the value of each pixel. Why stick the same pictures together, you ask? Everything is simple – this method allows you to greatly reduce the noise in the photo, as well as make the picture more saturated, because when averaging, information about the color is also supplemented. And after such manipulation, you can programmatically raise the shadows, restore the light a little and the HDR image is ready.

But there is another option – Image Bracketing. Or in the photo jargon – a fork in terms of exposure. At least 3 shots are already taken here, one normal, one overexposed so that details in the shadows are visible, and one underexposed so that there are no highlights. And then it’s all stitched together like Frankenstein.

As a result, we get a wide dynamic range and saturated colors, but artifacts such as ghosting can occur. Samsung appears to be using a combined algorithm because the benefits of both algorithms are visible. At the same time, HDR works in general on all cameras, including the ultra wide-angle camera and the front camera.

5. Segmentation and NPU

And so, we get an almost perfect shot. And a couple of years ago, HDR stitching would have been the last step. But when they began to build neuroprocessors into smartphones, everything changed. And the fifth stage appeared – neural processing.

Even before you pressed the button to remove, everything that you see on the screen also goes for inquiry into the neuroprocessor, which recognizes objects and scenes. And its purpose is to work flexibly and help the camera choose the ideal settings.

The machine learning capabilities of the neuroprocessor (NPU) inside Exynos automatically detect objects in the scene, allowing the image processing processor (ISP) to generate and apply specific shooting parameters tailored to the specific subject, thereby improving frame quality. Depending on what scene or object he recognized in advance, the shooting parameters will be adjusted. If a dog is running fast in the frame, the camera will lower the shutter speed so that the subject is not blurred. If you are photographing a human, the NPU in the Exynos processor will automatically correct the white balance to get the perfect skin tone, and the exposure will adjust to the face. And after the shot is taken, the NPU segments the image to achieve optimal contrast and texture for different objects. And all this happens in a split second thanks to the tight integration of the CPU, ISP and NPU.

For example, using the Exynos processor, smartphones can not only take photos with a blurred background, but also video in real time. Also, depending on which lens the image was taken with, additional processing steps may appear – such as correcting distortion on an ultra wide-angle lens.

And with a thirty-fold zoom, which this smartphone can do, upscaling algorithms are connected.

6. Merging and saving the final JPEG

And after all this complex process of processing, the game JPEG is saved. And all this happens instantly!

But this is not striking, but the fact that now smartphones have become so powerful that they have learned to simultaneously instantly glue together not just one super HDR image. And the fact that now they do it from all cameras at the same time.

For example, in the flagships of Samsung, there is a Multi-frame function, which will allow you to press the button once and at the same time all cameras will take several pictures and even shoot several videos, and then the neural network will cut everything, frame, and even stabilize the video.

By the way, smartphones also have a video stabilization mode – Super Steady – generally some kind of wild feature.

Similar Posts

Leave a Reply