What are deepfakes of faces and how to detect them

What are deepfakes and what are they?

Both the term and its implementation first appeared on Reddit in 2017: one of the users then made several fake videos with celebrities.

The word “deepfake” comes from two terms: deep learning, a deep method of machine learning using neural networks, and fake. Initially, the term referred only to the replacement of a person’s face. Gradually, this concept expanded to other categories, including audio, video, text, and so on.

Deepfakes are different from other fakes. If a fraudster tries to pay for a purchase by holding up an image of his victim in front of the payment terminal camera, this is not a deepfake. Methods of protection against digital and physical counterfeits are also different.

What is the difference? Liveness is a technology that helps the system distinguish a living person. It detects a physical (not digital!) mask, even if it is made of silicone or latex, photo, voice recording or video.

The algorithm works in conjunction with a facial recognition system, but the main difference is that it answers not the question “Is this the right person?”, but the question “Is this a living person?”

Liveness is a very interesting and broad topic. We’ll talk about it another time.

Now let’s dive into the world of deepfakes and look at their most common types:

  • face transfer;

  • facial expression transfer;

  • customization of attributes. Adding glasses, darkening or lightening the skin, changing the hairstyle, and so on;

  • face generation from scratch. Creation of people who do not exist in reality;

  • lip synchronization during a conversation.

Deepfakes of faces are used both in completely legal areas and in criminal activities. A legal way to use deepfakes is, for example, to develop masks for entertainment applications. There are other options for completely ethical use:

  • advertising integration;

  • digital avatars;

  • film dubbing;

  • virtual fitting room (makeup, hair styling, etc.);

  • anonymization of avatar on video during a conference;

  • digital actors – this technology is becoming increasingly popular in Hollywood.

Now – about the illegal use of deepfakes:

  • compromising content;

  • manipulation of public opinion;

  • realistic bots for social networks that are used to deceive users;

  • bypassing the authentication system.

One of the most popular types of deepfakes is Face Swap. These types of fakes are the hardest to detect. Especially when there are a lot of them, and you need to find a fake quickly.

Generation algorithms are divided into two types:

  1. Auto-Encoder. The quality is very good, but to train the neural network you need a video database.

  2. ID-Injection. Here the features of the original face are preserved. They add the face of another person and the result is something between two people. This is what it looks like:

In reality, there are even more types of generation. I described two to show: having protected ourselves from some algorithms, we have no guarantee that we will protect ourselves from others.

Counterfeit protection

There are a huge number of deepfakes, both primitive and difficult to detect. A modern person, caught in a dense information flow, often does not even have time to think about whether it is a deepfake or not.

There are other problems when detecting fakes. In my experience, I divide them into four main categories:

  • there are a lot of difficulties due to data compression, including JPEG, H254. Deepfake detectors focus on low-level artifacts, and compression removes some of them;

  • New types of attacks are appearing and the deepfake detector is poorly portable to new generation algorithms. If it was not trained on the algorithm used to create the face, there may be errors;

  • not fully established test data. Different detection methods do not have a “gold standard” of samples for data verification and because of this it is difficult to compare the results of different detection algorithms.;

  • Only models that have many neural networks with a large number of parameters built into them can boast good accuracy. They make almost no mistakes, but even on a video card they work slowly. This makes it very difficult to create real-time deepfake detectors

Of course, different options are possible. Let’s take, for example, a relatively simple fake detection algorithm. There is a video and a face track. Training a binary classifier and we get a fairly effective tool.

A simple and working solution. But it’s better to take a classifier that is “tailored” for working with video and deepfakes.

The idea, for example, is to take a neuron trained understand what a person is saying by the movement of his lips. Using this model as a basis, you can establish the reality of the video, where the speaking face is clearly visible.

What does this pre-training provide? In this case, the neuron catches changes and adapts better to new attacks than if she were trained additionally. The screenshot below shows the dependence of the efficiency of neural networks based on the AUC metric (the results of different detection algorithms are compared on 4 test datasets).

If you freeze and do not train the main part of the network, but update its last part, it will be better able to identify new fakes, since it will not be retrained for artifacts of old types of attacks.

Another good way to detect deepfakes is artifact detection on different frames of the same video. This method works well because videos with fake faces are mostly generated frame by frame. If artifacts appear and disappear, the neural network will notice and analyze it.

Now about the situation when someone generates a face and puts it on the head from the video, blurring the edges to hide the changes. In this case deepfakes are being caught just by searching for the edges of the added image.

In previous methods we compared fakes with real data, but here we have to synthesize “fake” data and train the network on it. In this way, you can predict places with changes in the picture.

How to Protect Your Images

Yes, photographs can indeed be protected so that fraudsters cannot make a fake based on them.

There are already moments of Adversarial attack. We adding noise to the face image, and when creating a deepfake, this causes elements with distorted colors to stand out. For example, after replacing a face, the fraudster will receive a blue image in which it is difficult to see the person. You can’t fool anyone here anymore!

The protection method involves the use of neural networks, which add noise to the images. We take the original image as input, “noise” it, and pass it through a special algorithm. You can even apply noise variations for different types of attacks. They are added to different parts of the image so as not to interfere with protection.

For a person, the picture looks completely normal, but a machine will not be able to use such a picture:

Conclusion

Both the methods for creating deepfakes are being improved – more and more complex technologies are appearing, as well as protection methods. None of the methods is 100% effective, so you need to understand: there will be even more deepfakes.

I will be glad to answer your questions in the comments. And write if you are interested in any of the listed detection methods – perhaps we will tell you more about it in the following articles.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *