Google has developed an algorithm for automatically cropping video on important objects in the frame

Google Research Team introduced a new development – algorithm for automatic cropping video. The source material is edited by the machine, the user only needs to set the basic parameters, such as the aspect ratio in the frame.

Well, the software does everything on its own, tracking important objects on the video and cropping frames in such a way that everything important remained in the final version of the video.

The team posted the results of their work on GitHub, this is an open source framework, which was called Autoflip. GitHub also published code and instructions for starting the program.

The developers started the project because most videos are shot in horizontal format. So it was originally – the vast majority of video equipment is designed so that the frame width is greater than the height. But now that smartphones are ubiquitous, horizontal video is not always convenient. Sometimes you have to adapt the video for several platforms at once.

If you crop a horizontal video, cropping it, then many important points outside the final frame simply disappear. In order to avoid this, the developers decided to teach the algorithm to track important objects and crop the frame on them. As a result, all that is needed remains in the frame and nothing is lost.

By the way, Google representatives are not the first to deal with this problem. Not so long ago, representatives of Adobe worked on a similar idea. They created a product that also works great, but it has all sorts of limitations, and not every user can get access to it. Google solved the problem in their own way, providing the best practices to everyone.

In order to get started, you need, as mentioned above, set the initial cropping parameters – aspect ratios, as well as the number of objects that must remain in the frame. After the initial settings are made, the algorithm starts working, marking the source file. One of the most important criteria for evaluating a scene change is the saturation histogram. If it changes, then the scene has changed.

According to the developers, each scene is processed separately. This is done because in different scenes the objects in the frame are located differently, sometimes some objects disappear, others appear. So that the final version of the video does not lose anything important, the algorithm marks the objects in the frame, cropping it so that they remain in sight.

There are several cropping options – from static cropping with the movement of the cropping zone from one side to the other, to dynamic cropping in accordance with the movements of objects in the frame. If, in the original problem, it is indicated that all objects in the frame should be saved, then the algorithm is able to expand the cropping zone by adding space on the sides of the frame to fill in the voids that arise.

According to the explanation of the developers, the algorithm is available on GitHub, it is implemented as a MediaPipe pipeline. By the way, the latter can work in a web browser, so that if desired, computer vision algorithms can be run in a browser on a computer or smartphone. The developers have already said that they are not going to stop there, they will improve it. Both individual developers and entire companies can join the project.

The areas of application of the algorithm are tracking the face of a speaker in a frame or, for example, a hero in a cartoon.

In the future, the algorithm will add the function of drawing border areas of the frame, as well as deleting text or images. Both options already exist as independent algorithms, so integrating them into a cropping solution is not a problem.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *