How I made movies using neural networks

In the last article, I talked about how I used neural networks to create jewelry. Today I’ll tell you how I made a film for the planetarium using neural networks.

As a result, this film received an award for dome film festival in tokyo

Horizontal movie poster

Horizontal movie poster

One of my goals was to see if neural networks are suitable for practical use. Now, when the work is finished, there is no doubt about it, but then everything was just beginning, many things were not clear how to do it. Moreover, when I started, there were practically no animation methods. You could draw pictures on demand, and that’s it.

An additional complication was that the film was domed, which meant that a fisheye projection was needed. When trying to draw in fisheye projection, midjourney (and SD) gave out something similar to the desired projection, invariably adding a fish to it, or even literally a fish eye!

Unchanging fish when present in the request

Unchanging fish when present in the request

Or just a fisheye.  It's hard to argue that it's a fisheye

Or just a fisheye. It’s hard to argue that it’s a fisheye

You could get rid of the fish with negative requests, but the quality of the panorama still came out unconvincing

Therefore, the panoramas had to be collected in the old-fashioned way. By gluing rectangular pictures and transforming the resulting one using polar coordinates. The rectangular pictures themselves I did through Mijorney. I needed the longest possible images, so I set an extreme aspect ratio of 16:1. To my surprise, the Neroset did not object, although, of course, they did not give out such a proportion. But it was the most elongated format possible (then it was still version v3. V4 still doesn’t know how)

Panorama - source.  In the overall composition, these required three or four

Panorama – source. In the overall composition, these required three or four

By the way, when stitching, I also used the DALL-E 2 neural network. It turned out to be convenient for stitching panoramas.

Now, when the work is finished, convenient means have appeared by which you can get an equidistant and from it already a fisheye projection. For example, service https://skybox.blockadelabs.com/ which generates 360 panoramas on demand, or LatentLabs360 – LORA to create such panoramas in SD. How much easier after just a couple of months to live 🙂

Just creating a panorama is not enough. Even in 2D animation. For a scene to live, there must be objects in it. When the camera moves, objects move relative to the background and each other, so we see space. In normal animation, it is enough to place different plans on different layers and move them at different speeds. In the case of a spherical cinema, this, alas, will not work, since we have a real three-dimensional space, and objects are placed around us. Actually, this was what we had to do, the planes with the drawn objects were placed around the camera. These planes are turned towards the camera, and when it moves, they also turn after the camera. Therefore, we will never see such a plane from the end. Luckily, all this can be done in AAE and there is no need to use 3D software.

There is another way to create space. Adding parallax using a depth map. This is how volume was added to the interior of the hut. Stable Diffusion can work with the Midas neural network that creates a depth map.

Interior element depth map

Interior element depth map

And, of course, the whole space must live. Trees sway, candle flames tremble in the draft, and so on – all those little things that make the space real.

In general, the animation in this video was of three types:

  1. Classic animation, when layers with objects are animated by standard means – using effects or puppet deformation. This is how candles, trees, flying birds were animated.

  2. Character animation. Was made with EbSynth. I’ll tell you more about it a little later.

  3. Dream animation. This is the kind of animation that neural networks are best at doing. Therefore, the choice of a lullaby for film adaptation by a neural network is a very good choice. Some images flow into others as the generation parameters and keywords change. To do this, scripts such as Deforum or Animation are connected to Stable Diffusion. The only drawback is that you can only create dream-like animations or psychedelics this way.

I’ll tell you a little more about character animation. The difficulty was that the neural network was not yet able to animate the pose at that time. Now this is much better, there is StableNet, in which the pose can be set using a mannequin. And if you animate a mannequin, then you can then generate a moving character from this animation (everything is still not very good, for example, clothes can change on a character or it will “flicker”, but this is much better than nothing). Then there was nothing of the kind. And that’s why I used EbSynth.

You can often find the statement that EbSynth is a neural network. In fact, this is not so, it is not a neural network at all. This program simply takes a video, looks at how some pixels move in it relative to others, and transfers this movement to another picture. The same puppet deformation as in Photoshop.

First of all, I created a character. Generated with midjourney. It was a separate story to get her to stand in a T-pose (we remember, this was in the old days of three months ago, when such tasks were simply not solved)

Main character created in Migjorney

Main character created in Migjorney

After that I made a 3D model with the same silhouette. And animated it in UE5. Thus, I got a sequence of frames from which EbSynth was able to capture the movement.

Mannequin animation

Mannequin animation

Unfortunately, EbSynth, due to its nature, cannot correctly transfer the deformation mesh if some parts of the object begin to overlap others. Therefore, I had to split the image into several layers and animate them separately.

The mask was also animated

Masks of individual parts

Masks of individual parts

What conclusions do I have after this work

  1. It is possible to use neural networks in creativity

  2. Although everyone is afraid that neural networks will replace the artist, there are more people in the resulting result than neural networks.

  3. EbSynth can only be used for very simple animations. If you use it to animate the movement of characters, you will be sad and turn gray prematurely.

  4. Ways to solve certain problems with the help of neural networks change every day. If something doesn’t work out, go to sleep, maybe tomorrow someone else will solve your problem and bring the solution on a silver platter.

In general, it was an interesting experience, I understood how many things should be done, how many things should not be done. And of course, I’m glad that the audience likes the result. Today I received a letter that the video was shown at the school planetarium in Brazil and the children were delighted 🙂

Watch video you can here. Just keep in mind that it looks on the dome, and looks like this:

Dome projection

Dome projection

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *