Non-obvious life hacks of 3D human reconstruction

3D human reconstruction before and after using our life hacks
3D human reconstruction before and after using our life hacks

You can scan people for a variety of purposes, from creating a digital twin for special effects in movies to creating a digital blogger or social media assistant. Most often, it is convenient or even necessary to scan an already existing person so that the appearance of the 3D model is quite definite and photorealistic. In this article, we will discuss what are the non-obvious life hacks of 3D human reconstruction based on photogrammetry.

Obvious life hacks

Let’s start with the fact that by photogrammetry, i.e. restoration of 3D models of scenes based on many photographs, enjoy for more than a dozen years for various purposes, mainly for scanning terrain maps or large architectural objects. We are in Twin3D we use photogrammetry to scan people, but there are other objects, for example, wolfdogs or a dj console. So what are the obvious life hacks for 3D scanning people?

  1. The more photos the better. The most obvious life hack. According to the principle of photogrammetry, beams from a virtual camera are drawn on images through pixels – as a result, the beams intersect. But you need to find points that are repeated in different photographs. The more photographs of the object, the more such points, and hence the quality of the final 3D model.

  2. There should be at least 30% overlap between adjacent photos (preferably 50%). This follows directly from the previous explanation.

  3. If the object is mobile (and the person is very mobile, even when trying to stand still), then you need to photograph at the same time. This is the most difficult requirement to implement. It turns out that simply rotating the camera around a person is not the best option (although someone trying to). You need to put a lot of cameras that can take pictures at the same time. By the way, we have just such a rig made of cameras (picture below).

  4. Use paid software. There are many free and open source software for 3D reconstruction, such as COLMAP or Meshroom, but at least in humans, their results cannot be compared with paid software. Of the paid ones, we recommend (this is not an advertisement) Agisoft Metashape or RealityCapture… The first has the advantage of having a convenient Python API, while the second has the advantage that it is well adapted for scanning people. The picture below shows the catastrophic difference between free and paid software. By the way, Agisoft has its own list of tips for human reconstruction, take a look

Our full-height scanner
Our full-height scanner
Test build in MeshRoom, COLMAP, Metashape and RealityCapture, respectively
Test build in MeshRoom, COLMAP, Metashape and RealityCapture, respectively

Non-obvious life hacks

If the points above are clear to anyone who has tried to assemble their 3D models and has experience in photography, then further we will consider the points that do not lie on the surface and require some kind of background in computer science and deep learning. In addition, they are united by the fact that they do not imply any manual intervention of a 3D modeler or photographer.

By the way, before that, we wrote an article about the automatic reconstruction of a human face to the level of pores. There is a completely different setup and different approaches to reconstruction.

1. Background matting

Below are raw photos from our cameras. As you can see, there are crops of the human body against a rather complex background. Not surprisingly, 3D reconstruction software has a hard time finding match points in different photos: noise and artifacts appear due to this complex background.



Raw photos from a scanner

Raw photos from a scanner

It would be ideal to remove the background. But how to do that? Of course, there are many articles on image segmentation, but they are mainly focused on a specific training dataset, and there people are usually fully represented (and not by a piece of leg). In addition, it is highly desirable that the background removal is as accurate as possible (with an accuracy of up to mm) and is quickly calculated.

Fortunately, there is an article “Real-Time High-Resolution Background Matting” (Shanchuan Lin et al., CVPR 2021) that has it all: very accurate background removal regardless of the main subject and fast computation speed. And most importantly – affordable the code with MIT license.

Here’s what happens after running the pictures from above on it:

Photos after background matting
Photos after background matting

Zoom in and remove the texture of the object:

Background matting result - removing the background accurate to hairs
The result of background matting – removing the background accurate to hairs

The results are amazing! The background is removed exactly to the hairs. Now we definitely expect 3D reconstruction to improve. What actually happens?

In the picture below we have made 2 models: one without removing the background (on the left), the other with it. The difference is obvious (literally).

3D reconstruction before and after background removal
3D reconstruction before and after background removal

2. Image enhancement

The eye of an experienced photographer will notice that our raw photographs clearly need improvement. There is not enough sharpness, and the light is too bright, there is also camera distortion, and so on. The hypothesis is that improving these things should lead to better 3D models.

We decided to try the products of the world famous photography company DxO, namely DxO PureRaw (also not advertising). This software can do all of the above, as well as remove noise from photos using a neural network developed by inhouse DeepPRIME AI… Here’s what happens after applying it:

DxO PureRaw application
DxO PureRaw application

The main question is: how does this affect the final model?

Models before and after photo enhancement
Models before and after photo enhancement

We see that improving the original photographs significantly improves the final model.

3. Mesh denoising

The previous tricks dealt with enhancing the input images for 3D reconstruction. The question now is whether we can somehow improve the outputs: i.e. mesh or texture. Usually, the mesh is very noisy at the output, especially if this part is visible with few cameras (for example, the lower part of the arm).

Raw hand model
Raw hand model

In fact, we already wrote a detailed article about different approaches to removing noise from a mesh: from ordinary filters to graph neural networks – so let’s go straight to the results on humans.

Applying noise cancellation to a girl model
Applying noise cancellation to a girl model

If you look closely at this girl, you can easily notice that after reducing the noise, the final rendering becomes much more pleasant, while the detail does not disappear. We can also look at the lower arm:

Refined hand model
Refined hand model

Conclusion

The field of 3D human reconstruction is very wide and interesting; a huge range of methods can be applied to model reconstruction from photographs. Here we have covered only a part of them, but already with their help, we can see significant improvements in the results.

In other articles, we will tell you how you can restore textures, how to calculate realistic skin normals, how to create virtual cameras between real ones and many other interesting things.

Similar Posts

Leave a Reply