How AI Systems Aim to Simplify Sound Engineering
Photo Free To Use Sounds / Unsplash
The difficult task of the noisemaker
Sounds for films and TV shows – for example, the rustle of rain – is very difficult to record in the right way right on the set at the time of shooting a particular fragment. There will be a lot of extraneous noise, conflicts with the voices of actors and other equipment are possible. For this reason, almost all sounds are recorded separately and mixed during editing. Doing this noisemakers…
If a movie needs to reproduce the sound of a broken window, then the sound designers go to the studio and start breaking glass under controlled acoustic conditions. The recording is carried out until the sound coincides with what is happening on the screen. In particularly difficult cases, this may require dozens of iterations, which complicates and increases the cost of filmmaking.
University of Texas Engineers offered Alternative option. They developed an AI system that detects what is happening in the frame and automatically suggests a scale.
How it works
The engineers described the operating principle of the system in their work for the IEEE (PDF). They designed two machine learning models. The first one extracts features of images from the footage – for example, color. The second model analyzes the movement of an object in different frames and determines its nature in order to select the appropriate sound.
To form the acoustic array, engineers developed AutoFoley program. It generates a new sound based on thousands of short audio samples – with the sound of rain, the ticking of a clock, a galloping horse. The result of the work is quite convincing: