How artificial intelligence was taught to solve diffusion

Today, in anticipation of the start of a new course flow “Mathematics and Machine Learning for Data Science”, here’s a helpful translation of an article from the MIT Technology Review on how Koltech researchers taught AI to solve partial differential equations, why it’s needed, and how it can change the world. All details can be found under the cut.

Unless you are a physicist or engineer, you have no particular reason to know about partial differential equations. And after years of graduate school studying mechanical engineering, I haven’t used them in real life since.

But such equations, (hereinafter for simplicity, we use the English abbreviation PDE), have their own magic. This is a category of mathematical equations that are really good at describing changes in space and time, and thus very convenient for describing physical phenomena in our universe. They can be used to model everything from planetary orbits to plate tectonics and air turbulence interfering with flight, which in turn allows us to do useful things, such as predicting seismic activity and designing safe aircraft.

The catch is that PDEs are notoriously difficult to solve. And here the meaning of the word “decision” is perhaps better illustrated. For example, you are trying to simulate air turbulence in order to test a new airplane design. There is a well-known PDE called the Navier-Stokes equation, which is used to describe the motion of any fluid. Solving the Navier-Stokes equation allows you to take a “snapshot” of the movement of air (wind conditions) at any time and simulate how it will continue to move or how it moved before.

These calculations are very complex and computationally expensive, so disciplines with a lot of PDEs often rely on supercomputers to perform mathematical calculations. This is why AI professionals take a special interest in these equations. If we could use deep learning to speed up the solution, it could be of great benefit in research and engineering.

Koltech researchers have implemented new method of deep learning to solve PDE, which is significantly more accurate than the deep learning methods previously developed. The method is also generalized enough to solve entire PDE families such as the Navier-Stokes equation for any type of fluid, without the need for new training. Finally, it is 1,000 times faster than traditional mathematical formulas, reducing reliance on supercomputers and increasing the computational power of problem modeling even further. And this is good. Give two!

Hammertime

[прим. перев. — Подзаголовок — отсылка к «U Can’t Touch This» за авторством рэпера MC Hammer]

Before we dive into how the researchers did it, let’s first evaluate the results. The gif below shows an impressive demo. The first column shows two snapshots of fluid movement; the second column shows how the fluid actually continued to move; and the third column shows the neural network prediction. Basically it looks identical to the second one.

The article did a lot of twitter buzz and even rapper MC Hammer repost.

But back to how scientists achieved this.

When the function fits

The first thing to understand is that neural networks are basically approximators. When they train on a set of inputs and outputs, they are actually evaluating a function, or a series of mathematical operations that translate one data into another. Consider a cat detector. You train the neural network by feeding it many images of cats and other images, marking the groups as 1 and 0. Then the neural network looks for the best function that converts each image of the cat to 1, and the images of everything else to 0. So the network can look at the image and tell if it has a cat on it. She uses the found function to calculate her answer, and if the training was successful, then in most cases the recognition will be correct.

Conveniently, function approximation is exactly what we need when solving PDE. Ultimately, you need to find a function that best describes, say, the movement of air particles in space and time.

This is the essence of the work. Neural networks are usually trained to approximate functions between inputs and outputs defined in Euclidean space, this is a classic graph with the x, y and z axes. But this time, the researchers decided to define the inputs and outputs in Fourier space – a special type of space for plotting wave frequencies. The fact is that something like the movement of air can in fact be described as a combination of waves, says Anima Anandkumar, a professor at the University of California who, along with her colleagues, Professors Andrew Stewart and Kaushik Bhattacharya, led the research. The general wind direction at the macro level is similar to low frequency with very long, sluggish waves, while the small eddies that form at the micro level are similar to high frequencies with very short and fast waves.

Why is it so important? Because it is much easier to approximate the Fourier function in Fourier space than to deal with PDE in Euclidean space. This approach greatly simplifies the work of the neural network. This is also a guarantee of significant improvement in accuracy: in addition to the huge speed advantage over traditional methods, the new method reduces the error rate in solving Navier-Stokes problems by 30% compared to previous deep learning methods.

This is all very reasonable, and besides, the method has the ability to generalize. Previous methods of deep learning must be trained separately for each type of fluid, in the case of this method, one training is enough to cope with all fluids, which is confirmed by the experiments of the researchers. While they have not yet attempted to extend the approach to other media, the method should also be capable of working with the earth’s crust when solving seismic-related PDEs or material types when solving thermal-conductivity-related PDEs.

Supersimulation

The faculty and their graduate students did this research for more than just the pleasure of theories. They want to bring AI to new scientific disciplines. It was thanks to conversations with employees of various profiles working in the fields of climatology, seismology and materials science that Anandkumar was the first to solve the PDE problem together with her colleagues and students. They are now working to put the method into practice with other researchers from Coltech and Lawrence Berkeley National Laboratory.

One of the research topics of particular concern to Anandkumar is climate change. The Navier-Stokes equation is well suited not only for modeling air turbulence; this equation is also used in weather modeling. “Good, accurate global weather forecasts are challenging,” she says, “and even on the largest supercomputers, we cannot make global forecasts today.” Therefore, if we can use a new method to speed up all the work, it will have a huge impact.

“There are many, many other applications of the method,” she adds. “In that sense, there is no limit, because we have a common way to speed up the work with all these applications.”

Now artificial intelligence is able to solve diffusion, what’s next? Maybe you will be one of those who will teach him how to solve even more complex problems.
And we will be happy to help you with this by giving a special promotional code. HABRwhich will add 10% to the banner discount.