PINN (Physics-informed neural networks) and what they eat with

A well-known, and in many ways sad, fact: Real physical systems are calculated by complex numerical methods in a very long time on supercomputers.

Less well-known, but more joyful, fact: There are neural networks that do this faster (albeit with less accuracy).

In real complex industries, such as the creation of thin films, the development of membranes, etc., either sacred knowledge of technologists is required, passed down from generation to generation, or at least some approximate estimates of the input parameters of the processes that are necessary for production (installation parameters, its geometry, ratios of substances, etc). Unfortunately, any adequate physical estimates can be obtained only with the help of long-term calculations of the dynamics of the ongoing processes. And they will be valid at best for one specific installation and specific conditions. What to do? Approximate!

A typical task of a physicist-programmer

Suppose we wanted to see how a highly discharged gas would behave under some specific conditions.

Similar phenomena are described by some differential equations with boundary conditions. Those. the Cauchy problem.

For example: one-dimensional Boltzmann equation without collisions

\frac{\partial f}{\partial t} + v \frac{\partial f}{\partial x} = 0,

where f is the distribution function, v is the speed.

To solve such problems, the so-called difference grids, at the nodes of which we determine the parameters of our system, introduce grid functions, calculate discretized derivatives, and look for solutions to our equations. A typical discretized version of the Boltzmann equation looks like this:

\frac{f^{j+1}_{i} - f^{j}_{i}}{t} + v\frac{f^{j}_{i} - f^{j}_{ i-1}}{h} = 0

where t is the time grid step, h is the spatial grid step.

The accuracy of the final calculations depends on the dimensions of the grids, and the duration and complexity of the calculations also depend on them.

The typical number of points for a coordinate grid for real problems is about hundreds of thousands points. Such grids are introduced for all variables. For multidimensional problems, the situation worsens dramatically – it is necessary to introduce grids for each from coordinates.

Such calculations for real problems usually take months on supercomputers.

We approximate the approximation

When I was a student, we often joked that publications in physics journals about neural networks are usually an attempt to approximate some approximate method with a neural network. Those. get the squared approximation.

Well, in general, we are not that wrong.

So. We have neural networks and many different architectures for them. Naive solution: Let’s take an arbitrary architecture and run our data through it.

At the output, we will get some neural network that will play the role of our distribution function.

f' = NN(x, t)

Let’s even assume that this neural network has a very low error, including on the test dataset.

But here’s the problem: for a number of cases, conservation laws are violated, albeit not much. We cannot guarantee this.

Unfortunately, this means that such an approximation becomes absolutely useless, because solutions that violate conservation laws are non-physical and lead to the appearance of absolutely magical phenomena, such as sources of energy in the system that have come from nowhere. Dynamical systems are sensitive enough to such problems that almost the entire solution becomes useless.

Another landscape

From a formal point of view, we solved the problem of optimizing an arbitrary distribution function over an arbitrary parameter space.

But what if we pre-set the loss function in such a way that the conservation laws are satisfied?

For example: Let’s directly add a term to the loss function, which will vanish only if the conservation laws are fulfilled.

Loss = MSE(f, f') + L_{phys}

Thus, we limited the space of parameters only to those that lie in a certain subspace – consider that we have identified a surface on which the conservation laws hold.

Is this a sufficient solution?

Almost. In fact, as an additional term, we need to choose one that will take into account all the properties of the equations being approximated that we know. Boundary conditions, conservation laws, etc. That is:

L_{phys} = L_{BC} + L_{DE}

where BC – boudary condition – boundary conditions, and DE – differential equation – differential equation. Thus, such a loss function should be minimized on those solutions for which both the boundary conditions and the differential equation itself are satisfied.

A neural network with this type of Loss function is called PINN – Physics-informed neural network.

This type of loss function solves several problems at once:

  1. Preservation of the physicality of the solution.

  2. Learning acceleration – by reducing the search space of network parameters.

  3. Reducing the amount of data needed for training – after all, we directly put a priori information about the properties of the solution into the network.

And what’s next?

In fact, PINN is a special case of a deeper idea that is now being actively developed in neural networks.

The addition of additional terms associated not only with the features themselves that we are studying, but also with their derivatives (in this case, differential equations) generates a requirement for the neural network to take into account the smoothness of the solution space. Thus, we kind of take into account some invariants our diversity. Networks can be designed taking into account the geometric properties of the problem under study in different ways. Details not related to PINN can be found in a rather controversial but interesting article: link.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *