PINN or Physics-aware Neural Networks

What is PINN and what is its application area?

PINNs have appeared relatively recently (in 2019, a paper was published Raissi), but are already actively used for some physics problems. A distinctive feature of these neural networks is that the Loss function includes residuals from the equations that describe the physical process under consideration. The input of such a neural network is coordinates (spatial or temporal, depending on the task). And another feature – no targets are required for training, since, I repeat, the Loss minimizes the deviation of the predicted values from the equations.

We can say that PINN is a replacement for numerical modeling and then the question arises: “Are neural networks needed where numerical methods work well?” But it’s not that simple. Imagine that you or your colleague conducted an experiment, for example, measured the velocities of particles in a fluid flow, or obtained point measurements of temperature. If you really have experience in conducting an experiment, you probably know that experimental data is far from ideal and can cause a lot of headaches during processing. Now imagine that you have nevertheless carried out this processing, received a dataset from the experiment, and now you want to use this data in equations to obtain other flow parameters. For example, you measured the speed, and from the hydrodynamic equations, you want to obtain the pressure. Or, in other words, conduct data assimilation, speaking in modern pseudo-scientific language. Numerical modeling in this case can fail, because even carefully filtered data can be noisy (especially if you need to take derivatives from them, and if you also need to take the second ones, then it’s completely crap). Or there may be few of them (for example, the temperature was measured with a thermocouple at several points). In this case, it seems, there is an experiment and it is possible to potentially restore some values from others by solving equations. And here PINNs can come to the rescue. Because they work differently than numerical methods. They do not use transfer schemes, and the neural network parameters are minimized at selected points.

In addition, PINN does not use traditional numerical derivatives, but uses a tool called automatic differentiation. After all, let me remind you, PINNs exist to minimize residuals from differential equations. I will try to explain “naively” what “automatic” derivatives are. PINN is a fully connected neural network, that is, several layers with a certain number of neurons on each layer. When passing back through the input parameters (let me remind you that here the input data are spatial coordinates, time), derivatives are taken. Now let's think about it: these are nothing more than ordinary derivatives, only, we can say, a “derivative” at a point, in the limit. An ordinary numerical derivative is a difference, and here it is simply weighting coefficients embedded in the neural network. We can use these derivatives as needed. And second derivatives, and third ones… Yes, we can. And this is a powerful advantage of PINN. Let's remember that experimental data is often noisy, sparse data. And a neural network, if trained correctly, can learn this data and provide us with automatic derivatives at the points we need.

This is the idea behind using neural networks for data assimilation.

A simple example with a harmonic oscillator

In this review, I want to show using a simple example how to train a neural network to solve an equation for a regular harmonic oscillator (mass load m on a spring with rigidity k) and demonstrate what “automatic” derivatives are, point out some features of PINN. Let’s take the most common weight on a spring. Such an example is studied in school and is well known. Fig. 1 shows a familiar diagram.

Fig. 1. Diagram of a weight on a spring performing harmonic oscillations

The equation with which we will physically “inform” our PINN, that is, the equation of motion of a harmonic oscillator:

$\frac{\mathrm{d}x}{\mathrm{d}t} + \omega x = 0$

Let the weight be removed from its initial equilibrium position () in position and released without initial velocity: $\frac{\mathrm{d}x}{\mathrm{d}t}(t=0)= 0$

. As follows from the equation of motion, $\omega = \sqrt{\frac{k}{m}}$ .

. Solving this equation, we obtain the law of motion of the weight:

steps=1000 pbar = tqdm(range(steps), desc="Training Progress") t = (torch.linspace(0, 1, 100).unsqueeze(1)).to(device) t.requires_grad = True metric_data = nn.MSELoss() writer = SummaryWriter() optimizer = torch.optim.LBFGS(model.parameters(), lr=0.1) def train(): for step in pbar: def closure(): optimizer.zero_grad() loss = pdeBC loss.backward() return loss optimizer.step(closure) if step % 2 == 0: current_loss = closure().item() pbar.set_description("Step: %d | Loss: %.6f" % (step, current_loss)) writer.add_scalar('Loss/train', current_loss, step) train() writer.close()

Perhaps the most important thing here is the pdeBC() function:

def pdeBC
    out = model
    f1 = pde(out, t)

    inlet_mask = (t[:, 0] == 0)
    t0 = t[inlet_mask]
    x0 = model(t0).to(device)
    dx0dt = torch.autograd.grad(x0, t0, torch.ones_like(t0), create_graph=True, \
                        retain_graph=True)[0]

    loss_bc = metric_data(x0, x0_true) + \
                metric_data(dx0dt, dx0dt_true.to(device))
    loss_pde = metric_data(f1, torch.zeros_like(f1))

    loss = 1e3*loss_bc + loss_pde

    metric_x = metric_data(out, x0_true * torch.sin(omega*t + torch.pi / 2))
    metric_x0 = metric_data(x0, x0_true)
    metric_dx0dt = metric_data(dx0dt, dx0dt_true.to(device))

    acc_metrics = {'metric_x': metric_x,
                'metric_x0': metric_x0,
                'metric_dx0dt': metric_dx0dt,
                }

    metrics = {'loss': loss,
                'loss_bc': loss_bc,
                'loss_pde': loss_pde,
                }
    for k, v in metrics.items():
        run[k].append(v)
    for k, v in acc_metrics.items():
        run[k].append(v)

    return loss

Along with the equation itself:

def pde(out, t, nu=2):
    omega = 2 * torch.pi * nu
    dxdt = torch.autograd.grad(out, t, torch.ones_like
                            retain_graph=True)[0]
    d2xdt2 = torch.autograd.grad(dxdt, t, torch.ones_like
                            retain_graph=True)[0]
    f = d2xdt2 + (omega ** 2) * out
    return f

Please note that torch.autograd.grad – this is the automatic derivative. And, as you can see, $loss = 1e3*loss_{bc} + loss_{pde}$ . That is, the Loss function consists of two parts: the residual from the equations and the deviations from the boundary conditions. More complex PINNs include even more parts, for example, as noted in the introduction, if there are experimental data, you can include a part associated with deviations from these experimental data. And it is important to choose the right weights so that the neural network does not strive to minimize only one part. In my example, I selected a weight of 1000 before $loss_{bc}$ .