An example of how new diagnostics appear in PVS-Studio

New C ++ diagnostics for PVS-Studio

Users sometimes ask how new diagnostics appear in the PVS-Studio static analyzer. We answer that we draw inspiration from a wide variety of sources: books, coding standards, our own mistakes, letters from our users, and so on. Today we came up with an interesting new diagnosis and decided to tell the story of how it happened.

It all started with checking the COVID-19 CovidSim Model project and an article about an uninitialized variable. The project turned out to be small and written using the modern C ++ language standard. This means that it can perfectly add to the base of test projects for regression testing of the PVS-Studio analyzer core.

However, before replenishing the database with a new project, it is useful to review the issued warnings in order to identify new patterns (patterns) of false positives and write them out for further refinement of the analyzer. It is also an additional opportunity to notice that something else is wrong. For example, a message unsuccessfully describes an error for a particular code construct.

The programmer who was instructed to add the project to the test base approached the problem thoroughly and decided to look into the MISRA-diagnostics section. In general, it was not necessary to do this, since this is a very specific group of diagnostics that it makes no sense to include for projects such as CovidSim.

MISRA C and MISRA C ++ diagnostics are intended for developers of embedded systems, and their essence boils down to limiting the use of unsafe programming constructs. For example, it is not recommended to use the operator goto (V2502), since it provokes the creation of complex code in which it is easy to make a logical error. More information about the philosophy of the MISRA coding standard can be found in the article “What is MISRA and how to prepare it”.

For application software, which is exactly the CovidSim project, it makes no sense to include a set of MISRA diagnostics. The user will simply drown in a huge number of messages of little use to him. For example, while experimenting with this set of diagnostics, we received over a million warnings for some medium-sized open source projects. Roughly speaking, from the point of view of MISRA, there may be something wrong in every third line of code :). Naturally, no one will watch all this and, moreover, will not rule. The project is either immediately developed taking into account MISRA recommendations, or for him this coding standard is irrelevant.

But we digress from the topic. So, skimming through the MISRA warnings, a colleague caught a glance at the warning V2507issued for this code snippet.

if (radiusSquared > StateT[tn].maxRad2) StateT[tn].maxRad2 = radiusSquared;
{
  SusceptibleToLatent(a->pcell);
  if (a->listpos < Cells[a->pcell].S)
  {
    UpdateCell(Cells[a->pcell].susceptible, a->listpos, Cells[a->pcell].S);
    a->listpos = Cells[a->pcell].S;
    Cells[a->pcell].latent[0] = ai;
  }
}
StateT[tn].cumI_keyworker[a->keyworker]++;

Rule V2507 forces conditional statement bodies to be wrapped in curly braces.

At the first moment, our meticulous colleague thought that the analyzer had malfunctioned. After all, there is a block of text in curly braces! Is there a false positive before us?

Let’s take a closer look. The code only seems to be correct, but it is not! Curly braces are not relevant to the operator if

Let’s format the code for clarity:

if (radiusSquared > StateT[tn].maxRad2)
  StateT[tn].maxRad2 = radiusSquared;

{
  SusceptibleToLatent(a->pcell);
  ....
}

Agree, this is a nice bug. It will surely be included in the Top10 C ++ bugs we found in 2021.

And what follows from this? The MISRA approach works! Yes, it makes you write curly braces everywhere. Yes, it’s exhausting. But it’s a reasonable price to pay for improving the reliability of embedded applications used in medical technology, automobiles, and other high-responsibility systems.

Okay, developers using the MISRA standard are fine. However, it is a bad idea to tell everyone to use curly braces. Using this approach, it is very easy to bring the analyzer to the point where it becomes unusable. There will be so many messages that nobody will watch them.

So we got there for an idea for a new general purpose diagnostic. The following rule can be formulated.

Issue a warning if the operator if the following conditions are met:

  • the whole conditional statement if written in one line and has only then-branch;
  • next statement after if Is a compound statement and it is not on the same line as if

You can predict in advance that you will get a good rule with a low number of false positives.

This is how this idea is now framed in our task accounting system. Perhaps, in the process of implementation, something will be done differently, but that is no longer important. The main thing is that a good diagnostic rule will appear, which will begin to identify a new error pattern. Further, we will extend it to C # and Java kernels of the PVS-Studio analyzer.

We have just looked at an interesting example of how a new diagnostic rule was formulated, which will then be implemented in PVS-Studio. Let’s say thanks to the CovidSim project, the MISRA coding standard and the observation of our colleague.

Thanks for reading and follow me into the world of C ++ and bugs :). TwitterFacebook

Additional links:

  1. Technologies used in the PVS-Studio code analyzer to find bugs and potential vulnerabilities
  2. PVS-Studio for Java under the hood: developing diagnostics
  3. Using machine learning in static analysis of program source code

If you want to share this article with an English-speaking audience, please use the translation link: Andrey Karpov. Example of How New Diagnostics Appear in PVS-Studio.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *