Roslyn API, or because of which PVS-Studio analyzed the project for a very long time

How many of you have used third-party libraries when writing your code? The question is rhetorical, because without the use of third-party libraries, the development of some products would have been delayed for a very, very long time, because to solve each problem, one would have to “reinvent the wheel”. However, in addition to advantages, there are also disadvantages in using third-party libraries. One of these drawbacks recently touched upon the PVS-Studio analyzer for C #. For a long time, the analyzer could not complete the analysis of a large project due to the use of the SymbolFinder.FindReferencesAsync method from the Roslyn API in the V3083 diagnostics.

Life in PVS-Studio, as usual, went on as usual. New diagnostics were developed, the static analyzer was improved, new articles were written. Suddenly! One of the users of our analyzer had an analysis on his large project during the day and could not end it in any way. Alarm! Alarm! All hands on deck! And we whistled, got dumps from the user and began to understand the reasons for the long analysis. After a detailed study of the problem, it turned out that 3 C # diagnostics worked the longest. One of them turned out to be diagnostics numbered V3083. This diagnosis has already received increased attention, but it was time to take concrete action. V3083 warns about invalid C # event calls. For example, in the code:

public class IncorrectEventUse
{
  public event EventHandler EventOne;  
  protected void InvokeEventTwice(object o, Eventers args)
  {
    if (EventOne != null)
    {
      EventOne(o, args);        
      EventOne.Invoke(o, args);
    }
  }
}

V3083 will indicate calls to event handlers EventOne in method InvokeEventTwice… You can learn more about the reasons for the danger of this code in documentation of this diagnosis. From the outside, the logic of the V3083 is very simple:

  • find the event call;

  • check if this event is being called correctly;

  • issue a warning if the event is called incorrectly.

This simplicity makes it even more interesting to understand the reason for the long diagnostic work.

The reason for the slowdown

In fact, the logic is a little more complicated. V3083 in each file for each type creates only one response of the analyzer per event, where it writes the numbers of all lines (for navigation in various plugins: Visual Studio, Rider, SonarQube) where the event is incorrectly called. It turns out that the first step is to find all the places where the event was triggered. For a similar task, the Roslyn API already has a method SymbolFinder.FindReferencesAsync, which was used in the V3083, so as not to “reinvent the wheel”.

This method is recommended for use in many tutorials: the first, second, third and so on. Perhaps, in some simple cases, the speed of this method is sufficient. However, the larger the project’s codebase, the longer this method will take. We were 100% convinced of this only after changing the V3083.

V3083 acceleration after change

When you change the diagnostic code or the analyzer kernel, you need to check that nothing that worked before is broken. To do this, we have positive and negative tests for each diagnostics, unit tests for the analyzer core, as well as a base of open-source projects (of which there are almost 90). Why do we need a base of open-source projects? On it, we run our analyzer to check it in “combat conditions”, and this run also serves as an additional check that we have not broken anything in the analyzer. We already had a run of the analyzer on this base before changing V3083. All we have left to do is do a similar run after changing the V3083 and figure out the gain in time. We were pleasantly surprised by the results. Without use SymbolFinder.FindReferencesAsync in the V3083 we got a 9% acceleration in tests. If to someone these figures seemed insignificant, then here are the characteristics of the computer on which the measurements were made:

I think after that even the most stubborn skeptics became convinced of the scale of the problem that quietly lived in the V3083 diagnostics.

Conclusion

Let this note be a warning to everyone using the Roslyn API! And you will not make our mistakes. Moreover, this applies not only to the method SymbolFinder.FindReferencesAsyncbut also all other class methods Microsoft.CodeAnalysis.FindSymbols.SymbolFinderthat use the same mechanism.

I also advise all developers to carefully and carefully study the libraries that you are using. I’m not just saying this! To understand why this is so important, I advise you to study two of our other notes: the first, second… They consider this issue in more detail.

In addition to diagnostics, we have taken up other optimizations of the PVS-Studio analyzer, which we will discuss in the following articles and notes.

The V3083 diagnostic change has not yet been released, so the analyzer version 7.12 works using SymbolFinder.FindReferencesAsync

As mentioned earlier, analyzer slowdown was found in two more C # diagnostics, besides V3083. For the sake of interest, I invite readers to leave their assumptions about what these diagnostics are in the comments. When there are more than 50 guesses, I will open the veil of secrecy and give the numbers of these diagnostics.

If you want to share this article with an English-speaking audience, please use the translation link: Valery Komarov. Roslyn API: Why PVS-Studio Was Analyzing the Project So Long.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *