Single line code or Nethermind validation using PVS-Studio C # for Linux

Picture 1

This article is dedicated to the launch of the PVS-Studio C # beta test for Linux, as well as the plug-in for Rider. For such a wonderful reason, using these tools, we checked the source code of the Nethermind product and in this article we will look at interesting and sometimes funny errors.

Nethermind Is a fast client for .NET Core Ethereum for Linux, Windows, MacOs. It can be used in projects when setting up Ethereum or dApps private networks. Nethermind has open source located on github. The project was founded in 2017 and is constantly evolving.

Introduction

Do you like manual labor? For example, such as finding errors in program code. Agree, it is rather tedious to read and analyze the site you wrote or the whole project in search of some tricky bug. Well, if the project is small, let’s say 5,000 lines, but if its size has already exceeded one hundred thousand or a million lines? Moreover, it was written by different programmers and sometimes not in a very digestible form. What to do in this case? Do you really have to sleep somewhere, somewhere underestimate and 100% of your time to realize all these endless lines in order to understand where this nasty mistake is? I doubt that you would like to do this. So what to do, maybe there are modern means to somehow automate this?

Figure 3

And then a tool like a static code analyzer comes into play. Static analyzer is a tool for detecting defects in the source code of programs. The advantage of this tool over manual verification is as follows:

  • almost does not spend your time looking for the location of the error (at least it’s definitely faster than looking for the failed copy-paste with your eyes);
  • does not get tired, unlike a person who after some time of searching will need rest;
  • knows a lot of error patterns that a person may not even be aware of;
  • uses technologies such as: data flow analysis, symbolic execution, pattern matching, and others;
  • allows you to regularly and at any time to analyze;
  • etc.

Of course, using a static code analyzer does not replace or cancel code reviews. But code reviews are becoming more productive and useful. You can focus on finding high-level errors, on transferring knowledge, rather than tiring to read the code in search of typos.

If you are interested in reading more about it, I suggest next articleas well as an article on technologyused in PVS-Studio.

PVS-Studio C # for Linux / macOS

At the moment, we are porting our C # analyzer to .NET Core, and we are also actively developing a plug-in for the IDE Rider.

If you are interested, you can sign up for a beta test by filling out the form on this page. Installation instructions will be sent to your mail (don’t worry, it’s very simple), as well as a license to use the analyzer.

This is what Rider looks like with the PVS-Studio plugin:

Picture 4

A bit of indignation

I want to say that in some places it was very difficult to read the source code of Nethermind, because in it lines of 300-500 characters in length are quite normal. Yes, it is all the code without formatting in 1 line. And these lines, for example, contain several ternary operators, and logical operators, and that only there. In general, enjoyment as from the last season of the game of thrones.

I will explain a little to understand the scope. I have an UltraWide monitor with a length of about 82 centimeters. Opening the IDE on it in full screen, it fits about 340 characters, that is, the lines that I’m talking about do not even fit. If you want to see how it looks, I left links to files on GitHub:

Example 1

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    string authorString = (block.Author == null ? null : "sealed by " +
(KnownAddresses.GoerliValidators.ContainsKey(block.Author) ?
KnownAddresses.GoerliValidators[block.Author] : block.Author?.ToString())) ??
(block.Beneficiary == null ? string.Empty : "mined by " +
(KnownAddresses.KnownMiners.ContainsKey(block.Beneficiary) ?
KnownAddresses.KnownMiners[block.Beneficiary] : block.Beneficiary?.ToString()));
    if (_logger.IsInfo)
    {
        if (_logger.IsInfo) _logger.Info($"Discovered a new block
{string.Empty.PadLeft(9 - block.Number.ToString().Length, '
')}{block.ToString(Block.Format.HashNumberAndTx)} {authorString}, sent by
{syncPeer:s}");
    }
}

File link

Example 2

private void BuildTransitions()
{
    ...
    releaseSpec.IsEip1283Enabled = (_chainSpec.Parameters.Eip1283Transition ??
long.MaxValue) <= releaseStartBlock &&
((_chainSpec.Parameters.Eip1283DisableTransition ?? long.MaxValue) 
> releaseStartBlock || (_chainSpec.Parameters.Eip1283ReenableTransition ??
long.MaxValue) <= releaseStartBlock);           
    ...
}

File link

public void 
Will_not_reject_block_with_bad_total_diff_but_will_reset_diff_to_null()
{
    ...
    _syncServer = new SyncServer(new StateDb(), new StateDb(), localBlockTree,
NullReceiptStorage.Instance, new BlockValidator(Always.Valid, new
HeaderValidator(localBlockTree, Always.Valid, MainnetSpecProvider.Instance,
LimboLogs.Instance), Always.Valid, MainnetSpecProvider.Instance, 
LimboLogs.Instance), Always.Valid, _peerPool, StaticSelector.Full, 
new SyncConfig(), LimboLogs.Instance);
    ...     
}

File link

Now imagine that an error will occur in such a site, will it be pleasant to look for it? I’m sure that it’s not, and everyone understands that it’s impossible to write like that. And by the way, there is a similar place with an error in this project.

Validation Results

Conditions that do not like 0

Condition 1

public ReceiptsMessage Deserialize(byte[] bytes)
{
    if (bytes.Length == 0 && bytes[0] == Rlp.OfEmptySequence[0])
        return new ReceiptsMessage(null);
    ...
}

PVS-Studio warning: V3106 Possibly index is out of bound. The '0' index is pointing beyond 'bytes' bound. Nethermind.Network ReceiptsMessageSerializer.cs 50

To notice the error, consider the case when the number of elements in the array is 0. Then the condition bytes.Length == 0 will be true and when accessing the 0 element of the array, an exception of type IndexOutOfRangeException.

They wanted to exit this method immediately if the array is empty or the 0 element is equal to a certain value, but it seems that they accidentally mixed up "||" with "&&". I propose to fix this problem as follows:

public ReceiptsMessage Deserialize(byte[] bytes)
{
    if (bytes.Length == 0 || bytes[0] == Rlp.OfEmptySequence[0])
        return new ReceiptsMessage(null);
    ...
}

Condition 2

public void DiscoverAll()
{
    ...
    Type? GetStepType(Type[] typesInGroup)
    {
        Type? GetStepTypeRecursive(Type? contextType)
        {
            ...
        }
        ...
        return typesInGroup.Length == 0 ? typesInGroup[0] :
               GetStepTypeRecursive(_context.GetType());
    }
    ...
}

PVS-Studio warning: V3106 Possibly index is out of bound. The '0' index is pointing beyond 'typesInGroup' bound. Nethermind.Runner EthereumStepsManager.cs 70

A situation similar to that described above occurs. If the number of items in typesInGroup is equal to 0, then when accessing the 0 element an exception of type IndexOutOfRangeException.

Only in this case I do not understand what the developer wanted. Most likely instead typesInGroup[0] need to write null.

Error or unfinished optimization?

private void DeleteBlocks(Keccak deletePointer)
{
   ...
   if (currentLevel.BlockInfos.Length == 1)
   {
      shouldRemoveLevel = true;
   }
   else
   {
      for (int i = 0; i < currentLevel.BlockInfos.Length; i++)
      {
         if (currentLevel.BlockInfos[0].BlockHash == currentHash) // <=
         {
            currentLevel.BlockInfos = currentLevel.BlockInfos
                                      .Where(bi => bi.BlockHash != currentHash)
                                      .ToArray();
            break;
         }
      }
   }
   ...
}

PVS-Studio warning: V3102 Suspicious access to element of 'currentLevel.BlockInfos' object by a constant index inside a loop. Nethermind.Blockchain BlockTree.cs 895

At first glance, the error is obvious - the loop is aimed at iterating over the elements currentLevel.BlockInfosbut when handling instead currentLevel.BlockInfos[i] have written currentLevel.BlockInfos[0]. Fix 0 on i and the mission is completed. But do not rush, let's figure it out.

Now we Length times we turn to Blockhash zero element. If it is equal currentHash, then we take from currentLevel.BlockInfos all elements that are not equal currentHash, write to it and exit the loop. It turns out that the cycle is superfluous.

I think that before there was an algorithm that we decided to change / optimize using linqbut something went wrong. Now, in the case when the condition is false, we get meaningless iterations.

By the way, if the developer who wrote this would use the mode incremental analysis, then he immediately realized that something would be wrong and quickly corrected everything. At the moment, I would rewrite the code like this:

private void DeleteBlocks(Keccak deletePointer)
{
    ...
    if (currentLevel.BlockInfos.Length == 1)
    {
        shouldRemoveLevel = true;
    }
    else
    {
        currentLevel.BlockInfos = currentLevel.BlockInfos
                                  .Where(bi => bi.BlockHash != currentHash)
                                  .ToArray();
    }
    ...
}

Dereferencing null reference

Dereferencing 1

public void Sign(Transaction tx, int chainId)
{
    if (_logger.IsDebug)
        _logger?.Debug($"Signing transaction: {tx.Value} to {tx.To}");
    IBasicWallet.Sign(this, tx, chainId);
}

PVS-Studio warning: V3095 The '_logger' object was used before it was verified against null. Check lines: 118, 118. Nethermind.Wallet DevKeyStoreWallet.cs 118

The error is in the wrong sequence. First comes the appeal to _logger.IsDebug and only after that there is a check _logger on the null. Accordingly, in the case when _logger is equal to nullwe get a type exception NullReferenceException.

Dereferencing 2

private void PopAction()
{
  if (IsTracingInstructions)
  {
      _currentVmTrace.VmTrace.Operations 
                      = _currentVmTrace.Ops.ToArray();        // <=
      _vmTraceStack.Pop();
      _currentVmTrace = _vmTraceStack.Count == 0 ? 
                        (null, null) : _vmTraceStack.Peek();
      _currentOperation = _currentVmTrace.Ops?.Last();        // <=
      _gasAlreadySetForCurrentOp = false;
      }
  }
  ...
}

V3095 The '_currentVmTrace.Ops' object was used before it was verified against null. Check lines: 218, 221. Nethermind.Evm ParityLikeTxTracer.cs 218

The same error as above. First comes the appeal _currentVmTrace.Ops.ToArray ()but only after _currentVmTrace.Ops checked for null.

Dereferencing 3

private void BuildNodeInfo()
{
    _nodeInfo = new NodeInfo();
    _nodeInfo.Name = ClientVersion.Description;
    _nodeInfo.Enode = _enode.Info;                           // <=
    byte[] publicKeyBytes = _enode?.PublicKey?.Bytes;        // <=
    _nodeInfo.Id = (publicKeyBytes == null ? Keccak.Zero :
                   Keccak.Compute(publicKeyBytes)).ToString(false);
    _nodeInfo.Ip = _enode?.HostIp?.ToString();
    _nodeInfo.ListenAddress = $"{_enode.HostIp}:{_enode.Port}";
    _nodeInfo.Ports.Discovery = _networkConfig.DiscoveryPort;
    _nodeInfo.Ports.Listener = _networkConfig.P2PPort;
    UpdateEthProtocolInfo();
}

PVS-Studio warning: V3095 The '_enode' object was used before it was verified against null. Check lines: 55, 56. Nethermind.JsonRpc AdminModule.cs 55

The error is completely similar to the one described above, only this time the culprit is _enode.

I want to add that if you forgot to check for null, then remember this only when your program crashes. The analyzer will remind you of this and everything will be fine.

Our favorite Copy-Paste

Case 1

public static bool Equals(ref UInt256 a, ref UInt256 b)
{
    return a.s0 == b.s0 && a.s1 == b.s1 && a.s2 == b.s2 && a.s2 == b.s2;
}

PVS-Studio warning: V3001 There are identical sub-expressions 'a.s2 == b.s2' to the left and to the right of the '&&' operator. Nethermind.Dirichlet.Numerics UInt256.cs 1154

Here the same condition is checked 2 times:

a.s2 == b.s2

Since the parameters a and b there is a field s3, then I assume that when you copy just forgot to change s2 on the s3.

It turns out that the parameters will be equal more often than expected by the author of the code. At the same time, some developers think that they do not do this and begin to look for a mistake in a completely different place, spending a lot of energy and nerves.

By the way, errors in comparison functions are generally a classic. Apparently, programmers, considering such functions simple, relate to their writing very casually and inattentively. Proof. Knowing this, now be careful :)!

Case 2

public async Task 
PublishBlockAsync(SignedBeaconBlock signedBlock,
                  CancellationToken cancellationToken)
{
    bool acceptedLocally = false;
    ...
    if (acceptedLocally)
    {
        return new ApiResponse(StatusCode.Success);
    }
    else
    {
        return new ApiResponse(StatusCode.Success);
    }
    ...
}

PVS-Studio warning: V3004 The 'then' statement is equivalent to the 'else' statement. Nethermind.BeaconNode BeaconNodeFacade.cs 177

For any value of the variable acceptedLocally the method returns the same thing. It is difficult to say whether it is a mistake or not. Suppose a programmer copied a line and forgot to change StatusCode.Success for something else, then this is a real mistake. Especially since Statuscode is available InternalError and InvalidRequest. But perhaps the code refactoring is to blame and we don’t care about the value anyway acceptedLocally, in this case, the condition becomes the place that makes you sit and think whether it is a mistake or not. So in any case, this case is extremely unpleasant.

Case 3

public void TearDown()
{
    ...
    foreach (var testResult in _results)
    {
         string message = $"{testResult.Order}. {testResult.Name} has " 
               + $"{(testResult.Passed ? "passed [+]" : "failed [-]")}";
         if (testResult.Passed)
         {
               TestContext.WriteLine(message);
         }
         else
         {
               TestContext.WriteLine(message);
         }
     }
}

PVS-Studio warning: V3004 The 'then' statement is equivalent to the 'else' statement. Nethermind.Overseer.Test TestBuilder.cs 46

And again, we do not care for verification, because as a result we get the same thing. And again we sit and suffer, thinking, but what was the developer wanting to write here. A waste of time that could have been avoided by using static analysis and immediately fixing such ambiguous code.

Case 4

public void Setup()
{
    if (_decoderBuffer.ReadableBytes > 0)
    {
        throw new Exception("decoder buffer");
    }

    if (_decoderBuffer.ReadableBytes > 0)
    {
        throw new Exception("decoder buffer");
    }
    ...
}

PVS-Studio warning: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless Nethermind.Network.Benchmark InFlowBenchmarks.cs 55

Just clicked once again Ctrl + V. We remove the excess check and everything is fine. I’m sure that if another condition were important here, then everyone would write in one if through the logical operator I.

Case 5

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    if (_logger.IsInfo)
    {
        if (_logger.IsInfo)
        {
            ...
        }
    }
}

PVS-Studio warning: V3030 Recurring check. The '_logger.IsInfo' condition was already verified in line 242. Nethermind.Synchronization SyncServer.cs 244

As in the fourth case, an extra check is performed. However, the difference is that _logger there is not only one property, but also for example 'bool IsError {get; }'. Therefore, the code should probably look like this:

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    if (_logger.IsInfo)
    {
        if (!_logger.IsError) // <=
        {
            ...
        }
    }
}

Well, or all the fault of the code refactoring and just one check is no longer needed.

Case 6

if (missingParamsCount != 0)
{
    bool incorrectParametersCount = missingParamsCount != 0; // <=
    if (missingParamsCount > 0)
    {
        ...
    }
    ...
}

PVS-Studio warning: V3022 Expression 'missingParamsCount! = 0' is always true. Nethermind.JsonRpc JsonRpcService.cs 127

We check the condition (missingParamsCount! = 0) and if it is true, then we calculate it again and assign the result to the variable. Agree that this is a fairly original way to write true.

Confusing check

public async Task 
DownloadHeaders(PeerInfo bestPeer, 
                BlocksRequest blocksRequest, 
                CancellationToken cancellation)
{
  ...
  for (int i = 1; i < headers.Length; i++)
  {
    ...
    BlockHeader currentHeader = headers[i];
    ...
    bool isValid = i > 1 ? 
        _blockValidator.ValidateHeader(currentHeader, headers[i - 1], false):
        _blockValidator.ValidateHeader(currentHeader, false);
    ...
    if (HandleAddResult(bestPeer, 
                        currentHeader, 
                        i == 0,                              // <=
                        _blockTree.Insert(currentHeader))) 
    {
       headersSynced++;
    }

    ...
  }
  ...
}

PVS-Studio warning: V3022 Expression 'i == 0' is always false. Nethermind.Synchronization BlockDownloader.cs 192

Let's start in order. When initializing a variable i the value is assigned 1. Next, the variable is only incremented, therefore, the value will always be passed to the function false.

Now let's look at HandleAddResult:

private bool HandleAddResult(PeerInfo peerInfo, 
                             BlockHeader block,
                             bool isFirstInBatch, 
                             AddBlockResult addResult)
{
    ...
    if (isFirstInBatch)
    {
        ...
    }
    else
    {
        ...
    }
    ...
}

Here we are interested isFirstInBatch. Judging by the name of this parameter, it is responsible for whether something is the first in the party. Hm, first. Let's look again above and see that there are 2 hits using i:

BlockHeader currentHeader = headers[i];
_blockValidator.ValidateHeader(currentHeader, headers[i - 1], false)

Do not forget that the countdown in this case goes from 1. It turns out that we have 2 options: either “first” means an element under the index 1, or under the index 0. But in any case i will be equal to 1.

It turns out that the function call should look like this:

HandleAddResult(bestPeer, currentHeader, 
                i == 1, _blockTree.Insert(currentHeader))

Or like this:

HandleAddResult(bestPeer, currentHeader, 
                i - 1 == 0, _blockTree.Insert(currentHeader))

And again, if a developer constantly used a static analyzer, then he would write this code and see a warning, would quickly fix it and enjoy life.

A priority ??

Situation 1

public int MemorySize
{
  get
  {
    int unaligned = (Keccak == null ? MemorySizes.RefSize : 
        MemorySizes.RefSize + Keccak.MemorySize) 
        + (MemorySizes.RefSize + FullRlp?.Length 
                                 ?? MemorySizes.ArrayOverhead)   // <=
        + (MemorySizes.RefSize + _rlpStream?.MemorySize 
                                 ?? MemorySizes.RefSize)         // <=
        + MemorySizes.RefSize + (MemorySizes.ArrayOverhead + _data?.Length 
        * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 
        + MemorySizes.SmallObjectOverhead + (Key?.MemorySize ?? 0);
    return MemorySizes.Align(unaligned);
  }
}

PVS-Studio warnings:

  • V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.Trie TrieNode.cs 43
  • V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.Trie TrieNode.cs 44

The analyzer advises you to check how we use the "??" operators, and to understand what the problem is, I propose to consider the following situation. We look at this line here:

(MemorySizes.RefSize + FullRlp?.Length ?? MemorySizes.ArrayOverhead)

MemorySizes.RefSize and MemorySizes.ArrayOverhead are constants:

public static class MemorySizes
{
    ...
    public const int RefSize = 8;
    public const int ArrayOverhead = 20;
    ...
}

Therefore, for clarity, I propose to rewrite the string, substituting their values:

(8 + FullRlp?.Length ?? 20)

Now suppose that Fullrlp will be equal null Then (8 + null) will be null. Next we get the expression (null ?? twenty), which will return 20.

It turns out that provided that when Fullrlp equally null, will always return a value from MemorySizes.ArrayOverhead no matter what is stored in MemorySizes.RefSize. The fragment on the line below is similar.

But the question is, did the developer want this behavior? Let's look at the following line:

MemorySizes.RefSize + (MemorySizes.ArrayOverhead 
    + _data?.Length * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 

Here, as in the sections considered above MemorySizes.RefSize add up with expression but

note that after the first operator "+" is a bracket. It turns out that to MemorySizes.RefSize we must add some expression, and if it is equal nullthen add another. So the code should look like this:

public int MemorySize
{
  get
  {
    int unaligned = (Keccak == null ? MemorySizes.RefSize : 
       MemorySizes.RefSize + Keccak.MemorySize) 
       + (MemorySizes.RefSize + (FullRlp?.Length 
                                 ?? MemorySizes.ArrayOverhead))    // <=
       + (MemorySizes.RefSize + (_rlpStream?.MemorySize 
                                 ?? MemorySizes.RefSize))          // <=
       + MemorySizes.RefSize + (MemorySizes.ArrayOverhead + _data?.Length 
       * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 
       + MemorySizes.SmallObjectOverhead + (Key?.MemorySize ?? 0);
    return MemorySizes.Align(unaligned);
  }
}

Again, this is only an assumption, however, if the developer wanted a different behavior, then you should explicitly indicate this:

((MemorySizes.RefSize + FullRlp?.Length) ?? MemorySizes.ArrayOverhead)

And then, the one who reads this code would not have to delve into for a long time what is happening here and what the programmer wanted.

Situation 2

private async Task 
ExecuteAsync(JsonRpcRequest request, 
             string methodName,
             (MethodInfo Info, bool ReadOnly) method)
{
    var expectedParameters = method.Info.GetParameters();
    var providedParameters = request.Params;
    ...
    int missingParamsCount = expectedParameters.Length 
            - (providedParameters?.Length ?? 0) 
            + providedParameters?.Count(string.IsNullOrWhiteSpace) ?? 0; // <=
    if (missingParamsCount != 0)
    {
        ...
    }
    ...
}

PVS-Studio warning: V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.JsonRpc JsonRpcService.cs 123

And again, the priority of operation "??", therefore, as in the last time, consider the situation. We look at this line here:

expectedParameters.Length 
            - (providedParameters?.Length ?? 0) 
            + providedParameters?.Count(string.IsNullOrWhiteSpace) ?? 0;

Assume that providedParameters equally nullthen, for clarity, we immediately replace everything related to providedParameters on the null but instead expectedParameters.Length substitute a random value:

100 - (null ?? 0) + null ?? 0;

Now it’s immediately noticeable that there are two similar checks, only in one case there are brackets, and in the other there. Let's run this example. First we get that (null ?? 0) will return 0, then subtract 0 from 100 and get 100:

100 + null ?? 0;

Now, instead of executing "null ?? 0"and eventually get (100 + 0), we get a completely different one.

First run (100 + null) and we get null. Then checked (null ?? 0), which causes the missingParamsCount variable to be 0.

Since then comes the condition checking whether missingParamsCount zero, we can assume that the developer sought exactly this behavior. And I’ll say, why not put brackets and express my thoughts explicitly? In general, it is possible that this check was due to a misunderstanding of why 0 is sometimes returned, and this is nothing more than a crutch.

And again, we are wasting time, although we might not have done it, use the developer when writing code incremental analysis.

Conclusion

In conclusion, I hope that I was able to convey to you that the static analyzer is your friend, and not an evil overseer who is just waiting for you to make a mistake.

It should also be noted that by using the analyzer once or rarely using it, you will of course find errors and some of them will even be quickly fixed, but there will also be ones that you need to smash your head at. Therefore, you need to use a static analyzer constantly. Then you will find many more errors and correct them at the moment when the program code is written and you understand exactly what you are trying to do.

The simple truth is that everyone makes mistakes and that’s normal. We all learn from mistakes, but only from those that were able to notice and understand. Therefore, use modern tools to search for these very errors, for example - PVS-Studio. Thank you for the attention.

If you want to share this article with an English-speaking audience, then please use the link to the translation: Nikolay Mironov. Single line code or check of Nethermind using PVS-Studio C # for Linux.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *