The most beautiful bug I found using PVS-Studio in 2024

Beautiful bug

I'll warn you right away, my tastes are very specific. The beauty of a bug is that it is very difficult for humans to find. I don't believe it can be seen when reviewing code. If only you know in advance that it exists and look for it purposefully.

I found an error in the project DPDK. There are other errors in it, but more about them later. They pale in comparison to this diamond. Just don't expect anything like that. The error is simple as hell. It’s just that it’s not easy to find it by looking through the code. Actually, try it yourself.

To do this, first examine the list of named constants:

enum dbg_status

enum dbg_status {
  DBG_STATUS_OK,
  DBG_STATUS_APP_VERSION_NOT_SET,
  DBG_STATUS_UNSUPPORTED_APP_VERSION,
  DBG_STATUS_DBG_BLOCK_NOT_RESET,
  DBG_STATUS_INVALID_ARGS,
  DBG_STATUS_OUTPUT_ALREADY_SET,
  DBG_STATUS_INVALID_PCI_BUF_SIZE,
  DBG_STATUS_PCI_BUF_ALLOC_FAILED,
  DBG_STATUS_PCI_BUF_NOT_ALLOCATED,
  DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS,
  DBG_STATUS_NO_MATCHING_FRAMING_MODE,
  DBG_STATUS_VFC_READ_ERROR,
  DBG_STATUS_STORM_ALREADY_ENABLED,
  DBG_STATUS_STORM_NOT_ENABLED,
  DBG_STATUS_BLOCK_ALREADY_ENABLED,
  DBG_STATUS_BLOCK_NOT_ENABLED,
  DBG_STATUS_NO_INPUT_ENABLED,
  DBG_STATUS_NO_FILTER_TRIGGER_256B,
  DBG_STATUS_FILTER_ALREADY_ENABLED,
  DBG_STATUS_TRIGGER_ALREADY_ENABLED,
  DBG_STATUS_TRIGGER_NOT_ENABLED,
  DBG_STATUS_CANT_ADD_CONSTRAINT,
  DBG_STATUS_TOO_MANY_TRIGGER_STATES,
  DBG_STATUS_TOO_MANY_CONSTRAINTS,
  DBG_STATUS_RECORDING_NOT_STARTED,
  DBG_STATUS_DATA_DIDNT_TRIGGER,
  DBG_STATUS_NO_DATA_RECORDED,
  DBG_STATUS_DUMP_BUF_TOO_SMALL,
  DBG_STATUS_DUMP_NOT_CHUNK_ALIGNED,
  DBG_STATUS_UNKNOWN_CHIP,
  DBG_STATUS_VIRT_MEM_ALLOC_FAILED,
  DBG_STATUS_BLOCK_IN_RESET,
  DBG_STATUS_INVALID_TRACE_SIGNATURE,
  DBG_STATUS_INVALID_NVRAM_BUNDLE,
  DBG_STATUS_NVRAM_GET_IMAGE_FAILED,
  DBG_STATUS_NON_ALIGNED_NVRAM_IMAGE,
  DBG_STATUS_NVRAM_READ_FAILED,
  DBG_STATUS_IDLE_CHK_PARSE_FAILED,
  DBG_STATUS_MCP_TRACE_BAD_DATA,
  DBG_STATUS_MCP_TRACE_NO_META,
  DBG_STATUS_MCP_COULD_NOT_HALT,
  DBG_STATUS_MCP_COULD_NOT_RESUME,
  DBG_STATUS_RESERVED0,
  DBG_STATUS_SEMI_FIFO_NOT_EMPTY,
  DBG_STATUS_IGU_FIFO_BAD_DATA,
  DBG_STATUS_MCP_COULD_NOT_MASK_PRTY,
  DBG_STATUS_FW_ASSERTS_PARSE_FAILED,
  DBG_STATUS_REG_FIFO_BAD_DATA,
  DBG_STATUS_PROTECTION_OVERRIDE_BAD_DATA,
  DBG_STATUS_DBG_ARRAY_NOT_SET,
  DBG_STATUS_RESERVED1,
  DBG_STATUS_NON_MATCHING_LINES,
  DBG_STATUS_INSUFFICIENT_HW_IDS,
  DBG_STATUS_DBG_BUS_IN_USE,
  DBG_STATUS_INVALID_STORM_DBG_MODE,
  DBG_STATUS_OTHER_ENGINE_BB_ONLY,
  DBG_STATUS_FILTER_SINGLE_HW_ID,
  DBG_STATUS_TRIGGER_SINGLE_HW_ID,
  DBG_STATUS_MISSING_TRIGGER_STATE_STORM,
  MAX_DBG_STATUS
};

Now examine the array of strings:

static const char * const s_status_str[] = ….

static const char * const s_status_str[] = {
  /* DBG_STATUS_OK */
  "Operation completed successfully",

  /* DBG_STATUS_APP_VERSION_NOT_SET */
  "Debug application version wasn't set",

  /* DBG_STATUS_UNSUPPORTED_APP_VERSION */
  "Unsupported debug application version",

  /* DBG_STATUS_DBG_BLOCK_NOT_RESET */
  "The debug block wasn't reset since the last recording",

  /* DBG_STATUS_INVALID_ARGS */
  "Invalid arguments",

  /* DBG_STATUS_OUTPUT_ALREADY_SET */
  "The debug output was already set",

  /* DBG_STATUS_INVALID_PCI_BUF_SIZE */
  "Invalid PCI buffer size",

  /* DBG_STATUS_PCI_BUF_ALLOC_FAILED */
  "PCI buffer allocation failed",

  /* DBG_STATUS_PCI_BUF_NOT_ALLOCATED */
  "A PCI buffer wasn't allocated",

  /* DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS */
  "The filter/trigger constraint dword offsets are not "
  "enabled for recording",

  /* DBG_STATUS_VFC_READ_ERROR */
  "Error reading from VFC",

  /* DBG_STATUS_STORM_ALREADY_ENABLED */
  "The Storm was already enabled",

  /* DBG_STATUS_STORM_NOT_ENABLED */
  "The specified Storm wasn't enabled",

  /* DBG_STATUS_BLOCK_ALREADY_ENABLED */
  "The block was already enabled",

  /* DBG_STATUS_BLOCK_NOT_ENABLED */
  "The specified block wasn't enabled",

  /* DBG_STATUS_NO_INPUT_ENABLED */
  "No input was enabled for recording",

  /* DBG_STATUS_NO_FILTER_TRIGGER_256B */
  "Filters and triggers are not allowed in E4 256-bit mode",

  /* DBG_STATUS_FILTER_ALREADY_ENABLED */
  "The filter was already enabled",

  /* DBG_STATUS_TRIGGER_ALREADY_ENABLED */
  "The trigger was already enabled",

  /* DBG_STATUS_TRIGGER_NOT_ENABLED */
  "The trigger wasn't enabled",

  /* DBG_STATUS_CANT_ADD_CONSTRAINT */
  "A constraint can be added only after a filter was "
  "enabled or a trigger state was added",

  /* DBG_STATUS_TOO_MANY_TRIGGER_STATES */
  "Cannot add more than 3 trigger states",

  /* DBG_STATUS_TOO_MANY_CONSTRAINTS */
  "Cannot add more than 4 constraints per filter or trigger state",

  /* DBG_STATUS_RECORDING_NOT_STARTED */
  "The recording wasn't started",

  /* DBG_STATUS_DATA_DID_NOT_TRIGGER */
  "A trigger was configured, but it didn't trigger",

  /* DBG_STATUS_NO_DATA_RECORDED */
  "No data was recorded",

  /* DBG_STATUS_DUMP_BUF_TOO_SMALL */
  "Dump buffer is too small",

  /* DBG_STATUS_DUMP_NOT_CHUNK_ALIGNED */
  "Dumped data is not aligned to chunks",

  /* DBG_STATUS_UNKNOWN_CHIP */
  "Unknown chip",

  /* DBG_STATUS_VIRT_MEM_ALLOC_FAILED */
  "Failed allocating virtual memory",

  /* DBG_STATUS_BLOCK_IN_RESET */
  "The input block is in reset",

  /* DBG_STATUS_INVALID_TRACE_SIGNATURE */
  "Invalid MCP trace signature found in NVRAM",

  /* DBG_STATUS_INVALID_NVRAM_BUNDLE */
  "Invalid bundle ID found in NVRAM",

  /* DBG_STATUS_NVRAM_GET_IMAGE_FAILED */
  "Failed getting NVRAM image",

  /* DBG_STATUS_NON_ALIGNED_NVRAM_IMAGE */
  "NVRAM image is not dword-aligned",

  /* DBG_STATUS_NVRAM_READ_FAILED */
  "Failed reading from NVRAM",

  /* DBG_STATUS_IDLE_CHK_PARSE_FAILED */
  "Idle check parsing failed",

  /* DBG_STATUS_MCP_TRACE_BAD_DATA */
  "MCP Trace data is corrupt",

  /* DBG_STATUS_MCP_TRACE_NO_META */
  "Dump doesn't contain meta data - it must be provided in image file",

  /* DBG_STATUS_MCP_COULD_NOT_HALT */
  "Failed to halt MCP",

  /* DBG_STATUS_MCP_COULD_NOT_RESUME */
  "Failed to resume MCP after halt",

  /* DBG_STATUS_RESERVED0 */
  "",

  /* DBG_STATUS_SEMI_FIFO_NOT_EMPTY */
  "Failed to empty SEMI sync FIFO",

  /* DBG_STATUS_IGU_FIFO_BAD_DATA */
  "IGU FIFO data is corrupt",

  /* DBG_STATUS_MCP_COULD_NOT_MASK_PRTY */
  "MCP failed to mask parities",

  /* DBG_STATUS_FW_ASSERTS_PARSE_FAILED */
  "FW Asserts parsing failed",

  /* DBG_STATUS_REG_FIFO_BAD_DATA */
  "GRC FIFO data is corrupt",

  /* DBG_STATUS_PROTECTION_OVERRIDE_BAD_DATA */
  "Protection Override data is corrupt",

  /* DBG_STATUS_DBG_ARRAY_NOT_SET */
  "Debug arrays were not set "
  "(when using binary files, dbg_set_bin_ptr must be called)",

  /* DBG_STATUS_RESERVED1 */
  "",

  /* DBG_STATUS_NON_MATCHING_LINES */
  "Non-matching debug lines - in E4, all lines must be of "
  "the same type (either 128b or 256b)",

  /* DBG_STATUS_INSUFFICIENT_HW_IDS */
  "Insufficient HW IDs. Try to record less Storms/blocks",

  /* DBG_STATUS_DBG_BUS_IN_USE */
  "The debug bus is in use",

  /* DBG_STATUS_INVALID_STORM_DBG_MODE */
  "The storm debug mode is not supported in the current chip",

  /* DBG_STATUS_OTHER_ENGINE_BB_ONLY */
  "Other engine is supported only in BB",

  /* DBG_STATUS_FILTER_SINGLE_HW_ID */
  "The configured filter mode requires a single Storm/block input",

  /* DBG_STATUS_TRIGGER_SINGLE_HW_ID */
  "The configured filter mode requires that all the constraints of a "
  "single trigger state will be defined on a single Storm/block input",

  /* DBG_STATUS_MISSING_TRIGGER_STATE_STORM */
  "When triggering on Storm data, the Storm to trigger on must be specified"
};

The last one is the code that turns the named constant into a string:

const char *qed_dbg_get_status_str(enum dbg_status status)
{
  return (status < MAX_DBG_STATUS) ?
    s_status_str[status] : "Invalid debug status";
}

Where is the mistake?

Where is the bug?

Found it?

In fact, there is a hint that you can theoretically cling to. In an array, all rows are separated by one empty line, except for one place. But we’ll start with a warning from the PVS-Studio code analyzer:

V557 Array overrun is possible. The value of 'status' index could reach 58. qede_debug.c 7149

It points to the place where a pointer to a string is retrieved from the array by index:

return (status < MAX_DBG_STATUS) ?
  s_status_str[status] : "Invalid debug status";

At first I made the wrong boring diagnosis. Inadvertently, I thought that in front of me off-by-one error. This is very similar to a typical incorrect check that I have seen many times in different projects.

The essence of the classic antipattern. Eat enumwhose last element is used as the number of elements in it.

enum Efoo {
  A,
  B,
  C,
  COUNT
};

Constants in enum values ​​are assigned starting from 0. Accordingly, the auxiliary constant COUNT will be equal to 3, which corresponds to the number of main named constants.

There is some kind of array where each named constant corresponds to something:

char Cfoo[] = { 'A', 'B', 'C' };

A common bug is when they check that the index does not exceed the array boundary:

char Convert(unsigned id)
{
  return (id > COUNT) ? 0 : Cfoo[id];
}

The check does not work correctly: if id turns out to be equal COUNTthen the array will go beyond the boundary. For such errors, the PVS-Studio analyzer issues a warning similar to the one discussed earlier.

One of the following options will be correct:

return (id >= COUNT) ? 0 : Cfoo[id];  // OK
return (id < COUNT) ? Cfoo[id] : 0;   // OK

Nothing interesting. Let's go our separate ways. You can only include the warning itself in the article along with other out-of-bounds errors, but the bug itself cannot be discussed in the article about checking DPDK.

And then at the last moment – STOP!

HERE IS THE CORRECT CHECK WRITTEN!

return (status < MAX_DBG_STATUS) ?
  s_status_str[status] : "Invalid debug status";

Now it’s unclear and interesting! Why did the analyzer issue a warning? False positive? This is unlikely, there is no place for him to get confused here.

Rolling up my sleeves, I begin to carefully study the code. Here it is! Missing line for constant DBG_STATUS_NO_MATCHING_FRAMING_MODE.

In enum:

DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS,
DBG_STATUS_NO_MATCHING_FRAMING_MODE,
DBG_STATUS_VFC_READ_ERROR,

In the array:

/* DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS */
"The filter/trigger constraint dword offsets are not "
"enabled for recording",

/* DBG_STATUS_VFC_READ_ERROR */
"Error reading from VFC",

Pay attention to the separator consisting of two empty lines. I mentioned this artifact before. Something went wrong here. Perhaps there was some kind of failure when filling the array, perhaps a line was accidentally deleted, perhaps an unsuccessful branch merge, or something else.

However, now that a bug has been found, two empty lines look suspicious. When reviewing, no one will pay attention to them. And even if they accidentally notice it, they will simply remove one of them for beauty. No one will say, “Well, I guess we missed something here, let’s go check it out” 🙂

As a result of this error:

  • it is possible to go beyond the array boundary;
  • the function returns incorrect strings for all constants starting with DBG_STATUS_NO_MATCHING_FRAMING_MODE.

In my opinion, it's beautiful. Thank you for your attention. Check your code regularly using PVS-Studio.

PS

A colleague suggested adding to the article that such an error can be avoided by using compile-time checks. Code examples with static_assert for C and C++. For C++ it turns out to be more dexterous with calculating the sizes of arrays, but that’s another story story. Perhaps adding such checks everywhere is overkill. But for such large arrays, why not.

If you want to share this article with an English-speaking audience, please use the translation link: Andrey Karpov. Most striking error I found with PVS-Studio in 2024.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *