Will Arabella sink?

Recently, the network has news that the source code of the game “The Adventures of Captain Blood” has been opened. We couldn’t pass by and checked its quality with PVS-Studio. Will the legendary ship of Captain Blood sink from the bugs found? Let’s find out!

about the project

The Adventures of Captain Blood game was created by the 1C Sea Wolf studio based on the books by Rafael Sabatini. It tells about the adventures of the protagonist of these works, Captain Peter Blood. The game is set in medieval New England. The main character, a military doctor by education, was unfairly convicted and exiled to the island of Barbados. He manages to escape from the island, after which, having captured a Spanish ship, he becomes the most famous pirate of that time.

The project was completed in 2010, but never saw the light of day. This is because 1C and Playlogic could not decide who owns the rights to the game. Then there was news that the final assembly of the game was irretrievably lost. However, there has been some news recently that this is not the case. There was a video with the passage on YouTube and a working build of the game on one well-known tracker. Later posted on GitHub sources licensed under GPLv3.

So, after we got acquainted with the project, I propose to see what interesting errors we managed to find in it.

Test Results

Warning N1

V1075 The function expects the file to be opened for writing, but it was opened for reading. Debugger.cpp 172

We can congratulate my colleague Sergey Larin on the first trophy triggering of his diagnostic rule, released in the PVS-Studio 7.15 release. This diagnostic rule complains about writing to files that are open for reading, and vice versa. Honestly, we didn’t even think it would ever work. Here is the problematic function in full:

void appDebuger::CopyDump(const char * src, DWORD from, const char * dst)
{
  FILE * file = fopen(src, "rb");
  if(file)
  {
    fseek(file, 0, SEEK_END);
    DWORD size = ftell(file);
    fseek(file, from, SEEK_SET);
    if(size > from)
    {
      FILE * to = fopen(dst, "a+b");
      if(to)
      {
        char buf[128];
        while(from < size)
        {          
          DWORD s = size - from;
          if(s > sizeof(buf)) s = sizeof(buf);
          memset(buf, ' ', sizeof(buf));
          fread(buf, s, 1, file);
          fwrite(buf, s, 1, file);            // <=
          from += s;
        }
        fclose(to);
      }
    }
    fclose(file);
  }
}

So, the function should read data from one file along the path srcstarting from position* from*, and write them to another file along the path dst. The variable is responsible for the source file filefor the resulting file variable to.

Unfortunately, the code reads from a variable file and writing to a variable file. You can correct the code like this:

fwrite(buf, s, 1, to);

Warning N2

V1065 [CWE-682] Expression can be simplified, check ‘bi’ and similar operands. PreviewAnimation.cpp 146

PreviewAnimation::PreviewAnimation(....)
{
  // ....
  const int bw = 30;
  const int bh = 30;
  const int bi = 3;
  const int cn = 9;

  bar.r.w = 7 + (bw + bi)*cn + 7 - bi + 1 + (bw + bi) + 7 - bi + 1;
  // ....
}

And this is one of my favorite diagnostic rules! It tells us that in the presented expression, you can reduce the variable bi:

bar.r.w = 7 + (bw + bi)*cn + 7 - bi + 1 + bw + 7 + 1;

Possibly in place of one of bi there must be some other variable. Or is the expression (bw+bi) forgot to multiply by something. I wonder how the animations in the game would change after that?

Also, I note that giving similar names to constants: bw, bh, bi, cn is bad practice. This confuses the code for the person reading it. It would be much better to name the variables more meaningfully, or at least accompany their definitions with explanatory comments.

Warning N3

V570 [CWE-480] The ‘ldMaxChars’ variable is assigned to itself. StringAttr.cpp 198

void StringAttribute::LoadFromFile (....)
{
  int ldMinChars, ldMaxChars;
  // ....
  minChars = ldMinChars;
  ldMaxChars = ldMaxChars;
  // ....
}

class StringAttribute : public BaseAttribute
{
  int minChars;
  int maxChars;
}

Here we see the classic assignment of a variable to itself. To see what should be in place of the variable to which the assignment takes place, look at the class definition StringAttribute. It shows a similar field maxChars. It is obvious that the programmer made a typo and actually should have assigned a value to the variable ldMaxChars. Corrected code snippet:

void StringAttribute::LoadFromFile (....)
{
  int ldMinChars, ldMaxChars;
  // ....
  minChars = ldMinChars;
  maxChars = ldMaxChars;
  // ....
}

Warning N4

V781 [CWE-20, CERT-API00-C] The value of the ‘start’ index is checked after it was used. Perhaps there is a mistake in program logic. coreloader.h 136

__forceinline void GetBootParams(const char * bootIni)
{
  long start = 0;
  // ....
  if (bootIni[start] && start > 0)
  {
    // ....
  }
  // ....
}

In this code snippet, the programmer first used the value start for array indexing bootIni, and only then checked it for greater than* 0*. The check may be meaningless, then you should remove it. It may also make sense, then it should be swapped with the operation of taking the value by index.

if (start > 0 && bootIni[start])
{
  // ....
}

Otherwise, the programmer risks dropping the game. By the way, there were quite a lot of such positives in the program. Here are a few to look out for first:

  • V781 [CWE-20, CERT-API00-C] The value of the ‘i’ index is checked after it was used. Perhaps there is a mistake in program logic. tinyxml.h 352

  • V781 [CWE-20, CERT-API00-C] The value of the ‘start’ index is checked after it was used. Perhaps there is a mistake in program logic. coreloader.h 136

  • V781 [CWE-20, CERT-API00-C] The value of the ‘pathLen’ index is checked after it was used. Perhaps there is a mistake in program logic. winworkpath.h 50

  • ….

Warning N5

V700 [CWE-480] Consider inspecting the ‘mView = mView = chr.Render().GetView()’ expression. It is odd that variable is initialized through itself. CharacterItems.cpp 705

void CharacterItems::DrawFlares(float dltTime, const Matrix & toWorld)
{
  // ....
  Matrix mView = mView = chr.Render().GetView();
  // ....
}

Strange code with assigning a value to a variable to itself. Most likely there was a typo. In any case, the redundant assignment can be removed:

void CharacterItems::DrawFlares(float dltTime, const Matrix & toWorld)
{
  // ....
  Matrix mView = chr.Render().GetView();
  // ....
}

Warning N6

V654 [CWE-834] The condition ‘n >= 0’ of loop is always true. resourceselect.cpp 42

typedef unsigned int dword;

TResourceSelectorWindow::TResourceSelectorWindow ()
{
  string PakPath;
  // ....
  miss->EditorGetPackPath(PakPath);
  //....
  for (dword n = PakPath.Size()-1; n >= 0; n--)
  {
    if (PakPath[n] == '\\')
    {
      PakPath.Delete(0, n+1);
      break;
    }
  }

  // ....
}

Here we see a potentially infinite loop if there is no character in the string \. Variable n unsigned type, which means it will always be greater than or equal to 0.

But in fact, there will be no eternal loop, because when a variable overflow occurs n a crash will occur. Overflowing an unsigned variable does not lead to undefined behavior, but access outside the array will. In practice, there is likely to be Access Violation.

Here the programmer wrote an algorithm that looks for the first occurrence of a character \ from the end and removes the substring from the beginning up to and including this character. The most correct fix would be to use rbegin And rendbut there is a nuance – this is a self-written line (analogue std::string with additional functionality), in which, unfortunately, there are no iterators. Then, to correct the code, we can take a signed variable with a dimension 2 times larger (int64_t) and loop through it.

for (int64_t n = PakPath.Size() - 1; n >= 0; n--)
{
  if (PakPath[n] == '\\')
  {
    PakPath.Delete(0, n+1);
    break;
  }
}

But this method will no longer work when porting an application to a 64-bit system, in which the function size should already return a value of type uint64_t. And then iterators would be a good way to write correct code. For example, this is how it would look in C++14 if the self-written string still had no iterators, but at least there were functions for accessing the underlying buffer:

auto r_begin = std::make_reverse_iterator(PakPath.GetBuffer() + PakPath.Size());
auto   r_end = std::make_reverse_iterator(PakPath.GetBuffer());

while (r_begin != r_end )
{
  if (*r_begin == '\\')
  {
    PakPath.Delete(0, std::distance(r_begin, r_end));
    break;
  }

  ++r_begin;
}

Warning N7

V611 [CWE-762, CERT-MEM51-CPP] The memory was allocated using ‘new T[]’ operator but was released using the ‘delete’ operator. Consider inspecting this code. It’s probably better to use ‘delete’ [] ldName;’. ColorAttr.cpp 198

void ColorAttribute::LoadFromFile (....)
{
  DWORD slen = 0;
  // ....
  char* ldName = NEW char[slen+1];
  // ....
  delete ldName;
}

Here the developer cleared the dynamically allocated memory, however, he did it wrong. Instead of calling the operator delete[] to free previously allocated memory using the operator new[]he called a simple delete. Correct option:

void ColorAttribute::LoadFromFile (....)
{
  DWORD slen = 0;
  // ....
  char* ldName = NEW char[slen+1];
  // ....
  delete ldName[];
}

The problem of such errors is covered in more detail in the article of my colleague “Why in C++ arrays need to be deleted via delete[]“.

Warning N8

V607 Ownerless expression ‘isSetPause ? 10’. SoundsEngine.cpp 741

void SoundsEngine::SetPause(bool isSetPause)
{
  //....
  isSetPause ? 1 : 0;
  //....
}

The analyzer points to dead code. As I noticed in the class SoundsEngine contains a similar field isPausehowever, it has type long. It is likely that earlier the code did the parameter conversion isSetPause from type bool V long, but then it was refactored. The line is harmless, and it can be completely removed.

Warning N9

V583 [CWE-783] The ‘?:’ operator, regardless of its conditional expression, always returns one and the same value: 0xffff0000. NoFatality.cpp 63

void _cdecl NoFatality::Draw(float dltTime, long level)
{
  //....
  for (....)
  {
    Render().DrawBox(...., IsActive() ? 0xffff0000 : 0xffff0000);
  }
  //....
}

Here we see that the second and third operands of the ternary operator are equal. It turns out that the second argument of the function drawbox will always be a constant 0xffff0000. It is most likely that another value should have been passed in one of the branches, otherwise the code can be simplified.

Warning N10

V523 [CWE-691] The ‘then’ statement is equivalent to the ‘else’ statement. TaskViewer.cpp 298

void TaskViewer::SetText(const char * rawText)
{
  // ....
  if (rawText[0] == '#')
  {
    // Подменим идентификатор на реальную строку
    // потом
    str = rawText;
  }
  else
  {
    str = rawText;
  }
  // ....
}

In this code fragment, the analyzer noted that the operator if identical branches then And else. It turns out that regardless of the condition, the same piece of code will be executed. Perhaps this is not the behavior the programmer intended.

N11 warnings

V614 [CWE-457, CERT-EXP53-CPP] Uninitialized variable ‘color.c’ used. Color.h 1268

mathinline dword mathcall Color::GetDword() const
{
  DColor color;
  color.r = (byte)(r * 255.0f);
  color.g = (byte)(g * 255.0f);
  color.b = (byte)(b * 255.0f);
  color.a = (byte)(a * 255.0f);

  return color.c;
}

Let’s see what is dColor:

class DColor
{
public:

  union
  {
#ifndef _XBOX
  struct
  {
    ///Синий
    unsigned char b;
    ///Зелённый
    unsigned char g;
    ///Красный
    unsigned char r;
    ///Прозрачность
    unsigned char a;
  };
#else
  struct
    {
    ///Прозрачность
    unsigned char a;
    ///Красный
    unsigned char r;
    ///Зелённый
    unsigned char g;
    ///Синий
    unsigned char b;
  };
#endif
  union
  {
    ///Упакованный цвет
    dword c;
    ///Упакованный цвет
    dword color;
  };
};

In the code, the four RGBA components are written byte-by-byte to the desired fields, and then returned from the function as a value of type dword (4-byte unsigned type). In C, such code is harmless and the programmer would get what he wanted. In C++ reading from an inactive field union leads to undefined behavior. The reason for this is that C++ operates on objects whose lifetime must start. Only one field can be active at any given time. For an inactive field (last union) it didn’t start.

Some compilers allow you to write such code as a non-standard extension, and most of the time it works as you would expect it to. But it is worth considering that it is intolerable. On the current compiler and platform, everything can work correctly, but it’s worth changing one thing – and quirks can begin.

How to write C++ code that will work everywhere and always the same? There may be several approaches. You can still leave the four RGBA components as four fields, and optionally get their combination as dword using bitwise operations:

class DColor
{
public:
  ///Синий
  uint8_t b;
  ///Зелённый
  uint8_t g;
  ///Красный
  uint8_t r;
  ///Прозрачность
  uint8_t a;

  dword getAsDword() const noexcept
  {
#ifndef _XBOX
    return (static_cast<uint32_t>(b) << 0)
         | (static_cast<uint32_t>(g) << 8)
         | (static_cast<uint32_t>(r) << 16)
         | (static_cast<uint32_t>(a) << 24);
#else
    return (static_cast<uint32_t>(a) << 0)
         | (static_cast<uint32_t>(r) << 8)
         | (static_cast<uint32_t>(g) << 16)
         | (static_cast<uint32_t>(b) << 24);
#endif
  }
};

Another way could be to use an array of 4 bytes, and to convert it to dword you can use the function memcpy (or std::bit_caststarting with C++20):

class DColor
{
private:
  uint8_t m_components[4];

public:
  DColor(uint8_t r, uint8_t g, uint8_t b, uint8_t a) noexcept; 
  DColor(dword comp) noexcept; // To construct from dword

  DColor& operator=(dword) noexcept; // To be assigned to dword

  uint8_t& R() noexcept;
  uint8_t& G() noexcept;
  uint8_t& B() noexcept;
  uint8_t& A() noexcept;

  dword getAsDword() const noexcept
  {
#if __cplusplus >= 202002L && __has_cpp_attribute(__cpp_lib_bit_cast)
    return std::bit_cast<dword>(m_components);
#else
    dword result;
    memcpy(&result, m_components, sizeof(m_components));
    return result;
#endif
  }
};

Comment by colleague Andrey Karpov. This use of conditional compilation directives makes the code extremely difficult to read. Nesting is broken, and you have to struggle to understand what is written, including and excluding different code fragments in your head.

The code should not be written in such a way that it is as short as possible, but that it is easy to read and understand. I would make two different implementations of the function here:

#if __cplusplus >= 202002L && __has_cpp_attribute(__cpp_lib_bit_cast)
  dword getAsDword() const noexcept
  {
    return std::bit_cast<dword>(m_components);
  }
#else
  dword getAsDword() const noexcept
  {
    dword result;
    memcpy(&result, m_components, sizeof(m_components));
    return result;
  }
#endif

The gist is longer, but the code is much nicer to read.

In the case of large functions, with pieces of conditional compilation, of course, this will not work. In this case, it makes sense to put these pieces into separate functions, implemented differently for different cases. Those. it is necessary to separate the general code as much as possible from the one that depends on the compilation conditions.

N12 warnings

V579 [CWE-687, CERT-ARR01-C] The memset function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. Debugger.cpp 282

void appDebuger::debuggerMakeThreadDump(....)
{
  CONTEXT ct;
  memset (&ct, 0, sizeof(&ct));
  // ....
}
typedef struct DECLSPEC_NOINITALL _CONTEXT 
{
  DWORD ContextFlags;
  // ....
  DWORD   SegCs;              // MUST BE SANITIZED
  DWORD   EFlags;             // MUST BE SANITIZED
  DWORD   Esp;
  DWORD   SegSs;
  BYTE   ExtendedRegisters[MAXIMUM_SUPPORTED_EXTENSION];

} CONTEXT;

The programmer wanted to initialize all fields of the structure to zero. Object size* ct *– 716 bytes, and passed into it the size of the pointer – 4 bytes. Fix a call memset could be like this:

memset (&ct, 0, sizeof(ct));

PS In general, it is strange why programmers so stubbornly use this pattern of initializing structures using memset. It’s possible to write:

CONTEXT ct {};

N13 warnings

Well, I would like to finish the list of errors with a couple of cool typos:

  • V501 [CWE-571] There are identical sub-expressions ‘lleg.to >= 0′ to the left and to the right of the ‘&&’ operator. CharacterLegs.cpp 51

  • V501 [CWE-571] There are identical sub-expressions ‘lleg.fo >= 0′ to the left and to the right of the ‘&&’ operator. CharacterLegs.cpp 51

void CharacterLegs::Invalidate()
{
  lleg.th = ani->FindBone("Bedro_left_joint"  ,true);
  lleg.kn = ani->FindBone("Golen_left_joint"  ,true);
  lleg.fo = ani->FindBone("Foolt_left_joint"  ,true);
  lleg.to = ani->FindBone("Foolt_left_joint_2",true);

  rleg.th = ani->FindBone("Bedro_right_joint"  ,true);
  rleg.kn = ani->FindBone("Golen_right_joint"  ,true);
  rleg.fo = ani->FindBone( "Foot_right_joint"  ,true);
  rleg.to = ani->FindBone( "Foot_right_joint_2",true);

  if (   lleg.th >= 0 && lleg.kn >= 0 
      && lleg.fo >= 0 && lleg.to >= 0 
      && rleg.th >= 0 && rleg.kn >= 0 
      && lleg.fo >= 0 && lleg.to >= 0 )
  {
    // ....
  }

  // ....
}

Here, in the operator condition if the analyzer noticed two identical subexpressions lleg.to >= 0 And lleg.fo >= 0. It can be seen that the developer wanted to check all the fields and made a typo. In place of two identical subexpressions, there should be checks of other variables:

if (   lleg.th >= 0 && lleg.kn >= 0 
    && rleg.th >= 0 && rleg.kn >= 0 
    && lleg.fo >= 0 && lleg.to >= 0
    && rleg.fo >= 0 && rleg.to >= 0)
{
  // ....
}

Conclusion

As a result of the check, the analyzer found various kinds of errors in the project. I described a damn dozen of them in the article. The rest are an incentive for you to download the analyzer and try it (if you haven’t already, of course). And it will be even more interesting to see what you can find in your own code or the code of your colleagues :).

More thanks to the people thanks to whom the game was still able to see the light of day, as well as to the reviews, articles and discussions that develop and support it. Well, whether the Arabella will sink, let each reader decide for himself!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *