How not to check the size of an array in C++

How often do you come across the construction sizeof(array)/sizeof(array[0]) to determine the size of the array? I really hope not often, because it’s already 2024. In this post we’ll talk about the design’s shortcomings, where it comes from in modern code, and how to finally get rid of it.

A little more context

Not long ago I was surfing the Internet in search of an interesting project to check. Eye caught on OpenTTD — Open Source simulator inspired by Transport Tycoon Deluxe (aka transport company simulator). “A good, mature project,” I initially thought. Moreover, there is a reason – he recently turned 20 years! Even PVS-Studio is younger 🙂

Around here it would be good to move on to the errors that the analyzer found, but that was not the case. I would like to praise the developers – despite the fact that the project has existed for more than 20 years, their code base looks great: CMake, working with modern C++ standards and a relatively small number of errors in the code. Everyone would do that.

However, as you understand, if nothing at all had been found, then this note would not have existed. I suggest you look at the following code (GitHub):

NetworkCompanyPasswordWindow(WindowDesc *desc, Window *parent) 
: Window(desc)
, password_editbox(
    lengthof(_settings_client.network.default_company_pass)    // <=
  )
{
  ....
}

It looks nothing interesting, but the analyzer was confused by the calculation of the container size _settings_client.network.default_company_pass. Upon closer examination it turned out that lengthof is a macro, and in reality the code looks like this (slightly formatted for convenience):

NetworkCompanyPasswordWindow(WindowDesc *desc, Window *parent) 
: Window(desc)
, password_editbox(
    (sizeof(_settings_client.network.default_company_pass) /
       sizeof(_settings_client.network.default_company_pass[0]))
  )
{
  ....
}

Well, since we’re laying our cards on the table, we can show the analyzer’s warning:

V1055 [CWE-131] The 'sizeof (_settings_client.network.default_company_pass)' expression returns the size of the container type, not the number of elements. Consider using the 'size()' function. network_gui.cpp 2259

In this case, for _settings_client.network.default_company_pass hiding std::string. Most often the size of the container object obtained via sizeof, says nothing about its true size. Trying to get the size of a string this way is almost always a mistake.

It's all about the implementation features of modern standard library containers and std::string in particular. Most often they are implemented using two pointers (the beginning and end of the buffer), as well as a variable containing the actual number of elements. This is why when trying to calculate the size* of std::string* using sizeof you will get the same value regardless of the actual buffer size. You can verify this by looking at a small examplewhich I have already prepared for you.

Of course, the implementation and final size of the container depends on the standard library used, as well as various optimizations (see below). Small String Optimization), so your results may vary. An interesting study on the subject of internals std::string can be read Here.

Why?

So, we figured out the problem and found out that there is no need to do this. But I wonder how they came to this?

In the case of OpenTTD, everything is quite simple. Judging by blame, almost four years ago the field type default_company_pass changed With char[NETWORK_PASSWORD_LENGTH] on std::string. Interestingly, the current value returned by the macro lenghtof, differs from the previous expected: 32 versus 33. I confess, I did not delve deeper into the project code, but I hope that the developers took this nuance into account. Judging by the comment, after the field default_company_pass The 33rd character was responsible for the null terminal.

// The maximum length of the password, in bytes including '\0'
// (must be >= NETWORK_SERVER_ID_LENGTH)

Legacy and a little inattention during refactoring – it would seem that this is the reason. But, oddly enough, this method of calculating the size of an array is found even in new code. If everything is clear with the C language – there is no other way, then what’s wrong with C++? I went to Google Search for the answer and I can’t say I was surprised…

Right at the very beginning, even before the main search results, this is displayed 🙁 It’s worth making a note here that a private mode, a clean computer, and other nuances were used for the search, which eliminate the suspicion that this is a search based on my past queries.

Note author: it even became a little interesting. Write in the comments what shows you in the top results for the same request.

Sadly. I hope that AIs trained on the current code will not make similar mistakes.

How to

It would be rude to identify a problem and not offer good solutions. All that remains is to figure out what to do about it. I propose to start in order and gradually reach the best solution at the moment.

So, sizeof((expr)) / sizeof((expr)[0]) – this is real magnet for errors. Judge for yourself:

  1. For dynamically allocated buffers sizeof will count the wrong things;

  2. If builtin-the array was passed to the function by copy, then sizeof it will also return the wrong thing.

Since we're writing in C++ here, let's take advantage of the power of templates! Here we come to the legendary ArraySizeHelpers (aka “safe sizeof” in some articles), which are sooner or later written almost in every project. In ancient times – before C++11 – you could meet such monsters:

template <typename T, size_t N>
char (&ArraySizeHelper(T (&array)[N]))[N];

#define countof(array) (sizeof(ArraySizeHelper(array)))
For those who don't understand what's going on here:

ArraySizeHelper is a function template that takes an array of type T and size N link. In this case, the function returns a reference to an array of type char size N.

To understand how this thing works, let's look at a small example:

void foo()
{
  int arr[10];
  const size_t count = countof(arr);
}

When calling ArraySizeHelper the compiler will have to infer template parameters from template arguments. In our case T will be output as intA N as 10. The return type of the function will be the type char(&)[10]. Eventually sizeof will return the size of this array, which will be equal to the number of elements.

As you can see, the function does not have a body. This was done so that such a function could be used ONLY in uncomputable context. For example, when a function call is in sizeof.

Separately, I note that the function signature clearly states that it takes an array, and not just anything. This is how pointer protection works. If you still try to pass the pointer to such ArraySizeHelperthen we get a compilation error:

void foo(uint8_t* data)
{
  auto count = countof(arr); // ошибка компиляции
  ....
}

I'm not exaggerating about ancient times. My colleague back in 2011 figured it outhow this magic works in the Chromium project. With the advent of C++11 and C++14 in our lives, writing such auxiliary functions has become much easier:

template <typename T, size_t N>
constexpr size_t countof(T (&arr)[N]) noexcept
{
  return N;
}

But that's not all – it can be better!

Most likely, next you will come across the fact that you want to calculate the size of containers: std::vector, std::string, QList, – doesn't matter. Such containers already have the function we need – size. We need to call her. Let's add an overload for the function above:

template <typename Cont>
constexpr auto countof(const Cont &cont) -> decltype(cont.size())
  noexcept(noexcept(cont.size()))
{
  return cont.size();
}

Here we simply defined a function that will accept any object and return the result of calling its function size. Now our function has protection against pointers and can work with both builtin-arrays, and with containers, and even at the compilation stage.

Aaaand I congratulate you, we have successfully reinvented std::size. This is what I propose to use, starting with C++17, instead of outdated sizeof-crutches and ArraySizeHelpers. You also don’t need to write it again every time: it becomes available after including the header file of almost any container 🙂

Modern C++: Correctly calculating the number of elements in arrays and containers

Below I also propose to consider a couple of common scenarios for those who suddenly got here from a search. In what follows I will mean that std::size available in the standard library. Otherwise, you can copy the functions described above and use them as analogues.

I'm using some modern container (std::vector, QList, etc.)

In most cases it is better to use a class member function size. For example: std::string::size, std::vector::size, QList::size and so on. Starting with C++17, I recommend switching to std::sizedescribed above.

std::vector<int> first  { 1, 2, 3 };
std::string      second { "hello" };
....
const auto firstSize  = first.size();
const auto secondSize = second.size();

I have a regular array

Also use the free function std::size. As we already found out above, it can return the number of elements not only in containers, but in regular arrays.

static const int MyData[] = { 2, 9, -1, ...., 14 };
....
const auto size = std::size(MyData);

The obvious advantage of this function is that if we try to give it an inappropriate type or pointer, we will get a compilation error.

I'm inside the template and don't know what container/object is actually being used

Also use the free function std::size. In addition to being unpretentious in terms of object type, it also works at the compilation stage.

template <typename Container>
void DoSomeWork(const Container& data)
{
  const auto size = std::size(data);
  ....
}

I have two pointers or iterators (start and end)

There are two options here depending on your needs. If you just need to know the size, then just use std::distance:

void SomeFunc(iterator begin, iterator end)
{
  const auto size = static_cast<size_t>(std::distance(begin, end));
}

If you need something more interesting than simply getting the size, you can use read-only wrapper classes: std::string_view for strings, std::span in the general case, etc. For example:

void SomeFunc(const char* begin, const char * end)
{
  std::string_view view { begin, end };
  const auto size = view.size();
  ....
  char first = view[0];
}

Experienced readers can also add an option with address arithmetic, but, perhaps, I will leave it out of brackets, because… The target audience of the note is novice programmers. Let's not teach them bad things 🙂

I have only one pointer (for example, an array was created via new)

In most cases, you will have to rewrite the program a little and add the transfer of the array size. Alas.

If you work specifically with strings (const char *, const wchar_t * etc.) and you know exactly what the line contains null terminal, then the situation is a little better. In this case, you can use std::basic_string_view:

const char *text = GetSomeText();
std::string_view view { text };

As in the example above, we get all the advantages of view classes, initially having only one pointer.

I will also mention a less preferable, but useful in some situations, option using std::char_traits::length:

const char *text = GetSomeText();
const auto size = std::char_traits<char>::length(text);

std::char_traits is a real Swiss army knife for working with strings. With its help, you can write generalized algorithms regardless of the type of characters used in the string (char, wchar_t, char8_t, char16_t, char32_t). This allows you not to think about which function needs to be used at one time or another: std::strlen or std::wsclen. Please note that I did not just clarify that the null terminal must be present in the line. Otherwise you will receive undefined behavior (undefined behavior).

Conclusion

I hope I was able to show good alternatives to replace such a simple but dangerous design as sizeof(array) / sizeof(array[0]). If you think that I undeservedly forgot or kept silent about something, welcome to the comments 🙂

If you want to share this article with an English-speaking audience, please use the translation link: Mikhail Gelvikh. How not to check array size in C++.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *