We write our own variant type

Introduction

C++17 brought quite a lot of innovations to the language, including the std::variant pattern (although Boost has had it for quite some time). In fact, the last released and fully implemented C++ standard at the time I started learning this language was C++17, so at one time I paid the most attention to the innovations of this standard.
At some point, I became interested in how exactly std::variant works, so I googled a little about its fundamental structure and, armed with variadic templates, sat down to write my implementation. This template is designed quite interestingly, so people who are not at all familiar with its structure will find this article useful. If more experienced developers read this article, I would welcome their comments on my implementation.
Let me mention a few points before starting the article:

  • implemented the ability to call an arbitrary functor for an object stored in std::variant

  • you can pass objects of specified types

  • Naturally, stored objects are deleted correctly

  • using C++20 the code can be made more beautiful, but in this article we will use the 17th standard

  • the functionality listed below is NOT implemented: multiple calls of various std::variant objects (I’m doing it little by little, but it’s quite difficult. If I don’t forget and finish it, there will be a short continuation of the article); emplace creation; all other auxiliary functionality

  • The author in no way claims that the implementation is optimal. The article is intended for those who are not/are new to variadic templates and how to use them in general, and are also new to std::variant

  • it is assumed that the reader is at least superficially familiar with metaprogramming

Basic device

Before presenting some code, I would like to mention what is used at the heart of the variant type. As is known (or perhaps not known to anyone), the size of std::variant depends on the maximum size among the passed types. Already from this property we can assume that a buffer is used internally, the size of which is equal to the size of the largest type.
Before C++11, implementing a variant type was a rather non-trivial task (for example, one of the possible crutches is to define a template with a large number of type parameters with a default value assigned to all but the first), however, with the advent of variadic templates, the complexity of writing this code has greatly decreased.
How to store information about what type of data is stored in an instance of a variant type? You can use an index for this task, matching it with the order of the types passed to the template.
At the same time, in the future, using the index, it will be possible to quite easily call the specialization of the functor for the desired type, using an array of pointers.
All that remains is to write the code and also solve some implementation nuances

Let's start writing code

To begin with, I propose to implement auxiliary functions – we need to be able to find the maximum size among the passed types, and also be able to associate a type with some index. This must be done at compile-time, especially in the case of the function for finding the maximum size

template <std::uint32_t CurMax, class T, class...Types>
constexpr std::uint32_t FindMaxSize() {
  constexpr std::uint32_t sZT = sizeof(T);
  if constexpr (sZT > CurMax) {
    if constexpr (sizeof...(Types) != 0) {
      return FindMaxSize<sZT, Types...>();
    } else {
      return sZT;
    }
  } else {
    if constexpr (sizeof...(Types) != 0) {
      return FindMaxSize<CurMax, Types...>();
    } else {
      return CurMax;
    }
  }
}

Let's analyze this function. It iterates recursively over the passed types, finding the largest size. At the same time, the function uses constexpr calculations, due to which we can use the result of executing this function to determine the size of the std::array. The template takes in the maximum size of the type from the previous iteration (we explicitly pass 0 by hand when calling this function), and also a construction of the form class T, class… Types will allow us to pinch off one type in each subsequent iteration, and also guarantees that although at least one type (class… Types can accept 0 types). Based on the size of the current type under consideration, we decide which size to pass on – the maximum size from the previous iteration or the current size. The condition for stopping the recursion is that the size of Types… is equal to 0. The result of execution is the maximum size.
Now let's write a function for determining the index of a specific type:

template <std::int16_t curN, class T, class Val, class... Vals>
constexpr std::int16_t FindNumber() {

  if constexpr (std::is_same_v<std::decay_t<T>, std::decay_t<Val>>) {
    return curN;
  } else if constexpr (sizeof...(Vals) != 0) {
    return FindNumber<curN + 1, T, Vals...>();
  } else {
    return -1;
  }
}

The fundamental device is identical. Class T is the type whose index we define, the class Val class… Vals construct will allow us to iterate. However, there is one nuance in this code that needs to be considered. Namely, the call to std::decay_t. If this function in one form or another is called in template code (and it will be called in it), then, for example, Val can be of type Foo, and T can be const Foo. In this case, std::is_same_v without using std::decay_t will return false, since the types have different qualifiers. This behavior is unlikely to be expected. Therefore, it is better to discard qualifiers when comparing types. Indeed, std::variant allows you to use types with qualifiers other than those specified, so this behavior is more or less correct.
Thanks to these functions, we can write the following code:

template <class... PTypes> 
class MyVariant {
private:
  std::int16_t curT;
  std::array<char, FindMaxSize<0, PTypes...>()> data;

public:
  MyVariant() : curT(-1) {}
  template <class T> MyVariant(T &&val) {
    constexpr auto nextN = FindNumber<0, T, PTypes...>();
    static_assert(nextN != -1, "Uknown type passed");
    using valueT = std::decay_t<T>;
    new (this->data.data()) valueT(std::forward<T>(val));
	this->curT = nextN;
  }
};

Let’s also agree that since type indexing starts at 0, the value -1 will indicate the absence of a type. In this case, the default constructor indicates that no object is stored.
What does the copy\move constructor do:

  • find the type index

  • if the type is not found, then we notify about this using static_assert

  • if all is well, then create this type using a buffer. Creation features: use of std::forward(val) for forward transmission. So we implement both a copy constructor and a move constructor; memory is allocated in a buffer by calling new(pointer to buffer)DataType(arguments); just in case, we discard qualifiers when creating

  • The assignment of the index occurs precisely after the object has been successfully created.

I would like to draw your attention to the last point. The fact is that a custom type can throw exceptions quite easily and naturally. And since the index will also be used in the future to properly clean up the stored object, the reverse order may lead to an attempt to clean up an unconstructed object. What will come of this is unknown.

And now we smoothly come to one of the most important points – clearing memory. Memory leaks are not a good thing, so we need to be sure to delete them properly.
Let's define an auxiliary template function:

template <class T> void Clearer(void *data) {
  T *casted = reinterpret_cast<T *>(data);
  casted->~T(); 
}

It is as simple as possible: it accepts void* data and the type T, which is actually the real type of the object. Why is this necessary? With a similar maneuver, we can then determine the specializations of this template and store them in one array, simply by calling the necessary cleanup function on the index stored inside our variant type. There is one point that we must pay attention to – this function must be able to call the object's destructor. However, this limitation is not unusual, so it is not a problem.
Using this helper function, we can define the following public method for our MyVariant class:

bool TryClear() {
    bool cleared = false;
    if (this->curT != -1) {
      static constexpr std::array<void (*)(void *), sizeof...(PTypes)> cont{
          &Clearer<PTypes>...};
      cont[this->curT](this->data.data());
      this->curT = -1;
      cleared = true;
    }
    return cleared;
  }

In theory, the static modifier can be removed here, because constexpr is implicitly static, but so be it. Here, using a collapse expression, an array is created from pointers to the specializations of the previously defined auxiliary function. Also, I want to draw your attention to the fact that the destructor, in an amicable way, should not throw exceptions. If the destructor of your class is capable of throwing exceptions, then in any case you will have to think through the mechanics of properly removing it in this situation separately. The call to the required cleanup function occurs by calling the necessary pointer to the function at the current type index.
At the same time, the destructor is ready:

~MyVariant() { this->TryClear(); }

And also the copy/move operator:

template <class T> 
MyVariant &operator=(T &&val) {
    constexpr auto nextN = FindNumber<0, T, PTypes...>();
    static_assert(nextN != -1, "Uknown type passed");
    this->TryClear();
    using valueT = std::decay_t<T>;
    new (this->data.data()) valueT(std::forward<T>(val));
    this->curT = nextN;
    return *this;
  }

One point worth paying attention to is that if the constructor of a custom type throws an exception, then we will get the following situation: old data has already been lost, and new data has not been created. In some cases, this behavior may not be desirable.
So, we have everything ready, except for one thing – the function call itself. In fact, the general principle has already been invented – we implemented something similar in the Clearer() auxiliary template, only in this case we will additionally accept a predicate type and a flag responsible for the need to move the object

template <class Predicate, class T, bool needToMove>
void CallP(Predicate pr, void *data) {
  T *casted = reinterpret_cast<T *>(data);
  if constexpr (needToMove) {
    pr(std::move(*casted));
  } else {
    pr(*casted);
  }
}

This function is similar to Clearer, but with one exception – I accept needToMove. Why am I passing this as a bool rather than extracting information from the type? I decided to design this way, since when calling Visit, the user will be passing exactly the variant type, and not the value lying inside (which is obvious). In this regard, it is easier to extract information about lvalue/rvalue directly from the variant type, and when passing the value stored inside, simply indicate what we are using.
To use this template, we define the following structure:

struct SimpleVisiterInternal {
  template <class Visitor, class... VariantTypes>
  static bool TryVisit(Visitor &&visitor, MyVariant<VariantTypes...> &variant) {
    bool result = false;
    if (variant.curT != -1) {
      result = true;
      constexpr auto TypesCount = sizeof...(VariantTypes);
      constexpr std::array<void (*)(Visitor, void *), TypesCount> cont{
          &CallP<Visitor, VariantTypes, false>...};
      cont[variant.curT](std::forward<Visitor>(visitor),
                         (void *)(variant.data.data()));
    }
    return result;
  }
  template <class Visitor, class... VariantTypes>
  static bool TryVisit(Visitor &&visitor,
                       MyVariant<VariantTypes...> &&variant) {
    bool result = false;
    if (variant.curT != -1) {
      constexpr auto TypesCount = sizeof...(VariantTypes);
      result = true;
      constexpr std::array<void (*)(Visitor, void *), TypesCount> cont{
          &CallP<Visitor, VariantTypes, true>...};
      cont[variant.curT](std::forward<Visitor>(visitor),
                         (void *)(variant.data.data()));
    }
    return result;
  }
};

This structure for an arbitrary Visitor defines an array of pointers to the specialization of the CallP template. Also, here we distinguish whether our MyVariant object is lvalue or rvalue and set the desired flag when called. And yes, this structure must be a friend of MyVariant in order to be able to work with its internal representation.
Now let’s wrap the implementation in a function to make it easier to use

template <class Visitor, class Variant>
bool TryVisit(Visitor &&visitor, Variant &&variant) {
  return SimpleVisiterInternal::TryVisit(std::forward<Visitor>(visitor),
                                         std::forward<Variant>(variant));
}

Essentially, everything is ready, you can use our variant type template

Where is TryVisit with multiple calls?

While it is being written. The general principle is the same – you need to make an array. In this case, I believe that it would be most convenient to make a one-dimensional array, which, by extracting the number of possible stored types for each of MyVariant, will correctly select the desired pointer. The difficulty is to generate this array, because… you have to operate with an arbitrary number of MyVariant, each of which can contain an arbitrary number of possible types. And all these sizes are different. There seems to be progress and ideas, but due to other matters, implementation may be delayed indefinitely.

A little about why TryVisit is a function and not a method

std::visit is a function, not a method. Moreover, you cannot call a functor using a method at all. Why is that? For myself personally, I identified 2 reasons:

  • std::visit can work with an arbitrary number of passed variant types. This possibility is quite controversial, since the process generates a very large number of specializations of functions N1 * N2 * … and so on in the number of variant types, but it is present, so it is better to use the function

  • It will be quite difficult to recognize where move semantics are needed, because We cannot directly extract information about whether std::variant is an lvalue or an rvalue (deducing this has not yet been introduced). Moving the functionality into a function provides a much simpler solution to this problem

Conclusion

This code was written purely out of interest. I tried to take everything into account, and also tested the code for various types, but I still cannot guarantee the absence of any errors.
This is my first article on habr, so I hope that it was useful to someone.
The code can be viewed at this link – https://github.com/KirillVelichk0/MyVariant

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *