On the edge between exceptions and std::expected

Looking at a new type from the upcoming standard called std::expected I came to an opinion that is interesting in my opinion, that you can rethink its essence a little and make it a little closer to exceptions.

In this article, I want to talk a little about a small study of the implementation of expected, which uses error type erasure.

A bit about std::expected

This type was conceived as one of the options for error handling. Unlike exceptions, it is good in that it gives an additional performance gain due to the absence of the need to unroll the stack, and also frees the programmer from routine tasks, such as explicitly specifying noexcept in your project’s API. It represents a kind of middle ground between exceptions (already a familiar mechanism in C++) and return error codes (as is customary in C).

This type is generally a container in the style std::optionalbut with two template parameters: T (the type whose value is contained in the container) and E (the type of the error contained in this container).

std::expected<std::string, int> foo = "hello";

When using this type, we can either try to get a value from there, or ask about what the error value is there. It usually looks like this:

enum MathError : unsigned char
{
    ZeroDivision,
    NegativeNotAllowed
};

std::expected<int, MathError> Bar(int a, int b)
{
  if (b == 0)
    return std::uexpected(MathError::ZeroDivision);
  if (a < 0 || b < 0)
    return std::uexpected(MathError::NegativeNotAllowed);
  
  return a / b;
}

int main()
{
  std::expected<int, MathError> foo = Bar(1, 3);
  
  if (foo.has_value())
  {
    std::cout << *foo;
  }
  else if (foo.error() == MathError::ZeroDivision)
  {
    std::cout << "Divided by zero";
  } else if (foo.error() == MathError::NegativeNotAllowed)
  {
    std::cout << "Negative numbers not allowed";
  }
}

That is, we know in advance what type of error we can have, and in this example it is MathError. What if under Bar a rather ambiguous logic is implied? It may be an arithmetic error, or it may be a system error. The first thing that comes to mind is to make an enum with different error values, so we bind explicitly to this enum. However, is it possible to “hide” this type and detect the error dynamically at runtime?

Erase error type

Type erasure is a pattern in C++ that is based on the use of templates and polymorphism. What does this give us? It will allow any type of error to be stored in an object of type expected at any time, while the signature of the variable declaration is reduced to a single template argument, for example expected<int>.

The general interface of the class would then look something like this:

template<typename T>
class Expected
{
public:

  template<typename E>
  Expected(Unexpected<E> Unexp)
  {
    SetError(Unexp.Error);
  }

  Expected(T Value)
  {
    StoredValue = Value;
  }

  bool HasError() const;

  template<typename E>
  void SetError(E&& Error);

  template<typename E>
  const E* GetError() const;

  bool HasValue() const;

  inline operator T() const;

protected:

  std::optional<T> StoredValue;

  // Сюда будет помещаться сама ошибка
  std::unique_ptr<ErrorHolderBase> StoredError;
};

// Структура, необходимая для передачи ошибки в Expected
template<typename E>
struct Unexpected
{
  Unexpected(E InError)
  {
    Error = InError;
  }
  E Error;
};

Now to the implementation using type erasure. And the first thing we do is declare the base class of the error handler.

struct ErrorHolderBase
{
  // Возвращает текст ошибки
  virtual std::string GetErrorText() const = 0;

  // Возвращает указатель на хранимую ошибку
  virtual void* GetErrorPtr() const = 0;

  virtual ~ErrorHolderBase() {}
};

The question arises, what will the pointer to the error erased before void*? To solve it, we can use type identifiers, as implemented in std::anyand here you can go two ways: use RTTI with typeid, or abandon RTTI and make a self-written counter of unique type identifiers in order to be able to distinguish error types from each other. The second option appeals to me more, since I work in a project in which, by convention, RTTI is disabled (hello, UnrealEngine). In general, I will use my bike and give an example of the implementation of such a counter in the spoiler:

Custom TypeId
struct TypeIdCnt
{
  template<typename>
  static uint32 GetUniqueId()
  {
    static const int32 TypeId = NewTypeId();
    return TypeId;
  }

private:
  static uint32 NewTypeId()
  {
    // thread-safe
    static std::atomic<uint32> CurrentId = 0;
    return CurrentId++;
  }
};

template<typename T>
static uint32 GetTypeId()
{
  return TypeIdCnt::GetUniqueId<T>();
}

The essence is this: each new type T creates a new function instance, which increases the counter.

We will also store the type identifier in the error store. Now the code will look like this:

struct ErrorHolderBase
{
  // Возвращает текст ошибки
  virtual std::string GetErrorText() const = 0;

  // Возвращает указатель на хранимую ошибку
  virtual void* GetErrorPtr() const = 0;

  // Возвращает либо указатель на ошибку, либо nullptr, если тип не соответствует
  virtual void* RetrieveError(uint32 ErrorTypeId) const = 0;

  virtual ~ErrorHolderBase() {}

  std::set<uint32> Bases;
};

template<typename ErrorType>
struct ErrorHolder : ErrorHolderBase
{
  ErrorHolder(ErrorType InError)
  {
    Error = InError;
  }

  virtual std::string GetErrorText() const
  {
    // для каждого типа ошибки можно перегрузить функцию error_to_str для получения текстового представления
    return error_to_str(*Error)
  }

  virtual void* GetErrorPtr() const
  {
    return (void*)&Error;
  }

  virtual void* RetrieveError(uint32 ErrorTypeId) const
  {
    if (GetTypeId<ErrorType>() == ErrorTypeId)
      return GetErrorPtr();
    return nullptr;
  }

  ErrorType Error;
};

We can receive an error from the container, knowing its type. However, this means that there can be any type of error in the container, and we can only get an error if we assume the correct type (specified as a template parameter). This limits the ability to classify errors into categories. This approach is suitable for base types, strings, and other types that do not require error categorization. I would like to add an additional method called Catch, which would emulate the exception mechanism (to some extent), allowing you to extract errors from the new expected variant by category (keep the child and catch by the parent). An example code might look like this:

struct BaseError{};
struct MathError : BaseError{};
struct SystemError : BaseError{};

expected<int> ValueOrError = unexpected(MathError());

if (auto Error = ValueOrError.Catch<BaseError>())
{
  // ...
}

To solve this problem, we can store error class identifiers directly in their entire hierarchy. A similar trick is done in the implementation dyn_cast in Clang: https://llvm.org/doxygen/ExtensibleRTTI_8h_source.html

To do this, we will use the “decorator” pattern, which will add the identifier of its parent to the child. This way we can get the identifiers of all types in the hierarchy:

template<typename T>
struct DeriveError : T
{
  using T::T;
  
  // Данная функция собирает идентификаторы из всей иерархии рекурсивно
  static std::set<uint32> GetBaseIds()
  {
  	std::set<uint32> Bases = { GetTypeId<T>() };
      // Так же спрашиваем идентификаторы у родителя
  	Bases.merge(T::GetBaseIds());
  	return Bases;
  }

};

Also, some base error class like std::exceptionwhich stores both its identifier and provides some interface for getting information about the error.

struct ErRuntimeError
{
  ErRuntimeError(const std::string& InMessage)
  {
    Message = InMessage;
  }

  static std::set<uint32> GetBaseIds()
  {
  	return { GetTypeId<ErRuntimeError>() };
  }
  
  std::string What() const
  {
  	if (Message.IsEmpty())
  		return GetErrorType();
  	return GetErrorType() + TEXT(": ") + Message;
  }
  
  virtual std::string GetErrorType() const
  {
  	return "RuntimeError";
  }
  
  virtual ~ErRuntimeError() = default;
  
protected:
  std::string Message;

};

// Перегрузка для получения текствого представления об ошибке
inline std::string error_to_str(const ErRuntimeError& Error)
{
  return Error.What();
}

It also makes sense to create a macro that can create new types of errors in order to bypass the boilerplate code:

#define DEFINE_RUNTIME_ERROR(Error, Parent) \
  struct Error : DeriveError<Parent> \
  { \
    using ParentType = DeriveError<Parent>; \
    using ParentType::ParentType; \
    virtual FString GetErrorType() const override \
    { \
      return #Error; \
    } \
  };

Now the error declarations will look like this:

// Мат. ошибка
DEFINE_RUNTIME_ERROR(ErMathError, ErRuntimeError);
// Мат. ошибка - деление на ноль
DEFINE_RUNTIME_ERROR(ErZeroDivisionError, ErMathError);
// Ошибка значения
DEFINE_RUNTIME_ERROR(ErValueError, ErRuntimeError);

And now, when we have a hierarchy of errors with identifiers, we can write our own version Catch for new expected. Upon receiving an error, we have every right to explicitly cast void* To E*because CatchError must return an error pointer if the passed identifier exists in the hierarchy, or return nullptr.

template<typename T>
template<typename E>
const E* Expected<T>::Catch()
{
  const int32 ErrorTypeId = GetTypeId<E>();
  return static_cast<E*>(StoredError->CatchError(ErrorTypeId));
}

And when setting the error, we do something like this:

template<typename T>
template<typename E>
void Expected<T>::SetError(E&& Error)
{
  StoredError = std::make_unique<ErrorHolder<E>>(std::forward(Error));

  if constexpr (std::is_base_of_v<ErRuntimeError, E>)
  {
  	std::set<uint32> Bases = E::GetBaseIds();
  	Bases.add(GetTypeId<E>());
  	StoredError->SetBases(Bases);
  }
}

This is where the error store itself is created. And then the type identifiers of the entire higher hierarchy of the error are simply transferred E (including the error itself) in the error store.

Instead of the above in the article RetrieveErrornow we can use the method CatchError of our polymorphic error store, which checks whether such a type identifier exists in the previously stored list of the hierarchy before issuing an error pointer, or returns nullptr.

void* ErrorHolderBase::CatchError(uint32 ErrorTypeId) const
{
  if (Bases.contains(ErrorTypeId))
    return GetErrorPtr();
  return nullptr;
}

void ErrorHolderBase::SetBases(const std::set<uint32>& InBases)
{
  Bases = InBases;
}

Now we can check the contents of expected in the style of C++ exceptions:

Expected<int> Bar(int a, int b)
{
  if (b == 0)
    return Unexpected(ErZeroDivisionError("b is Zero"));
  
  if (a < 0 || b < 0)
    return Unexpected(ErNegativeNotAllowed("a < 0 or b < 0"));
  
  return a / b;
}

int main()
{
  Expected<int> foo = Bar(1, 3);

  if (foo.has_value())
  {
    std::cout << *foo;
  }
  else if (auto ZDError = foo.Catch<ErZeroDivisionError>())
  {
    std::cout << ZDError->What();
  }
}

Conclusion

Why might you need this expected option?

My reasons for using expected with the error type erased are:

  1. Simpler semantics for declaring expected values. Using the erased error type in expected allows you to reduce the signature of a variable declaration to a single template argument. This improves the readability and understanding of the code.

  2. Ability to set any type of error on the fly. If a hypothetical function has the ability to create all sorts of errors, then why not take the error from where it really came from (from another expected) and pass it to the current expected? By the way, this point falls very nicely on the monadic approach of using expected using coroutines, which, by the way, brings the usability of expected even closer to exceptions. I’m thinking of writing an article about this as well.

  3. Error handling by category. Using the erased error type allows you to handle errors by category, which brings expected closer to the exception mechanism. This gives flexibility and convenience in handling various types of errors.

Reasons why you should not use the erased error type:

  1. Lack of ability to use at compile-time. Dynamic polymorphism makes it impossible to use expected at compile time.

  2. Undefined error type makes it difficult to understand the source of the error. However, in my opinion, this can be easily solved using additional language tools, for example, adding information about the location in the source code where this error occurred to the error store: std::source_location https://en.cppreference.com/w/cpp/utility/source_location

  3. Consuming more memory. Using an erased error type in expected requires storing the class IDs of the error’s ancestors, which can increase memory consumption.

What do you think about it?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *