static_ptr smart pointer concept in C++

There are several “smart pointers” in C++ – std::unique_ptr, std::shared_ptr, std::weak_ptr. There are also more non-standard smart pointers, for example in boostone: intrusive_ptr, local_shared_ptr.

In this article, we’ll look at a new kind of smart pointer, which we can call static_ptr. Most of all, he looks like std::unique_ptr without dynamic memory allocation.

std::unique_ptr

std::unique_ptr<T>2 this is a wrapper over a simple pointer T*. Probably all C++ programmers have used this class.

One of the most popular reasons for using this pointer is dynamic polymorphism.

If at the compilation stage we don’t “know” which class object we will create at a certain point of execution, then because of this we don’t know the value by which the stack pointer should be increased, which means that such an object cannot be created on the stack – we can only create it in a heap.

Let’s have a virtual class IEngine and his heirs TSteamEngine, TRocketEngine, TEtherEngine. Object “some successor IEngineknown at run-time” is most often exactly std::unique_ptr<IEngine>in which case the memory for the object is allocated on the heap.

with objects of different sizes” title=”std::unique_ptr with objects of different sizes” width=”262″ height=”302″ data-src=”https://habrastorage.org/getpro/habr/upload_files/964/c4a/a5d/964c4aa5d32725dc0e2aca93548b4348.png”/>
std::unique_ptr with objects of different sizes

Allocation of small objects

Heap allocations are needed for “large objects” (std::vector with a bunch of elements, etc.), while a stack is better for “small objects”.

On Linux, to get the stack size for a process, you can run:

ulimit -s

by default it will show a low number, on my systems it is 8192 KiB = 8 MiB. While memory from the heap can be eaten in gigabytes.

Allocation of a large number of small objects fragments memory and negatively affects the cache. To fix these problems, you can use memory pool – there is a cool article on this topic3I recommend reading it.

Objects on the stack

How can you make an object similar to std::unique_ptrbut completely stackable?

C++ has std::aligned_storage4which gives raw memory on the stack, and in this memory using the placement new construct5 you can create an object of the desired class T. It is necessary to check that the memory is not less than sizeof(T).

Thus, due to the microscopic overhead (several unoccupied bytes) on the stack, you can create objects of an arbitrary class.

sp::static_ptr

With the intention of making a stack-only analog std::unique_ptr<T>I decided to look for ready-made implementations, because the idea, it would seem, lies on the surface.

Thinking words like stack_ptr, static_ptr etc., and looking for them on GitHub, I found a sane implementation in the ceph project6in ceph/static_ptr.h7 and saw some useful ideas there. However, this class is rarely used in the project, and there are a number of significant blunders in the implementation.

The implementation may look like this – there is a buffer for the object itself (in the form std::aligned_storage); and some data that allows you to correctly steer the object: for example, call the destructor of exactly the type that is currently contained in static_ptr.

with objects of different sizes (32 bytes buffer)” title=”sp::static_ptr with objects of different sizes (32 bytes buffer)” width=”387″ height=”203″ data-src=”https://habrastorage.org/getpro/habr/upload_files/28c/9af/42d/28c9af42d324c26c4e839d1a322bf6ac.png”/>
sp::static_ptr with objects of different sizes (32 bytes buffer)

Implementation: how hard is move?

Here I will describe a step-by-step implementation and many pitfalls that may come up.

The class itself static_ptr i decided to put inside the namespace sp (from static pointer).

Implementing containers, smart pointers, and other things is generally one of the most difficult programs in C++, because you have to think about things that normal projects are not aware of.

Let’s say we want to call the move constructor from one memory location to another. You can write like this:

template <typename T>
struct move_constructer {
    static void call(T* lhs, T* rhs) {
        new (lhs) T(std::move(*rhs));
    }
};
// call `move_constructer<T>::call(dst, src);`

However, what if the class T doesn’t have a move constructor?

There’s a chance that T has a move assignment operator, then you must use it. If it is not there, then you need to “break” the compilation.

The newer the C++ standard, the easier it is to write code for such things. We get the following code (compiles in C ++17):

template <typename T>
struct move_constructer {
    static void call(T* lhs, T* rhs) {
        if constexpr (std::is_move_constructible_v<T>) {
            new (lhs) T(std::move(*rhs));
        } else if constexpr (std::is_default_constructible_v<T> && std::is_move_assignable_v<T>) {
            new (lhs) T();
            *lhs = std::move(*rhs);
        } else {
            []<bool flag = false>(){ static_assert(flag, "move constructor disabled"); }();
        }
    }
};

(on line 10 the compilation is broken in the form static_assert happens with a hackeight)

However, it would be nice to point out noexcept-specifier when possible. In C++20 we get the following code, as simple as possible at the moment:

template <typename T>
struct move_constructer {
    static void call(T* lhs, T* rhs)
        noexcept (std::is_nothrow_move_constructible_v<T>)
        requires (std::is_move_constructible_v<T>)
    {
        new (lhs) T(std::move(*rhs));
    }

    static void call(T* lhs, T* rhs)
        noexcept (std::is_nothrow_default_constructible_v<T> && std::is_nothrow_move_assignable_v<T>)
        requires (!std::is_move_constructible_v<T> && std::is_default_constructible_v<T> && std::is_move_assignable_v<T>)
    {
        new (lhs) T();
        *lhs = std::move(*rhs);
    }
};

Similarly, with the analysis of cases, you can make a structure move_assigner. Could still be done copy_constructer and copy_assigner, but they are not needed in our implementation. AT static_ptr copy constructor and copy assignment operator will be removed (as in unique_ptr).

Implementation: std::type_info on the knee

Although in static_ptr any object can lie, we still need to somehow “know” what type it is there. For example, so that we can call the destructor of this particular object, and do other things.

After several attempts, I developed this option – I need a structure ops:

struct ops {
    using binary_func = void(*)(void* dst, void* src);
    using unary_func = void(*)(void* dst);

    binary_func move_construct_func;
    binary_func move_assign_func;
    unary_func destruct_func;
};

And a couple of helper functions for translation void* in T*

template<typename T, typename Functor>
void call_typed_func(void* dst, void* src) {
    Functor::call(static_cast<T*>(dst), static_cast<T*>(src));
}

template<typename T>
void destruct_func(void* dst) {
    static_cast<T*>(dst)->~T();
}

And now we can for each type T have your copy ops:

template<typename T>
static constexpr ops ops_for{
    .move_construct_func = &call_typed_func<T, move_constructer<T>>,
    .move_assign_func = &call_typed_func<T, move_assigner<T>>,
    .destruct_func = &destruct_func<T>,
};
using ops_ptr = const ops*;

static_ptr will store a reference to ops_for<T>where T is the class of the object that is currently in static_ptr.

Implementation: I like to move it, move it

Copy static_ptr it will be impossible – you can only muvat in another static_ptr. The choice of the move method depends on the type of objects that lie in these two static_ptr:

  1. Both static_ptr empty (dst_ops = src_ops = nullptr): To do nothing.

  2. static_ptr contain the same type (dst_ops = src_ops): do move assign and destroy the object src.

  3. static_ptr contain different types (dst_ops != src_ops): destroy the object in dstdo move constructdestroy the object in srcmaking an assignment dst_ops = src_ops.

You get this method:

// moving objects using ops
static void move_construct(void* dst_buf, ops_ptr& dst_ops,
                           void* src_buf, ops_ptr& src_ops) {
    if (!src_ops && !dst_ops) {
        // both object are nullptr_t, do nothing
        return;
    } else if (src_ops == dst_ops) {
        // objects have the same type, make move
        (*src_ops->move_assign_func)(dst_buf, src_buf);
        (*src_ops->destruct_func)(src_buf);
        src_ops = nullptr;
    } else {
        // objects have different type
        // delete the old object
        if (dst_ops) {
            (*dst_ops->destruct_func)(dst_buf);
            dst_ops = nullptr;
        }
        // construct the new object
        if (src_ops) {
            (*src_ops->move_construct_func)(dst_buf, src_buf);
            (*src_ops->destruct_func)(src_buf);
        }
        dst_ops = src_ops;
        src_ops = nullptr;
    }
}

Implementation: buffer size and alignment

Now we need to decide what will be the default buffer size and what will be the alignmentninebecause std::aligned_storage requires knowing these two values.

It is clear that the alignment of the descendant class may exceed the alignment of the ancestor classten. Therefore, alignment should be as high as possible, which only happens. The type will help us with this. std::max_align_televen:

static constexpr std::size_t align = alignof(std::max_align_t);

On my systems, this value is 16, but there may be non-standard values ​​somewhere.

By the way, memory from the heap (from malloc) is also aligned to the maximum possible alignment, automatically.

The default buffer size can be set to 16 bytes or sizeof(T) – which will be more.

template<typename T>
struct static_ptr_traits {
    static constexpr std::size_t buffer_size = std::max(static_cast<std::size_t>(16), sizeof(T));
};

It is clear that almost always this value will need to be redefined by its own value so that objects of all descendant classes are placed. It is advisable to do this in the form of a macro so that it is quick to write. You can make such a macro to override the buffer size in one class:

#define STATIC_PTR_BUFFER_SIZE(Tp, size)                   \
namespace sp {                                             \
    template<> struct static_ptr_traits<Tp> {              \
        static constexpr std::size_t buffer_size = size;   \
    };                                                     \
}

// example:
STATIC_PTR_BUFFER_SIZE(IEngine, 1024)

However, this is not enough for the selected size to be “inherited” by all descendant classes of the desired one. Another macro can be made for this using std::is_base:

#define STATIC_PTR_INHERITED_BUFFER_SIZE(Tp, size)         \
namespace sp {                                             \
    template<typename T> requires std::is_base_of_v<Tp, T> \
    struct static_ptr_traits<T> {                          \
        static constexpr std::size_t buffer_size = size;   \
    };                                                     \
}

// example:
STATIC_PTR_INHERITED_BUFFER_SIZE(IEngine, 1024)

Implementation: sp::static_ptr

Now we can give the implementation of the class itself. It has only two fields – a link to ops and a buffer for the object:

template<typename Base>
requires(!std::is_void_v<Base>)
class static_ptr {
private:
    static constexpr std::size_t buffer_size = static_ptr_traits<Base>::buffer_size;
    static constexpr std::size_t align = alignof(std::max_align_t);

    // Struct for calling object's operators
    // equals to `nullptr` when `buf_` contains no object
    // equals to `ops_for<T>` when `buf_` contains a `T` object
    ops_ptr ops_;

    // Storage for underlying `T` object
    // this is mutable so that `operator*` and `get()` can
    // be marked const
    mutable std::aligned_storage_t<buffer_size, align> buf_;

    // ...

First, let’s implement the method resetwhich removes the object – this method is often used:

    // destruct the underlying object
    void reset() noexcept(std::is_nothrow_destructible_v<Base>) {
        if (ops_) {
            (ops_->destruct_func)(&buf_);
            ops_ = nullptr;
        }
    }

We implement basic constructors by analogy with std::unique_ptr:

    // operators, ctors, dtor
    static_ptr() noexcept : ops_{nullptr} {}

    static_ptr(std::nullptr_t) noexcept : ops_{nullptr} {}
    static_ptr& operator=(std::nullptr_t) noexcept(std::is_nothrow_destructible_v<Base>) {
        reset();
        return *this;
    }

Now we can implement move constructor and move assignment operator. To accept the same type, you need to do this:

    static_ptr(static_ptr&& rhs) : ops_{nullptr} {
        move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
    }

    static_ptr& operator=(static_ptr&& rhs) {
        move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
        return *this;
    }

However, it is better if we can take static_ptr for other types. The other type must fit into the buffer and be a descendant of the current type:

    template<typename Derived>
    struct derived_class_check {
        static constexpr bool ok = sizeof(Derived) <= buffer_size && std::is_base_of_v<Base, Derived>;
    };

And you need to declare “friends” all instances of the class:

    // support static_ptr's conversions of different types
    template <typename T> friend class static_ptr;

Then the two previous methods can be rewritten like this:

    template<typename Derived = Base>
    static_ptr(static_ptr<Derived>&& rhs)
        requires(derived_class_check<Derived>::ok)
        : ops_{nullptr}
    {
        move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
    }

    template<typename Derived = Base>
    static_ptr& operator=(static_ptr<Derived>&& rhs)
        requires(derived_class_check<Derived>::ok)
    {
        move_construct(&buf_, ops_, &rhs.buf_, rhs.ops_);
        return *this;
    }

Copying is prohibited:

    static_ptr(const static_ptr&) = delete;
    static_ptr& operator=(const static_ptr&) = delete;

The destructor destroys the object in the buffer:

    ~static_ptr() {
        reset();
    }

To create an object in the buffer, let’s make a method emplace. The old object is deleted (if it exists), a new one is created in the buffer, and the pointer to ops.

    // in-place (re)initialization
    template<typename Derived = Base, typename ...Args>
    Derived& emplace(Args&&... args)
        noexcept(std::is_nothrow_constructible_v<Derived, Args...>)
        requires(derived_class_check<Derived>::ok)
    {
        reset();
        Derived* derived = new (&buf_) Derived(std::forward<Args>(args)...);
        ops_ = &ops_for<Derived>;
        return *derived;
    }

We will make the accessor methods the same as in std::unique_ptr:

    // accessors
    Base* get() noexcept {
        return ops_ ? reinterpret_cast<Base*>(&buf_) : nullptr;
    }
    const Base* get() const noexcept {
        return ops_ ? reinterpret_cast<const Base*>(&buf_) : nullptr;
    }

    Base& operator*() noexcept { return *get(); }
    const Base& operator*() const noexcept { return *get(); }

    Base* operator&() noexcept { return get(); }
    const Base* operator&() const noexcept { return get(); }

    Base* operator->() noexcept { return get(); }
    const Base* operator->() const noexcept { return get(); }

    operator bool() const noexcept { return ops_; }

By analogy with std::make_unique and std::make_sharedlet’s make a method sp::make_static:

template<typename T, class ...Args>
static static_ptr<T> make_static(Args&&... args) {
    static_ptr<T> ptr;
    ptr.emplace(std::forward<Args>(args)...);
    return ptr;
}

The implementation is available on GitHub12!

How to use sp::static_ptr?

It’s simple! I made unit tests that show the lifetime of the objects living inside static_ptrthirteen.

In the test, you can see typical scenarios for working with static_ptr and what happens to the objects inside them.

Benchmark

For benchmarks, I used the library google/benchmarkfourteen. The code for this is in the repository.fifteen.

I have considered two scenarios, each of them checks std::unique_ptr and sp::static_ptr:

  1. Creating a smart pointer and calling an object method.

  2. Iterate over a vector of 128 smart pointers, each with a method call.

In the first scenario, the gain sp::static_ptr should be due to the lack of allocation, in the second scenario due to the locality of memory. Although, of course, it is clear that compilers are very smart and can optimize “bad” scenarios well, depending on the optimization flags.

Let’s run the benchmark in a Debug build:

***WARNING*** Library was built as DEBUG. Timings may be affected.
-------------------------------------------------------------------------------------------------
Benchmark                                                       Time             CPU   Iterations
-------------------------------------------------------------------------------------------------
BM_SingleSmartPointer<std::unique_ptr<IEngine>>               207 ns          207 ns      3244590
BM_SingleSmartPointer<sp::static_ptr<IEngine>>               39.1 ns         39.1 ns     17474886
BM_IteratingOverSmartPointer<std::unique_ptr<IEngine>>       3368 ns         3367 ns       204196
BM_IteratingOverSmartPointer<sp::static_ptr<IEngine>>        1716 ns         1716 ns       397344

In Release build:

-------------------------------------------------------------------------------------------------
Benchmark                                                       Time             CPU   Iterations
-------------------------------------------------------------------------------------------------
BM_SingleSmartPointer<std::unique_ptr<IEngine>>              14.5 ns         14.5 ns     47421573
BM_SingleSmartPointer<sp::static_ptr<IEngine>>               3.57 ns         3.57 ns    197401957
BM_IteratingOverSmartPointer<std::unique_ptr<IEngine>>        198 ns          198 ns      3573888
BM_IteratingOverSmartPointer<sp::static_ptr<IEngine>>         195 ns          195 ns      3627462

Thus, there is a certain performance gain for sp::static_ptrwhich is the stack-only counterpart of std::unique_ptr.

Links

  1. Boost.SmartPtr

  2. std::unique_ptr – cppreference.com

  3. C++ Memory Pool and Small Object Allocator | by Debby Nirwan

  4. std::aligned_storage – cppreference.com

  5. Placement new operator in C++ – GeeksforGeeks

  6. ceph – github.com

  7. ceph/static_ptr.h – github.com

  8. c++ – constexpr if and static_assert

  9. Objects and alignment – cppreference.com

  10. godbolt.com – the alignment of the descendant class is greater than that of the ancestor class

  11. std::max_align_t – cppreference.com

  12. Izaron/static_ptr – github.com

  13. Izaron/static_ptr, test_derives.cc – github.com

  14. google/benchmark – github.com

  15. Izaron/static_ptr, benchmark.cc – github.com

Similar Posts

Leave a Reply