Easy on the hard – move in C ++

What do people usually say about move? This is a cool thing, the code works faster with it. How much faster? Let’s check.

To evaluate the performance, let’s take the following class:

class LogDuration {
public:
    LogDuration(std::string id)
        : id_(std::move(id)) {
    }

    ~LogDuration() {
        const auto end_time = std::chrono::steady_clock::now();
        const auto dur = end_time - start_time_;
        std::cout << id_ << ": ";
        std::cout << "operation time"
                  << ": " << std::chrono::duration_cast<std::chrono::milliseconds>(dur).count()
                  << " ms" << std::endl;
    }

private:
    const std::string id_;
    const std::chrono::steady_clock::time_point start_time_ = std::chrono::steady_clock::now();
};

Do not be alarmed, we will only need it as a conditional stopwatch for experiments. To use it to estimate the execution time of the operation, it is enough to do this:

    {
        LogDuration ld("identifier");
        // some operations
    }

where curly braces define the scope. When you go beyond it, class destructors are launched for objects that were created inside this area, including ~ LogDuration (), which will show the time of execution of operations inside the block.

So let’s start experimenting.

They say that for vectors and strings (std :: string), move should be used whenever possible. Let’s check. Let’s write the following code:

int main() {
    vector<uint8_t> big_vector(1e9, 0);

    {
        LogDuration ld("vector copy");
        vector<uint8_t> reciever(big_vector);
    }
    cout << "size of big_vector is " <<  big_vector.size() << 'n';
}

Here we create a vector big_vector of 10 ^ 9 zeros and then create a new vector as a copy of this one. The time taken to create a copy is displayed in the console:

vector copy: operation time: 484 ms
size of big_vector is 1000000000

The valgrind program shows that 2 GB of RAM was used during the execution of the program:

total heap usage: 4 allocs, 4 frees, 2,000,073,728 bytes allocated

So, we got two identical vectors, it took half a second and 2 GB of RAM. The next question is – what if we never need the original vector further in the code, we would save 1 GB. Let’s see what happens if we add move. Let’s make a replacement:

- vector<uint8_t> reciever(big_vector);
+ vector<uint8_t> reciever(move(big_vector));

And lo and behold! The execution time decreased by almost 10 times, and the size of the original vector became equal to zero:

vector move: operation time: 34 ms
size of big_vector is 0

Valgrind is already more optimistic:

total heap usage: 3 allocs, 3 frees, 1,000,073,728 bytes allocated

It turns out that by using move we won in speed, but sacrificed the original vector. I propose to check the case with a long string instead of a vector.

Now let’s try to figure out what’s going on here. Let’s write our own vector, more precisely, a simple wrapper over a standard vector

template <typename T>
class Vector {
public:
    Vector(size_t size, T value)
        : data_(size, value) {
    }
    Vector(const Vector& rhs) {
        cout << "copy constructor was calledn";
    }
    Vector(Vector&& rhs) noexcept {
        cout << "move constructor was calledn";
    }
    size_t size() {
        return data_.size();
    }

private:
    vector<T> data_;
};

Also, do not be alarmed, here you need to look at what is inside the public section. Add this code before main () in your program, and inside main, replace the first letter in vector with capital letters wherever it is mentioned. For the case:

Vector<uint8_t> reciever(big_vector);

the console will output:

copy constructor was called
vector copy: operation time: 0 ms
size of big_vector is 1000000000

And for the variant with move:

move constructor was called
vector move: operation time: 0 ms
size of big_vector is 1000000000

Here we come to the observation that the move function itself does not perform any movements, despite the name, but does everything possible to call the move constructor – Vector (Vector && rhs) in this particular example. Because in the given wrapper class in the constructors only text output is performed, it is clear that the operation time is so short, and the original vector does not disappear anywhere.

The use of move is not limited to class constructors. For example:

void CopyFoo(string text) {}
void CopyRefFoo(const string& text) {}
void MoveFoo(string&& text) {}

int main() {
    string text;
    text = "some text";
  
    CopyRefFoo(text);
    CopyFoo(text);

//    MoveFoo(text); // compile error
    MoveFoo("another text");
    MoveFoo(move(text));

Notice line 12 where the operation is commented out. The signature of this function contains the “magic” characters &&, which prevent it from specifying the text object. And some kind of ownerless string in quotes is possible. Now, notice line 7, where the text object is assigned “some text”. How do they differ fundamentally, except for the location left-right from the assignment operator?
And the fact that text has an address in memory, and the expression “some text” does not have it, more precisely, its address is not so easy to find and it is short-lived. The address of a permanent object can be found like this:

cout << &text << 'n';
// 0x7ffdfd45dce0

Now look, in order for MoveFoo to accept an argument, it “must not have an address” like “another text” for example. Such objects are also called temporary. Now we can get to the point where we can say what the move function does – it makes its argument pretend to be “unaddressed”, i.e. temporary, so line 14 compiles fine. And if you do not do anything with text inside the MoveFoo function, then it will not disappear by itself, will not be transferred, will not disappear. But why, then, are all bodily movements asked? But if you write:

void MoveFoo(string&& text) {
    string tmp(move(text));
}

then after executing this function, the text variable in the outer block will be empty (compiler gcc 7.5 c ++ 17), as at the very beginning for the case with moving the vector.

Now let’s return to the question why the original vector “moved” to the new vector in such a short time?
We have some observations: when using move, memory was spent almost equal to the size of the original array.
Let’s imagine a vector as a data structure, which in its most simplified form stores an address (pointer) to a place in memory where all its elements are located. We remember that in the vector, all the elements are sequentially located in memory, without gaps. And the second field will be a variable that stores the current size of the vector. We also know that after the “move” operation, the original vector is empty. Now imagine that there are two vectors – one with a set of 10 ^ 9 elements, the other empty. The simplest solution is for them to take and “exchange” their content. The new one will simply change its address, pointing to the beginning of the data block, to the one that the original had. Will also update its size. And the original will take the same fields from an empty vector. It’s simple. If you follow the chain from the move constructor with the debugger, you can find the following code in the standard library in the stl_vector.h file:

    void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
    {
      std::swap(_M_start, __x._M_start);
      std::swap(_M_finish, __x._M_finish);
      std::swap(_M_end_of_storage, __x._M_end_of_storage);
    }

There, of course, everything is much more complicated, but the general principle is something like this.

I really hope that now the main points of using move have cleared up for you. Further, I recommend that you already familiarize yourself with more scientific works on the use of move semantics, where it is easy, I hope, to catch analogies with lvalues, rvalues, etc. And for more experienced developers – if you have read to the end, I will be glad to hear your comments and remarks.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *