C ++ vtables. Part 2 (Virtual Inheritance + Compiler-Generated Code)

The translation of the article was prepared specifically for students of the course "C ++ Developer". Is it interesting to develop in this direction? Watch the recording of the Google Test Framework Practice Class!

Part 3 – Virtual Inheritance

In the first and second parts of this article, we talked about how vtables work in the simplest cases, and then in multiple inheritance. Virtual inheritance complicates the situation even more.

As you may recall, virtual inheritance means that there is only one instance of the base class in a particular class. For example:

class ios ...
class istream: virtual public ios ...
class ostream: virtual public ios ...
class iostream: public istream, public ostream

If not for the keyword virtualabove iostream would actually have two instances iosthat could cause headaches during synchronization and simply would be ineffective.

To understand virtual inheritance, we will consider the following code fragment:

#include 
using namespace std;

class Grandparent {
 public:
  virtual void grandparent_foo () {}
  int grandparent_data;
};

class Parent1: virtual public Grandparent {
 public:
  virtual void parent1_foo () {}
  int parent1_data;
};

class Parent2: virtual public Grandparent {
 public:
  virtual void parent2_foo () {}
  int parent2_data;
};

class Child: public Parent1, public Parent2 {
 public:
  virtual void child_foo () {}
  int child_data;
};

int main () {
  Child child;
}

Let's explore child. I'll start by dumping a lot of memory exactly where it starts vtable Childas we did in the previous parts, and then I will analyze the results. I suggest taking a quick look at the result here and returning to it when I reveal the details below.

(gdb) p child
$ 1 = { = { = {_vptr $ Grandparent = 0x400998 , grandparent_data = 0}, _vptr $ Parent1 = 0x400950 , parent1_data = 0},  = {_vptr $ Parent2 = 0x400978 , parent2_data = 4195888}, child_data = 0}
(gdb) x / 600xb 0x400938
0x400938 : 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400940 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400948 : 0x00 0x0b 0x40 0x00 0x00 0x00 0x00 0x00
0x400950 : 0x70 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400958 : 0xa0 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400960 : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400968 : 0xf0 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400970 : 0x00 0x0b 0x40 0x00 0x00 0x00 0x00 0x00
0x400978 : 0x90 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400980 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400988 : 0xe0 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400990 : 0x00 0x0b 0x40 0x00 0x00 0x00 0x00 0x00
0x400998 : 0x80 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x4009a0 : 0x50 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4009a8 : 0xf8 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4009b0 : 0x18 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x4009b8 : 0x98 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x4009c0 : 0xb8 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x4009c8 : 0x98 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4009d0 : 0x78 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4009d8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x4009e0 : 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x4009e8 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x4009f0 : 0x50 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x4009f8 : 0x70 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400a00 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400a08 : 0xe0 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400a10 : 0x50 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400a18 : 0x80 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400a20 : 0x37 0x50 0x61 0x72 0x65 0x6e 0x74 0x31
0x400a28 : 0x00 0x31 0x31 0x47 0x72 0x61 0x6e 0x64
0x400a30 : 0x70 0x61 0x72 0x65 0x6e 0x74 0x00 0x00
0x400a38 : 0x50 0x10 0x60 0x00 0x00 0x00 0x00 0x00
0x400a40 : 0x29 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400a48: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400a50 : 0xa0 0x10 0x60 0x00 0x00 0x00 0x00 0x00
0x400a58 : 0x20 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400a60 : 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x400a68 : 0x38 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400a70 : 0x03 0xe8 0xff 0xff 0xff 0xff 0xff 0xff
0x400a78: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400a80 : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400a88 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400a90 : 0xd0 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400a98 : 0x90 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400aa0 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400aa8 : 0xf0 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400ab0 : 0xd0 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400ab8 : 0x80 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400ac0 : 0x37 0x50 0x61 0x72 0x65 0x6e 0x74 0x32
0x400ac8 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400ad0 : 0xa0 0x10 0x60 0x00 0x00 0x00 0x00 0x00
0x400ad8 : 0xc0 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400ae0 : 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x400ae8 : 0x38 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400af0 : 0x03 0xe8 0xff 0xff 0xff 0xff 0xff 0xff
0x400af8 : 0x35 0x43 0x68 0x69 0x6c 0x64 0x00 0x00
0x400b00 : 0xa0 0x10 0x60 0x00 0x00 0x00 0x00 0x00
0x400b08 : 0xf8 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400b10 : 0x02 0x00 0x00 0x00 0x02 0x00 0x00 0x00
0x400b18 : 0x50 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400b20 : 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400b28 : 0xd0 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400b30 : 0x02 0x10 0x00 0x00 0x00 0x00 0x00 0x00
0x400b38 : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x400b40 : 0x38 0x0a 0x40 0x00 0x00 0x00 0x00 0x00
0x400b48 : 0x80 0x08 0x40 0x00 0x00 0x00 0x00 0x00

Wow, there is a lot of information. Two new questions come up immediately: what is VTT and what is construction vtable for X-in-Child? We will answer them soon enough.
Let's start with the memory structure. Child:

The sizeValue
8 bytes_vptr $ Parent1
4 bytesparent1_data (+ 4 padding bytes)
8 bytes_vptr $ Parent2
4 bytesparent2_data
4 byteschild_data
8 bytes_vptr $ Grandparent
4 bytesgrandparent_data (+ 4 fill bytes)

Indeed, in Child There is only 1 instance of Grandparent. The nontrivial thing is that he is the last in memory, although he is the highest in the hierarchy.
Here is the structure vtable:

AddressValueContent
0x4009380x20 (32)virtual-base offset (we will discuss this soon)
0x4009400top_offset
0x4009480x400b00typeinfo for child
0x4009500x400870Parent1 :: parent1_foo (). The vtable pointer Parent1 points here.
0x4009580x4008a0Child :: child_foo ()
0x4009600x10 (16)virtual-base offset
0x400968-sixteentop_offset
0x40090x400btypeinfo for child
7000
0x4009780x400890Parent2 :: parent2_foo (). The vtable pointer Parent2 points here.
0x4009800virtual-base offset
0x400988-32top_offset
0x4009900x400b00typeinfo for child
0x4009980x400880Grandparent :: grandparent_foo (). The vtable pointer Grandparent points here.

Above is a new concept – virtual-base offset. Soon we will understand what he is doing there.
Next, let's explore these weird looking construction vtables. Here is construction vtable for Parent1-in-Child:

ValueContent
0x20 (32)virtual-base offset
0top-offset
0x400a50typeinfo for Parent1
0x400870Parent1 :: parent1_foo ()
0virtual-base offset
-32top-offset
0x400a50typeinfo for Parent1
0x400880Grandparent :: grandparent_foo ()

At the moment, I think it would be more understandable to describe the process than to pile more tables with random numbers on you. So:

Imagine you Child. You are asked to construct yourself in a new piece of memory. Since you inherit Grandparent directly (this is what virtual inheritance means), first you will directly call its constructor (if it weren't virtual inheritance, you would call the constructor Parent1which, in turn, would call the constructor Grandparent) You set this + = 32 bytes, since this is where the data is Grandparent, and call the constructor. Very simple.

Then it's time to design Parent1. Parent1 can safely assume that by the time he constructs himself, Grandparent has already been created, so it can, for example, access data and methods Grandparent. But wait, how can he know where to find this data? They are not in the same place with variables Parent1!

Goes on stage construction table for Parent1-in-Child. This table is intended to indicate Parent1where to find pieces of data that he can access. this indicates data Parent1. virtual-base offset indicates where Grandparent data can be found: Step 32 bytes ahead of this and you will find the memory Grandparent. Do you get it? virtual-base offset is similar to top_offset, but for virtual classes.

Now that we understand this, construction Parent2 basically the same, only using construction table for Parent2-in-Child. And indeed Parent2-in-child It has virtual-base offset in 16 bytes.

Let the information soak in a bit. Are you ready to continue? Good.
Now let's get back to VTT. Here is the structure VTT:

AddressValueSymbolContent
0x4009a00x400950vtable for Child + 24Parent1 entries in vtable Child
0x4009a80x4009f8construction vtable for Parent1-in-Child + 24Parent1 methods in Parent1-in-Child
0x4009b00x400a18construction vtable for Parent1-in-Child + 56Grandparent Methods for Parent1-in-Child
0x4009b80x400a98construction vtable for Parent2-in-Child + 24Parent2 methods in Parent2-in-Child
0x4009c00x400ab8construction vtable for Parent2-in-Child + 56`Grandparent methods for Parent2-in-Child
0x4009c80x400998vtable for Child + 96`Grandparent entries in vtable Child
0x4009d00x400978vtable for Child + 64`Parent2 entries in vtable Child

VTT stands for virtual-table table (virtual table table), which means that it is a vtable table. This is a translation table that knows, for example, whether the constructor is being called Parent1 for an individual object, for an object Parent1-in-child or for Parent1-in-SomeOtherObject. She always appears right after vtableso that the compiler knows where to find it. Therefore, there is no need to store another pointer in the objects themselves.

Fuh … a lot of details, but I think we covered everything that I wanted to cover. In the fourth part we will talk about the details vtables higher level. Do not skip, as this is probably the most important part in this article!

Part 4 – Code Generated by the Compiler

At this point in this article, we learned how to record vtables and typeinfo are placed in our binaries and how the compiler uses them. Now we will understand the part of the work that the compiler does for us automatically.

Constructors

For the constructor of any class, the following code is generated:

  • Calling parent constructs, if any;
  • Setting vtable pointers, if any;
  • Initialization of members according to the list of initializers;
  • Code execution inside constructor brackets.

All of the above can happen without an explicit code:

  • Parent constructors start automatically by default unless otherwise specified;
  • Members are initialized by default if they do not have a default value or entries in the initializer list;
  • The entire constructor can be marked = default;
  • Only the vtable assignment is always hidden.

Here is an example:

#include 
#include 
using namespace std;

class Parent {
public:
    Parent () {Foo (); }
    virtual ~ Parent () = default;
    virtual void Foo () {cout << "Parent" << endl; }
    int i = 0;
};

class Child: public Parent {
public:
    Child (): j (1) {Foo (); }
    void Foo () override {cout << "Child" << endl; }
    int j;
};

class Grandchild: public Child {
public:
    Grandchild () {Foo (); s = "hello"; }
    void Foo () override {cout << "Grandchild" << endl; }
    string s;
};

int main () {
    Grandchild g;
}

Let's write a pseudo-code for the constructor of each class:

ParentChildGrandchild
1. vtable = vtable Parent;1. Calls the default constructor Parent;1. Calls the default constructor Child;
2. i = 0;2. vtable = vtable Child;2. vtable = vtable Grandchild;
3. Calls Foo ();3. j = 1;3., Calls the default constructor s;
4. Calls Foo ();4. Calls Foo ();
5. Calls the = operator for s;

Given this, it is not surprising that in the context of the class constructor, vtable refers to the vtable of this class itself, and not to its specific class. This means that virtual calls are resolved as if there were no heirs available. Thus, the conclusion

Parent
Child
Grandchild

What about pure virtual functions? If they are not implemented (yes, you can implement purely virtual functions, but why do you need this?), You will probably (and hopefully) go straight to segfault. Some compilers neglect the error, which is cool.

Destructors

As you can imagine, destructors behave in the same way as constructors, only in the reverse order.

Here's a quick exercise to think about: why do destructors change the vtable pointer so that it points to its own class rather than leaving a pointer to a specific class? Answer: since by the time the destructor was launched, any inheriting class was already destroyed. Calling methods of this class is not what you want to do.

Implicit cast

As we saw in the second and third parts, a pointer to a child object is not necessarily equal to the parent pointer of the same instance (as in the case of multiple inheritance).

However, there is no additional work for you (the developer) to call a function that receives a parent pointer. This is because the compiler implicitly shifts thiswhen you append pointers and references to parent classes.

Dynamic Cast (RTTI)

Dynamic cast uses tables typeinfowhich we investigated in the first part. They do this at runtime by looking at the record. typeinfo one pointer before what the pointer points to vtable, and use the class from there to check if the cast is possible.

This explains the cost of dynamic_cast when used frequently.

Method Pointers

I plan to write a full post about pointers to methods in the future. Before that, I would like to emphasize that a pointer to a method that points to a virtual function will actually call an overridden method (as opposed to pointers to functions that are not members).

// TODO: add a link when the post is ready

Check yourself!

Now you can explain to yourself why the following code fragment behaves the way it behaves:

#include 
using namespace std;

class FooInterface {
public:
    virtual ~ FooInterface () = default;
    virtual void Foo () = 0;
};

class BarInterface {
public:
    virtual ~ BarInterface () = default;

    virtual void Bar () = 0;
};

class Concrete: public FooInterface, public BarInterface {
public:
    void Foo () override {cout << "Foo ()" << endl; }
    void Bar () override {cout << "Bar ()" << endl; }
};

int main() {
    Concrete c;
    c.Foo();
    c.Bar();

    FooInterface* foo = &c;
    foo->Foo ();

    BarInterface * bar = (BarInterface *) (foo);
    bar-> Bar (); // Prints "Foo ()" - WTF?
}

This concludes my four-part article. I hope you learn something new, just like me.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *