C++ Tutorial
Object Slicing and Virtual Table - 2020
When a derived class object is passed by value as a base class object, as in foo(Base derived_obj), the base class copy constructor is called.
So, the specific behaviors of a derived class object are sliced off. We're left with a base class object. In other words, if we upcast (Upcasting and Down Casting) to an object instead of a pointer or reference, the object is sliced. As a result, all that remains is the subobject that corresponds to the destination type of our cast as shown in the example below:
class General {}; class Specific : public General {}; int main () { vector<General> gVec; Specific sp; /* sp object is copied as a base class and put into gVec. So, its specialness is lost during the copying*/ gVec.push_back(sp); return 0; }
This is not surprising because a base class constructor created it. This is called object slicing. This is one of the reason we should prefer pass-by-reference to pass-by-value. In the case of above example, we should have created a container of pointers rather than the container of objects.
Here is another example of object slicing.
#include <iostream> #include <string> using namespace std; class Animal{ public: Animal(const string& s) : name(s) {} virtual void eat() const { cout << "animal: " << name << " eat()" << endl; } private: string name; }; class Bird : public Animal { private: string name; string habitat; public: Bird(const string& sp, const string &s;, const string &h;) : Animal(sp), name(s), habitat(h) {}; virtual void eat() const { cout << "bird: " << name << " eat() in " << habitat << endl; } }; void WhatAreYouDoingValue(Animal a) { a.eat(); } void WhatAreYouDoingReference(const Animal &a;) { a.eat(); } int main() { Animal animal("Animal"); Bird bird("Eagle","Bald","US and Canada"); cout << "pass-by-value" << endl; WhatAreYouDoingValue(animal); WhatAreYouDoingValue(bird); cout << "\npass-by-reference" << endl; WhatAreYouDoingReference(animal); WhatAreYouDoingReference(bird); }
Output of the run is:
pass-by-value animal: Animal eat() animal: Eagle eat() pass-by-reference animal: Animal eat() bird: Bald eat() in US and Canada
Note that, in main(), we call two functions,
WhatAreYouDoingValue(Animal a) and WhatAreYouDoingReference(const Animal &a;). The first call is passing an argument by value, the second one is passing an argument by reference.
We might expect the first call to produce animal: Animal eat(), and the second to produce bird: Bald eat() in US and Canada. In fact, both calls use the base-class version of eat.
What happened?
In the pass-by-value case, because eat( ) was invoked by an Animal object (rather than a pointer or reference). It caused an object the size of Animal to be pushed on the stack. This means that if an object of a class inherited from Animal is passed to eat( ), the compiler accepts it, but it copies only the Animal portion of the object. It slices the derived portion (such as habitat member) off of the object.
Because the object is being passed by value, the compiler knows the precise type of the object because the derived object has been forced to become a base object. When passing by value, the copy-constructor for a Animal object is used, which initializes the vptr (virtual table pointer) to the Animal vtbl (virtual table) and copies only the Animal parts of the object. There's no explicit copy-constructor here, so the compiler somehow makes one. The Bird object lost all the things that make it bird-like, and it becomes an Animal during slicing.
The second output which used pass-by-reference did what we expected.
The way compilers handle virtual functions is to add a hidden member to each object. The hidden member holds a pointer to an array of function addresses. Such an array is usually termed a virtual function table(vtbl). The vtbl holds the addresses of the virtual functions declared for objects of that class.
An object of a base class, for example, contains a pointer to a table of addresses of all the virtual functions for that class. An object of a derived class contains a pointer to a separate table of addresses. If the derived class provides a new definition of a virtual function, the vtbl holds the address of the new function. If the derived class doesn't redefine the virtual function, the vtbl holds the address of the original version of the function. If the derived class defines a new function and makes it virtual, its address is added to the vtbl. So, whether we define 1 or 100 virtual functions for a class, we are adding just one address member to an object.
When we call a virtual function, the program looks at the vtbl address stored in an object and goes to the corresponding table of function addresses. If we use the first virtual function defined in the class declaration, the program uses the first function address in the array and execute the function that has that address, and so on.
- Each object has its size increased by the amount needed to hold that address.
- For each class, the compiler creates a table of addresses of virtual functions.
- For each function call, there's an extra step of going to a table to look up an address.
In summary, virtual functions are implemented using a table of function pointers, called the vtable. There is one entry in the table per virtual function in the class. This table is created by the constructor of the class. When a derived class is constructed, its base class is constructed first which creates the vtable. If the derived class overrides any of the base classes virtual functions, those entries in the vtable are overwritten by the derived class constructor. This is why you should never call virtual functions from a constructor, and that's because the vtable entries for the object may not have been set up by the derived class constructor yet, so you might end up calling base class implementations of those virtual functions
Even though virtual functions provide dynamic binding, we need implement them judiciously. Scott Meyers said (in his book "Effective C++"), "The bottom line is that gratuitously declaring all destructors virtual is just as wrong as never declaring them virtual. In fact, many people summarize the situation this way: declare a virtual destructor in a class if and only if that class contains at least one virtual function."
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization