Hello, and welcome to the 12th part of the C / C++ low level curriculum. Really soon after part 11! (No, of course part 11 didn’t get too big and need to be split. Why would you ask?)

Last time we looked at the basics of how inheritance was implemented at the low level; and this time we’re going to examine how using multiple inheritance affects this picture (note: we’re leaving the keyword virtual til next time).

Before We Begin

I will assume that you have already read the previous posts in the series, but I will also put in-line links to any important terms or concepts that you might need to know about to make sense of what you’re reading. I’m helpful like that.

Another big assumption I’m going to make is that you’re already very familiar with the language features of C++ and  comfortable using the language features we’re discussing. If I need to demonstrate anything out of the ordinary I’ll explain it – or at least link to an explanation.

In this series I discuss what happens with vanilla unoptimised win32 debug code generated by the VS 2010 compiler – whilst the specifics will differ on other platforms (and probably with other compilers) the general sweep of the code should be basically the same – because it’s assembly that has been generated by a C++ compiler – and so following the same examples given here with a source / disassembly debugger on your platform of choice should provide you with the same insights we get here.

With this in mind, in case you missed them, here are the backlinks to the previous posts in the series:

  1. /2011/11/09/a-low-level-curriculum-for-c-and-c/
  2. /2011/11/24/c-c-low-level-curriculum-part-2-data-types/
  3. /2011/12/14/c-c-low-level-curriculum-part-3-the-stack/
  4. /2011/12/24/c-c-low-level-curriculum-part-4-more-stack/
  5. /2012/02/07/c-c-low-level-curriculum-part-5-even-more-stack/
  6. /2012/03/07/c-c-low-level-curriculum-part-6-conditionals/
  7. lovingly zipped up a hand-crafted VS2010 solution / project / source code combo to go with this sample which contains the following code:

    1
     
      2
     
      3
     
      4
     
      5
     
      6
     
      7
     
      8
     
      9
     
      10
     
      11
     
      12
     
      13
     
      14
     
      15
     
      16
     
      17
     
      18
     
      19
     
      20
     
      21
     
      22
     
      23
     
      24
     
      25
     
      26
     
      27
     
      28
     
      29
     
      30
     
      31
     
      32
     
      33
     
      34
     
      35
     
      36
     
      37
     
      38
     
      39
     
      40
     
      41
     
      42
     
      43
     
      44
     
      45
     
      46
     
      47
     
      48
     
      49
     
      50
     
      51
     
      52
     
      53
     
      54
     
      55
     
      56
     
      57
     
      58
     
      59
     
      60
     
      61
     
      62
     
      63
     
      64
     
      
    class CTestBaseOne
     
      {
     
      public:
     
          int _iA;
     
          int _iB;
     
       
     
          CTestBaseOne( int iA, int iB )
     
          : _iA( iA )
     
          , _iB( iB )
     
          {}
     
       
     
          int SumBase( void )
     
          {
     
              return _iA + _iB;
     
          }
     
      };
     
       
     
      class CTestBaseTwo
     
      {
     
      public:
     
          int _iC;
     
          int _iD;
     
       
     
          CTestBaseTwo( int iC, int iD )
     
          : _iC( iC )
     
          , _iD( iD )
     
          {}
     
       
     
          int SumBaseTwo( void )
     
          {
     
              return _iC + _iD;
     
          }
     
      };
     
       
     
      class CTestDerived
     
      : public CTestBaseOne
     
      , public CTestBaseTwo
     
      {
     
      public:
     
          int _iE;
     
          int _iF;
     
       
     
          CTestDerived( int iA, int iB, int iC, int iD )
     
          : CTestBaseOne ( iA, iB )
     
          , CTestBaseTwo ( iC, iD )
     
          , _iE ( iB )
     
          , _iF ( iD )
     
          {}
     
       
     
          int SumDerived( void )
     
          {
     
              return return SumBase() + SumBaseTwo() +_iE + _iF;
     
          }
     
      };
     
       
     
      int main( int argc, char* argv[] )
     
      {
     
          CTestBaseOne    cTestBaseOne( argc, argc + 1 );
     
          CTestBaseTwo    cTestBaseTwo( argc, argc + 1 );
     
          CTestDerived    cTestDerived( argc, argc + 1, argc + 2, argc + 3 );
     
       
     
          return      cTestBaseOne.SumBase()  + cTestBaseTwo.SumBaseTwo() 
     
                  +   cTestDerived.SumBase()  + cTestDerived.SumBaseTwo() + cTestDerived.SumDerived();
     
      }

    Once you’ve unzipped it, go ahead and build it.

    Don’t forget to pay attention to the build output – it shows the memory layout which we’re going to talk about next.

     

    Memory Layout

    As you can see, we now have two base classes, and one class that derives from both of them.

    When you build the project, you should see that the memory layout of these classes looks like this:

    1>  class CTestBaseOne	size(8):
     
      1>  	+---
     
      1>   0	| _iA
     
      1>   4	| _iB
     
      1>  	+---
     
      1>  
     
      1>  class CTestBaseTwo	size(8):
     
      1>  	+---
     
      1>   0	| _iC
     
      1>   4	| _iD
     
      1>  	+---
     
      1>   
     
      1>  class CTestDerived	size(24):
     
      1>  	+---
     
      1>  	| +--- (base class CTestBaseOne)
     
      1>   0	| | _iA
     
      1>   4	| | _iB
     
      1>  	| +---
     
      1>  	| +--- (base class CTestBaseTwo)
     
      1>   8	| | _iC
     
      1>  12	| | _iD
     
      1>  	| +---
     
      1>  16	| _iE
     
      1>  20	| _iF
     
      1>  	+---

    This is all as we might expect – given what we found out last time.

    In particular note that:

    • the memory layout of both base classes is embedded into CTestDerived
    • CTestBaseOne and CTestBaseTwo appear in the memory layout in the same order they are declared in the base-specifier-list of CTestDerived.

    n.b. the base-specifier-list is the part of the declaration of a class where the base classes are specified.

    In the simple case of single inheritance we considered in the last post, we saw that the functions of a base class B could be called on instances of a derived class D because:

    • the memory layout of D contains a literal instance of B at an offset of 0 bytes within itself and…
    • …this means that the member data of an instance of B is at the same offset relative to the memory layout of an instance of D
    • …and so the hard coded offsets used to access these members within functions belonging to B are also valid for instances of D

    Looking at the memory layout for this multiply inherited class we can see that:

    1. this relationship still holds for CTestBaseOne and CTestDerived - CTestBaseOne is an an offset of 0 bytes within the memory layout of CTestDerived
    2. however, this same relationship is not true of CTestBaseTwo and CTestDerived

    Given this situation, how do functions of CTestBaseTwo work with instances of CTestDerived?

    As usual the best thing to do do is take a look…

     

    Calling a function of CTestBaseTwo on CTestDerived

    Put a breakpoint on the return statement from main(), run the code, and when it stops right click then choose ‘Go To Disassembly’.

    Rather than paste the disassembly as text this time, I’ve inserted a screenshot of my debugger window – this allows more formatting and highlighting options.

    N.B. in this screenshot I have “Show symbol names” checked under viewing options. Whilst this typically makes it easier to relate disassembly to C or C++ code, it does hide detail (i.e. the addresses of the symbols) .

    CCPPLLC_12Inheritance_P2_MultipleInheritance_00

    Let’s pick this apart then, starting at the current line indicator where the breakpoint is and working down:

    • we can see that (following the x86 thiscall convention) before each function is called, the address of the corresponding object is stored into ecx using lea.
    • First it loads the address of cTestDerived into ecx and then calls CTestDerived::SumDerived()
    • then it…
    • Oh, wait… it’s loading the address [ebp-20h] into ecx…
    • that symbol isn’t resolving in the disassembly window, so what witchcraft is this!?

    I have helpfully highlighted the most salient areas of the screenshot with red boxes :)

    If you look at the function calls made in the disassembly, and compare them to the calls in the C++ code, you will see that all of the high level function calls have an analogue at the assembly level except for cDerived.SumBaseTwo().

    CTestBaseTwo::SumBaseTwo is getting called, but with [ebp-20h] as the this pointer in ecx, not [cTestDerived] (n.b. see the top red box in the screenshot).

    So, the question is: how does the address [ebp-20h] related to the address of cTestDerived?

    This would be a good time to reiterate that the watch window is your friend. We can use the watch window to Sherlock Holmes our way to an answer.

    If you look in the watch window below the disassembly view (shown again below by itself for those of you who are vertical resolution challenged) you can see that I have used watch window expression evaluation to find out some information about these values:

    CCPPLLC_12Inheritance_P2_MultipleInheritance_WatchCasting_00

    This shows us that:

    • the address of cTestDerived is 0x0048fa84…
    • … and the address of cTestDerived cast to a pointer to CTestBaseOne has the same address, …
    • …but when the address of cTestDerived is cast to CTestBaseTwo we get 0x0048fa8c…
    • …which is the same value as [ebp-20h]…
    • …or an 8 byte offset from the address of cTestDerived…
    • …which is the offset of CTestBaseTwo within CTestDerived

     

    Should this be surprising?

    Here’s the memory layout of CTestDerived again:

    1>  class CTestDerived	size(24):
     
      1>  	+---
     
      1>  	| +--- (base class CTestBaseOne)
     
      1>   0	| | _iA
     
      1>   4	| | _iB
     
      1>  	| +---
     
      1>  	| +--- (base class CTestBaseTwo)
     
      1>   8	| | _iC
     
      1>  12	| | _iD
     
      1>  	| +---
     
      1>  16	| _iE
     
      1>  20	| _iF
     
      1>  	+---

    Since we know that:

    • (within non-static member functions) member variables are accessed via constant offsets from their this pointer
    • the memory for CTestBaseTwo starts at an offset of 8 bytes from the start of the memory layout of an instance of CTestDerived

    it follows that CTestBaseTwo::SumBaseTwo() wouldn’t work if the compiler passed the address of an instance of CTestDerived because the constant offsets used to access the members of CTestBaseTwo would be off by 8 bytes.

    Consequently, any time a CTestBaseTwo member function is called on an instance of CTestDerived the compiler must ensure that a compatible this pointer is generated to pass to the function - i.e. pointing at the start address of CTestBaseTwo within the instance of CTestDerived.

    Frighteningly obvious once you know isn’t it?

    I honestly don’t think it should be surprising though – given the way that we know data within user defined types is accessed at the assembly level (see part 10), it pretty much had to work like this.

     

    …one more little thing

    In the example above, cTestDerived is a Stack Frame.

    This means that the compiler can calculate the address of the instance of CTestBaseTwo within cTestDerived at compile time, and can therefore access it at no extra cost compared to any other Stack variable.

    We should probably check whether this is this any different when we’re dealing with a pointer to a CTestDerived at an arbitrary point in memory, just to be thorough.

    Luckily I have already thought of this :)

    If you place a breakpoint on the return statement of CTestDerived::SumDerived  you can check the disassembly yourself, but here are the relevant lines from my disassembly window:

    1
     
      2
     
      3
     
      4
     
      5
     
      6
     
      7
     
      8
     
      
       52:     int SumDerived( void )
     
          53:     {
     
      001010A0  push        esi  
     
      001010A1  mov         esi,ecx  
     
      001010A3  push        edi  
     
          54:         return SumBase() + SumBaseTwo() +_iE + _iF;
     
      001010A4  lea         ecx,[esi+8]  
     
      001010A7  call        CTestBaseTwo::SumBaseTwo (101050h)

    As you should be able to see by now, the code in this function is adding a constant offset of 8 bytes onto the this pointer it is passed to generate the this pointer it is passing to CTestBaseTwo::SumBaseTwo

    If you’re having trouble seeing it, remember that the ‘thiscall‘ win32 member function calling convention uses ecx to pass the this pointer.

    Most significantly, looking back to the last post, we can see that this is essentially the same way that member variables of user defined types are accessed when we had a pointer to an instance of  a user defined type – in fact, at the level of the assembly codethere is really no difference between a member variable and a base class; this distinction is really only meaningful at the level of the C++ code.

    We now also know that multiple inheritance can cause your code a small additional cost in pointer arithmetic when calling member functions of any of its base types that has a non-zero offset within its memory layout.

     

    What was that earlier? about declaration order?

    If you’re paying attention, you should have noticed that when we looked at the memory layout of CTestDerived I mentioned in passing that the ordering of  CTestBaseOne and CTestBaseTwo within it matches the textual order they were listed in its base-specifier-list.

    This is obviously significant, since it implies that if the textual order in which CTestBaseOne and CTestBaseTwo are listed changes, then the memory layout of CTestDerived will change to reflect this.

    If you swap the order of CTestBaseOne and CTestBaseTwo around here’s the memory layout printed during the build process:

    1>  class CTestDerived	size(24):
     
      1>  	+---
     
      1>  	| +--- (base class CTestBaseTwo)
     
      1>   0	| | _iC
     
      1>   4	| | _iD
     
      1>  	| +---
     
      1>  	| +--- (base class CTestBaseOne)
     
      1>   8	| | _iA
     
      1>  12	| | _iB
     
      1>  	| +---
     
      1>  16	| _iE
     
      1>  20	| _iF
     
      1>  	+---
     
      1>

    Given what we have discovered so far, we can see that this new memory layout means that CTestDerived can now be treated as an instance of CTestBaseTwo.

    We can also see that with this new layout, the compiler would need to adjust CTestDerived pointers in order to call CTestOne functions.

    I leave it as an exercise for you, o budding expert reader of x86 disassembly, to check this for yourself :)

     

    Aside: construction and destruction with single inheritance

    Something we entirely skipped past in last post was construction and destruction of inherited types.

    This was intentional – construction and destruction behaviour is straightforward with single inheritance.

    We all should know the expected high level behaviour for single inheritance (of arbitrary depth) - in summary:

    • each constructor calls the constructor of its base class before it does the work of its own function definition – i.e. classes are constructed in order ”from inside to out” or “least to most derived“.
    • destructors do the opposite, each destructor does its own work before calling the destructor of its base class – i.e. classes are destructed in order “from outside to in” or “most to least derived“.

    The disassembly matches the high level behaviour in a very straightforward way and I leave it as an exercise for the reader to step through the disassembly of construction & destruction in some test code to see this in action.

    Like the rest of the behaviour we’ve discovered so far, when you think about it, it’s actually pretty obvious that this sort of ‘stack-like’ construction / destruction behaviour is required in order to make inheritance work correctly.

     

    Construction and destruction with multiple inheritance

    It was pretty obvious that we were coming to this, right?

    What happens with construction and destruction when multiple inheritance is involved is less simple.

    For example, what order do the constructors of multiple base classes get called in? … and what order do their destructors get called in?

    We also assume that – since the constructor and destructor are member functions – there must be some fiddling with this pointers during this process too.

    Luckily, this is very easy to empirically determine: we can just add some text output into the constructors and destructors of the sample classes to print the name of the function and the value of their this pointer.

    Here’s a link to a VS2010 project I prepared earlier to do just that, I’ve just added a little extra code to the original example code.

    Below is the command line output produced when it is run:

    CCPPLLC_12Inheritance_P2_ConDestructionThisPointerFixing00

    You can see that:

    • constructors are called in the textual order they appear in the class declaration for CTestDerived - from least derived (i.e. CTestBaseOne) to most (CTestDerived).
    • destructors are called in the opposite order – this is to ensure that work done in the constructors is un-done in the opposite order.
    • this also shows that the this pointers are changed for the constructor and destructor of CTestBaseTwo just as they did when we were calling regular member functions

    At this point you should feel free to swap around the order of CTestBaseOne and CTestBaseTwo in CTestDerived‘s base-specifier-list to check that construction and destruction follow the same rules as the ordering of base types in derived type’s memory layout (they do, I promise).

     

    Summary

    That’s it for this time and it was massive! I bet you’re glad I split this off from post 11 now :)

    The astute amongst you will have noticed that we have not looked at any code using the keyword virtual. This is entirely deliberate, and that’s for next time.

    So, let’s recap what we’ve discovered so far about inheritance…

    First, what we learned about single inheritance in part 11:

    1) We know that the memory layout of user defined types is fixed at compile time…

    2) …and so code accessing a data member of a user defined type can use a constant offset relative to the start address of an instance of the type (see part 10).

    3) In single inheritance, the memory layout of a derived type D literally extends that of its base type B.

    4) This ensures that the inherited members of B are at the same constant offsets relative to the start address of an instance of D as they would be relative to the start address of an instance of B

    5) …which means that a pointer to an instance of type D can safely be treated as a pointer to an instance of type B

    6) …which in turn guarantees that member functions of type B can safely be called on instances of type D.

     

    We’ve also discovered that if a derived class D class inherits from multiple base types A and B, then this multiple inheritance breaks the convenience of the single inheritance approach somewhat:

    7) As with single inheritance, the memory layout of an instance of D contains the member data of both A and B, laid out exactly as it was in each base class.

    8) Member functions of both type A and type B will use constant offsets relative to the their this pointers to access their data members.

    9) Logically; only either A or B may have an offset of 0 bytes within the memory layout of an instance of D

    10) … consequently a pointer to an instance of type D can only be safely treated as a pointer to whichever of A or B has a 0 byte offset within its memory layout

    11) Which base type has a 0-byte offset is determined by the textual ordering of the A and B types within the base-specifier-list of D‘s class declaration

    12) …if A were at the 0 byte offset within D, the compiler would need to calculate a compatible ‘this’ pointer whenever a member function of B called on an instance of D (and vice versa)

    13) …when an instance of D is created, the instances of A and B contained within its memory layout will be constructed by their own constructors before the constructor for D is called, and…

    14) …the order in which A and B are constructed depends on their textual ordering within the base-specifier-list in D‘s class declaration (i.e. they will be constructed in memory offset order).

     

    Disclaimer

    The above numbered bullet points are facts we have discovered empirically by examining the behaviour of win32 x86 code created by the Visual Studio 2010 compiler.

    Do not assume that code generated by other platforms / compilers will behave identically. It should behave very similarly, but you should check.

    POD type you should be able to save out its memory to file as binary and load it back into the memory of a different instance of the same type.

    iff you don’t change target platform, compiler, compiler options, alignment specifications, or the declaration of the type.

    Thanks

    Thanks for peer review go out to Bruce Dawson and Amir Embrahimi; and for general #altdevblogaday admin assistance to Luke Dicken.

     

    Appendix: What does the C++ standard have to say about all this?

    I spent some time reading the C++11 ISO standard (or a near final revision of it at least), but even after consulting the source it is not 100% clear to me exactly what is and what isn’t guaranteed by the standard – see the below for more information.

    this near-final draft of the ISO C++ 11 standard document  when I was looking up various bits for this post (you have to pay ANSI for the actual one!).

    If you want more information on any of the points below, click the link above to download the .pdf and search for the indented text – that will get you to the page it was on. This document does not make for light reading!

     

    1. Ordering of base classes in memory within a multiply inherited class.

    As far as I can tell, this is not guaranteed by the standard – in fact the draft standard says this:

    10.1 Multiple base classes [class.mi]
    1 A class can be derived from any number of base classes. [Note: The use of more than one direct base class
    is often called multiple inheritance. — end note ] [Example:
    class A { /∗ ... ∗/ };
    class B { /∗ ... ∗/ };
    class C { /∗ ... ∗/ };
    class D : public A, public B, public C { /∗ ... ∗/ };
    — end example ]

    2 [Note: The order of derivation is not significant except as specified by the semantics of initialization by
    constructor (12.6.2), cleanup (12.4), and storage layout (9.2, 11.1). — end note ]

    which more or less says that it’s not guaranteed.

     

    2. Ordering of data members within a class.

    So, it appears that a C++ compiler is allowed to reorder data members of a class in memory vs. their textual declaration order if (and only if) their access control (i.e. public, private, protected) is different:

    15 Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (11). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other;

    I can’t imagine this would ever be a problem for you unless you’re writing a reflection library or similar.

     

    3. Ordering of constructor calls when constructing types with base types

    Thankfully, there does appear to be some sanity remaining in the universe; as I managed to find the part of the standard that specifies the order in which constructors are called for types that use inheritance:

    In a non-delegating constructor, initialization proceeds in the following order:
    — First, and only for the constructor of the most derived class (1.8), virtual base classes are initialized in the order they appear on a depth-first left-to-right traversal of the directed acyclic graph of base classes, where “left-to-right” is the order of appearance of the base classes in the derived class base-specifier-list.
    — Then, direct base classes are initialized in declaration order as they appear in the base-specifier-list (regardless of the order of the mem-initializers).
    — Then, non-static data members are initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers).
    — Finally, the compound-statement of the constructor body is executed.
    [Note: The declaration order is mandated to ensure that base and member subobjects are destroyed in the
    reverse order of initialization. — end note ]

    TL;DR – (if you are not using virtual base classes) each constructor initialises its base classes in declaration order, then the class members in declaration order, then the constructor’s body is executed called.