PDA

View Full Version : Organization of C++ objects in memory


-Vormav-
01-15-2007, 06:58 AM
I've come across a case where I need to know how the data members of C++ objects are organized in memory. My initial thoughts were that it would probably be along similar lines as structs and other data members: members stored in memory, in line, in the order in which they were declared. Obviously, there's bound to be some deviations from this for static vs. non-static members, but from the tests I've written up so far, they've been organized just as expected. You can cast a pointer from one object directly to a pointer of a struct that has the same attributes, and vice versa, without any issues that I've noticed so far.

However, I'm hesitant to buy into this. If any of you know your compilers well enough, would you know if that's really the case or not? For instance, if I start using a lot of inheritance, is that going to make the order of object members in memory a sloppy, unordered mess? And will the declaration of object functions ever conflict with this, too (from what I've seen, it seems like functions are stored somewhere else in memory, separate from object attributes)?


I've also noticed something very weird when I declare character attributes in objects: Though they act like characters, they seem to be organized as if they were integers. IE, say I have a class like this:

class SomeClass
{
char a;
char b;
};

When I create an instance of this, and lookup the addresses of chars a and b, I'm finding that these chars are 4 bytes apart... although chars are only 1-byte a piece. I get this with objects, but not structs.
This is just on Visual Studio, though. Can anyone confirm if this holds true for other compilers?



EDIT:
Nevermind, guess I posted too soon. Google finally turned something up that explains all of the above: http://www.cprogramming.com/tutorial/size_of_class_object.html

playmesumch00ns
01-15-2007, 11:29 AM
The statement "I need to know the organisation of my data members in memory" sounds very dangerous to me. Can I ask why? Doing anything that relies on this will be very likely to break (potentially subtly and with disastrous consequences) if you change compiler, and even compiler version.

Ian Jones
01-15-2007, 12:34 PM
To answer one of your questions, I'm a newbie...but have been reading lots...so AFAIK:
Instances of a class are stored on the heap, whilst class definitions are stored on the stack. Two distinct area's of memory managed by the OS.

The book I;ve been reading recently (free ebook) 'Thinking in c++' is a great book which goes into a lot of detail and theory about how c++ works and includes information about when to watch out for compiler differences. It alsot akes about memory, compiler workins, linking etc... might be useful for you.

gga
01-15-2007, 04:56 PM
However, I'm hesitant to buy into this. If any of you know your compilers well enough, would you know if that's really the case or not? For instance, if I start using a lot of inheritance, is that going to make the order of object members in memory a sloppy, unordered mess? And will the declaration of object functions ever conflict with this, too (from what I've seen, it seems like functions are stored somewhere else in memory, separate from object attributes)?

You can indeed rely on the data of your class to follow the order of definitions under all compilers.
However, relying on the data to always be at a particular offset may or may not always be safe.

You can expect your data to be tightly packed as long as the data is properly aligned. Pointers, ints, floats and doubles are, in today's cpus, expected to be word or long word aligned.
As such, you should always try to define your variables with similar types together, roughly following this layout:

class myClass
{
// all pointers
// all doubles
// all floats
// all ints
// all shorts
// all chars
// all bools
};

The above layout will, in all cases, keep your class data being as small as possible. Mixing the order of floats/doubles/chars/bools will usually lead to your class sizes being bigger than expected. This fact is also true for simple inheritance. If the total size of a base class is not word aligned, but the derived class' first element requires it to be word-aligned, your derived class will be a little bigger than expected, as your compiler "pads" the data to make sure the data is indeed word-aligned.

The big issue when all bets are off is when dealing with: virtual inheritance, normal inheritance of classes with virtual methods or with multiple inheritance. In those cases using any sort of C cast, static_cast<> or reinterpret_cast<> or expected alignment of data is asking for trouble, as all compilers can align data differently (and often they do so).
Best you can do is rely on dynamic_cast<> in those cases.

If you want a thorough discussion of how compilers align data, get a copy of "Inside the C++ object model", by Stanley Lippman. You will find 300+ pages devoted to how, what and WHY compilers align data, structures and functions the way they do.

-Vormav-
01-16-2007, 05:59 AM
The statement "I need to know the organisation of my data members in memory" sounds very dangerous to me. Can I ask why? Doing anything that relies on this will be very likely to break (potentially subtly and with disastrous consequences) if you change compiler, and even compiler version.
Sure. It was just a rather naive idea at a method for creating a c++ interface in Mental Ray. The issue is that, when Mental Ray loads up, it allocates memory for all of your shader's parameters (based on how they've been defined in the .mi definition) in-line, just as if it had been allocated as a struct. A pointer to the beginning of this block of data is then passed on to the shader author.
Traditionally, in order to access individual parameters from this block of data, the shader writer is required to define a struct that matches the shader parameters exactly. When the shader is called up, the pointer is cast as a struct pointer of this type. And if the layout of your struct is identical to that of the memory block that's been allocated for your parameters, the members of your struct will point to each individual parameter.

That's fine. However, for the sake of having a nice C++-ified interface - and just for the record, I know that this isn't totally necessary. Writing shaders in pure C is still easy. I'm attempting it more for the sake of experimentation. I.E., just for the hell of it. - but anyway, for the sake of this interface, I wanted to have a generalized ParameterList object that would point to the parameters instead (and do a whole lot of other nice things). I also wanted to make it so that creating such an object would remove the necessity of creating the struct, too.
So you could have, say...

class MyShaderParameters : public ParameterList
{
Parameter<int> someInt;
Parameter<miColor> param1;
Parameter<miColor> param2;
};

...and be done with it.
But in order to use this class to access the individual parameters, without ever defining the struct that goes right along with it, I'd need to be able to cast a void pointer to my block of data (which is organized like a struct) as a pointer of my class type...which is probably unsafe, since many of the attributes of that pointer might point to unallocated memory space. But as long as you get the pointer to point at the right place, and only ever attempt to access the appropriate memory blocks, I think it'd be safe.

Byte-padding (which I wasn't previously aware of) not considered, this actually works... until you start adding more data before the parameters are declared. Then, before you cast the void pointer (that points to the block of data holding your parameters) as your class type, you'd just have to offset the address pointed to by your void pointer, so that when you cast it as your class type, the parameters in memory are in line with the parameters declared in your object.
For that, I thought up a little trick to get the appropriate offset. Take something like this:

class BaseParamList
{
char initializer; //always the first attribute
...
};

class ParamList : public BaseParamList
{
//(object related crap)
char structInitializer; //always right before parameter declarations
//(Parameters)
//(other object related crap)
};


Again, byte-padding not considered, you could create a "ParamList" object, and find out the offset of the parameters from the beginning of the object by finding the difference between the address of the "initializer" and "structInitializer" attributes. Then you could cast the void* of your data (which is basically a struct) as your class object, and offset it appropriately so that everything lines up. Say...
ParameterList *paramDataPtr = (ParameterList*)(((char*)pointerToData) + offsetFromObjectStart);
ParameterList params(paramDataPtr);
miColor inColor = params->param1.asColor();

...which actually does work, in simple cases. But yeah, I guess with byte padding, and the various other things that each compilers do with objects, it won't work. Cool idea, but impractical.

Oh well. There are other ways of implementing a 'ParameterList' object. They're just not quite as nice as what I was hoping for. Had the memory organization for objects been identical to structs - or even if it was something easily predictable - I could have even said to hell with this 'ParameterList' object, and have the parameters be contained as attributes within the shader object itself (again, without ever needing to define the struct), which would've been pretty nice...
:shrug:

CGTalk Moderation
01-16-2007, 05:59 AM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.