PDA

View Full Version : Binary data and endian...ness


-Vormav-
02-05-2007, 12:52 PM
Hey guys. I'm writing a binary data translator, and am starting to deal with the issue of big vs. little endian. I understand the issue of byte-swapping just fine; that's easy. But there's one little issue I'm getting confused on: whether implementing endian checking should be done as a preprocessor stage, or as runtime process.

The basic question I have is this: Is the endian type of data in a program determined by the machine running the compiled code, or is the endian type a factor of the compiler being used, or both?
And if this is something decided by the compiler, are there any common predefined macros in c++ to identify the type of endian, or is this the sort of custom little macro you'd have to define yourself, based on your knowledge of your compiler and settings? I've looked around for such a macro, but can't seem to find any standard predefined types for it. Yet, I have seen at least one example of endianness being checked as a preprocessor step... So, just looking for the proper way of setting all of this up. If I have a definite way of knowing my endian type, then my job is really quite easy.

Thanks

UrbanFuturistic
02-05-2007, 01:59 PM
As it is, I've never had to deal with any of this directly as I've never compiled on a PPC Mac and, as it goes, I'll probably be using SDL which handles all this stuff for you.

What I can say, is that it's both. Endianness at compile time is defined through compiler options so that the data held in the program is in the correct byte order for the processor that will be running it. If you're compiling on the target platform it should be set correctly as a default and if you're cross-compiling it should hopefully be set correctly for your target platform.

gga
02-06-2007, 03:14 AM
The basic question I have is this: Is the endian type of data in a program determined by the machine running the compiled code, or is the endian type a factor of the compiler being used, or both?

Neither.

The endianess of the file formats (data) used in your program is determined by you, the programmer. If you want your image/scene/file formats to be all big-endian, for example, you can, even when running on a little endian machine.
Your code can avoid all complications about endianness if your file formats provide an endianness check in and by themselves, by say, checking for a magic number. If you read your file and find the magic number reversed, you know you need to swap your data, regardless of where the data came from (and without even caring what endianness your OS uses).

But to answer your question, the actual endianess, however, is OS dependant. If you do need to find the endianness of your machine, it is often ideal and more efficient to determine what OS you are compiling for at compile/preprocessor time. Usually, this is done by checking the macros the compiler sets for you (#ifdef WIN32, etc), and extrapolate the endianness from it.

UrbanFuturistic
02-06-2007, 08:51 AM
But to answer your question, the actual endianess, however, is OS dependant.It's really not.

Introduction to Endianness (http://www.netrino.com/Publications/Glossary/Endianness.php)
Wikipedia on Endianness (http://en.wikipedia.org/wiki/Endianness)

It doesn't matter which OS you're using on an x86 CPU, it's little endian. End of story. Similarly, the Power PC chip is always big endian. :wip:

gga
02-07-2007, 01:03 AM
It's really not.

Introduction to Endianness (http://www.netrino.com/Publications/Glossary/Endianness.php)
Wikipedia on Endianness (http://en.wikipedia.org/wiki/Endianness)

It doesn't matter which OS you're using on an x86 CPU, it's little endian. End of story. Similarly, the Power PC chip is always big endian. :wip:

Actually, it IS OS dependant (besides being chip dependant). While your first statement is true, the second is not. The PPC chip you mentioned, for example, is a chip that can run either in big endian or in little endian mode (G4/G5's use it in big endian mode, XBOXes in little endian mode). It is up to the OS/BIOS to configure it to run either way. Same is true for DEC Alphas, IA64, MIPS, etc.
But, to clarify any confusion with my statement... Just because you are running Linux, Solaris or OSX, for example, does not immediately mean that you know your endianess, as those OSes can run under different chip architectures.

-Vormav-
02-07-2007, 11:46 AM
Thanks for the pointers guys. I think I've got everything all cleared up, though I ended up checking it a little differently. There's an integer type with only two valid values towards the header of the file. I read in that value and check if the value I read is one of those valid values or not. If it is, I know I'm dealing with the right byte order. Otherwise, I try and swap the bytes. If it works, I byte-swap everything else. If it doesn't, I return an error.
Simple enough solution I guess; it's not like the byte order would vary within the file. I probably should have thought of that earlier. :wip:
It's working just fine now, anyway. Definitely needed the byte-swapping on my system.

stew
02-07-2007, 08:48 PM
If you're developing for Mac OS X, you should definitely read the Universal Binary Programming Guidelines (http://developer.apple.com/documentation/MacOSX/Conceptual/universal_binary/index.html). Even if you're not on OS X, some of that information might still be useful.

mummey
02-09-2007, 02:39 AM
gga: actually the G5 didn't have a little endian mode.

-Vormav-
02-09-2007, 08:10 AM
Just out of curiosity, would the following snippet of code be a safe way to test for the current (running) endian type (assuming we have no weird types like middle endian)?



int i = 1;

char* ptr = (char*)&i;

if( (int)ptr[0] == 1 && (int)ptr[1] == 0 && (int)ptr[2] == 0 && (int)ptr[3] )
{
byteOrder = LITTLE_ENDIAN;
}
else
{
byteOrder = BIG_ENDIAN; //assuming support of only two endian types
}


I may be off on this, but by the way I understand it, if you read an integer (1) byte by byte, then if you're in little endian mode, you can expect to see the data in the first byte. If you're in big endian mode, you'd expect to see the data in the last byte.

...right? :curious:

CGTalk Moderation
02-09-2007, 08:10 AM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.