If you have ever had to deal with pointer alignment issues then the C offsetof macro is your friend. It is often quite difficult to identify alignment issues other than seeing odd behaviour in your program. It is assumed the compiler will pad structures accordingly based on the architecture you are compiling your code for. Unfortunately, this is not always the case, some architectures need all structures to be 32-bit aligned. This can be mitigated by using some C macros like __packed__ or __padded__, or even using some gcc flags. You can however, use some of the predefined macros in your program to help identify these odd behaviours, one of which is the offsetof Macro.
The offsetof macro will return the offset of the field member from the start of the structure type. It is defined as:
// Header required
#include <stddef.h>
// Function definition
size_t offsetof(type, member);
This is useful because the size of the fields that make the structure can differ across implementations. Some compilers will automatically pad bytes between fields to align them correctly for that CPU, these means that just because you have 1 char followed by 1 int, the int may not be 1 byte after the start of the first char.
Knowing how the bytes are laid out in memory can help identify where pointer alignment issues may occur, especially if you are pointing to structures and modifying data in memory.
An example of using the offsetof macro:
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
struct test {
char a;
int b;
double c;
long d;
};
int main()
{
struct test data;
fprintf(stdout, "char: %u, int: %u, double: %u, long: %u\n",
offsetof(struct test, a),
offsetof(struct test, b),
offsetof(struct test, c),
offsetof(struct test, d));
return 0;
}
Running the program:
$ ./offset
char: 0, int: 4, double: 8, long: 16
All calculations are based off the start of the structure, and thus our char is at boundary 0, the integer begins 4 bytes after. Wait?! I thought a char was only 1 byte? Well guess what, it is, the compiler has padded the structure to have the integer begin on a 32bit boundary to help alleviate pointer alignment issues. This is what we want, and from our perspective it really makes no difference as long as when we reference the integer it references 4 bytes after char instead of simply 1 byte after char.
The offsetof macro is defined using a special form, lets take a look:
#define offsetof(st, m) __builtin_offsetof(st, m)
Behind the scenes this function does some pointer arithmetic to identify the offset of each member of the structure then returns that offset to the programmer. Nifty function! Using the offsetof function and displaying pointer addresses you can help narrow down oddities in your code that may be due to pointer alignment and byte boundary issues.
Comments