Friday, 19 April 2013

Alignment Agnosticism (avoid #pragma pack)

A nasty problem that can occur in C/C++ is bad alignment when passing structs as parameters. For example, a provider of a module may provide a header file with structs having built their library or DLL using the default alignment setting. If the consumer of the module uses the header file with a different alignment setting then all sorts of nasty problems can occur since the caller and the callee have a different idea of the layout of the structure in memory. Alignment is generally controlled by using a command line switch (eg, /Zp in MSC) or with the #pragma pack() compiler directive.

How should this sort of problem be avoided? I have seen arguments from both sides (depending on whether you are the provider or consumer of the library).

Consumer: The provider should protect the use of their header (using #pragma pack) to make it independent of the current alignment setting.

Provider: The client should make sure the default alignment setting is active before #including header files.  If it needs to be changed before a struct declaration then it should be set back to the default immediately afterwards.

So who is correct? Actually, both sides have good points. If either (or preferably both) follow my recommendations (see below) the problem can be avoided.

What is padding for?

The C standard allows a compiler to add padding bytes to structs (and unions) after any field to allow the following field to be aligned according to any specific requirements of the compiler (or the user of the compiler). The standard does not specify, but typically a compiler will provide a command line argument to specify the (default) alignment. Good compilers also invariably support the de facto standard of #pragma pack, including push and pop options.

Padding bytes provide improved performance by reducing the amount of memory accesses required by suitably aligned data types. For example, on a 32-bit processor (more specifically a system which uses memory with 32 data lines) accessing a 32-bit integer will require two memory accesses when reading and writing the value rather than just one if it is stored on a 4-byte boundary (ie, unless the bottom two bits of the address of the integer are zero).

Some processors have alignment requirements which means that badly aligned data will actually terminate the program. For example, on the M68000 integers must be stored at even addresses.

However, the compiler should not allow the use of #pragma pack() to cause this problem.
The performance advantage of aligning data properly can be significant. In the past I have found that inner loop speeds can be more than doubled by fixing alignment issues. Nowadays I suspect that CPU caches mean that mis-alignment does not affect performance so much.

How does it work?

I won't explain in detail how padding is done as there are probably many other sources of this information. The rules are fairly simple but the consequences may be hard to comprehend, so you may want to read further and ponder different scenarios.

In brief, the values allowed for #pragma pack() are invariably powers of two. For example, Microsoft C has always supported 1, 2, 4 and 8 (and probably now supports larger values). Each field of a struct has alignment requirements equal to its size.  (An array has an alignment requirement equal to that of its base type; a nested struct has a requirement equal to the largest of its fields.)

Padding bytes are added before a field (except the first) to force alignment to the smaller of the alignment requirements of the field and the current alignment setting. Also padding bytes may be added at the end of the struct suitable for the field with the largest alignment requirement.  (This is mainly so that all elements of an array of the struct will be suitably aligned.)

For example, consider the following struct.


#pragma pack(4)
struct
{
    char a;         // 1 padding byte after a
    short b;
    char c[5];      // 3 padding bytes after c
    double d;
    char e[2];      // 2 padding bytes added at the end
};

Here one padding byte is added before b so it is aligned to a 2-byte boundary, because a short has an alignment requirement of 2 bytes. Three bytes are added before d to take it to a 4-byte boundary, because a double (assuming it's 8 bytes) has an alignment requirement of 8, but the alignment as specified in the #pragma pack() is only 4.  Finally, two padding bytes are added at the end to take the struct size to 24 - this ensures that the elements of an array of structs has the required alignment (in this case 4, due to the presence of d).

Graphically, the bytes are laid out like this:

Problems

Problems occur when structures are used with a specific alignment setting but used elsewhere or at a later time with a different setting. One way this could happen is if structures have been written to a file then the software that uses the structures is rebuilt with a different alignment setting. Old files, written using the original alignment setting, will be unusable due to misalignment of the data bytes.

However, the most common way that this problem occurs is in header files that define an interface to a module of some sort, such as a library or DLL.

For example, I encountered a very nasty problem when I was a young C programmer in the mid-1980's. I had developed a program that wrote data to binary files using different structs. I also used structs to pass data between different modules. At some stage we required the use of a third party library which provided a header file allowing us to interface with the library.

After #including the header file (in two of many modules) many strange and inexplicable things started to happen to the software. Suddenly structures passed between modules were completely different between the caller and the callee. Also reading data from old files resulted in seemingly "corrupted" values.

The problem was that the vendor had used #pragma pack(2) in their header file, but neglected to set it back to the default. This affected private structures in my source code. It could also affect structures in other header files, if  #included after the problem one, unless they themselves were protected using #pragma pack().

In those days, the Microsoft compiler I was using did not support the push/pop options of #pragma pack. However, the sensible thing for the vendor to do was to set alignment back to the default after it was changed like this:

#pragma pack(4)      // change alignment to 4
struct { ... };      // struct that requires alignment of 4
#pragma pack()       // reset alignment back to the default

Even better would be to design their structs to avoid the need for ever using #pragma pack() as I will explain now.

Alignment Agnostic

After the above experience my strategy was to always create structs that are what I call alignment agnostic. That is, the struct will have the same layout in memory, irrespective of the active alignment setting. In other words, carefully manage the size and placement of the fields of a struct so the compiler will never add padding bytes automatically.

How do you create alignment agnostic structs?

First, you have to understand how and where padding bytes would be added. If you did not understand my brief explanation above then you might try reading about it elsewhere.

You also need to know the size of your data types, especially if the structures need to be portable.  For example, is an int 16-bits or 32-bits? Of course, if the structures need to be portable then you should define your own types, such as INT16, INT32, etc and use those in the structures.

Then you can use these techniques:

1. Reorder fields in the structure. For example, you might group two short fields together to make a unit of 4 bytes, then follow it by a long (32-bit integer) so all three fields take up 8 bytes.

Not only does this avoid alignment problems but you may find that the struct is in fact smaller. And the decrease in size is not accompanied by a decrease in performance, as everything is still nicely aligned.

2. Increase the size of numeric fields. For example, using a short instead of an unsigned char may avoid the addition of a padding byte if it is on an even (word) boundary and the following field has an alignment requirement of 2.

This also has the advantage that you might avoid an unanticipated overflow with integers or loss of significance with floating points numbers.

3. Increase the length of strings. For example add one or two more characters to the end of the string to allow the following field to align to a 4-byte boundary (for a following 32-bit value) or 8-byte boundary (eg for a double).

This also has the advantage of lessening the chance off buffer overflow problems. For example, someone may forget to allow for a the string terminator and copy an extra byte into a character array.

4. Add dummy padding bytes manually.

For example, see char unused[6] in the example below.

Using the above struct as an example, we could reorganize like this.

struct
{
    char a;
    char c[5];
    short b;
    double d;
    char e[8];
};

Note that the struct is still 24 bytes in length but the positions of the fields (and the size of the struct) will not vary depending on the current alignment setting.  This is because
  1. the short field (b) is on a 2-byte boundary
  2. the double field (d) is on an 8-byte boundary
  3. the size of e is increased so the length is multiple of 8 bytes

Alternatively you can add a padding field at the end instead of increasing the size of e. This has the advantage that you could add extra field(s) in place of the unused bytes.


struct
{
    char a;
    char c[5];
    short b;
    double d;
    char e[2];
    char unused[6];
};

Note that for alignment up to 4, you only need to make unused 2 bytes in size, giving the struct a size of 20. But if alignment of 8 (or greater) is ever used then d needs to be on an 8-byte memory boundary and the struct size needs to be 24 bytes (ie, a multiple of 8).


Bit-fields

At this stage it would be remiss of me not to mention how bit-fields work. Bit-fields probably cause the most confusion when dealing with alignment. (If you don't use bit-fields you can skip this section.)

First according to the C standard all bit-fields whether declared as char, long, unsigned short, etc are treated the same. Since bit-fields are packed together in the same storage unit and since the storage unit is the natural integer size on the processor (usually the size of int), then the following structures would all have the same size. This would typically be 4 bytes on a 32-bit processor (probably 2 bytes on a 16-bit processor).

struct unsigned char a: 1; };
struct unsigned short b: 1; };
struct unsigned int c: 1; };

However, most compilers allow an extension to the standard where the bit-field storage unit is taken from the the base field type. In many compilers the above structures would have sizes of 1, 2 and 4 bytes respectively.

Note that consecutive bit-fields are placed by the compiler into the same storage unit, until its overflows and a new storage unit is begun. However, a bit-field of a different base type may also trigger a new storage unit. So, in the following struct the total size would be 4, which may surprise you.  The first storage unit (containing a and b) has size 1, then there is a padding byte, followed by another storage unit (containing c) of size 2.

struct
{
   unsigned char a: 1;
   unsigned char b: 2;
   unsigned short c: 1;
};

Graphically:

[Actually, if alignment is 1, then there is no padding byte and the struct size is 3.]

I guess the point is that when creating alignment agnostic structures with bit-fields you have to know how the compiler(s) you use allocate bit-field storage units. When bit-field storage units are used they have padding bytes added just as if they were integers of the same size.

Recommendations

For creators of header files:
  • make structures alignment agnostic
  • test that sizeof(your_struct) is invariant with different alignments (say 1, 8)
  • if you don't want to or know how to do that then use #pragma pack push/pop
  • add #pragma pack(push, n) in the header immediately before your struct(s)
  • add #pragma pack(pop) immediately after your last struct
  • never change the alignment before #including another header file
  • if push/pop is not supported at least use #pragma() to restore the default
For users of header files:
  • never change alignment before #including a header (in case it has problems)
  • protect structures in your own code with #pragma pack push/pop
  • OR better make your structures alignment agnostic

Summary

Luckily, this sort of problem is not common any more. C software does not often write structures to disk or pass stuff around using structs - all sorts of other things are available such as XML serialisation. When structs are used vendors have learnt to protect their header files.

The problem highlights another reason why I like C#.  .Net does not require the use of header files to define the interface for a module; instead an assembly's metadata is used by the compiler to find out about functions, their parameters, and anything else that is exposed publicly. This system avoids this and other problems that affect C and C++ programs.