Aligning data on boundaries can help performance. The Intel® compiler attempts to align data on boundaries for you. However, as in all areas of optimization, coding practices can either help or hinder the compiler and can lead to performance problems. Always attempt to optimize using compiler options first. See Optimization Options Summary for more information.
To avoid performance problems you should keep the following guidelines in mind, which are separated by architecture:
IA-32, Intel® 64, and IA-64 architectures:
Do not access or create data at large intervals that are separated by exactly 2n (for example, 1 KB, 2 KB, 4 KB, 16 KB, 32 KB, 64 KB, 128 KB, 512 KB, 1 MB, 2 MB, 4 MB, 8 MB, etc.).
Align data so that memory accesses does not cross cache lines (for example, 32 bytes, 64 bytes, 128 bytes).
Use Application Binary Interface (ABI) for the Itanium® compiler to insure that ITP pointers are 16-byte aligned.
IA-32 and Intel® 64 architectures:
Align data to correspond to the SIMD or Streaming SIMD Extension registers sizes.
IA-64 architecture:
Avoid using packed structures.
Avoid casting pointers of small data elements to pointers of large data elements.
Do computations on unpacked data, then repack data if necessary, to correctly output the data.
In general, keeping data in cache has a better performance impact than keeping the data aligned. Try to use techniques that conform to the rules listed above.
See Setting Data Type and Alignment for more detailed information on aligning data.