The Intel compiler supports a variety of options and features that allow optimization opportunities; however, in most cases you will benefit by applying optimization strategies the order listed below.
Use the automatic optimization options, like -O1, -O2, -O3, or -fast (Linux* and Mac OS*) and /O1, /O2, /O3, or /fast (Windows*) to determine what works best for your application. Use these options and measure the resulting performance after each compilation.
You might find specific options to work better on a particular architecture:
IA-32 and Intel® 64 architectures: start with -O2 (Linux and Mac OS) or /O2 (Windows).
IA-64 architecture: start with -O3 (Linux and Mac OS) or /O3 (Windows).
If you plan to run your application on specific architectures, experiment with combining the automatic optimizations with compiler options that specifically target processors.
You can combine the -x and -ax (Linux* and Mac OS*) or /Qx and /Qax (Windows*) options to generate both code that is optimized for specific Intel processors and generic code that will run on most processors based on IA-32 architecture. See the following topics for more information about targeting processors:
Attempt to combine the automatic optimizations with the processor-specific options before applying other optimizations techniques.
Experiment with Interprocedural Optimization (IPO) and Profile-guided Optimization (PGO). Measure performance after applying the optimizations to determine whether the application performance improved.
Use a top-down, iterative method for identifying and resolving performance-hindering code using performance monitoring tools, like the Intel® VTune™ Performance Analyzer or the compiler reports.
If you are planning to run the application on multi-core or multi-processor systems, start the parallelism process by using the parallelism options or OpenMP* options.
Use automatic optimization options and other processor-independent compiler options to generate optimized code that do not take advantage of advances in processor design or extension support. Use the -x (Linux* and Mac OS*) or /Qx (Windows*) option to generate processor dispatch for older processors.