
An Overview of the Intel IA-64 Compiler (continued)
Page 4 of 15
PROFILE-GUIDED OPTIMIZATIONS
The compiler may be able to take the fullest advantage of the IA-64 architecture when accurate information about the program execution behavior, called profile information, is available. Profile information consists of a frequency for each basic block and a probability for each branch in the program.
The Intel IA-64 compiler gathers profile information about the specified program and annotates the intermediate language for the program with this information. The compiler supports two modes for determining profile information: static and dynamic. Static profiling, as the name suggests, is collected by the compiler without any trial runs of the program. The compiler uses a collection of heuristics to estimate the frequencies and probabilities, based on knowledge of "typical" program characteristics. Static information is necessarily approximate because it must be general enough to work with all programs. The compiler uses static profiling information whenever the optimizer is active, unless the developer selects dynamic profiling.

Figure 2: Steps in dynamic profile-guided compilation
Dynamic profiling information, or profile feedback, is gathered in a three-step process as shown in Figure 2. Instrumented compilation is the first step, where the application developer compiles all or part of the application with the prof_gen option, which produces executable code instrumented to collect profile information. The developer then runs the instrumented code one or more times with "typical" input sets to gather execution profiles. Finally, the developer compiles the application again, this time using the prof_use option, which combines the gathered profiles and annotates the internal representation of the program with the observed frequencies and probabilities. Many optimizations then read the information and use it to guide their behavior. The Intel IA-64 compiler uses profile information to guide several optimizations:
- The compiler uses profile information to integrate procedures that are most frequently executed into their call sites, thereby providing the benefits of larger program scope while minimizing code growth.
- Profile information is also used to guide the layout of procedures and blocks within procedures to reduce instruction cache and TLB misses.
- Finally, the compiler uses profile information to make the best use of machine instruction width and speculation features. By knowing the program's execution behavior at scheduling time, the instruction scheduler is capable of selecting the right candidates for speculation.
|