As mentioned previously, an instrumenting profiler needs to modify the profiled code. Doing this at compile-time is one reasonable method: it is simple to implement, and can provide exact data. Its main drawback is the inconvenience of recompilation of the target code, and the risk of skew as a result of the introduction of profiling code.
The GNU C compiler provides a small number of mechanisms which a profiler can use to support itself. Using the -pg to gcc, the compiler will insert calls to _mcount() into each function prologue (for details see final.c:profile_function() in the gcc sources). This function is eventually supplied by the C library, and collects the from and to PC values into a data structure, which can then be used to construct call-graph information. The same mechanism is used with the -a option, which is intended to allow basic-block profiling, although it is reputed to work poorly or not at all in a large number of cases.
The GNU C compiler provides another mechanism that can be used for profiling, with the -finstrument-functions option[gcc]. This will generate references at the start and end of each function to the following functions :
void __cyg_profile_func_enter(void (*fn)(), void (*parent)()); void __cyg_profile_func_exit(void (*fn)(), void (*parent)()); |
You can implement these functions, and use the function pointer values to construct profiling data. Typically, a profiler would use the PC values passed to look up the function names in the binary image, so a user-readable call-graph report can be generated. Note that these are weak symbols so profiling via this method can be done via LD_PRELOAD.
GCC provides increasing support for profile-directed optimization. This technique uses program profile data in order to guide compilation decisions, in the hope that the compiled program will behave similarly, improving overall performance. This feature is enabled by the -fprofile-arcs option, which then produces a .da profile, containing arc traversal data (in this context, an arc represents a program branch to a basic block, a straight-line section of code). This can then be fed back in for a second compile run, this time additionally using -fbranch-probabilities. See the GCC manual and [gccprofiledriven] for more information.