Profiling ASAP and other Python extensions written in C/C++

There exist a number of profiling tools, most of them work even for dynamically loaded objects, but they generally require special compilation, and introduce an overhead that perturbs the measurement. For Linux, the oprofiler does not suffer from these drawbacks, but it does require root access and help from the kernel.

oprofiler prerequisites

The oprofiler must be installed, and support must be enabled in the kernel, for example as a module. You can check for oprofiler support in the current kernel like this:

~ $ zcat /proc/config.gz | grep OPROFILE
CONFIG_OPROFILE=m
# CONFIG_OPROFILE_IBS is not set
CONFIG_HAVE_OPROFILE=y

Compiling the module

Nothing special needs to be done when compiling your module. However, optimization makes it difficult to assign timings to individual lines of the source code, so if you do line-by-line profiling, and the results look weird, try recompiling with -O1 -g.

Starting oprofiler

Disable any CPU frequency throttling, it may distort the results.

Only root can start the oprofiler:

# Load the kernel module etc.
opcontrol --init
# We do not want to profile the kernel
opcontrol --no-vmlinux

Taking data

Again, root must start and stop the actual data aquisition:

# Delete any data from previous measurements
opcontrol --reset
# Start taking data
opcontrol --start

Now the kernel samples everything on the system. Any user can run the program being measured:

$ asap-python MyScript.py

Note that I choose to use the static binary compiled for parallel simulations (although the script is serial). This is not necessary, but avoids mixing Asap profiling data with data from other Python programs running on the system.

After taking data, root must stop the profiler:

# Stop taking data
opcontrol --stop
# Flush the buffers
opcontrol --dump

Generating reports

Function/method level profiling

$ opreport --demangle=smart --symbols `which asap-python`

Call graph

$ opreport -cl --demangle=smart `which  asap-python`

Line-by-line profiling

$ opannotate --source --output-dir=$HOME/src/annotated `which asap-python`

Now a directory tree is created in $HOME/src/annotated/path/to/original/source containing all the source files, but with annotations giving the timing.

For more examples, see the oprofiler home page.

Shutting down oprofiler

Finally, root can stop the oprofiler and unload the kernel module:

opcontrol --deinit