This is an old revision of the document!
−Table of Contents
Measuring performance
In general, to obtain useful information from binaries, certain compile flags should be used.
Perf
For using perf, one should compile OOFEM with the flag -fno-omit-frame-pointer
First record a session using
perf record -g ./oofem -f myinputfile.in
which will generate a perf.data file.
Then you can visualize the results in several ways. A good, simple to understand method is to use Gprof2Dot to generate a complete callgraph:
perf script | ./gprof2dot.py -f perf | dot -Tsvg -o output.svg
Or use the ncurses program
perf report -G –sort comm,dso
Perf has very small overhead, but only does statistical sampling.
Callgrind
Callgrind is a tool in valgrind, and should only be used on medium. This has a huge overhead, so expect OOFEM to run over a hundred times slower through valgrind. Simply run
valgrind –tool=callgrind ./oofem -f myinputfile.in
and you will produce a new file named callgrind.out.123456
where the numbers at the end are randomized.
Open this file in Kcachegrind to visualize the results.