Benchmarking and performance

Finding performance bottlenecks is important. All cases inside test_assembly.cc are quite optimized but this does not cover all possible cases.

Profiling a specific program that is based on GetFEM, only requires that the program (if compiled at all) and GetFEM itself are compiled with the debug flag ā€œ-gā€. Then the program can be started with perf, as in the examples:

OMP_NUM_THREADS=1 perf record --call-graph dwarf ./test_assembly

or

OMP_NUM_THREADS=1 perf record --call-graph dwarf python3 gf_benchmark.py

The profiled program will run at its normal speed and perf will create a quite big file called perf.data with all profiling information. To visualize this file, just start hotspot in the same folder with

hotspot

by default it will search for a file called perf.data and will provide a visualization like this

1 Like