perf is installed with
sudo apt install linux-perf
perf is installed with
sudo apt install linux-perf
Hello Kostas,
That gives me
[sudo] password for eac:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-perf
Sincerely,
Eric Comstock
ok, it seems that the package name is a bit different in Ubuntu than in Debian, try with
sudo apt install linux-tools-common
or
sudo apt install linux-tools-generic
the installation is with apt in any case, do not try anything else for installing software in your Ubuntu.
Hello Kostas,
Thank you! I am still having a problem installing, though - during the installation step of the guide you sent, I got
eac@GREIVOUS2:/mnt/c/Users/Admin/Documents/Important Files/GT/Research/Summer 2025/Memos/06 02 2025 - getFEM high-D pare
llelization testing$ uname -r
5.15.146.1-microsoft-standard-WSL2
eac@GREIVOUS2:/mnt/c/Users/Admin/Documents/Important Files/GT/Research/Summer 2025/Memos/06 02 2025 - getFEM high-D parellelization testing$ sudo apt install linux-tools-5.15.146.1-generic
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-tools-5.15.146.1-generic
E: Couldn't find any package by glob 'linux-tools-5.15.146.1-generic'
eac@GREIVOUS2:/mnt/c/Users/Admin/Documents/Important Files/GT/Research/Summer 2025/Memos/06 02 2025 - getFEM high-D parellelization testing$
Please advise.
Sincerely,
Eric Comstock
the link I provided is just FYI, you were not meant to run this command. Just install linux-tools-generic
Hello Kostas,
I tried that as well, and I am still getting an error when I try to run perf. I am getting the error message
You may need to install the following packages for this specific kernel:
linux-tools-5.15.146.1-microsoft-standard-WSL2
linux-cloud-tools-5.15.146.1-microsoft-standard-WSL2
You may also want to install one of the following packages to keep up to date:
linux-tools-standard-WSL2
linux-cloud-tools-standard-WSL2
However, when I try to install the mentioned packages, I get the error that they do not exist.
Sincerely,
Eric Comstock
what is the output of these commands?
whereis perf
ls /usr/lib/linux-tools/
Hello Kostas,
Thank you!
I tried running those commands, and I got the following output:
eac@GREIVOUS2:/mnt/c/Users/Admin$ whereis perf
perf: /usr/bin/perf /usr/share/man/man1/perf.1.gz
eac@GREIVOUS2:/mnt/c/Users/Admin$ ls /usr/lib/linux-tools/
5.15.0-141-generic
However, I still have an error when I try to use perf.
Sincerely,
Eric Comstock
instead of perf use
/usr/lib/linux-tools/5.15.0-141-generic/perf
in the command for running your program
Hello Kostas,
Thank you! That seems to have worked. I will parse the results and send you what I find.
Sincerely,
Eric Comstock
Hello Kostas,
First, I has to turn down the sampling frequency to 100 to prevent errors relating to the perf.data file being 5 GB. I got some results that did not make a whole lot of sense to me. The command I ran was
LD_LIBRARY_PATH=/home/eac/FEM_cpp_testing/test_compilation_folder/opt/lib/ PYTHONPATH=/home/eac/FEM_cpp_testing/test_compilation_folder/opt/lib/python3.10/site-packages/getfem /usr/lib/linux-tools/5.15.0-141-generic/perf record -F 100 --call-graph dwarf python3 test6D_linear_BCs_fast_default.py
and then
hotspot
which generated a number of errors:
QStandardPaths: wrong permissions on runtime directory /run/user/1000/, 0755 instead of 0700
QStandardPaths: wrong permissions on runtime directory /run/user/1000/, 0755 instead of 0700
feature not properly read PerfHeader::BPF_PROG_INFO 4 0
feature not properly read PerfHeader::SAMPLE_TIME 16 0
feature not properly read PerfHeader::CACHE 6828 0
feature not properly read PerfHeader::CPU_TOPOLOGY 1068 756
feature not properly read PerfHeader::BPF_BTF 4 0
Linux version "5.15.179" detected. Switching to automatic buffering.
unhandled event type 73 PERF_RECORD_THREAD_MAP
unhandled event type 74 PERF_RECORD_CPU_MAP
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/hwloc.sm. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/initial-pmix_shared-segment-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smlockseg-3960668161. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds12_27269/initial-pmix_shared-segment-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds12_27269/dstore_sm.lock. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds12_27269/dstore_sm.lock. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds12_27269/initial-pmix_shared-segment-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smlockseg-3960668161. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/initial-pmix_shared-segment-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smseg-3960668161-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smdataseg-3960668161-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smseg-3960668161-0. This can break stack unwinding and lead to missing symbols.
PerfUnwind::MissingElfFile: Could not find ELF file for /tmp/ompi.GREIVOUS2.1000/pid.27269/pmix_dstor_ds21_27269/smdataseg-3960668161-0. This can break stack unwinding and lead to missing symbols.
Invalid memory read requested by dwfl fffffffffffffffc
Please let me know if any of these invalidate the results.
Here are the results:
Format:
Symbol
Binary
CPU clock time %
libstdc++.so.6.0.30std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
libstdc++.so.6.0.30
29.3%
gmm::strongest_value_type<std::vector<double>, gmm::rsvector<double> >::value_type gmm::vect_sp<std::vector<double>, gmm::rsvector<double> >(std::vector<double> const&, gmm::rsvector<double> const&)
_getfem.cpython-310-x86_64-linux-gnu.so
24.8%
void gmm::add_spec<gmm::scaled_vector_const_ref<gmm::rsvector<double>, double>, std::vector<double> >(gmm::scaled_vector_const_ref<gmm::rsvector<double>, double> const&, std::vector<double>&, gmm::abstract_vector) [clone .isra.0]
_getfem.cpython-310-x86_64-linux-gnu.so
24.1%
void gmm::range_basis_eff_Lanczos<gmm::col_matrix<gmm::rsvector<double> > >(gmm::col_matrix<gmm::rsvector<double> > const&, std::set<unsigned long>&, double)
_getfem.cpython-310-x86_64-linux-gnu.so
7.27%
dgemm_
libblas.so.3.10.0
1.97%
I could only find strongest_value_type, add_spec, range_basis_eff_Lanczos, and dgemm_ in header files in the source code, and I could not find _Rb_tree_increment.
Do you know what these do?
Sincerely,
Eric Comstock
useful results
dgemm_ is blas matrix-matrix product, you should spend most of your computations in calls like this
the called function is not strongest_value_type the function called in this line is gmm::vect_sp
gmm::add_spec is just addition of vectors, you need to see if this is function runs slow (e.g. applied to very large vectors), or if it is just run a gazillion times (because of some poor programming logic)
range_basis_eff_Lanczos has to do with eliminating the null space of a multiplier when you have used the add_multiplier function to add the multiplier. This computation time, you could save if you define your multiplier as a normal (filtered) variable, if you restrict yourself the region and order of the variable, so that the resulting constraint will have no null space.
In general try also to check who is the caller of these time consuming functions, to get an idea why they are called so often (or with too much data) in the first place.
Hello Kostas,
Thank you for the detailed breakdown!
Okay - it makes sense that dgemm is called, but it only takes up 2% of the runtime. It is called by a series of functions, which is called by standard_solve (which makes sense).
Thank you!
Got it - is there a way to find this out in perf, or should I create a modification to the sourcecode to measure it directly (e.g. printing something into a file whenever it is run)?
Okay - how do I do that? I remember I was using the filtered boundary conditions earlier, but we switched to add_linear_term.
It seems that range_basis_eff_Lanczos also is what calls add_spec, vect_sp, and _Rb_tree_increment. This function seems to take up most of the time. It is called by range_basis, which is called by model::actualize_sizes(), which is from model::nd_dof, from ga_workspace::ga_workspace, from add_linear_term_, from add_linear_term.
Sincerely,
Eric Comstock
your interpretation looks correct, it all comes from your use of “add_multiplier” which leads to an expensive null-space computation, only the first time the dofs of the multiplier variable need to be defined.
You need to figure out how to replace your add_multiplier with add_filtered_fem_variable, and still get the same results (select correct order of multiplier fem, and correct region). This will save you all these heavy computations.
Hello Kostas,
Thank you!
I got it to partially work with add_filtered_fem_variable, but it caused me to get the “ICNTL(14) too low” error again. To change this, I only need to edit ICNTL(14) in gmm_MUMPS_interface.h, correct?
I am using
md.add_filtered_fem_variable("mult46", mf, 46)
logging.info('Multiplier 1/6')
md.add_filtered_fem_variable("mult47", mf, 47)
logging.info('Multiplier 2/6')
md.add_filtered_fem_variable("mult48", mf, 48)
logging.info('Multiplier 3/6')
md.add_filtered_fem_variable("mult49", mf, 49)
logging.info('Multiplier 4/6')
md.add_filtered_fem_variable("mult50", mf, 50)
logging.info('Multiplier 5/6')
md.add_filtered_fem_variable("mult51", mf, 51)
logging.info('Multiplier 6/6')
to get my multipliers, and the error is
thon3.10/site-packages/getfem python3 test6D_add_filtered_fem_variable.py
Level 1 Warning in getfem_regular_meshes.cc, line 33: CAUTION : Simplexification in dimension >= 5 has not been tested and the resulting mesh should be not conformal
message from gf_model_get follow:
List of model variables and data:
Variable f 1 copy fem dependant 4096 doubles.
Variable mult40 1 copy fem dependant 3068 doubles.
Variable mult46 1 copy fem dependant 3056 doubles.
Variable mult47 1 copy fem dependant 3056 doubles.
Variable mult48 1 copy fem dependant 3056 doubles.
Variable mult49 1 copy fem dependant 3056 doubles.
Variable mult50 1 copy fem dependant 1024 doubles.
Variable mult51 1 copy fem dependant 1024 doubles.
Data B1 1 copy fem dependant 4096 doubles.
Data B2 1 copy fem dependant 4096 doubles.
Data B3 1 copy fem dependant 4096 doubles.
Data DirichletData 1 copy fem dependant 4096 doubles.
Data E1 1 copy fem dependant 4096 doubles.
Data E2 1 copy fem dependant 4096 doubles.
Data E3 1 copy fem dependant 4096 doubles.
Data zeros 1 copy fem dependant 4096 doubles.
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: (source term): generic source term assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3476: Generic linear assembly brick: generic matrix assembly
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: Source term: generic source term assembly
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: Source term: generic source term assembly
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: Source term: generic source term assembly
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: Source term: generic source term assembly
Trace 2 in getfem_models.cc, line 3308: Generic source term assembly
Trace 2 in getfem_models.cc, line 3319: Source term: generic source term assembly
Trace 2 in getfem_models.cc, line 2652: Global generic assembly RHS
Assembly time 2.48828
iter 0 residual 1
Trace 2 in getfem_models.cc, line 2654: Global generic assembly tangent term
Assembly time 1.49739
logic_error exception caught
Traceback (most recent call last):
File "/mnt/c/Users/Admin/Documents/Important Files/GT/Research/Summer 2025/Memos/06 16 2025 - getFEM term filtering/test6D_add_filtered_fem_variable.py", line 309, in <module>
force, stability, result_arrays = calc_F(0e-6, 0, -7.8, 0., 1e6+0.0, 2.9, make_grids(N, 10))
File "/mnt/c/Users/Admin/Documents/Important Files/GT/Research/Summer 2025/Memos/06 16 2025 - getFEM term filtering/test6D_add_filtered_fem_variable.py", line 259, in calc_F
md.solve("noisy", "lsolver", "mumps")
File "/home/eac/FEM_cpp_testing/test_compilation_folder/opt/lib/python3.10/site-packages/getfem/getfem.py", line 2989, in solve
return self.get("solve", *args)
File "/home/eac/FEM_cpp_testing/test_compilation_folder/opt/lib/python3.10/site-packages/getfem/getfem.py", line 2813, in get
return getfem('model_get',self.id, *args)
RuntimeError: (Getfem::InterfaceError) -- Error in ../../src/gmm/gmm_MUMPS_interface.h, line 205 bool gmm::mumps_error_check(int, int):
Solve with MUMPS failed: error -9, increase ICNTL(14)
Note - it did go much more quickly at first, though - my logging file now reads
2025-06-16 08:59:24,077 [DEBUG] Basis functions per element: ('1 - x - y - z - w - v - u', 'x', 'y', 'z', 'w', 'v', 'u')
2025-06-16 08:59:48,650 [INFO] Multiplier 1/6
2025-06-16 08:59:48,651 [INFO] Multiplier 2/6
2025-06-16 08:59:48,651 [INFO] Multiplier 3/6
2025-06-16 08:59:48,652 [INFO] Multiplier 4/6
2025-06-16 08:59:48,652 [INFO] Multiplier 5/6
2025-06-16 08:59:48,653 [INFO] Multiplier 6/6
2025-06-16 08:59:48,709 [INFO] Linear 1/6
2025-06-16 08:59:48,715 [INFO] Linear 2/6
2025-06-16 08:59:48,722 [INFO] Linear 3/6
2025-06-16 08:59:48,727 [INFO] Linear 4/6
2025-06-16 08:59:48,730 [INFO] Linear 5/6
2025-06-16 08:59:48,732 [INFO] Linear 6/6
…but it errors afterwards.
Sincerely,
Eric Comstock
no, you should not change ICNTL(14), you need to reflect on why the add_filtered_fem_variable gives you a different behavior than the add_multiplier.
You pass two things to add_filtered_fem_variable
if you define these 2 correctly you will not get a too dense tangent matrix (which has the ICNTL(14) error as a side effect)
Hello Kostas,
Is there any way to examine each of those objects, to determine if they are being defined correctly?
Sincerely,
Eric Comstock
to begin with, for a given mesh, you need to know beforehand (calculate by hand) how many degrees of freedom your multiplier variable should have. Then you can print
model.mesh_fem_of_variable("mult_varname").nb_dof()
and compare.
You can also examine the value of
model.mesh_fem_of_variable("mult_varname").nb_basic_dof()
which is all dofs without restricting the variable to region.
Hello Kostas,
I got the error “AttributeError: 'MeshFem' object has no attribute 'nb_dof'. Did you mean: 'nbdof'?”
I switched to md.mesh_fem_of_variable("mult46").nbdof(), and got some very odd results - it was exactly the same as md.mesh_fem_of_variable("mult46").nb_basic_dof(), which were both the total number of dofs I expect in my problem.
Please advise.
Sincerely,
Eric Comstock
yes, I meant nbdof(). It seems we are coming closer to the root cause of the issue.
Your region for the filtered variable is either not defined properly, or ignored.
print the content of mesh.region(region_number)
how does it look like?