elpa issueshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues2023-12-15T21:47:03Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/102Undefined reference to elpa_skew functions2023-12-15T21:47:03ZPetr KarpovUndefined reference to elpa_skew functionsThere is a problem with undefined reference to elpa_skew functions, when the skew symmetric support is disabled (--disable-skew-symmetric-support).
Here is the reproducer for raven:
module load anaconda/3/2021.11 intel/21.6.0 impi/2021...There is a problem with undefined reference to elpa_skew functions, when the skew symmetric support is disabled (--disable-skew-symmetric-support).
Here is the reproducer for raven:
module load anaconda/3/2021.11 intel/21.6.0 impi/2021.6 mkl/2022.1 gcc/11 cuda/11.4
../configure CC=mpicc FC=mpiifort CXX=mpiicpc CFLAGS="-O3 -march=skylake-avx512" FCFLAGS="-O3 -xCORE-AVX512" SCALAPACK_FCFLAGS="-I/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/include/intel64/lp64" SCALAPACK_LDFLAGS="-L/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -Wl,-rpath,/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/lib/intel64" --disable-openmp --disable-64bit-integer-math-support --disable-64bit-integer-mpi-support --enable-mpi-module --enable-detect-mpi-launcher --enable-generic --disable-sparc64 --disable-neon-arch64 --disable-vsx --enable-sse --enable-sse-assembly --enable-avx --enable-avx2 --enable-avx512 --disable-sve128 --disable-sve256 --disable-sve512 --disable-bgp --disable-bgp --enable-assumed-size --disable-ifx-compiler --enable-Fortran2008-features --enable-option-checking=fatal --disable-heterogenous-cluster-support --enable-timings --enable-band-to-full-blocking --without-threading-support-check-during-build --disable-runtime-threading-support-checks --disable-allow-thread-limiting --disable-gpu --enable-nvidia-gpu --disable-amd-gpu --disable-intel-gpu-sycl --disable-nvidia-sm80-gpu --disable-NVIDIA-gpu-memory-debug --disable-cuda-aware-mpi --disable-gpu-streams --disable-nvtx --disable-c-tests --disable-cpp-tests --disable-skew-symmetric-support --with-mpi=yes --disable-redirect --enable-single-precision --disable-autotuning --disable-scalapack-tests --disable-autotune-redistribute-matrix --with-papi=no --with-likwid=no --disable-store-build-config --disable-python --disable-python-tests --with-cuda-path="/mpcdf/soft/SLE_15/packages/x86_64/cuda/11.4.2" --with-NVIDIA-GPU-compute-capability=sm_80 --with-cusolver
make -j 18
Here is the error message we get:
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvectors_a_h_a_f'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvalues_d_ptr_f'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvalues_a_h_a_d'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvectors_d_ptr_f'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvectors_d_ptr_d'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvectors_a_h_a_d'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvalues_a_h_a_f'
ld: ./.libs/libelpa.so: undefined reference to `elpa_skew_eigenvalues_d_ptr_d'https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/99ELPA 2022 release crashes for a test case in FHI-aims2023-02-08T11:29:48ZSebastian KokottELPA 2022 release crashes for a test case in FHI-aimsI compiled FHI-aims with external ELPA on the MPCDF raven cluster using the current release version:
```
/mpcdf/soft/SLE_15/packages/skylake/elpa/intel_21.6.0-2021.6.0-impi_2021.6-2021.6.0/2022.05.001-standard/lib/libelpa.so
```
First, I...I compiled FHI-aims with external ELPA on the MPCDF raven cluster using the current release version:
```
/mpcdf/soft/SLE_15/packages/skylake/elpa/intel_21.6.0-2021.6.0-impi_2021.6-2021.6.0/2022.05.001-standard/lib/libelpa.so
```
First, I did some tests for smaller systems, and everything worked fine and was reproducible compared to the default version used in FHI-aims (2020).
Then, I checked for a large-scale system using 64 nodes. Here, the ELPA calls in the first and second cycles worked but crashed during the third cycle.
ELPA stopped after:
```
Updating Kohn-Sham eigenvalues and eigenvectors using ELSI and the ELPA eigensolver.
Starting ELPA eigensolver
Finished transformation to standard eigenproblem
| Time : 33.756 s
```
I'm attaching the runs with elpa2020 (success) and elpa2022 (fail).
Any idea what the source of the crash might be? Many thanks in advance!
[64_elpa_2022.tgz](/uploads/3ad66120371fbfff1e887b2c193de3d8/64_elpa_2022.tgz)
[64_elpa_2020.tgz](/uploads/73350b9a39d8da794888165a720d071f/64_elpa_2020.tgz)https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/93Internal Compiler Error with Intel compiler on branch cusolver_device_ptr2022-05-11T07:37:24ZAndreas MarekInternal Compiler Error with Intel compiler on branch cusolver_device_ptrIn a new branch, after adding a few new api routines the Intel compiler produces an ICE.
branch:
git checkout cusolver_device_ptr
Software used
Currently Loaded Modulefiles:
1) autoconf/2.69 2) automake/1.15 3) libtool/2.4.6 4...In a new branch, after adding a few new api routines the Intel compiler produces an ICE.
branch:
git checkout cusolver_device_ptr
Software used
Currently Loaded Modulefiles:
1) autoconf/2.69 2) automake/1.15 3) libtool/2.4.6 4) intel/21.3.0 5) impi/2021.3 6) mkl/2021.3
Configure line:
../configure CC=mpiicc CFLAGS="-O3 -march=skylake-avx512 -g" FC=mpiifort FCFLAGS="-O3 -g" SCALAPACK_FCFLAGS="-I/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/include/intel64/lp64" SCALAPACK_LDFLAGS="-L/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -Wl,-rpath,/mpcdf/soft/SLE_15/packages/x86_64/intel_oneapi/2021.3/mkl/latest/lib/intel64" --enable-avx512https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/86Could ELPA let user specify how many lowest eigenvalues to compute?2021-08-25T11:50:15ZI-Te LuCould ELPA let user specify how many lowest eigenvalues to compute?Dear developers,
I have a quick question about ELPA: is there an ELPA subroutine that gives a few lowest eigenvalues and eigenvectors of a Hermitian matrix without solving the whole matrix? I have tried to look into the source codes, b...Dear developers,
I have a quick question about ELPA: is there an ELPA subroutine that gives a few lowest eigenvalues and eigenvectors of a Hermitian matrix without solving the whole matrix? I have tried to look into the source codes, but could not find one (probably, I missed something...).
Thanks,
I-Tehttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/82Check MPI calls within OpenMP parallelized regions2021-09-03T13:04:11ZAndreas MarekCheck MPI calls within OpenMP parallelized regionsCurrently we require the MPI library to provide the threading levels "MPI_THREAD_SERIALIZED" or "MPI_THREAD_MULTIPLE". This is done for safety and might not be necessary for all cases of calling ELPA.
Todo:
- make a list of all MPI call...Currently we require the MPI library to provide the threading levels "MPI_THREAD_SERIALIZED" or "MPI_THREAD_MULTIPLE". This is done for safety and might not be necessary for all cases of calling ELPA.
Todo:
- make a list of all MPI calls (also from subroutines) which are called from within OpenMP parallel regions
- check for all calls, whether it can be guaranteed which thread (master or any) will initiate the communication and which thread (master, or the same who initiated the call, or any) can end the communication
- adapt the required threading level accordinglySoheil SoltaniSoheil Soltanihttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/80Print line can exceed 132 characters on Summit2021-04-15T06:56:48ZAndreas MarekPrint line can exceed 132 characters on SummitOn Summit, when compiling the test programs it can happen (why?) that the line printing
the program name exceeds 132 characters
https://gitlab.mpcdf.mpg.de/elpa/elpa/-/blob/master/test/Fortran/test.F90#L248On Summit, when compiling the test programs it can happen (why?) that the line printing
the program name exceeds 132 characters
https://gitlab.mpcdf.mpg.de/elpa/elpa/-/blob/master/test/Fortran/test.F90#L248https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/74Less verbosity if environment variables ELPA_DEFAULT_xxx is set2021-03-24T06:24:33ZAndreas MarekLess verbosity if environment variables ELPA_DEFAULT_xxx is setDo not print on each tasksDo not print on each taskshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/73Service Desk (from dev@stellardeath.org): A gitlab test issue using the servi...2021-02-24T09:37:42ZGitLab Support BotService Desk (from dev@stellardeath.org): A gitlab test issue using the service-deskFoobarFoobarhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/70suspected problem for matrix of size 200k2021-02-24T09:45:56ZPavel Kussuspected problem for matrix of size 200kReported by Phillip Coles
-> check again the setupReported by Phillip Coles
-> check again the setupPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/69elpa tries to use GPU kernel when it should not2019-10-31T09:37:21ZPavel Kuselpa tries to use GPU kernel when it should nothappening during autotuning
Proposed solution to disable the GPU kernel altogether (does not work/perform anyways)happening during autotuning
Proposed solution to disable the GPU kernel altogether (does not work/perform anyways)Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/68probable memory leak on GPU2019-10-31T09:38:43ZPavel Kusprobable memory leak on GPUcan be observed during autotuning when different routines can run on GPU or CPUcan be observed during autotuning when different routines can run on GPU or CPUPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/67cannot build ELPA on talos with cuda 10.12019-11-20T11:05:57ZPavel Kuscannot build ELPA on talos with cuda 10.1works with cuda 10.0works with cuda 10.0Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/66ELPA changes the number of the OpenMP threads, even for calling program2019-04-18T06:46:34ZAndreas MarekELPA changes the number of the OpenMP threads, even for calling programWhen ELPA, e.g. in the autotuning, changes the number of OpenMP threads this is done globally.
But this also affects the calling program.
To cure this, the original number of threads should be stored, and at the end of ELPA restoredWhen ELPA, e.g. in the autotuning, changes the number of OpenMP threads this is done globally.
But this also affects the calling program.
To cure this, the original number of threads should be stored, and at the end of ELPA restoredAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/65API change in elpa_deallocate()2019-02-18T12:44:14ZAsk Hjorth LarsenAPI change in elpa_deallocate()Hi! Please excuse me if this is not the right place to post this, or if I have missed info in the docs.
`elpa_deallocate()` recently got another argument, namely the error code:
https://gitlab.mpcdf.mpg.de/elpa/elpa/commit/69b68de30...Hi! Please excuse me if this is not the right place to post this, or if I have missed info in the docs.
`elpa_deallocate()` recently got another argument, namely the error code:
https://gitlab.mpcdf.mpg.de/elpa/elpa/commit/69b68de30e21d2d959baa426b968e39603ebd758
This will require existing interfaces to be updated as reported here:
https://gitlab.com/gpaw/gpaw/issues/197
Is there a recommended way to write interfaces that are compatible with both this *and* the older version? For example by accessing the version number in the preprocessor?https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/63Job Failed #386213: on power8 with na=15002019-04-17T19:36:24ZAndreas MarekJob Failed #386213: on power8 with na=1500On Power8 with na=1500
the tests:
test_real_single_hermitian_multiply_1stage_random_default.sh
test_real_single_hermitian_multiply_1stage_gpu_random_default.sh
fail, due to slightly too larger error resdiualsOn Power8 with na=1500
the tests:
test_real_single_hermitian_multiply_1stage_random_default.sh
test_real_single_hermitian_multiply_1stage_gpu_random_default.sh
fail, due to slightly too larger error resdiualshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/62Gitlab CI causes several problems on new appdev cluster2018-09-05T06:08:15ZAndreas MarekGitlab CI causes several problems on new appdev cluster- Frank matrix does not work with coverage
- GPU runs hang sometimes
- Pinning does not work
- Sometimes "stale file handle"
- Knl 1-4, maik create problems- Frank matrix does not work with coverage
- GPU runs hang sometimes
- Pinning does not work
- Sometimes "stale file handle"
- Knl 1-4, maik create problemshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/60No GPU complex GPU version in elpa2_bandred2017-08-31T10:24:59ZAndreas MarekNo GPU complex GPU version in elpa2_bandredhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/59add scalapack test to gitlab CI2018-01-07T10:01:37ZPavel Kusadd scalapack test to gitlab CIThe scalapack tests are built only when --enable-scalapack-tests option is used with configure. We should test it in gitlab CI as well, but MKL 11.3 is strangely failing on buildtest. The problem seems to disappear when switching to MKL ...The scalapack tests are built only when --enable-scalapack-tests option is used with configure. We should test it in gitlab CI as well, but MKL 11.3 is strangely failing on buildtest. The problem seems to disappear when switching to MKL 2017 (even though it works on Hydra for both 11.3 and 2017). So we should return this test when we switch to MKL 2017 on buildtest.Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/57Missing: test_project for a C-code2018-01-07T10:24:24ZAndreas MarekMissing: test_project for a C-codeAnalogous to the already existing test projects, there should be some for a C program.Analogous to the already existing test projects, there should be some for a C program.Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/56Problem with ancient Ubuntu/Debian "mawk"2017-08-23T06:30:30ZLorenz HuedepohlProblem with ancient Ubuntu/Debian "mawk"Pavel discovered that on Ubuntu the system default awk is an 1996 "mawk" that does not properly understand `[[^,]]` constructions in a regex, which leads to errors in the generated `elpa/elpa_constants.h` filePavel discovered that on Ubuntu the system default awk is an 1996 "mawk" that does not properly understand `[[^,]]` constructions in a regex, which leads to errors in the generated `elpa/elpa_constants.h` fileLorenz HuedepohlLorenz Huedepohl