elpa issueshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues2021-04-15T06:16:37Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/79Print in the test programs the number of GPU's used2021-04-15T06:16:37ZAndreas MarekPrint in the test programs the number of GPU's usedIf ELPA is build with GPU support, print in the startup of the test programs the number of GPUs usedIf ELPA is build with GPU support, print in the startup of the test programs the number of GPUs usedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/61QR decompostion does not work with "analytic" matrix2017-09-13T09:30:39ZAndreas MarekQR decompostion does not work with "analytic" matrixSince it does not work, this combination is not enabled at the momentSince it does not work, this combination is not enabled at the momentPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/6Refactor QR part2017-07-16T17:31:51ZAndreas MarekRefactor QR partThe QR part should be refactored and cleanedThe QR part should be refactored and cleanedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/27Remove unecessary data copies if MPI is not used2022-12-12T07:47:11ZAndreas MarekRemove unecessary data copies if MPI is not usedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/94Setting of GPU kernel depens on order of set calls2022-02-03T17:05:00ZAndreas MarekSetting of GPU kernel depens on order of set callsWhen setting
first set("solver",2stage) and then
set("kernel",GPU_KERNEL)
it uses the CPU kernel (the default kernel seems to be set)
In the other order it works correctlyWhen setting
first set("solver",2stage) and then
set("kernel",GPU_KERNEL)
it uses the CPU kernel (the default kernel seems to be set)
In the other order it works correctlyhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/50stripe_width in trans_ev_tridi_to_band2017-09-06T19:06:07ZAndreas Marekstripe_width in trans_ev_tridi_to_bandhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/105SYCL kernels for multiply missing2023-10-25T06:03:39ZAndreas MarekSYCL kernels for multiply missingPetr KarpovPetr Karpovhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/81Toeplitz test cases hang for realy small matrices na=42021-05-06T12:37:33ZAndreas MarekToeplitz test cases hang for realy small matrices na=4If you use 4 MPI tasks for a setup of na=4 nev=4 nblk=1, the the test-cases for Toeplitz matrices hang.
The test-cases for other matrix setups do work, however.
It seems that the code hangs in the "solve" stepIf you use 4 MPI tasks for a setup of na=4 nev=4 nblk=1, the the test-cases for Toeplitz matrices hang.
The test-cases for other matrix setups do work, however.
It seems that the code hangs in the "solve" stephttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/108UCX warnings for GPU complex_double tests with OpenMPI2023-12-15T13:01:02ZPetr KarpovUCX warnings for GPU complex_double tests with OpenMPIReproducer:
```
module purge
module load cuda/11.4 gcc/11 openmpi/4 mkl/2022.1 nccl/2.11.4
export OMPI_MCA_coll=^hcoll
../configure --prefix=$HOME/soft/elpa_mpi_00 --enable-option-checking=fatal CC=mpicc FC=mpif90 CXX=mpicxx CFLAGS="-...Reproducer:
```
module purge
module load cuda/11.4 gcc/11 openmpi/4 mkl/2022.1 nccl/2.11.4
export OMPI_MCA_coll=^hcoll
../configure --prefix=$HOME/soft/elpa_mpi_00 --enable-option-checking=fatal CC=mpicc FC=mpif90 CXX=mpicxx CFLAGS="-O3 -g -march=skylake-avx512 -I$MKL_HOME/include/intel64/lp64 -I$CUDA_HOME/include" CXXFLAGS="-std=c++17 -O3 -march=skylake-avx512 -I$MKL_HOME/include/intel64/lp64 -I$CUDA_HOME/include" FCFLAGS="-O3 -g -march=skylake-avx512 -I$MKL_HOME/include/intel64/lp64 -I$CUDA_HOME/include" LDFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64 -lpthread -Wl,-rpath,$MKL_HOME/lib/intel64" --with-mpi=yes --enable-assumed-size --enable-band-to-full-blocking --enable-nvidia-gpu --with-NVIDIA-GPU-compute-capability=sm_70 -with-cuda-path=$CUDA_HOME --enable-avx512 --enable-cpp-tests=no --enable-single-precision --enable-nvtx
```
The warnings like
```
[1702644215.666499] [ravg1002:132812:0] mpool.c:55 UCX WARN object 0xcf82c0 {{cpml|cb|snd_tag|rk_use} send length 41943040 ucp_proto_progress_tag_rndv_rts() comp:mca_pml_ucx_send_nbx_completion()host me was not returned to mpool ucp_requests
```
appear for complex_double tests, e.g. `validate_complex_double_eigenvectors_1stage_gpu_random` but not for real_doublePetr KarpovPetr Karpovhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/45unify real/complex QR paths2017-07-16T17:31:50ZAndreas Marekunify real/complex QR paths