elpa issueshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues2018-02-05T19:48:54Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/14Header of test programs wrong2018-02-05T19:48:54ZAndreas MarekHeader of test programs wrongIn some cases, e.g. elpa2_test_complex, the header gives wrong informationIn some cases, e.g. elpa2_test_complex, the header gives wrong informationhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/4Parameters in GPU version (still development)2018-02-05T19:48:54ZAndreas MarekParameters in GPU version (still development)The values for cudaMemCopyHostToDevice etc. should not be hard coded, but parsed from the Cuda header filesThe values for cudaMemCopyHostToDevice etc. should not be hard coded, but parsed from the Cuda header fileshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/13Complex generiv kernel produces wrong results2018-02-05T19:48:54ZAndreas MarekComplex generiv kernel produces wrong resultsIn some cases the error residual is wrongIn some cases the error residual is wronghttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/3Wrong results in complex calculation2018-02-05T19:48:54ZAndreas MarekWrong results in complex calculationIt seems that under certain conditions the ELPA2 complex case produces wrong results. This depends on the number of used MPI tasks, and only appears if the matrix size is larger by one than the used blocksize, e.g. ./elpa2_test_complex 1...It seems that under certain conditions the ELPA2 complex case produces wrong results. This depends on the number of used MPI tasks, and only appears if the matrix size is larger by one than the used blocksize, e.g. ./elpa2_test_complex 17 17 16https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/2Datatype mixing2018-02-05T19:48:54ZAndreas MarekDatatype mixingIn at least one subroutine double precision real data is used as double precision complex data. This has been done on purpose for performance reasons (packing). However, on the one hand the compilers throw a warning message, and on the o...In at least one subroutine double precision real data is used as double precision complex data. This has been done on purpose for performance reasons (packing). However, on the one hand the compilers throw a warning message, and on the other hand it forbids Fortran interface checking. This should be changed with modern Fortran functionalityhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/11Crash of ELPA AVX kernels2018-02-05T19:48:54ZAndreas MarekCrash of ELPA AVX kernelsAs recieved per email:
I am experiencing stability issues with the ELPA AVX kernels. I have
tested the latest stable tar-ball release (2015-11) as well as the
latest git master version.
I have used the following hardware/softwa...As recieved per email:
I am experiencing stability issues with the ELPA AVX kernels. I have
tested the latest stable tar-ball release (2015-11) as well as the
latest git master version.
I have used the following hardware/software combinations, both of which
support AVX2 instructions:
Intel i5-5200U (Haswell)/gcc&gfortran 5.3/openmpi 1.10.2/netlib
blas/lapack 3.6.0-4 / netlib scalapack 2.0.2.-4
Cray XC40 Intel Xeon E5-2690v3 (Haswell)/gcc&gfortran 5.2 (with cray
wrappers)/cray-mpich 7.3.1/cray-libsci 13.3.0
I have configured ELPA with ./configure --prefix=/home/nico/lib
--with-avx-optimization FCFLAGS="-O3 -march=haswell -mavx2 -mfma"
CFLAGS="-O3 -march=haswell -mavx2 -mfma" CXXFLAGS="-O3 -march=haswell
-mavx2 -mfma". A config.log is attached for the first machine.
I have tested the different available ELPA kernels using the included
test file test_real2_choose_kernel_with_api.F90 by changing the kernel
directly in the call to solve_evp_real_2stage (REAL_ELPA_KERNEL_*) and
recompiling. Running the script with e.g. "mpiexec -n 1
./elpa2_test_real_choose_kernel_with_api 64 32 16" (the test matrix is
the same with each call to the program) everything runs without issue
with the generic, generic_simple and SSE kernels. However, with all the
AVX kernels the program crashes roughly 75% of the time during the
backtransformation tridi->band.
I have attached an example output of a crash with the AVX_BLOCK_2
kernel. As far as I have been able to debug (using strategically placed
prints), the crash occurs on line 191 of
elpa2_kernels_real_avx-avx2_2hv.c (__m256d x1 =
_mm256_load_pd(&q[ldq]);)https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/10ELPA2 real case does not work anymore with OpenMP2018-02-05T19:48:54ZAndreas MarekELPA2 real case does not work anymore with OpenMPThe ELPA 2 real case produces wrong results, if more than 1 OpenMP threads are usedThe ELPA 2 real case produces wrong results, if more than 1 OpenMP threads are usedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/9ELPA1 GPU version2018-02-05T19:48:54ZAndreas MarekELPA1 GPU versionThe GPU version of ELPA1 (in branch ELPA_GPU_development version) should be updatedThe GPU version of ELPA1 (in branch ELPA_GPU_development version) should be updatedPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/8Single precision GPU version2018-02-05T19:48:54ZAndreas MarekSingle precision GPU versionThe GPU version has not yet been ported to single precision calculationsThe GPU version has not yet been ported to single precision calculationshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/62Gitlab CI causes several problems on new appdev cluster2018-09-05T06:08:15ZAndreas MarekGitlab CI causes several problems on new appdev cluster- Frank matrix does not work with coverage
- GPU runs hang sometimes
- Pinning does not work
- Sometimes "stale file handle"
- Knl 1-4, maik create problems- Frank matrix does not work with coverage
- GPU runs hang sometimes
- Pinning does not work
- Sometimes "stale file handle"
- Knl 1-4, maik create problemshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/65API change in elpa_deallocate()2019-02-18T12:44:14ZAsk Hjorth LarsenAPI change in elpa_deallocate()Hi! Please excuse me if this is not the right place to post this, or if I have missed info in the docs.
`elpa_deallocate()` recently got another argument, namely the error code:
https://gitlab.mpcdf.mpg.de/elpa/elpa/commit/69b68de30...Hi! Please excuse me if this is not the right place to post this, or if I have missed info in the docs.
`elpa_deallocate()` recently got another argument, namely the error code:
https://gitlab.mpcdf.mpg.de/elpa/elpa/commit/69b68de30e21d2d959baa426b968e39603ebd758
This will require existing interfaces to be updated as reported here:
https://gitlab.com/gpaw/gpaw/issues/197
Is there a recommended way to write interfaces that are compatible with both this *and* the older version? For example by accessing the version number in the preprocessor?https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/63Job Failed #386213: on power8 with na=15002019-04-17T19:36:24ZAndreas MarekJob Failed #386213: on power8 with na=1500On Power8 with na=1500
the tests:
test_real_single_hermitian_multiply_1stage_random_default.sh
test_real_single_hermitian_multiply_1stage_gpu_random_default.sh
fail, due to slightly too larger error resdiualsOn Power8 with na=1500
the tests:
test_real_single_hermitian_multiply_1stage_random_default.sh
test_real_single_hermitian_multiply_1stage_gpu_random_default.sh
fail, due to slightly too larger error resdiualshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/66ELPA changes the number of the OpenMP threads, even for calling program2019-04-18T06:46:34ZAndreas MarekELPA changes the number of the OpenMP threads, even for calling programWhen ELPA, e.g. in the autotuning, changes the number of OpenMP threads this is done globally.
But this also affects the calling program.
To cure this, the original number of threads should be stored, and at the end of ELPA restoredWhen ELPA, e.g. in the autotuning, changes the number of OpenMP threads this is done globally.
But this also affects the calling program.
To cure this, the original number of threads should be stored, and at the end of ELPA restoredAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/71investigate CPU memory allocation2019-10-22T08:27:00ZPavel Kusinvestigate CPU memory allocationand why it is not possible to run with much larger matrix on machines with very large memory (e.g. Optane memory equipped node)and why it is not possible to run with much larger matrix on machines with very large memory (e.g. Optane memory equipped node)Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/72lift the restriction bandwidth == nblk in the GPU version of ELPA 22019-10-29T11:13:22ZPavel Kuslift the restriction bandwidth == nblk in the GPU version of ELPA 2For the GPU version of ELPA 2, the intermediate bandwidth is always taken as the scalapack block size. It would be better, if the optimal value (as for the CPU version) could be selected, since it is very important for performance.
It i...For the GPU version of ELPA 2, the intermediate bandwidth is always taken as the scalapack block size. It would be better, if the optimal value (as for the CPU version) could be selected, since it is very important for performance.
It is, however, hard-coded somewhere in the band reduction step.https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/69elpa tries to use GPU kernel when it should not2019-10-31T09:37:21ZPavel Kuselpa tries to use GPU kernel when it should nothappening during autotuning
Proposed solution to disable the GPU kernel altogether (does not work/perform anyways)happening during autotuning
Proposed solution to disable the GPU kernel altogether (does not work/perform anyways)Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/68probable memory leak on GPU2019-10-31T09:38:43ZPavel Kusprobable memory leak on GPUcan be observed during autotuning when different routines can run on GPU or CPUcan be observed during autotuning when different routines can run on GPU or CPUPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/67cannot build ELPA on talos with cuda 10.12019-11-20T11:05:57ZPavel Kuscannot build ELPA on talos with cuda 10.1works with cuda 10.0works with cuda 10.0Pavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/5GPU version (still development) does not work with OpenMP host code2021-02-24T09:35:22ZAndreas MarekGPU version (still development) does not work with OpenMP host codeThis path should be implemented,at the moment the code abortsThis path should be implemented,at the moment the code abortsPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/73Service Desk (from dev@stellardeath.org): A gitlab test issue using the servi...2021-02-24T09:37:42ZGitLab Support BotService Desk (from dev@stellardeath.org): A gitlab test issue using the service-deskFoobarFoobar