elpa issueshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues2023-11-06T20:38:45Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/1Assumedsize arrays2023-11-06T20:38:45ZAndreas MarekAssumedsize arraysSome subroutines/functions of the Fortran code still use (deprecated) assumedsize arrays. This has been introduces for simplicity and performance (avoid unecessary copying of arrays) but makes debugging hard (or even impossible). This sh...Some subroutines/functions of the Fortran code still use (deprecated) assumedsize arrays. This has been introduces for simplicity and performance (avoid unecessary copying of arrays) but makes debugging hard (or even impossible). This should be changedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/2Datatype mixing2018-02-05T19:48:54ZAndreas MarekDatatype mixingIn at least one subroutine double precision real data is used as double precision complex data. This has been done on purpose for performance reasons (packing). However, on the one hand the compilers throw a warning message, and on the o...In at least one subroutine double precision real data is used as double precision complex data. This has been done on purpose for performance reasons (packing). However, on the one hand the compilers throw a warning message, and on the other hand it forbids Fortran interface checking. This should be changed with modern Fortran functionalityhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/3Wrong results in complex calculation2018-02-05T19:48:54ZAndreas MarekWrong results in complex calculationIt seems that under certain conditions the ELPA2 complex case produces wrong results. This depends on the number of used MPI tasks, and only appears if the matrix size is larger by one than the used blocksize, e.g. ./elpa2_test_complex 1...It seems that under certain conditions the ELPA2 complex case produces wrong results. This depends on the number of used MPI tasks, and only appears if the matrix size is larger by one than the used blocksize, e.g. ./elpa2_test_complex 17 17 16https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/4Parameters in GPU version (still development)2018-02-05T19:48:54ZAndreas MarekParameters in GPU version (still development)The values for cudaMemCopyHostToDevice etc. should not be hard coded, but parsed from the Cuda header filesThe values for cudaMemCopyHostToDevice etc. should not be hard coded, but parsed from the Cuda header fileshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/5GPU version (still development) does not work with OpenMP host code2021-02-24T09:35:22ZAndreas MarekGPU version (still development) does not work with OpenMP host codeThis path should be implemented,at the moment the code abortsThis path should be implemented,at the moment the code abortsPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/6Refactor QR part2017-07-16T17:31:51ZAndreas MarekRefactor QR partThe QR part should be refactored and cleanedThe QR part should be refactored and cleanedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/7Single precision ELPA2 kernels2018-02-05T19:48:54ZAndreas MarekSingle precision ELPA2 kernelsThe assembler kernel and the kernels using gcc intrinsic assembler directives have not yet been ported to single precisionThe assembler kernel and the kernels using gcc intrinsic assembler directives have not yet been ported to single precisionhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/8Single precision GPU version2018-02-05T19:48:54ZAndreas MarekSingle precision GPU versionThe GPU version has not yet been ported to single precision calculationsThe GPU version has not yet been ported to single precision calculationshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/9ELPA1 GPU version2018-02-05T19:48:54ZAndreas MarekELPA1 GPU versionThe GPU version of ELPA1 (in branch ELPA_GPU_development version) should be updatedThe GPU version of ELPA1 (in branch ELPA_GPU_development version) should be updatedPavel KusPavel Kushttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/10ELPA2 real case does not work anymore with OpenMP2018-02-05T19:48:54ZAndreas MarekELPA2 real case does not work anymore with OpenMPThe ELPA 2 real case produces wrong results, if more than 1 OpenMP threads are usedThe ELPA 2 real case produces wrong results, if more than 1 OpenMP threads are usedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/11Crash of ELPA AVX kernels2018-02-05T19:48:54ZAndreas MarekCrash of ELPA AVX kernelsAs recieved per email:
I am experiencing stability issues with the ELPA AVX kernels. I have
tested the latest stable tar-ball release (2015-11) as well as the
latest git master version.
I have used the following hardware/softwa...As recieved per email:
I am experiencing stability issues with the ELPA AVX kernels. I have
tested the latest stable tar-ball release (2015-11) as well as the
latest git master version.
I have used the following hardware/software combinations, both of which
support AVX2 instructions:
Intel i5-5200U (Haswell)/gcc&gfortran 5.3/openmpi 1.10.2/netlib
blas/lapack 3.6.0-4 / netlib scalapack 2.0.2.-4
Cray XC40 Intel Xeon E5-2690v3 (Haswell)/gcc&gfortran 5.2 (with cray
wrappers)/cray-mpich 7.3.1/cray-libsci 13.3.0
I have configured ELPA with ./configure --prefix=/home/nico/lib
--with-avx-optimization FCFLAGS="-O3 -march=haswell -mavx2 -mfma"
CFLAGS="-O3 -march=haswell -mavx2 -mfma" CXXFLAGS="-O3 -march=haswell
-mavx2 -mfma". A config.log is attached for the first machine.
I have tested the different available ELPA kernels using the included
test file test_real2_choose_kernel_with_api.F90 by changing the kernel
directly in the call to solve_evp_real_2stage (REAL_ELPA_KERNEL_*) and
recompiling. Running the script with e.g. "mpiexec -n 1
./elpa2_test_real_choose_kernel_with_api 64 32 16" (the test matrix is
the same with each call to the program) everything runs without issue
with the generic, generic_simple and SSE kernels. However, with all the
AVX kernels the program crashes roughly 75% of the time during the
backtransformation tridi->band.
I have attached an example output of a crash with the AVX_BLOCK_2
kernel. As far as I have been able to debug (using strategically placed
prints), the crash occurs on line 191 of
elpa2_kernels_real_avx-avx2_2hv.c (__m256d x1 =
_mm256_load_pd(&q[ldq]);)https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/13Complex generiv kernel produces wrong results2018-02-05T19:48:54ZAndreas MarekComplex generiv kernel produces wrong resultsIn some cases the error residual is wrongIn some cases the error residual is wronghttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/14Header of test programs wrong2018-02-05T19:48:54ZAndreas MarekHeader of test programs wrongIn some cases, e.g. elpa2_test_complex, the header gives wrong informationIn some cases, e.g. elpa2_test_complex, the header gives wrong informationhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/15Not all test programs are build if OpenMP is enabled2018-02-05T19:48:54ZAndreas MarekNot all test programs are build if OpenMP is enabledThe C test programs do not get build for unclear reason. They are compiled but not linkedThe C test programs do not get build for unclear reason. They are compiled but not linkedhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/16Setting of specific kernels at build time does not work anymore in ELPA_GPU b...2018-02-05T19:48:54ZAndreas MarekSetting of specific kernels at build time does not work anymore in ELPA_GPU branchAt least setting the AVX_BLOCK6 kernel with --with-real-avx_block6-kernel-only does not work. It produces wrong results. Not specifying this Option but calling the kernel works!At least setting the AVX_BLOCK6 kernel with --with-real-avx_block6-kernel-only does not work. It produces wrong results. Not specifying this Option but calling the kernel works!https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/17Single precision SSE/AVX/AVX BLOCK1 kernel does not work2017-05-21T22:14:19ZAndreas MarekSingle precision SSE/AVX/AVX BLOCK1 kernel does not workDue to this, also the BLOCK2 kernels do not workDue to this, also the BLOCK2 kernels do not workhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/18Single precision AVX Block6 crashes sometime2021-04-15T06:55:36ZAndreas MarekSingle precision AVX Block6 crashes sometimeThe double precision case works fine, but single precision crashes sometimes:
e.g. 1500 50 16, or 150, 50, 16
1500 500 16 works fineThe double precision case works fine, but single precision crashes sometimes:
e.g. 1500 50 16, or 150, 50, 16
1500 500 16 works finehttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/19GPU Branch version crashes with FCFLAGS = "-02, O3"2017-05-21T22:14:19ZAndreas MarekGPU Branch version crashes with FCFLAGS = "-02, O3"https://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/20QR decomposiiton crashes with segfault with Debian patch2017-05-21T22:14:19ZAndreas MarekQR decomposiiton crashes with segfault with Debian patchhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/issues/21QR decomposition in single precision crashes for matrix 1500 50 162017-05-21T22:14:19ZAndreas MarekQR decomposition in single precision crashes for matrix 1500 50 16