elpa merge requestshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests2022-05-07T05:38:54Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/101Fix problems when compiling merge_systems with -O22022-05-07T05:38:54ZAndreas MarekFix problems when compiling merge_systems with -O2With the Intel compiler, floating-point exceptions occur if the module
merge_systems is compiled with -O2. This does not happen with -O1. A
directive was added to force the optimization level to be less or equal
to 1.
Fixes #95With the Intel compiler, floating-point exceptions occur if the module
merge_systems is compiled with -O2. This does not happen with -O1. A
directive was added to force the optimization level to be less or equal
to 1.
Fixes #95Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/102Start to implement SYCL GPU backend2022-05-11T06:09:07ZAndreas MarekStart to implement SYCL GPU backend- does not work yet
- some mkl offload functions still missing- does not work yet
- some mkl offload functions still missingAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/104Gpu streams2022-06-02T05:56:20ZAndreas MarekGpu streamsAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/105Master pre stage2022-06-10T05:28:41ZAndreas MarekMaster pre stageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/106Master pre stage2022-06-12T06:13:39ZAndreas MarekMaster pre stageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/107Improvements for AMD Mi2502022-08-04T06:32:30ZAndreas MarekImprovements for AMD Mi250Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/108Allow to use NVIDIA cub in real GPU kernel (might give ~10% speedup)2022-08-06T07:03:58ZAndreas MarekAllow to use NVIDIA cub in real GPU kernel (might give ~10% speedup)Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/109If user does not set "omp_threads", then...2022-08-08T07:30:22ZAndreas MarekIf user does not set "omp_threads", then...use the value specified by omp_get_max_threads()use the value specified by omp_get_max_threads()Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/110Add configure flag to enable hipcub2022-08-16T06:46:39ZAndreas MarekAdd configure flag to enable hipcubFor nblk > 64 the HIPCUB implementation gives
wrong results!For nblk > 64 the HIPCUB implementation gives
wrong results!Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/111Fix the MPI communicators per elpa object2022-10-07T07:52:42ZAndreas MarekFix the MPI communicators per elpa objectClarification of the ELPA usage:
It has always been intended with the ELPA API that one should
only set the MPI communicators ("mpi_comm_parent", "mpi_comm_rows",
and "mpi_comm_cols") _once_ per ELPA object.
Technically, it has been poss...Clarification of the ELPA usage:
It has always been intended with the ELPA API that one should
only set the MPI communicators ("mpi_comm_parent", "mpi_comm_rows",
and "mpi_comm_cols") _once_ per ELPA object.
Technically, it has been possible to change these communicators, for
an existing ELPA object, which leads -- dependent on the exact
configuration of the ELPA object -- to correct or erroneous behaviour.
With this commit, it is _technically_ impossible to set the
communicators more than once (per ELPA object)Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/112Fixed ev datatype for d_ptr interface for the complex case2022-10-10T15:20:17ZPetr KarpovFixed ev datatype for d_ptr interface for the complex caseChanged size_of_datatype -> size_of_real_datatype for ev_dev for gpu device pointer usageChanged size_of_datatype -> size_of_real_datatype for ev_dev for gpu device pointer usageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/113fixed a problem with small nev for generalized problem, caused by cannon2022-10-13T07:20:51ZPetr Karpovfixed a problem with small nev for generalized problem, caused by cannonThere was a problem with division by 0 inside cannon, which is fixed nowThere was a problem with division by 0 inside cannon, which is fixed nowAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/114Amd mi2502022-10-20T09:28:20ZAndreas MarekAmd mi250Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/115Add configure flag to enable hipcub2022-10-28T14:19:50ZAndreas MarekAdd configure flag to enable hipcubFor nblk > 64 the HIPCUB implementation gives
wrong results!For nblk > 64 the HIPCUB implementation gives
wrong results!Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/116If user does not set "omp_threads", then...2022-10-29T14:47:02ZAndreas MarekIf user does not set "omp_threads", then...use the value specified by omp_get_max_threads()use the value specified by omp_get_max_threads()Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/117Fix error when using hipCUB2022-10-31T10:11:51ZAndreas MarekFix error when using hipCUBAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/119Master pre stage2022-11-02T15:12:58ZAndreas MarekMaster pre stageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/120fixes gpu id memory leak2022-11-04T06:49:31ZPetr Karpovfixes gpu id memory leakFixed multiple creation of cublas handles (and other gpu-handles). Added destruction of gpu-handles and handle arrays in elpa_uninit.Fixed multiple creation of cublas handles (and other gpu-handles). Added destruction of gpu-handles and handle arrays in elpa_uninit.Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/122Fixed C-interfaces and added new C-tests [updated]2022-11-11T05:56:08ZPetr KarpovFixed C-interfaces and added new C-tests [updated]Fixed minor typos in C-interfaces; added feature for the "explicit" interfaces that distinguishes host and GPU allocated arrays (currently only for cuda and hip). Added C-tests for eigenvalues, cholesky, hermitian_multiply. Added Fortran...Fixed minor typos in C-interfaces; added feature for the "explicit" interfaces that distinguishes host and GPU allocated arrays (currently only for cuda and hip). Added C-tests for eigenvalues, cholesky, hermitian_multiply. Added Fortran and C-test for invert_triangular. Added device pointer (d_ptr) API tests. Added C-interface/C-test and fixed man page for elpa_solve_tridiagonal.Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/118Fixed C-interfaces and added new C-tests2022-11-11T05:56:09ZPetr KarpovFixed C-interfaces and added new C-testsFixed minor typos in C-interfaces; added feature for the "explicit" interfaces that distinguishes host and GPU allocated arrays (currently only for cuda and hip).
Added C-tests for eigenvalues, cholesky, hermitian_multiply.
Added Fortran...Fixed minor typos in C-interfaces; added feature for the "explicit" interfaces that distinguishes host and GPU allocated arrays (currently only for cuda and hip).
Added C-tests for eigenvalues, cholesky, hermitian_multiply.
Added Fortran and C-test for invert_triangular.
Added device pointer (d_ptr) API tests.
Added C-interface/C-test and fixed man page for elpa_solve_tridiagonal.Andreas MarekAndreas Marek