elpa merge requestshttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests2023-12-05T08:44:18Zhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/155elpa1 gpu optimization2023-12-05T08:44:18ZPetr Karpovelpa1 gpu optimizationFixes for the ELPA1 GPU optimizations concerning the synchronization in dot-product-like kernels. The vulnerability was exposed by SYCL on CPU tests.Fixes for the ELPA1 GPU optimizations concerning the synchronization in dot-product-like kernels. The vulnerability was exposed by SYCL on CPU tests.Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/153optimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for...2023-11-24T10:49:09ZAndreas Marekoptimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for...optimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for NCCL-tridiagonalization in ELPA1 everything is on GPU nowoptimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for NCCL-tridiagonalization in ELPA1 everything is on GPU nowAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/151Merge of ELPA1-GPU optimization to master_pre_stage2023-11-23T06:54:05ZPetr KarpovMerge of ELPA1-GPU optimization to master_pre_stageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/152Redistribute and dptr2023-11-21T18:36:01ZAndreas MarekRedistribute and dptrAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/150Sycl dev2023-11-21T09:44:24ZAndreas MarekSycl devAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/149Master pre stage2023-11-16T07:58:37ZAndreas MarekMaster pre stageAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/148Fix cholesky2023-11-16T07:55:07ZAndreas MarekFix choleskyAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/147Skew headers2023-11-09T12:56:54ZAndreas MarekSkew headersAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/146Elpa 2 kernel improvements.2023-10-27T15:30:51ZAlexander PoepplElpa 2 kernel improvements.Merge for the ELPA 2 SYCL functionalityMerge for the ELPA 2 SYCL functionalityAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/145New API call 'setup_gpu'2023-10-26T18:28:15ZAndreas MarekNew API call 'setup_gpu'New API call to setup the gpu usage, speeds up the setup tremendously
- must be called _after_ the call to setup
- must be called _after_ setting one of the keywords 'nvidia-gpu',
'amd-gpu' or 'intel-gpu'New API call to setup the gpu usage, speeds up the setup tremendously
- must be called _after_ the call to setup
- must be called _after_ setting one of the keywords 'nvidia-gpu',
'amd-gpu' or 'intel-gpu'Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/144update to enable SVE 256 kernesl;the kernels have been tetsted with FHIaims a...2023-10-25T06:06:39ZAndreas Marekupdate to enable SVE 256 kernesl;the kernels have been tetsted with FHIaims and ELPA unit testsAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/143Fix bug in c/cpp multiple_objs tests by adding sleep before autotune_load_state2023-10-20T13:09:39ZPetr KarpovFix bug in c/cpp multiple_objs tests by adding sleep before autotune_load_stateThis fixes issue #103This fixes issue #103Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/142C interfaces: add print_times, fix cublas/rocblas/hipblas axpy2023-10-17T15:13:04ZPetr KarpovC interfaces: add print_times, fix cublas/rocblas/hipblas axpyAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/141Fix AMD ROCm compile bug2023-09-29T12:01:59ZPetr KarpovFix AMD ROCm compile bugA proper fix for AMD ROCm bug that doesn't break HIP codepath.A proper fix for AMD ROCm bug that doesn't break HIP codepath.Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/140Fixed leaks of mpi communicators in test programs2023-08-09T08:34:59ZPetr KarpovFixed leaks of mpi communicators in test programsFixed leaks of MPI commmunicators in Fortran tests: test.F90 and test_split_comm.F90Fixed leaks of MPI commmunicators in Fortran tests: test.F90 and test_split_comm.F90Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/139Fixed bug in invert_triangular with streams2023-07-07T05:55:28ZPetr KarpovFixed bug in invert_triangular with streamsAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/138ELPA 2023.05.001 release2023-06-14T11:01:10ZAndreas MarekELPA 2023.05.001 releasehttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/137Several changes to configure script2023-05-09T05:14:30ZAndreas MarekSeveral changes to configure script- first check whether C++ compiler supports c++17 standard (needed for
sycl) only if not enforce c++11 standard
- or clarity: rename --enable-avx (and so forth) to --enable-avx-kernels
for the time being still support old flags- first check whether C++ compiler supports c++17 standard (needed for
sycl) only if not enforce c++11 standard
- or clarity: rename --enable-avx (and so forth) to --enable-avx-kernels
for the time being still support old flagsAndreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/136sycl C/C++ tests and impleentation of cholesky and invert_trm2023-05-04T05:56:00ZPetr Karpovsycl C/C++ tests and impleentation of cholesky and invert_trmKey changes:
- implemented SYCL versions of cholesky and invert_trm
- adapted C/C++ tests for SYCL on GPU (in particular, created gpuFree function specific to SYCL C-tests).
@amarek please check stwiched off gpu_api_explicit tests, e...Key changes:
- implemented SYCL versions of cholesky and invert_trm
- adapted C/C++ tests for SYCL on GPU (in particular, created gpuFree function specific to SYCL C-tests).
@amarek please check stwiched off gpu_api_explicit tests, e.g. line 284 of test.C
- added workaround paths for non-implemented sycl functions (e.g sycl_host_register) in hermitian_multiply
- also made a cleanup of duplicated checks for GPU in /src/GPU/vendor_agnostic_layer_template.F90 -- @amarek please check it e.g. in lines 272-285
- check_for_GPU: getdevicecount -> moved to vendor_agnostic_layer_template
- added checks for NaNs and Infinities in test correctness routines
- changed all "stop" fortran instructions to "stop 1"Andreas MarekAndreas Marekhttps://gitlab.mpcdf.mpg.de/elpa/elpa/-/merge_requests/135fix obs installation cxx112023-02-14T07:57:52ZPetr Karpovfix obs installation cxx11Add -std=c++11 flag to fix OBS installation for older compiler versions on cobraAdd -std=c++11 flag to fix OBS installation for older compiler versions on cobraAndreas MarekAndreas Marek