optimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for...
optimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for NCCL-tridiagonalization in ELPA1 everything is on GPU now
optimization_26 nccl: implemented elpa_gpu_reduce_add_vectors, for NCCL-tridiagonalization in ELPA1 everything is on GPU now