Changelog for next release - not yet decided Upcoming changes for ELPA 2021.05.001 - allow the user to set the mapping of MPI tasks to GPU id per set/get - experimental feature: port to AMD GPUS, works correctly, performance yet unclear Changelog for ELPA 2020.11.001 - this release containts mostly bugfixes: - fix determination whether a _ is needed to link Fortran to C - fix an error in the real block4 kernel for arch64 NEON - add missing test_scalapack_template.F90 to EXTRA_DIST list - fix error in the GPU kernel - switch form python2 to python3 - experimental feature: complex kernels for arch64 NEON - experimental feature: kernels for ARM SVE Changelog for ELPA 2020.05.001 - Enable compilation with gcc v10 - Fix a bug in elpa_multiply_a_b (GPU) - improved documentation, including fixing of typos and errors in markdown - Fix a bug in the calling of Cannons algorithm which might lead to crashes for a squared process grid - improvements and bugfixes of the ELPA2 stage GPU version, see https://arxiv.org/abs/2002.10991 - bugfix for the build of AVX-512 KNL kernels - clean seperation of SIMD instructions for AVX and AVX2 kernels - better error checking for allocations / deallocations of CPU and GPU memory - experimental feature of matrix redistribution - bugfix in the cpuid tests - bugfix in elpa2_print_kernels - bugfix when configuring --with-gpu-support-only Changelog for ELPA 2019.11.001 - solve a bug when using parallel make builds - check the cpuid set during build time - add experimental feature "heterogenous-cluster-support" - add experimental feature for 64bit integer LAS/LAPACK/SCALAPACK support - add experimental feature for 64bit integer MPI support - support of ELPA for real valued skew-symmetric matrices, please cite: https://arxiv.org/abs/1912.04062 - cleanup of the GPU version - bugfix in the OpenMP version - bugfix on the Power8/9 kernels - bugfix on ARM aarch64 FMA kernels Changelog for ELPA 2019.05.002 - repacking of the src since the legacy interface has been forgotten in the 2019.05.001 release Changelog for ELPA 2019.05.001 - elpa_print_kernels supports GPU usage - fix an error if PAPI measurements are activated - new simple real kernels: block4 and block6 - c functions can be build with optional arguments if compiler supports it (configure option) - allow measurements with the likwid tool - users can define the default-kernel at build time - ELPA versioning number is provided in the C header files - as announced a year ago, the following deprecated routines have been finally removed; see DEPRECATED_FEATURES for the replacement routines , which have been introduced a year ago. Removed routines: -> mult_at_b_real -> mult_ah_b_complex -> invert_trm_real -> invert_trm_complex -> cholesky_real -> cholesky_complex -> solve_tridi - new kernels for ARM arch64 added - fix an out-of-bound-error in elpa2 Changelog for ELPA 2018.11.001 - improved autotuning - improved performance of generalized problem via Cannon's algorithm - check pointing functionality of elpa objects - store/read/resume of autotuning - Python interface for ELPA - more ELPA functions have an optional error argument (Fortran) or required error argument (C) => ABI and API change Changelog for ELPA 2018.05.001 - significant improved performance on K-computer - added interface for the generalized eigenvalue problem - extended autotuning functionality Changelog for ELPA 2017.11.001 - significant improvement of performance of GPU version - added new compute kernels for IBM Power8 and Fujistu Sparc64 processors - a first implementation of autotuning capability - correct some type statements in Fortran - correct detection of PAPI in configure step Changelog for ELPA 2017.05.003 - remove bug in invert_triangular, which had been introduced in ELPA 2017.05.002 Changelog for ELPA 2017.05.002 Mainly bugfixes for ELPA 2017.05.001: - fix memory leak of MPI communicators - tests for hermitian_multiply, cholesky decomposition and - deal with a problem on Debian (mawk) Changelog for ELPA 2017.05.001 Final release of ELPA 2017.05.001 Since rc2 the following changes have been made - more extensive tests during "make check" - distribute missing C headers - introduce analytic tests - Fix stack overflow in some kernels Changelog for ELPA 2017.05.001.rc2 This is the release candidate 2 for the ELPA 2017.05.001 version. Additionaly to the changes from rc1, it fixes some smaller issues - add missing script "manual_cpp" - cleanup of code Changelog for ELPA 2017.05.001.rc1 This is the release candidate 1 for the ELPA 2017.05.001 version. It provides a first version of the new, more generic API of the ELPA library. Smaller changes to the API might be possible in the upcoming release candidates. For users, who would like to use the older API of the ELPA library, the API as defined with release 2016.11.001.pre is frozen in and also supported. Apart of the API change to be more flexible for the future, this release offers the following changes: - faster GPU implementation, especially for ELPA 1stage - the restriction of the block-cyclic distribution blocksize = 128 in the GPU case is relaxed - Faster CPU implementation due to better blocking - support of already banded matrices (new API only!) - improved KNL support Changelog for pre-release ELPA 2016.11.001.pre This pre-release contains an experimental API which will most likely change in the next stable release - also suport of single-precision (real and complex case) eigenvalule problems - GPU support in ELPA 1stage and 2stage (real and complex case) - change of API (w.r.t. ELPA 2016.05.004) to support runtime-choice of GPU usage Changelog for release ELPA 2016.05.004 - fix a problem with the private state of module precision - distribute test_project with dist tarball - generic driver routine for ELPA 1stage and 2stage - test case for elpa_mult_at_b_real - test case for elpa_mult_ah_b_complex - test case for elpa_cholesky_real - test case for elpa_cholesky_complex - test case for elpa_invert_trm_real - test case for elpa_invert_trm_complex - fix building of static library - better choice of AVX, AVX2, AVX512 kernels - make assumed size Fortran arrays default Changelog for release ELPA 2016.05.003 - fix a problem with the build of SSE kernels - make some (internal) functions public, such that they can be used outside of ELPA - add documentation and interfaces for new public functions - shorten file namses and directory names for test programs in under to by pass "make agrument list too long" error Changelog for release ELPA 2016.05.002 - fix problem with generated *.sh- check scripts - name library differently if build without MPI support - install only public modules Changelog for release ELPA 2016.05.001 - support building without MPI for one node usage - doxygen and man pages documentation for ELPA - cleanup of documentation - introduction of SSE gcc intrinsic kernels - Remove errors due to unaligned memory - removal of Fortran "contains functions" - Fortran interfaces for assembly and C kernels