- 31 May, 2016 1 commit
-
-
Andreas Marek authored
-
- 30 May, 2016 7 commits
-
-
Andreas Marek authored
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
A small missing '$' caused a lot of mischief. Augmented also the tests to detect something like this in the future
-
Lorenz Huedepohl authored
Remove all references to private functions and symbols from the public Fortran modules. Install also only the public modules
-
Andreas Marek authored
-
Andreas Marek authored
This allows to install several versions of the library in the same directory
-
Andreas Marek authored
If compiled with MPI, the necessary "mpiexec -n 2" was not written in the check scripts. This commit is based on a patch provided by Michael Banck from debian.org
-
- 25 May, 2016 2 commits
-
-
Andreas Marek authored
The optional build via "--enable-assumed-size-arrays" is also tested in the CI
-
Andreas Marek authored
Using Fortran assumed size arrays makes debugging a lot harder, thus as default these are not used in ELPA. However, it might be, that some compilers make unwanted copies if calling subroutines with array slices. Then switching back to assumed size arrays might create a performance gain
-
- 24 May, 2016 1 commit
-
-
Lorenz Huedepohl authored
Remove all references to private functions and symbols from the public Fortran modules. Install also only the public modules
-
- 23 May, 2016 1 commit
-
-
Andreas Marek authored
-
- 10 May, 2016 2 commits
-
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
Now this is done consistently both in autoconf and automake. One can now safely call make clean and the header files are re-generated automatically.
-
- 09 May, 2016 2 commits
-
-
Lorenz Huedepohl authored
In my humble opinion it is much more obvious to specify --with-mpi=no instead of --enable-shared-memory-only to configure without MPI
-
Lorenz Huedepohl authored
This should make it easier to compile ELPA for Debian/Ubuntu, now it should be sufficient to just ./configure without a need to set LIBS or SCALAPACK_LDFLAGS
-
- 04 May, 2016 1 commit
-
-
Andreas Marek authored
-
- 23 Apr, 2016 1 commit
-
-
Andreas Marek authored
In case of SSE/AVX/AVX2 it could happen that more than one kernel (since some depend on other kernels, e.g. block 6 on block 4 and block 2) were called
-
- 22 Apr, 2016 1 commit
-
-
Andreas Marek authored
setting default kernels This fixes issue #16: due to a mess in setting the default kernels, several kernels were called at the same time, which produces wrong results
-
- 20 Apr, 2016 1 commit
-
-
Andreas Marek authored
It turned out that if a CPU supports SSE the already existing test for SSE assembly instructions always passes. However, the compilation of gcc SSE intrinic instructions might nevertheless fail if gcc is not called with one of the options "-msse3", "-msse4" , "-msse4.1", "-msse4.2", "-mavx", or "-mavx2"! Obviously gcc does still not consider SSE as a standard on X86_64 Intel CPUs. An additional configure test has been introduced, which test for gcc intrinsic sse instructions. If this test fails, the corresponding kernels are switched off.
-
- 19 Apr, 2016 2 commits
-
-
Andreas Marek authored
The C++ kernels can be written as C kernels, which simplifies the build procedure
-
Andreas Marek authored
In order to increase type safty all ELPA2 kernels provide now an interface. The interfaces for the C/C++ kernels are automatically generated during the configure step
-
- 08 Apr, 2016 1 commit
-
-
Lorenz Hüdepohl authored
-
- 06 Apr, 2016 1 commit
-
-
Andreas Marek authored
-
- 05 Apr, 2016 1 commit
-
-
Andreas Marek authored
The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex) case are now available by default Thus the following changes have been done - introduce new macros in configure.ac and Makefile.am - renmae the AVX kernels in AVX_AVX2 (they also support AVX2) - introduce new files with SSE kernel - introduce new kernel parameters ! - make the SSE kernels callable The results are identical with previous kernels
-
- 04 Apr, 2016 1 commit
-
-
Andreas Marek authored
- The SSE part will be available in different files. - Specify whether AVX or AVX2 was used to build
-
- 01 Apr, 2016 1 commit
-
-
Andreas Marek authored
The single precision version of the SSE assembly kernel is about 1.8 times faster than the double precision version
-
- 18 Mar, 2016 1 commit
-
-
Andreas Marek authored
library It the configure option "--enable-single-precision" is specified, ELPA will also be build for single precision usage. The double precision and single precision will be available at the same time with names "solve_evp_real_1stage_double" or "solve_evp_real_1stage_single" and so on... This change immplied some major refactoring of the ELPA code: 1.) functions/procedures had to be renamed with suffix "_double" 2.) If necessary the same functions have to be available with suffix "_single" 3.) Variable kind definitions have to be consistent with the intented use To avoid uneccessary code duplication this is done (most of the time) with preprocessor string substitution. The documentation has been updated. NOT SUPPORTED are at the moment: - single precision usage of ELPA2 with kernels, others than "generic" and "generic_simple" - single precision usage of GPU
-
- 24 Feb, 2016 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
The configure flag "--enable-shared-memory-only" triggers a build of ELPA without MPI support: - all MPI calls are skipped (or overloaded) - all calls to scalapack functions are replaced by the corresponding lapack calls - all calls to blacs are skipped Using ELPA without MPI gives the same results as using ELPA with 1 MPI task! This version is not yet optimized for performance, here and there some unecessary copies are done. Ths version is intended for users, who do not have MPI in their application but still would like to use ELPA on one compute node
-
- 11 Feb, 2016 1 commit
-
-
Andreas Marek authored
With the configure option "--enable-single-precision" ELPA1 is build with single-precision (half-words) only. The best precision in single-precision (float or complex) is 2^-23 ~ 1.2e-7. The accuracy of the error residual of ELPA1 in single-precision mode is of the order 1e-4 to 1e-5. The orthogonality of the EV's is fullfilled up to about ~1e-6. Thus the precision of ELPA1 in single-precision mode is roughly 100 - 1000 times less than the best achievable precison. This is consistent with the double-precision mode, where also a factor of 100 - 1000 less precision than the theoretical best one is found. The float EVs are identical to the double EVs to at least 1e-2, the precision of the EVs is thus about 1e-7/1e-2 = 1e5 times lower than the best theoretical precision. If the same holds for the double precision calculations, this implies that the double precision results can also be only trusted on the level 1e-11 (5 orders of magnitude larger than the best theoretical precision) The best speed-up compared to the double precision calculation is a factor of two. This is by far not achieved yet, since the singl precision version is not at all optimized at the moment
-
- 02 Feb, 2016 1 commit
-
-
Andreas Marek authored
The generic real kernel is now contained in a module, this allows strict interface checking! It also does not use assumed size arrays anymore. Both points increase the possibility to debug and find errors. However, this might be performance critical! It is possible to switch back to the old implementation if that turns out to be beneficial w.r.t. performance. Timings with gfortran 4.9 on Intel Haswell showed that the new implementation is about 30 percent faster then the previous one
-
- 22 Dec, 2015 1 commit
-
-
Andreas Marek authored
-
- 16 Dec, 2015 1 commit
-
-
Andreas Marek authored
This commit does not change the interfaces defined in ELPA_2015.11.001 ! All functionality is available via the interface names and definitions as in ELPA_2015.11.001 But some new interfaces have been added, in order to unfiy the references from C and Fortran codes: - The procedures to create the ELPA (row/column) communicators are now available from C _and_ Fortran with the name "get_elpa_communicators". The old Fortran name "get_elpa_row_col_comms" and the old C name "elpa_get_communicators" are from now on deprecated but still available - The 1-stage solver routines are available from C _and_ Fortran via the names "solve_evp_real_1stage" and "solve_evp_complex_1stage". The old Fortran names "solve_evp_real" and "solve_evp_complex" are from now on deprecated but still functional. All documentation (man pages, doxygen, and example test programs) have been changed accordingly. This commit implies a change in the API versioning number, but no changes to codes calling ELPA (if they have been already updated to the API of ELPA_2015.11.001)
-
- 11 Dec, 2015 1 commit
-
-
Andreas Marek authored
- the contact email is now: elpa-library@mpcdf.mpg.de - the official website is now hosted at http://elpa.mpcdf.mpg.de
-
- 10 Dec, 2015 1 commit
-
-
Andreas Marek authored
The user functions of ELPA are now documented with doxygen tags. At the moment the interface of ELPA 2015.11.001 is decribed. The documentation has step by step to be implemented for all functions and test programms.
-
- 09 Dec, 2015 1 commit
-
-
Andreas Marek authored
This variables, do not have to be global, they can be parsed along internally in ELPA. Removing them makes debugging more easy and the public interface more lean
-
- 26 Nov, 2015 1 commit
-
-
Andreas Marek authored
The API versioning number was not updated correctly at the release. This lead to a wrong soname. This is fixed now
-
- 16 Nov, 2015 1 commit
-
-
Andreas Marek authored
Due to the efforts of Intel, ELPA features now build-in support of AVX2 and FMA for the latest Intel processors
-
- 05 Nov, 2015 1 commit
-
-
Andreas Marek authored
-