- 08 Apr, 2016 1 commit
-
-
Lorenz Hüdepohl authored
For the Intel compiler, this was assured with the pragma !DEC$ ATTRIBUTES ALIGN: 64:: a however, other compilers such as gcc of course did not honour this, which could result in SIGSEGVs in case the variable was not aligned to 32 bytes (by chance!). This fixes issue #11, thanks to Nico Holmberg for reporting this.
-
- 05 Apr, 2016 1 commit
-
-
Andreas Marek authored
The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex) case are now available by default Thus the following changes have been done - introduce new macros in configure.ac and Makefile.am - renmae the AVX kernels in AVX_AVX2 (they also support AVX2) - introduce new files with SSE kernel - introduce new kernel parameters ! - make the SSE kernels callable The results are identical with previous kernels
-
- 04 Apr, 2016 1 commit
-
-
Andreas Marek authored
- The SSE part will be available in different files. - Specify whether AVX or AVX2 was used to build
-
- 26 Feb, 2016 1 commit
-
-
Andreas Marek authored
-
- 24 Feb, 2016 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
The test programs include the same template now, the printed messages are thus unified
-
Andreas Marek authored
The configure flag "--enable-shared-memory-only" triggers a build of ELPA without MPI support: - all MPI calls are skipped (or overloaded) - all calls to scalapack functions are replaced by the corresponding lapack calls - all calls to blacs are skipped Using ELPA without MPI gives the same results as using ELPA with 1 MPI task! This version is not yet optimized for performance, here and there some unecessary copies are done. Ths version is intended for users, who do not have MPI in their application but still would like to use ELPA on one compute node
-
- 02 Feb, 2016 5 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
The generic real kernel is now contained in a module, this allows strict interface checking! It also does not use assumed size arrays anymore. Both points increase the possibility to debug and find errors. However, this might be performance critical! It is possible to switch back to the old implementation if that turns out to be beneficial w.r.t. performance. Timings with gfortran 4.9 on Intel Haswell showed that the new implementation is about 30 percent faster then the previous one
-
- 19 Jan, 2016 2 commits
-
-
Andreas Marek authored
Now all functions, which were "contained" in anoter one are moved to seperate modules. This allows for strict interface checking and debugging
-
Andreas Marek authored
This routine has been contained in a subroutine. It has been moved to a module and and renamed to "single_hh_trafo_real" to make it's intention more clear
-
- 11 Jan, 2016 1 commit
-
-
Andreas Marek authored
-
- 04 Jan, 2016 1 commit
-
-
Andreas Marek authored
The Fortran variable declerations "variable type*[4,8,16]" is non Fortran standard. It might cause problem in the future. Furthermore, the usage of Fortran and C togehther is more clean if variables are defined according to C variable types. This is done, now for all the test programs
-
- 16 Dec, 2015 1 commit
-
-
Andreas Marek authored
This commit does not change the interfaces defined in ELPA_2015.11.001 ! All functionality is available via the interface names and definitions as in ELPA_2015.11.001 But some new interfaces have been added, in order to unfiy the references from C and Fortran codes: - The procedures to create the ELPA (row/column) communicators are now available from C _and_ Fortran with the name "get_elpa_communicators". The old Fortran name "get_elpa_row_col_comms" and the old C name "elpa_get_communicators" are from now on deprecated but still available - The 1-stage solver routines are available from C _and_ Fortran via the names "solve_evp_real_1stage" and "solve_evp_complex_1stage". The old Fortran names "solve_evp_real" and "solve_evp_complex" are from now on deprecated but still functional. All documentation (man pages, doxygen, and example test programs) have been changed accordingly. This commit implies a change in the API versioning number, but no changes to codes calling ELPA (if they have been already updated to the API of ELPA_2015.11.001)
-
- 15 Dec, 2015 1 commit
-
-
Andreas Marek authored
For the library functions which are accessible by the user man pages decribing the Fortran and C interface exist: -get_elpa_row_comms -solve_evp_real , solve_evp_complex -solve_evp_real_2stage, solve_evp_complex_2stage For the "service binary" print_available_elpa2_kernels, also a man page exists. TODO: extend man pages to test-binaries, or do not install test-binaries
-
- 10 Dec, 2015 3 commits
-
-
Andreas Marek authored
The user functions of ELPA are now documented with doxygen tags. At the moment the interface of ELPA 2015.11.001 is decribed. The documentation has step by step to be implemented for all functions and test programms.
-
Andreas Marek authored
As in a previous commit for elpa1.F90, for automatic generation of documentation elpa2.F90 has been splitted in two files, in order to have a lean, easy-to-understand user interface: elpa2.F90 the visible user functions, which provide the library calls. The usage is the same as before elpa2_compute.F90 all internal routines, which are used by ELPA2, but which are never called external of the library by a user. These functions are now "hidden" in the module elpa2_compute, which is used by ELPA2. The procedures in elpa2_compute.F90 are identical to the ones in elpa2.F90 before this split commit. The only -- but quite a lot of them -- changes are intendation changes.
-
Andreas Marek authored
For automatic generation of documentation, the file elpa1.F90 has been splitted into two files, in order to have a lean, easy-to-understand user interface: elpa1.F90 the visible user functios, which provide the library calls. The usage is the same as always elpa1_compute.F90 all internal routines, which are used by ELPA1 and ELPA2, but which are never called by the user. These functions are now "hidden" in the module elpa1_compute, which is used by ELPA1 and ELPA2. The procedures in elpa1_compute.F90 are identical to the ones in elpa1.F90 before this split commit. The only -- but lot of -- changes are intendation.
-
- 16 Nov, 2015 2 commits
-
-
Andreas Marek authored
Due to the efforts of Intel, ELPA features now build-in support of AVX2 and FMA for the latest Intel processors
-
Lorenz Huedepohl authored
-
- 05 Nov, 2015 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 03 Nov, 2015 1 commit
-
-
Andreas Marek authored
The examples, how to invoke ELPA from a c program have been updated. There are now examples for ELPA1 and ELPA2 both real and complex case. The test cases are still with less functionality than their Fortran counter parts, they are just ment as a "proof-of-concept".
-
- 23 Mar, 2015 1 commit
-
-
Lorenz Huedepohl authored
-
- 16 Mar, 2015 3 commits
-
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
-
- 11 Mar, 2015 1 commit
-
-
Andreas Marek authored
C interfaces are now available and definied in the header elpa.h
-
- 11 Feb, 2015 1 commit
-
-
Andreas Marek authored
If the QR-decomposition is used wrongly (matrix size is not a multiple of block size) the the execution will abort, in order to prevent the wrong results, discussed in a previous commit Debug messages are now available by setting the environment variable "ELPA_DEBUG_MESSAGES" to "yes".
-
- 03 Feb, 2015 3 commits
-
-
Lorenz Huedepohl authored
It contains preprocessor directives which produce warnings or errors otherwise.
-
Lorenz Huedepohl authored
-
Andreas Marek authored
-
- 29 Jan, 2015 1 commit
-
-
Andreas Marek authored
The qr decomposition is now available as a runtime choice. Some testing has still to be done
-
- 28 Jan, 2015 1 commit
-
-
Andreas Marek authored
-
- 27 Jan, 2015 1 commit
-
-
Lorenz Huedepohl authored
-
- 25 Aug, 2014 2 commits
-
-
Andreas Marek authored
At build time it can be specified that the ELPA test programs give more detailed timing information, which is usefull for performace measurements
-
Andreas Marek authored
If specified in the configure step, the test programs redirect their stdout and stderr output of each MPI task in a seperate file, which will be stored in a subdirectory "mpi_stdout". This will only be done if the environment variable "REDIRECT_ELPA_TEST_OUTPUT" is set to "true"
-