- 19 Apr, 2016 2 commits
-
-
Andreas Marek authored
The utility binary printed the available kernels to stderr. This is changed. The ELPA library itself still does all prints on stderr
-
Andreas Marek authored
The test programs are just needed at the build step (make check), they are useless for users and will not be installed anymore
-
- 08 Apr, 2016 4 commits
-
-
Andreas Marek authored
-
Lorenz Hüdepohl authored
-
Lorenz Hüdepohl authored
For the Intel compiler, this was assured with the pragma !DEC$ ATTRIBUTES ALIGN: 64:: a however, other compilers such as gcc of course did not honour this, which could result in SIGSEGVs in case the variable was not aligned to 32 bytes (by chance!). This fixes issue #11, thanks to Nico Holmberg for reporting this.
-
Lorenz Hüdepohl authored
-
- 06 Apr, 2016 1 commit
-
-
Andreas Marek authored
-
- 05 Apr, 2016 2 commits
-
-
Andreas Marek authored
The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex) case are now available by default Thus the following changes have been done - introduce new macros in configure.ac and Makefile.am - renmae the AVX kernels in AVX_AVX2 (they also support AVX2) - introduce new files with SSE kernel - introduce new kernel parameters ! - make the SSE kernels callable The results are identical with previous kernels
-
Andreas Marek authored
-
- 04 Apr, 2016 3 commits
-
-
Andreas Marek authored
- The SSE part will be available in different files. - Specify whether AVX or AVX2 was used to build
-
Andreas Marek authored
-
Andreas Marek authored
From now on, a Changelog will be updated, before an ELPA release
-
- 04 Mar, 2016 1 commit
-
-
Andreas Marek authored
files
-
- 26 Feb, 2016 1 commit
-
-
Andreas Marek authored
-
- 24 Feb, 2016 4 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
The test programs include the same template now, the printed messages are thus unified
-
Andreas Marek authored
The configure flag "--enable-shared-memory-only" triggers a build of ELPA without MPI support: - all MPI calls are skipped (or overloaded) - all calls to scalapack functions are replaced by the corresponding lapack calls - all calls to blacs are skipped Using ELPA without MPI gives the same results as using ELPA with 1 MPI task! This version is not yet optimized for performance, here and there some unecessary copies are done. Ths version is intended for users, who do not have MPI in their application but still would like to use ELPA on one compute node
-
- 18 Feb, 2016 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 17 Feb, 2016 1 commit
-
-
Andreas Marek authored
-
- 03 Feb, 2016 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 02 Feb, 2016 15 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
This commit is performance critical and has to be timed carefully. Thus one can switch back to the old implementation. The new one, however is more safe and better to debug
-
Andreas Marek authored
-
Andreas Marek authored
This commit might be performance critical, it has to be timed carefully. Thus one can switch back to the old implementation. The new one, however, is more safe and better to debug
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
This change might be performance critical and has to be timed carefully. Thus it is possible to switch back to the old implementation. The new one, however, can actually be debbuged
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
The generic real kernel is now contained in a module, this allows strict interface checking! It also does not use assumed size arrays anymore. Both points increase the possibility to debug and find errors. However, this might be performance critical! It is possible to switch back to the old implementation if that turns out to be beneficial w.r.t. performance. Timings with gfortran 4.9 on Intel Haswell showed that the new implementation is about 30 percent faster then the previous one
-
- 19 Jan, 2016 2 commits
-
-
Andreas Marek authored
Now all functions, which were "contained" in anoter one are moved to seperate modules. This allows for strict interface checking and debugging
-
Andreas Marek authored
This routine has been contained in a subroutine. It has been moved to a module and and renamed to "single_hh_trafo_real" to make it's intention more clear
-