- 22 Dec, 2015 2 commits
-
-
Andreas Marek authored
Similiar to commits 2998fac3 and 9710bf08 split elpa1.F90 and elpa2.F90 to make merge easier
-
Andreas Marek authored
-
- 11 Dec, 2015 2 commits
-
-
Andreas Marek authored
- the contact email is now: elpa-library@mpcdf.mpg.de - the official website is now hosted at http://elpa.mpcdf.mpg.de
-
Andreas Marek authored
The Rechenzentrum Garching (RZG) has been renamed into the Max Planck Computing and Data Facility (MPCDF). This is reflected now in the adapted headers. In the near future, all references to the ELPA webside and the ELPA email will also be adapted.
-
- 10 Dec, 2015 3 commits
-
-
Andreas Marek authored
The user functions of ELPA are now documented with doxygen tags. At the moment the interface of ELPA 2015.11.001 is decribed. The documentation has step by step to be implemented for all functions and test programms.
-
Andreas Marek authored
As in a previous commit for elpa1.F90, for automatic generation of documentation elpa2.F90 has been splitted in two files, in order to have a lean, easy-to-understand user interface: elpa2.F90 the visible user functions, which provide the library calls. The usage is the same as before elpa2_compute.F90 all internal routines, which are used by ELPA2, but which are never called external of the library by a user. These functions are now "hidden" in the module elpa2_compute, which is used by ELPA2. The procedures in elpa2_compute.F90 are identical to the ones in elpa2.F90 before this split commit. The only -- but quite a lot of them -- changes are intendation changes.
-
Andreas Marek authored
For automatic generation of documentation, the file elpa1.F90 has been splitted into two files, in order to have a lean, easy-to-understand user interface: elpa1.F90 the visible user functios, which provide the library calls. The usage is the same as always elpa1_compute.F90 all internal routines, which are used by ELPA1 and ELPA2, but which are never called by the user. These functions are now "hidden" in the module elpa1_compute, which is used by ELPA1 and ELPA2. The procedures in elpa1_compute.F90 are identical to the ones in elpa1.F90 before this split commit. The only -- but lot of -- changes are intendation.
-
- 09 Dec, 2015 1 commit
-
-
Andreas Marek authored
This variables, do not have to be global, they can be parsed along internally in ELPA. Removing them makes debugging more easy and the public interface more lean
-
- 08 Dec, 2015 1 commit
-
-
Alexander Heinecke authored
Current fix does as much blocking as possible, which should be beneficial from both a compute and communication point of view. Additionally, a second possible fix was added which just calls the blocked version if the local matrix has a sufficient size. This might create smaller and more messages at scale.
-
- 07 Dec, 2015 1 commit
-
-
Andreas Marek authored
For some matrix/block size combinations the real case of ELPA2 crashes, e.g: mpiexec -n 1 ./elpa2_test_real 50 50 32 leads to an error message ** On entry to DGEMM parameter number 3 had an illegal value and a crash. This only seems to happen with matrix size smaller than 64*64. he code path responsible for this has been identified, but the problem tself is not yet solved! The part of the code, which causes these crashes, has been switched on as default by Intel in commit fe63372d. The rest of the commit fe63372d seems to be fine, and is performance critical. As an intermediate step, the responsible code path is switched off again as default, this will be changed again once the underlying root cause has been solved.
-
- 11 Nov, 2015 1 commit
-
-
Andreas Marek authored
-
- 05 Nov, 2015 1 commit
-
-
Andreas Marek authored
-
- 03 Nov, 2015 1 commit
-
-
Andreas Marek authored
The examples, how to invoke ELPA from a c program have been updated. There are now examples for ELPA1 and ELPA2 both real and complex case. The test cases are still with less functionality than their Fortran counter parts, they are just ment as a "proof-of-concept".
-
- 28 Oct, 2015 1 commit
-
-
Alexander Heinecke authored
- enabling fusing iterations of stage 5 in ELPA2 for every configuration - Changed reduction bandwidth in ELPA2 to be at least 64 - partial OpenMP parallelization of the QR factorization in bandred_real - OpenMP parallelization of SYMM - OpenMP parallelization of SYR2K in bandred_real - OpenMP parallelization for elpa1_reduce_add_vectors and elpa1_transpose_vectors - AVX2 support in backtransformation elpa2_kernels (FMA3 instructions introduced with Haswell microarchitecture)
-
- 17 Jun, 2015 1 commit
-
-
Andreas Marek authored
"Merging" the NVIDIA code by hand , introduced errors.
-
- 16 Jun, 2015 4 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
complex cases Create automatically two independent routines for real and complex valued matrices
-
Andreas Marek authored
This commit is not ABI compatible
-
Andreas Marek authored
This commit is not ABI compatible, since it changes the interfaces of some routines Also, introduce type checking for transpose and reduce_add routines
-
- 02 Jun, 2015 1 commit
-
-
Andreas Marek authored
-
- 01 Jun, 2015 1 commit
-
-
Andreas Marek authored
-
- 28 May, 2015 1 commit
-
-
Lorenz Huedepohl authored
-
- 26 May, 2015 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
Andreas Gloess informed us about a memory leak in ELPA, which was introduced in version 2013.11.008. This memory leak is removed now again. Note, that older versions of ELPA will not be fixed right now.
-
- 21 May, 2015 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
1. The dimensions of an array have been wrong in cuda calls. 2. Start to get rid of "assumed-size" arrays in the real case They are a nightmare to debug and easily lead to a conceptional error as in 1. Furthermore, the compiler can generally optimize code better if "assumed-shape" arrays are used, since more information is available at compile time
-
- 28 Apr, 2015 1 commit
-
-
Andreas Marek authored
-
- 24 Mar, 2015 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 09 Mar, 2015 2 commits
-
-
Andreas Marek authored
In the test programms "MPI_ABORT" has been used incorrectly.
-
Andreas Marek authored
The blocksize should be intent(in) and not intent(inout), since it should not be changed by solve_evlp_real_2stage
-
- 06 Mar, 2015 2 commits
-
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
-
- 03 Mar, 2015 1 commit
-
-
Andreas Marek authored
If the user chooeses parameters for the QR-decomposition which are not allowed an error has been produced. This error is caught now, and the library aborts with a message It is now possible to switch on more debug messages via the environment variable "ELPA_DEBUG_MESSAGES=yes"
-
- 11 Feb, 2015 2 commits
-
-
Andreas Marek authored
Error in configure test program fixed
-
Andreas Marek authored
If the QR-decomposition is used wrongly (matrix size is not a multiple of block size) the the execution will abort, in order to prevent the wrong results, discussed in a previous commit Debug messages are now available by setting the environment variable "ELPA_DEBUG_MESSAGES" to "yes".
-
- 03 Feb, 2015 1 commit
-
-
Andreas Marek authored
We found a bug in the QR-decomposition, which appears for some matrix sizes and produces wrong results! If the QR decomposition is switched on, an appropiate warning is shown. This bug is still under investigation
-
- 02 Feb, 2015 1 commit
-
-
Andreas Marek authored
- cleanup of the file - add more (optional) timing information
-
- 29 Jan, 2015 1 commit
-
-
Andreas Marek authored
The qr decomposition is now available as a runtime choice. Some testing has still to be done
-