Changelog 6.27 KB
Newer Older
1
Changelog for next release
2

3
4
- not yet decided

Andreas Marek's avatar
Andreas Marek committed
5
Changelog for ELPA 2020.05.001
6

Andreas Marek's avatar
Test  
Andreas Marek committed
7
8
9
- improved documentation, including fixing of typos and errors in markdown
- Fix a bug in the calling of Cannons algorithm which might lead to crashes
for a squared process grid
10
11
12
13
14
15
16
- improvements and bugfixes of the ELPA2 stage GPU version, see
   https://arxiv.org/abs/2002.10991
- bugfix for the build of AVX-512 KNL kernels
- clean seperation of SIMD instructions for AVX and AVX2 kernels
- better error checking for allocations / deallocations of CPU and GPU memory
- experimental feature of matrix redistribution
- bugfix in the cpuid tests
17
18
- bugfix in elpa2_print_kernels
- bugfix when configuring --with-gpu-support-only
Andreas Marek's avatar
Test  
Andreas Marek committed
19

Andreas Marek's avatar
Andreas Marek committed
20
Changelog for ELPA 2019.11.001
21
22
23
24
25
26

- solve a bug when using parallel make builds
- check the cpuid set during build time
- add experimental feature "heterogenous-cluster-support"
- add experimental feature for 64bit integer LAS/LAPACK/SCALAPACK support
- add experimental feature for 64bit integer MPI support
27
28
- support of ELPA for real valued skew-symmetric matrices, please cite:
  https://arxiv.org/abs/1912.04062 
29
- cleanup of the GPU version
Andreas Marek's avatar
Andreas Marek committed
30
31
32
33
- bugfix in the OpenMP version
- bugfix on the Power8/9 kernels
- bugfix on ARM aarch64 FMA kernels

Andreas Marek's avatar
Andreas Marek committed
34
35
36
37
38
39

Changelog for ELPA 2019.05.002

- repacking of the src since the legacy interface has been forgotten in the
  2019.05.001 release

Andreas Marek's avatar
Andreas Marek committed
40
41
Changelog for ELPA 2019.05.001

42
43
44
45
46
47
48
- elpa_print_kernels supports GPU usage
- fix an error if PAPI measurements are activated
- new simple real kernels: block4 and block6
- c functions can be build with optional arguments if compiler supports it
(configure option)
- allow measurements with the likwid tool
- users can define the default-kernel at build time
49
- ELPA versioning number is provided in the C header files
50
51
52
53
54
55
56
57
58
59
- as announced a year ago, the following deprecated routines have been finally
removed; see DEPRECATED_FEATURES for the replacement routines , which have
been introduced a year ago. Removed routines:
  -> mult_at_b_real
  -> mult_ah_b_complex
  -> invert_trm_real
  -> invert_trm_complex
  -> cholesky_real
  -> cholesky_complex
  -> solve_tridi
60
- new kernels for ARM arch64 added
Andreas Marek's avatar
Andreas Marek committed
61
- fix an out-of-bound-error in elpa2
62

63

Andreas Marek's avatar
Andreas Marek committed
64
Changelog for ELPA 2018.11.001
65
66
67
68
69

- improved autotuning
- improved performance of generalized problem via Cannon's algorithm
- check pointing functionality of elpa objects
- store/read/resume of autotuning
70
- Python interface for ELPA
71
72
- more ELPA functions have an optional error argument (Fortran) or required
error argument (C) => ABI and API change
73
74


Andreas Marek's avatar
Andreas Marek committed
75
Changelog for ELPA 2018.05.001
76
77
78
79
80

- significant improved performance on K-computer
- added interface for the generalized eigenvalue problem
- extended autotuning functionality

81
Changelog for ELPA 2017.11.001
82

Andreas Marek's avatar
Andreas Marek committed
83
- significant improvement of performance of GPU version
84
85
86
- added new compute kernels for IBM Power8 and Fujistu Sparc64
  processors
- a first implementation of autotuning capability
87
88
- correct some type statements in Fortran
- correct detection of PAPI in configure step
89

90
91
92
93
94
Changelog for ELPA 2017.05.003

- remove bug in invert_triangular, which had been introduced
  in ELPA 2017.05.002

95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
Changelog for ELPA 2017.05.002

Mainly bugfixes for ELPA 2017.05.001:
- fix memory leak of MPI communicators
- tests for hermitian_multiply, cholesky decomposition and
- deal with a problem on Debian (mawk)

Changelog for ELPA 2017.05.001

Final release of ELPA 2017.05.001
Since rc2 the following changes have been made
- more extensive tests during "make check"
- distribute missing C headers
- introduce analytic tests
- Fix stack overflow in some kernels

Andreas Marek's avatar
Andreas Marek committed
111
112
113
114
115
116
117
Changelog for ELPA 2017.05.001.rc2

This is the release candidate 2 for the ELPA 2017.05.001 version.
Additionaly to the changes from rc1, it fixes some smaller issues
- add missing script "manual_cpp"
- cleanup of code

118
119
120
121
122
123
Changelog for ELPA 2017.05.001.rc1

This is the release candidate 1 for the ELPA 2017.05.001 version.
It provides a first version of the new, more generic API of the ELPA library.
Smaller changes to the API might be possible in the upcoming release
candidates. For users, who would like to use the older API of the ELPA
Andreas Marek's avatar
Andreas Marek committed
124
library, the API as defined with release 2016.11.001.pre is frozen in and
125
126
127
128
129
130
131
132
133
134
135
136
also supported.

Apart of the API change to be more flexible for the future, this release
offers the following changes:

- faster GPU implementation, especially for ELPA 1stage
- the restriction of the block-cyclic distribution blocksize = 128 in the GPU
  case is relaxed
- Faster CPU implementation due to better blocking
- support of already banded matrices (new API only!)
- improved KNL support

137
138
139
140
141
142
143
144
145
Changelog for pre-release ELPA 2016.11.001.pre

This pre-release contains an experimental API which will most likely
change in the next stable release

- also suport of single-precision (real and complex case) eigenvalule problems
- GPU support in ELPA 1stage and 2stage (real and complex case)
- change of API (w.r.t. ELPA 2016.05.004) to support runtime-choice of GPU usage

146
Changelog for release ELPA 2016.05.004
Andreas Marek's avatar
Andreas Marek committed
147
148
149

- fix a problem with the private state of module precision
- distribute test_project with dist tarball
150
- generic driver routine for ELPA 1stage and 2stage
Andreas Marek's avatar
Andreas Marek committed
151
152
153
154
155
156
157
158
159
160
- test case for elpa_mult_at_b_real
- test case for elpa_mult_ah_b_complex
- test case for elpa_cholesky_real
- test case for elpa_cholesky_complex
- test case for elpa_invert_trm_real
- test case for elpa_invert_trm_complex
- fix building of static library
- better choice of AVX, AVX2, AVX512 kernels
- make assumed size Fortran arrays default

Andreas Marek's avatar
Andreas Marek committed
161
162
163
164
165
166
167
168
169
170
Changelog for release ELPA 2016.05.003

- fix a problem with the build of SSE kernels
- make some (internal) functions public, such that they
  can be used outside of ELPA
- add documentation and interfaces for new public functions
- shorten file namses and directory names for test programs
  in under to by pass "make agrument list too long" error

Changelog for release ELPA 2016.05.002
171
172
173

- fix problem with generated *.sh- check scripts
- name library differently if build without MPI support
Andreas Marek's avatar
Andreas Marek committed
174
- install only public modules
175
176


Andreas Marek's avatar
Andreas Marek committed
177
Changelog for release ELPA 2016.05.001
178

Andreas Marek's avatar
Andreas Marek committed
179
- support building without MPI for one node usage
180
181
182
183
184
- doxygen and man pages documentation for ELPA
- cleanup of documentation
- introduction of SSE gcc intrinsic kernels
- Remove errors due to unaligned memory
- removal of Fortran "contains functions"
Andreas Marek's avatar
Andreas Marek committed
185
- Fortran interfaces for assembly and C kernels
186
187