INSTALL.md 19.6 KB
Newer Older
Andreas Marek's avatar
Andreas Marek committed
1
# Installation guide for the *ELPA* library#
2

3
## Preamble ##
4

5
This file provides documentation on how to build the *ELPA* library in **version ELPA-2020.11.001.rc1**.
6
With release of **version ELPA-2017.05.001** the build process has been significantly simplified,
7
which makes it easier to install the *ELPA* library.
8

9
The release ELPA 2018.11.001 was the last release, where the legacy API has been
10
enabled by default (and can be disabled at build time).
11
With the release ELPA 2019.11.001, the legacy API has been deprecated and the support has been closed.
12

13
The release of ELPA 2020.11.001.rc1 does change the API and ABI compared to the release 2019.11.001, since
Andreas Marek's avatar
Andreas Marek committed
14
the legacy API has been dropped.
15

Andreas Marek's avatar
Andreas Marek committed
16
## How to install *ELPA* ##
17
18

First of all, if you do not want to build *ELPA* yourself, and you run Linux,
Andreas Marek's avatar
Andreas Marek committed
19
it is worth having a look at the [*ELPA* webpage*](http://elpa.mpcdf.mpg.de)
20
21
22
23
24
and/or the repositories of your Linux distribution: there exist
pre-build packages for a number of Linux distributions like Fedora,
Debian, and OpenSuse. More, will hopefully follow in the future.

If you want to build (or have to since no packages are available) *ELPA* yourself,
Andreas Marek's avatar
Andreas Marek committed
25
please note that *ELPA* is shipped with a typical `configure` and `make`
26
27
autotools procedure. This is the **only supported way** how to build and install *ELPA*.

Andreas Marek's avatar
Andreas Marek committed
28

29
If you obtained *ELPA* from the official git repository, you will not find
Andreas Marek's avatar
Andreas Marek committed
30
the needed configure script! You will have to create the configure script with autoconf.
31
32


Andreas Marek's avatar
Andreas Marek committed
33
## (A): Installing *ELPA* as library with configure ##
34
35

*ELPA* can be installed with the build steps
Andreas Marek's avatar
Andreas Marek committed
36
37
38
39
- `configure`
- `make`
- `make check`   | or `make check CHECK_LEVEL=extended`
- `make install`
40

Andreas Marek's avatar
Andreas Marek committed
41
Please look at `configure --help` for all available options.
42

43
44
An excerpt of the most important (*ELPA* specific) options reads as follows:

Andreas Marek's avatar
Andreas Marek committed
45
46
| configure option                     | description                                           |
|:------------------------------------ |:----------------------------------------------------- |
Andreas Marek's avatar
Andreas Marek committed
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
|  `--enable-legacy-interface`           | build legacy API, will not be build as default        |
|  `--enable-optional-argument-in-C-API` | treat error arguments in C-API as optional            |
|  `--enable-openmp`                     | use OpenMP threading, default no.                     |
|  `--enable-redirect`                   | for ELPA test programs, allow redirection of <br> stdout/stderr per MPI taks in a file <br> (useful for timing), default no. |
|  `--enable-single-precision`           | build with single precision version                   |
|  `--disable-timings`                   | more detailed timing, default yes <br> **If disabled some features like autotune will <br> not work anymmore !** |
|  `--disable-band-to-full-blocking`     | build ELPA2 with blocking in band_to_full <br> (default:enabled) |
|  `--disable-mpi-module`                | do not use the Fortran MPI module, <br> get interfaces by 'include "mpif.h') |
|  `--disable-generic`                   | do not build GENERIC kernels, default: enabled        |
|  `--enable-sparc64`                    | do not build SPARC64 kernels, default: disabled        |
|  `--disable-sse`                       | do not build SSE kernels, default: enabled            |
|  `--disable-sse-assembly`              | do not build SSE_ASSEMBLY kernels, default: enabled   |
|  `--disable-avx`                       | do not build AVX kernels, default: enabled            |
|  `--disable-avx2`                      | do not build AVX2 kernels, default: enabled           |
|  `--enable-avx512`                     | build AVX512 kernels, default: disabled               |
|  `--enable-gpu`                        | build GPU kernels, default: disabled                  |
|  `--enable-bgp`                        | build BGP kernels, default: disabled                  |
|  `--enable-bgq`                        | build BGQ kernels, default: disabled                  |
|  `--with-mpi=[yes|no]`                 | compile with MPI. Default: yes                        |
|  `--with-cuda-path=PATH`               | prefix where CUDA is installed [default=auto]         |
|  `--with-cuda-sdk-path=PATH`           | prefix where CUDA SDK is installed [default=auto]     |
|  `--with-GPU-compute-capability=VALUE` | use compute capability VALUE for GPU version, <br> default: "sm_35" |
|  `--with-fixed-real-kernel=KERNEL`     | compile with only a single specific real kernel.      |
|  `--with-fixed-complex-kernel=KERNEL`  | compile with only a single specific complex kernel.   |
|  `--with-gpu-support-only`             | Compile and always use the GPU version                |
|  `--with-likwid=[yes|no|PATH]`         | use the likwid tool to measure performance (has an performance impact!), default: no |
|  `--with-default-real-kernel=KERNEL`   | set the real kernel KERNEL as default                 |
|  `--with-default-complex-kernel=KERNEL`| set the compplex kernel KERNEL as default             |
|  `--enable-scalapack-tests`            | build SCALAPACK test cases for performance <br> omparison, needs MPI, default no. |
76
|  `--enable-autotune-redistribute-matrix` | EXPERIMENTAL FEATURE; NOT FULLY SUPPORTED YET: Allows ELPA during autotuning to re-distribute the matrix to find the best (ELPA internal) block size for block-cyclic distribution (Needs Scalapack functionality |
Andreas Marek's avatar
Andreas Marek committed
77
78
79
80
81
82
83
84
85
86
87
88
|  `--enable-autotuning`                 | enables autotuning functionality, default yes         |
|  `--enable-c-tests`                    | enables the C tests for elpa, default yes             |
|  `--disable-assumed-size`              | do NOT use assumed-size Fortran arrays. default use   |
|  `--enable-scalapack-tests`            | build also ScalaPack tests for performance comparison; needs MPI |
|  `--disable-Fortran2008-features`      | disable Fortran 2008 if compiler does not support it  |
|  `--enable-pyhton`                     | build and install python wrapper, default no          |
|  `--enable-python-tests`               | enable python tests, default no.                      |
|  `--enable-skew-symmetric-support`     | enable support for real valued skew-symmetric matrices |
|  `--enable-store-build-config`         | stores the build config in the library object |
|  `--64bit-integer-math-support`        | assumes that BLAS/LAPACK/SCALAPACK use 64bit integers (experimentatl) |
|  `--64bit-integer-mpi-support`         | assumes that MPI uses 64bit integers (experimental) |
|  `--heterogenous-cluster-support`      | allows ELPA to run on clusters of nodes with different Intel CPUs (experimental) |
89

90
We recommend that you do not build ELPA in its main directory but that you use it
91
92
in a sub-directory:

Andreas Marek's avatar
Andreas Marek committed
93
```
94
95
96
97
mkdir build
cd build

../configure [with all options needed for your system, see below]
Andreas Marek's avatar
Andreas Marek committed
98
```
99
100
101
102

In this way, you have a clean separation between original *ELPA* source files and the compiled
object files

103
104
Please note, that it is necessary to set the **compiler options** like optimisation flags etc.
for the Fortran and C part.
Andreas Marek's avatar
Andreas Marek committed
105
For example sth. like this is a usual way: `./configure FCFLAGS="-O2 -mavx" CFLAGS="-O2 -mavx"`
106
107
For details, please have a look at the documentation for the compilers of your choice.

Andreas Marek's avatar
Andreas Marek committed
108
109
110
**Note** that most kernels can only be build if the correct compiler flags for this kernel (e.g. AVX-512)
have been enabled.

111

112
113
### Choice of building with or without MPI ###

114
It is possible to build the *ELPA* library with or without MPI support.
115
116
117

Normally *ELPA* is build with MPI, in order to speed-up calculations by using distributed
parallelisation over several nodes. This is, however, only reasonably if the programs
Andreas Marek's avatar
Andreas Marek committed
118
calling the *ELPA* library are already MPI parallelized, and *ELPA* can use the same
119
120
121
122
123
124
125
block-cyclic distribution of data as in the calling program.

Programs which do not support MPI parallelisation can still make use of the *ELPA* library if it
has also been build without MPI support.

If you want to build *ELPA* with MPI support, please have a look at "A) Setting of MPI compiler and libraries".
For builds without MPI support, please have a look at "B) Building *ELPA* without MPI support".
Andreas Marek's avatar
Andreas Marek committed
126
**NOTE** that if *ELPA* is build without MPI support, it will be serial unless the OpenMP parallelization is
Andreas Marek's avatar
Andreas Marek committed
127
explicitely enabled.
128
129

Please note, that it is absolutely supported that both versions of the *ELPA* library are build
130
and installed in the same directory.
131

132
#### A) Setting of MPI compiler and libraries ####
133
134

In the standard case *ELPA* needs a MPI compiler and MPI libraries. The configure script
135
136
137
will try to set this by itself. If, however, on the build system the compiler wrapper
cannot automatically found, it is recommended to set it by hand with a variable, e.g.

Andreas Marek's avatar
Andreas Marek committed
138
```
139
configure FC=mpif90
Andreas Marek's avatar
Andreas Marek committed
140
```
141

142
143
In some cases, on your system different MPI libraries and compilers are installed. Then it might happen
that during the build step an error like "no module mpi" or "cannot open module mpi" is given.
Andreas Marek's avatar
Andreas Marek committed
144
You can disable that the  *ELPA* library uses a MPI modules (and instead uses MPI header files) by
145
146
adding

Andreas Marek's avatar
Andreas Marek committed
147
```
148
--disable-mpi-module
Andreas Marek's avatar
Andreas Marek committed
149
```
150
151
152
153
154
155

to the configure call.

Please continue reading at "C) Enabling GPU support"


156
#### B) Building *ELPA* without MPI support ####
157
158
159

If you want to build *ELPA* without MPI support, add

Andreas Marek's avatar
Andreas Marek committed
160
```
Andreas Marek's avatar
Andreas Marek committed
161
--with-mpi=no
Andreas Marek's avatar
Andreas Marek committed
162
```
163
164
165

to your configure call.

166
You have to specify which compilers should be used with e.g.,
167

Andreas Marek's avatar
Andreas Marek committed
168
```
Andreas Marek's avatar
Andreas Marek committed
169
configure FC=gfortran --with-mpi=no
Andreas Marek's avatar
Andreas Marek committed
170
```
171

Andreas Marek's avatar
Andreas Marek committed
172
**DO NOT specify a MPI compiler here!**
173

174
Note, that the installed *ELPA* library files will be suffixed with
Andreas Marek's avatar
Andreas Marek committed
175
`_onenode`, in order to discriminate this build from possible ones with MPI.
176

177
178
179

Please continue reading at "C) Enabling GPU support"

180
### Enabling GPU support ###
181
182
183
184
185
186
187
188

The *ELPA* library can be build with GPU support. If *ELPA* is build with GPU
support, users can choose at RUNTIME, whether to use the GPU version or not.

For GPU support, NVIDIA GPUs with compute capability >= 3.5 are needed.

GPU support is set with

Andreas Marek's avatar
Andreas Marek committed
189
```
190
--enable-gpu
Andreas Marek's avatar
Andreas Marek committed
191
```
192
193
194

It might be necessary to also set the options (please see configure --help)

Andreas Marek's avatar
Andreas Marek committed
195
```
196
197
198
--with-cuda-path
--with-cuda-sdk-path
--with-GPU-compute-capability
Andreas Marek's avatar
Andreas Marek committed
199
```
200
201
202
203

Please continue reading at "D) Enabling OpenMP support".


204
### Enabling OpenMP support ###
205
206
207
208
209
210
211

The *ELPA* library can be build with OpenMP support. This can be support of hybrid
MPI/OpenMP parallelization, since *ELPA* is build with MPI support (see A ) or only
shared-memory parallization, since *ELPA* is build without MPI support (see B).

To enable OpenMP support, add

212
--enable-openmp
213
214
215
216
217
218

as configure option.

Note that as in case with/without MPI, you can also build and install versions of *ELPA*
with/without OpenMP support at the same time.

219
However, the GPU choice at runtime is not compatible with OpenMP support.
220
221
222
223

Please continue reading at "E) Standard libraries in default installation paths".


224
### Standard libraries in default installation paths ###
225
226

In order to build the *ELPA* library, some (depending on the settings during the
227
configure step) libraries are needed.
228
229

Typically these are:
230
231
232
233
234
235
236
  - Basic Linear Algebra Subroutines (BLAS)                   (always needed)
  - Lapack routines                                           (always needed)
  - Basic Linear Algebra Communication Subroutines (BLACS)    (only needed if MPI support was set)
  - Scalapack routines                                        (only needed if MPI support was set)
  - a working MPI library                                     (only needed if MPI support was set)
  - a working OpenMP library                                  (only needed if OpenMP support was set)
  - a working CUDA/cublas library                             (only needed if GPU support was set)
237
238

If the needed library are installed on the build system in standard paths (e.g. /usr/lib64)
239
in the most cases the *ELPA* configure step will recognize the needed libraries
240
241
automatically. No setting of any library paths should be necessary.

242
243
244
245
If your configure steps finish succcessfully, please continue at "G) Choice of ELPA2 compute kernels".
If your configure step aborts, or you want to use libraries in non standard paths please continue at
"F) Non standard paths or non standard libraries".

246
### Non standard paths or non standard libraries ###
247
248
249

If standard libraries are on the build system either installed in non standard paths, or
special non standard libraries (e.g. *Intel's MKL*) should be used, it might be necessary
250
to specify the appropriate link-line with the **SCALAPACK_LDFLAGS** and **SCALAPACK_FCFLAGS**
251
252
variables.

253
For example, due to performance reasons it might be benefical to use the *BLAS*, *BLACS*, *LAPACK*,
254
255
and *SCALAPACK* implementation from *Intel's MKL* library.

256
Together with the Intel Fortran Compiler the call to configure might then look like:
257

Andreas Marek's avatar
Andreas Marek committed
258
```
259
260
261
262
configure SCALAPACK_LDFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential \
                             -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -Wl,-rpath,$MKL_HOME/lib/intel64" \
	  SCALAPACK_FCFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential \
	                      -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -I$MKL_HOME/include/intel64/lp64"
Andreas Marek's avatar
Andreas Marek committed
263
```
264

265
and for *INTEL MKL* together with *GNU GFORTRAN* :
266

Andreas Marek's avatar
Andreas Marek committed
267
```
268
269
270
271
configure SCALAPACK_LDFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential \
                             -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -Wl,-rpath,$MKL_HOME/lib/intel64" \
	  SCALAPACK_FCFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential \
	                     -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -I$MKL_HOME/include/intel64/lp64"
Andreas Marek's avatar
Andreas Marek committed
272
```
273
274

Please, for the correct link-line refer to the documentation of the correspondig library. In case of *Intel's MKL* we
Andreas Marek's avatar
Andreas Marek committed
275
suggest the [Intel Math Kernel Library Link Line Advisor](https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor).
276
277


278
### Choice of ELPA2 compute kernels ###
279

280
281
282
ELPA 2stage can be used with different implementations of compute intensive kernels, which are architecture dependent.
Some kernels (all for x86_64 architectures) are enabled by default (and must be disabled if you do not want them),
others are disabled by default and must be enabled if they are wanted.
283

284
285
One can enable "kernel classes" by setting e.g.

Andreas Marek's avatar
Andreas Marek committed
286
```
287
--enable-avx2 
Andreas Marek's avatar
Andreas Marek committed
288
```
289

290
This will try to build all the AVX2 kernels. Please see configure --help for all options
291

292
293
With

Andreas Marek's avatar
Andreas Marek committed
294
```
295
--disable-avx2
Andreas Marek's avatar
Andreas Marek committed
296
```
297
298
299
300

one chan choose not to build the AVX2 kernels.


301
During the configure step all possible kernels will be printed, and whether they will be enabled or not.
302

303
304
It is possible to build *ELPA* with as many kernels as desired, the user can then choose at runtime which
kernels should be used.
305

306
It this is not desired, it is possible to build *ELPA* with only one (not necessary the same) kernel for the
Andreas Marek's avatar
Andreas Marek committed
307
308
real and complex valued case, respectively. This can be done with the `--with-fixed-real-kernel=NAME` or
`--with-fixed-complex-kernel=NAME` configure options. For details please do a "configure --help"
309

310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
#### Cross compilation ####

The ELPA library does _not_ supports cross-compilation by itself, i.e. compilation of the ELPA library on an architecture wich is not
identical than the architecture ELPA should be used on.

Whenever a cross-compilation situation might occur, great care has to be taken during the build process by the user.

At the moment we see two potential pitfalls:

1.) The "build architecure" is inferior to the "target" architecture (w.r.t. the instructions sets)

In this case, at the moment, the ELPA library can only be build with instructions sets supported on the build
system. All later instruction sets will _not_ be used in the compilation. This case might lead to less optimal
performance compared to the case that ELPA is build directly on the target system.

For example, if the "build architecture" consists of an HASWELL node (supporting up to Intel's AVX2 instruction set) and the 
"target architecture" is a Skylake node (supporting Intel's AVX-512 instruction set) than the AVX-512 kernels can not be build
This will lead to a performance degradation on the Skylake nodes, but is otherwise harmless (no chrashes).


2.) The "build architecure" is superior to the "target" architecture (w.r.t. the instructions sets)

This case is a critical one, since ELPA will by default build with instructions sets which are not supported on the target
system. This will lead to crashes, if during build the user does not take care to solve this issue.

For example, if the "build architecture" supports Intels' AVX-2 instruction set and the 
"target architecture" does only support Intel's AVX instruction set, then by default ELPA will be build with AVX-2 instruction set
and this will also be used at runtime (since it improves the performance). However, at the moment, since the target system does not support
AVX-2 instructions this will lead to a crash.

One can avoid this unfortunate situation by disabling instructions set which are _not_ supported on the target system.
In the case above, setting

Andreas Marek's avatar
Andreas Marek committed
343
```
344
--disable-avx2
Andreas Marek's avatar
Andreas Marek committed
345
```
346
347
348
349

during build, will remdy this problem.


350
### Doxygen documentation ###
Andreas Marek's avatar
Andreas Marek committed
351
A doxygen documentation can be created with the `--enable-doxygen-doc` configure option
352

Andreas Marek's avatar
Andreas Marek committed
353
354
355
356
357
358
359
360
361
### Some examples ###

#### Intel cores supporting AVX2 (Hasell and newer) ####

We recommend that you build ELPA with the Intel compiler (if available) for the Fortran part, but
with GNU compiler for the C part.

1. Building with Intel Fortran compiler and GNU C compiler:

362
363
364
Remarks:
  - you have to know the name of the Intel Fortran compiler wrapper
  - you do not have to specify a C compiler (with CC); GNU C compiler is recognized automatically
Andreas Marek's avatar
Andreas Marek committed
365
  - you should specify compiler flags for Intel Fortran compiler; in the example only `-O3 -xAVX2` is set
366
  - you should be careful with the CFLAGS, the example shows typical flags
Andreas Marek's avatar
Andreas Marek committed
367

Andreas Marek's avatar
Andreas Marek committed
368
```
369
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -xAVX2" CFLAGS="-O3 -march=native -mavx2 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
Andreas Marek's avatar
Andreas Marek committed
370
```
Andreas Marek's avatar
Andreas Marek committed
371
372
373

2. Building with GNU Fortran compiler and GNU C compiler:

374
375
376
Remarks: 
  - you have to know the name of the GNU Fortran compiler wrapper
  - you DO have to specify a C compiler (with CC); GNU C compiler is recognized automatically
Andreas Marek's avatar
Andreas Marek committed
377
  - you should specify compiler flags for GNU Fortran compiler; in the example only `-O3 -march=native -mavx2 -mfma` is set
378
  - you should be careful with the CFLAGS, the example shows typical flags
Andreas Marek's avatar
Andreas Marek committed
379

Andreas Marek's avatar
Andreas Marek committed
380
```
381
FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=native -mavx2 -mfma" CFLAGS="-O3 -march=native -mavx2 -mfma  -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
Andreas Marek's avatar
Andreas Marek committed
382
```
Andreas Marek's avatar
Andreas Marek committed
383
384
385

2. Building with Intel Fortran compiler and Intel C compiler:

386
387
388
389
Remarks:
  - you have to know the name of the Intel Fortran compiler wrapper
  - you have to specify the Intel C compiler
  - you should specify compiler flags for Intel Fortran compiler; in the example only "-O3 -xAVX2" is set
390
  - you should be careful with the CFLAGS, the example shows typical flags
Andreas Marek's avatar
Andreas Marek committed
391

392
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_intel_C_compiler ./configure FCFLAGS="-O3 -xAVX2" CFLAGS="-O3 -xAVX2" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
Andreas Marek's avatar
Andreas Marek committed
393
394


395
396
397
398
399
400
401