... | ... | @@ -2,110 +2,107 @@ |
|
|
|
|
|
## Preamble ##
|
|
|
|
|
|
This file provides documentation on how to build the *ELPA* library in **version ELPA-2019.11.001.rc1**.
|
|
|
This file provides documentation on how to build the *ELPA* library in **version ELPA-2020.05.001**.
|
|
|
With release of **version ELPA-2017.05.001** the build process has been significantly simplified,
|
|
|
which makes it easier to install the *ELPA* library.
|
|
|
|
|
|
The old, obsolete legacy API will be deprecated in the future !
|
|
|
Allready now, all new features of ELPA are only available with the new API. Thus, there
|
|
|
is no reason to keep the legacy API arround for too long.
|
|
|
|
|
|
The release ELPA 2018.11.001 was the last release, where the legacy API has been
|
|
|
enabled by default (and can be disabled at build time).
|
|
|
With release ELPA 2019.05.001 the legacy API is disabled by default, however,
|
|
|
can be still switched on at build time.
|
|
|
With this release ELPA 2019.11.001, the legacy API will be deprecated and not supported anymore.
|
|
|
With the release ELPA 2019.11.001, the legacy API has been deprecated and the support has been closed.
|
|
|
|
|
|
The release of ELPA 2019.11.001.rc1 does change the API and ABI compared to the release 2019.05.002, since
|
|
|
the legacy API has been droped.
|
|
|
The release of ELPA 2020.05.001 does change the API and ABI compared to the release 2019.11.001, since
|
|
|
the legacy API has been dropped.
|
|
|
|
|
|
## How to install *ELPA* ##
|
|
|
|
|
|
First of all, if you do not want to build *ELPA* yourself, and you run Linux,
|
|
|
it is worth having a look at the [*ELPA* webpage*] (http://elpa.mpcdf.mpg.de)
|
|
|
it is worth having a look at the [*ELPA* webpage*](http://elpa.mpcdf.mpg.de)
|
|
|
and/or the repositories of your Linux distribution: there exist
|
|
|
pre-build packages for a number of Linux distributions like Fedora,
|
|
|
Debian, and OpenSuse. More, will hopefully follow in the future.
|
|
|
|
|
|
If you want to build (or have to since no packages are available) *ELPA* yourself,
|
|
|
please note that *ELPA* is shipped with a typical "configure" and "make"
|
|
|
please note that *ELPA* is shipped with a typical `configure` and `make`
|
|
|
autotools procedure. This is the **only supported way** how to build and install *ELPA*.
|
|
|
|
|
|
|
|
|
If you obtained *ELPA* from the official git repository, you will not find
|
|
|
the needed configure script! You will have to create the configure scipt with autoconf.
|
|
|
the needed configure script! You will have to create the configure script with autoconf.
|
|
|
|
|
|
|
|
|
## (A): Installing *ELPA* as library with configure ##
|
|
|
|
|
|
*ELPA* can be installed with the build steps
|
|
|
- configure
|
|
|
- make
|
|
|
- make check | or make check CHECK_LEVEL=extended
|
|
|
- make install
|
|
|
- `configure`
|
|
|
- `make`
|
|
|
- `make check` | or `make check CHECK_LEVEL=extended`
|
|
|
- `make install`
|
|
|
|
|
|
Please look at configure --help for all available options.
|
|
|
Please look at `configure --help` for all available options.
|
|
|
|
|
|
An excerpt of the most important (*ELPA* specific) options reads as follows:
|
|
|
|
|
|
| configure option | description |
|
|
|
|:------------------------------------ |:----------------------------------------------------- |
|
|
|
| --enable-legacy-interface | build legacy API, will not be build as default |
|
|
|
| --enable-optional-argument-in-C-API | treat error arguments in C-API as optional |
|
|
|
| --enable-openmp | use OpenMP threading, default no. |
|
|
|
| --enable-redirect | for ELPA test programs, allow redirection of <br> stdout/stderr per MPI taks in a file <br> (useful for timing), default no. |
|
|
|
| --enable-single-precision | build with single precision version |
|
|
|
| --disable-timings | more detailed timing, default yes <br> **If disabled some features like autotune will <br> not work anymmore !** |
|
|
|
| --disable-band-to-full-blocking | build ELPA2 with blocking in band_to_full <br> (default:enabled) |
|
|
|
| --disable-mpi-module | do not use the Fortran MPI module, <br> get interfaces by 'include "mpif.h') |
|
|
|
| --disable-generic | do not build GENERIC kernels, default: enabled |
|
|
|
| --enable-sparc64 | do not build SPARC64 kernels, default: disabled |
|
|
|
| --disable-sse | do not build SSE kernels, default: enabled |
|
|
|
| --disable-sse-assembly | do not build SSE_ASSEMBLY kernels, default: enabled |
|
|
|
| --disable-avx | do not build AVX kernels, default: enabled |
|
|
|
| --disable-avx2 | do not build AVX2 kernels, default: enabled |
|
|
|
| --enable-avx512 | build AVX512 kernels, default: disabled |
|
|
|
| --enable-gpu | build GPU kernels, default: disabled |
|
|
|
| --enable-bgp | build BGP kernels, default: disabled |
|
|
|
| --enable-bgq | build BGQ kernels, default: disabled |
|
|
|
| --with-mpi=[yes|no] | compile with MPI. Default: yes |
|
|
|
| --with-cuda-path=PATH | prefix where CUDA is installed [default=auto] |
|
|
|
| --with-cuda-sdk-path=PATH | prefix where CUDA SDK is installed [default=auto] |
|
|
|
| --with-GPU-compute-capability=VALUE | use compute capability VALUE for GPU version, <br> default: "sm_35" |
|
|
|
| --with-fixed-real-kernel=KERNEL | compile with only a single specific real kernel. |
|
|
|
| --with-fixed-complex-kernel=KERNEL | compile with only a single specific complex kernel. |
|
|
|
| --with-gpu-support-only | Compile and always use the GPU version |
|
|
|
| --with-likwid=[yes|no|PATH] | use the likwid tool to measure performance (has an performance impact!), default: no |
|
|
|
| --with-default-real-kernel=KERNEL | set the real kernel KERNEL as default |
|
|
|
| --with-default-complex-kernel=KERNEL| set the compplex kernel KERNEL as default |
|
|
|
| --enable-scalapack-tests | build SCALAPACK test cases for performance <br> omparison, needs MPI, default no. |
|
|
|
| --enable-autotuning | enables autotuning functionality, default yes |
|
|
|
| --enable-c-tests | enables the C tests for elpa, default yes |
|
|
|
| --disable-assumed-size | do NOT use assumed-size Fortran arrays. default use |
|
|
|
| --enable-scalapack-tests | build also ScalaPack tests for performance comparison; needs MPI |
|
|
|
| --disable-Fortran2008-features | disable Fortran 2008 if compiler does not support it |
|
|
|
| --enable-pyhton | build and install python wrapper, default no |
|
|
|
| --enable-python-tests | enable python tests, default no. |
|
|
|
| --enable-skew-symmetric-support | enable support for real valued skew-symmetric matrices |
|
|
|
| --enable-store-build-config | stores the build config in the library object |
|
|
|
| --64bit-integer-math-support | assumes that BLAS/LAPACK/SCALAPACK use 64bit integers (experimentatl) |
|
|
|
| --64bit-integer-mpi-support | assumes that MPI uses 64bit integers (experimental) |
|
|
|
| --heterogenous-cluster-support | allows ELPA to run on clusters of nodes with different Intel CPUs (experimental) |
|
|
|
| `--enable-legacy-interface` | build legacy API, will not be build as default |
|
|
|
| `--enable-optional-argument-in-C-API` | treat error arguments in C-API as optional |
|
|
|
| `--enable-openmp` | use OpenMP threading, default no. |
|
|
|
| `--enable-redirect` | for ELPA test programs, allow redirection of <br> stdout/stderr per MPI taks in a file <br> (useful for timing), default no. |
|
|
|
| `--enable-single-precision` | build with single precision version |
|
|
|
| `--disable-timings` | more detailed timing, default yes <br> **If disabled some features like autotune will <br> not work anymmore !** |
|
|
|
| `--disable-band-to-full-blocking` | build ELPA2 with blocking in band_to_full <br> (default:enabled) |
|
|
|
| `--disable-mpi-module` | do not use the Fortran MPI module, <br> get interfaces by 'include "mpif.h') |
|
|
|
| `--disable-generic` | do not build GENERIC kernels, default: enabled |
|
|
|
| `--enable-sparc64` | do not build SPARC64 kernels, default: disabled |
|
|
|
| `--disable-sse` | do not build SSE kernels, default: enabled |
|
|
|
| `--disable-sse-assembly` | do not build SSE_ASSEMBLY kernels, default: enabled |
|
|
|
| `--disable-avx` | do not build AVX kernels, default: enabled |
|
|
|
| `--disable-avx2` | do not build AVX2 kernels, default: enabled |
|
|
|
| `--enable-avx512` | build AVX512 kernels, default: disabled |
|
|
|
| `--enable-gpu` | build GPU kernels, default: disabled |
|
|
|
| `--enable-bgp` | build BGP kernels, default: disabled |
|
|
|
| `--enable-bgq` | build BGQ kernels, default: disabled |
|
|
|
| `--with-mpi=[yes|no]` | compile with MPI. Default: yes |
|
|
|
| `--with-cuda-path=PATH` | prefix where CUDA is installed [default=auto] |
|
|
|
| `--with-cuda-sdk-path=PATH` | prefix where CUDA SDK is installed [default=auto] |
|
|
|
| `--with-GPU-compute-capability=VALUE` | use compute capability VALUE for GPU version, <br> default: "sm_35" |
|
|
|
| `--with-fixed-real-kernel=KERNEL` | compile with only a single specific real kernel. |
|
|
|
| `--with-fixed-complex-kernel=KERNEL` | compile with only a single specific complex kernel. |
|
|
|
| `--with-gpu-support-only` | Compile and always use the GPU version |
|
|
|
| `--with-likwid=[yes|no|PATH]` | use the likwid tool to measure performance (has an performance impact!), default: no |
|
|
|
| `--with-default-real-kernel=KERNEL` | set the real kernel KERNEL as default |
|
|
|
| `--with-default-complex-kernel=KERNEL`| set the compplex kernel KERNEL as default |
|
|
|
| `--enable-scalapack-tests` | build SCALAPACK test cases for performance <br> omparison, needs MPI, default no. |
|
|
|
| `--enable-autotune-redistribute-matrix` | EXPERIMENTAL FEATURE; NOT FULLY SUPPORTED YET: Allows ELPA during autotuning to re-distribute the matrix to find the best (ELPA internal) block size for block-cyclic distribution (Needs Scalapack functionality |
|
|
|
| `--enable-autotuning` | enables autotuning functionality, default yes |
|
|
|
| `--enable-c-tests` | enables the C tests for elpa, default yes |
|
|
|
| `--disable-assumed-size` | do NOT use assumed-size Fortran arrays. default use |
|
|
|
| `--enable-scalapack-tests` | build also ScalaPack tests for performance comparison; needs MPI |
|
|
|
| `--disable-Fortran2008-features` | disable Fortran 2008 if compiler does not support it |
|
|
|
| `--enable-pyhton` | build and install python wrapper, default no |
|
|
|
| `--enable-python-tests` | enable python tests, default no. |
|
|
|
| `--enable-skew-symmetric-support` | enable support for real valued skew-symmetric matrices |
|
|
|
| `--enable-store-build-config` | stores the build config in the library object |
|
|
|
| `--64bit-integer-math-support` | assumes that BLAS/LAPACK/SCALAPACK use 64bit integers (experimentatl) |
|
|
|
| `--64bit-integer-mpi-support` | assumes that MPI uses 64bit integers (experimental) |
|
|
|
| `--heterogenous-cluster-support` | allows ELPA to run on clusters of nodes with different Intel CPUs (experimental) |
|
|
|
|
|
|
We recommend that you do not build ELPA in its main directory but that you use it
|
|
|
in a sub-directory:
|
|
|
|
|
|
```
|
|
|
mkdir build
|
|
|
cd build
|
|
|
|
|
|
../configure [with all options needed for your system, see below]
|
|
|
```
|
|
|
|
|
|
In this way, you have a clean separation between original *ELPA* source files and the compiled
|
|
|
object files
|
|
|
|
|
|
Please note, that it is necessary to set the **compiler options** like optimisation flags etc.
|
|
|
for the Fortran and C part.
|
|
|
For example sth. like this is a usual way: ./configure FCFLAGS="-O2 -mavx" CFLAGS="-O2 -mavx"
|
|
|
For example sth. like this is a usual way: `./configure FCFLAGS="-O2 -mavx" CFLAGS="-O2 -mavx"`
|
|
|
For details, please have a look at the documentation for the compilers of your choice.
|
|
|
|
|
|
**Note** that most kernels can only be build if the correct compiler flags for this kernel (e.g. AVX-512)
|
... | ... | @@ -118,7 +115,7 @@ It is possible to build the *ELPA* library with or without MPI support. |
|
|
|
|
|
Normally *ELPA* is build with MPI, in order to speed-up calculations by using distributed
|
|
|
parallelisation over several nodes. This is, however, only reasonably if the programs
|
|
|
calling the *ELPA* library are already MPI parallized, and *ELPA* can use the same
|
|
|
calling the *ELPA* library are already MPI parallelized, and *ELPA* can use the same
|
|
|
block-cyclic distribution of data as in the calling program.
|
|
|
|
|
|
Programs which do not support MPI parallelisation can still make use of the *ELPA* library if it
|
... | ... | @@ -126,7 +123,7 @@ has also been build without MPI support. |
|
|
|
|
|
If you want to build *ELPA* with MPI support, please have a look at "A) Setting of MPI compiler and libraries".
|
|
|
For builds without MPI support, please have a look at "B) Building *ELPA* without MPI support".
|
|
|
**NOTE** that if *ELPA* is build without MPI support, it will be serial unless the OpenMP parallization is
|
|
|
**NOTE** that if *ELPA* is build without MPI support, it will be serial unless the OpenMP parallelization is
|
|
|
explicitely enabled.
|
|
|
|
|
|
Please note, that it is absolutely supported that both versions of the *ELPA* library are build
|
... | ... | @@ -138,14 +135,18 @@ In the standard case *ELPA* needs a MPI compiler and MPI libraries. The configur |
|
|
will try to set this by itself. If, however, on the build system the compiler wrapper
|
|
|
cannot automatically found, it is recommended to set it by hand with a variable, e.g.
|
|
|
|
|
|
```
|
|
|
configure FC=mpif90
|
|
|
```
|
|
|
|
|
|
In some cases, on your system different MPI libraries and compilers are installed. Then it might happen
|
|
|
that during the build step an error like "no module mpi" or "cannot open module mpi" is given.
|
|
|
You can disable that the *ELPA* library uses a MPI modules (and instead uses MPI header files) by
|
|
|
adding
|
|
|
|
|
|
```
|
|
|
--disable-mpi-module
|
|
|
```
|
|
|
|
|
|
to the configure call.
|
|
|
|
... | ... | @@ -156,18 +157,22 @@ Please continue reading at "C) Enabling GPU support" |
|
|
|
|
|
If you want to build *ELPA* without MPI support, add
|
|
|
|
|
|
```
|
|
|
--with-mpi=no
|
|
|
```
|
|
|
|
|
|
to your configure call.
|
|
|
|
|
|
You have to specify which compilers should be used with e.g.,
|
|
|
|
|
|
```
|
|
|
configure FC=gfortran --with-mpi=no
|
|
|
```
|
|
|
|
|
|
**DO NOT specify a MPI compiler here!**
|
|
|
|
|
|
Note, that the installed *ELPA* library files will be suffixed with
|
|
|
"_onenode", in order to discriminate this build from possible ones with MPI.
|
|
|
`_onenode`, in order to discriminate this build from possible ones with MPI.
|
|
|
|
|
|
|
|
|
Please continue reading at "C) Enabling GPU support"
|
... | ... | @@ -181,13 +186,17 @@ For GPU support, NVIDIA GPUs with compute capability >= 3.5 are needed. |
|
|
|
|
|
GPU support is set with
|
|
|
|
|
|
```
|
|
|
--enable-gpu
|
|
|
```
|
|
|
|
|
|
It might be necessary to also set the options (please see configure --help)
|
|
|
|
|
|
```
|
|
|
--with-cuda-path
|
|
|
--with-cuda-sdk-path
|
|
|
--with-GPU-compute-capability
|
|
|
```
|
|
|
|
|
|
Please continue reading at "D) Enabling OpenMP support".
|
|
|
|
... | ... | @@ -246,21 +255,24 @@ and *SCALAPACK* implementation from *Intel's MKL* library. |
|
|
|
|
|
Together with the Intel Fortran Compiler the call to configure might then look like:
|
|
|
|
|
|
```
|
|
|
configure SCALAPACK_LDFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential \
|
|
|
-lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -Wl,-rpath,$MKL_HOME/lib/intel64" \
|
|
|
SCALAPACK_FCFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential \
|
|
|
-lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -I$MKL_HOME/include/intel64/lp64"
|
|
|
```
|
|
|
|
|
|
and for *INTEL MKL* together with *GNU GFORTRAN* :
|
|
|
|
|
|
```
|
|
|
configure SCALAPACK_LDFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential \
|
|
|
-lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -Wl,-rpath,$MKL_HOME/lib/intel64" \
|
|
|
SCALAPACK_FCFLAGS="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential \
|
|
|
-lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -I$MKL_HOME/include/intel64/lp64"
|
|
|
|
|
|
```
|
|
|
|
|
|
Please, for the correct link-line refer to the documentation of the correspondig library. In case of *Intel's MKL* we
|
|
|
suggest the [Intel Math Kernel Library Link Line Advisor] (https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor).
|
|
|
suggest the [Intel Math Kernel Library Link Line Advisor](https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor).
|
|
|
|
|
|
|
|
|
### Choice of ELPA2 compute kernels ###
|
... | ... | @@ -271,14 +283,17 @@ others are disabled by default and must be enabled if they are wanted. |
|
|
|
|
|
One can enable "kernel classes" by setting e.g.
|
|
|
|
|
|
```
|
|
|
--enable-avx2
|
|
|
|
|
|
```
|
|
|
|
|
|
This will try to build all the AVX2 kernels. Please see configure --help for all options
|
|
|
|
|
|
With
|
|
|
|
|
|
```
|
|
|
--disable-avx2
|
|
|
```
|
|
|
|
|
|
one chan choose not to build the AVX2 kernels.
|
|
|
|
... | ... | @@ -289,8 +304,8 @@ It is possible to build *ELPA* with as many kernels as desired, the user can the |
|
|
kernels should be used.
|
|
|
|
|
|
It this is not desired, it is possible to build *ELPA* with only one (not necessary the same) kernel for the
|
|
|
real and complex valued case, respectively. This can be done with the "--with-fixed-real-kernel=NAME" or
|
|
|
"--with-fixed-complex-kernel=NAME" configure options. For details please do a "configure --help"
|
|
|
real and complex valued case, respectively. This can be done with the `--with-fixed-real-kernel=NAME` or
|
|
|
`--with-fixed-complex-kernel=NAME` configure options. For details please do a "configure --help"
|
|
|
|
|
|
#### Cross compilation ####
|
|
|
|
... | ... | @@ -325,13 +340,15 @@ AVX-2 instructions this will lead to a crash. |
|
|
One can avoid this unfortunate situation by disabling instructions set which are _not_ supported on the target system.
|
|
|
In the case above, setting
|
|
|
|
|
|
```
|
|
|
--disable-avx2
|
|
|
```
|
|
|
|
|
|
during build, will remdy this problem.
|
|
|
|
|
|
|
|
|
### Doxygen documentation ###
|
|
|
A doxygen documentation can be created with the "--enable-doxygen-doc" configure option
|
|
|
A doxygen documentation can be created with the `--enable-doxygen-doc` configure option
|
|
|
|
|
|
### Some examples ###
|
|
|
|
... | ... | @@ -345,22 +362,24 @@ with GNU compiler for the C part. |
|
|
Remarks:
|
|
|
- you have to know the name of the Intel Fortran compiler wrapper
|
|
|
- you do not have to specify a C compiler (with CC); GNU C compiler is recognized automatically
|
|
|
- you should specify compiler flags for Intel Fortran compiler; in the example only "-O3 -xAVX2" is set
|
|
|
- you should specify compiler flags for Intel Fortran compiler; in the example only `-O3 -xAVX2` is set
|
|
|
- you should be careful with the CFLAGS, the example shows typical flags
|
|
|
|
|
|
```
|
|
|
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -xAVX2" CFLAGS="-O3 -march=native -mavx2 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
|
|
|
|
|
|
```
|
|
|
|
|
|
2. Building with GNU Fortran compiler and GNU C compiler:
|
|
|
|
|
|
Remarks:
|
|
|
- you have to know the name of the GNU Fortran compiler wrapper
|
|
|
- you DO have to specify a C compiler (with CC); GNU C compiler is recognized automatically
|
|
|
- you should specify compiler flags for GNU Fortran compiler; in the example only "-O3 -march=native -mavx2 -mfma" is set
|
|
|
- you should specify compiler flags for GNU Fortran compiler; in the example only `-O3 -march=native -mavx2 -mfma` is set
|
|
|
- you should be careful with the CFLAGS, the example shows typical flags
|
|
|
|
|
|
```
|
|
|
FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=native -mavx2 -mfma" CFLAGS="-O3 -march=native -mavx2 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
|
|
|
|
|
|
```
|
|
|
|
|
|
2. Building with Intel Fortran compiler and Intel C compiler:
|
|
|
|
... | ... | |