Update wiki authored by Andreas Marek's avatar Andreas Marek
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Preamble ## ## Preamble ##
This file provides documentation on how to build the *ELPA* library in **version ELPA-2020.11.001**. This file provides documentation on how to build the *ELPA* library in **version ELPA-2021.05.001.rc1**.
With release of **version ELPA-2017.05.001** the build process has been significantly simplified, With release of **version ELPA-2017.05.001** the build process has been significantly simplified,
which makes it easier to install the *ELPA* library. which makes it easier to install the *ELPA* library.
...@@ -10,7 +10,7 @@ The release ELPA 2018.11.001 was the last release, where the legacy API has been ...@@ -10,7 +10,7 @@ The release ELPA 2018.11.001 was the last release, where the legacy API has been
enabled by default (and can be disabled at build time). enabled by default (and can be disabled at build time).
With the release ELPA 2019.11.001, the legacy API has been deprecated and the support has been closed. With the release ELPA 2019.11.001, the legacy API has been deprecated and the support has been closed.
The release of ELPA 2020.11.001 does change the API and ABI compared to the release 2019.11.001, since The release of ELPA 2021.05.001.rc1 does change the API and ABI compared to the release 2019.11.001, since
the legacy API has been dropped. the legacy API has been dropped.
## How to install *ELPA* ## ## How to install *ELPA* ##
...@@ -27,7 +27,7 @@ autotools procedure. This is the **only supported way** how to build and install ...@@ -27,7 +27,7 @@ autotools procedure. This is the **only supported way** how to build and install
If you obtained *ELPA* from the official git repository, you will not find If you obtained *ELPA* from the official git repository, you will not find
the needed configure script! You will have to create the configure script with autoconf. the needed configure script! You will have to create the configure script with autoconf. You can also run the `autogen.sh` script that does this step for you.
## (A): Installing *ELPA* as library with configure ## ## (A): Installing *ELPA* as library with configure ##
...@@ -62,7 +62,10 @@ An excerpt of the most important (*ELPA* specific) options reads as follows: ...@@ -62,7 +62,10 @@ An excerpt of the most important (*ELPA* specific) options reads as follows:
| `--enable-sve128` | Experimental feature build ARM SVE128 kernels, default: disabled | | `--enable-sve128` | Experimental feature build ARM SVE128 kernels, default: disabled |
| `--enable-sve256` | Experimental feature build ARM SVE256 kernels, default: disabled | | `--enable-sve256` | Experimental feature build ARM SVE256 kernels, default: disabled |
| `--enable-sve512` | Experimental feature build ARM SVE512 kernels, default: disabled | | `--enable-sve512` | Experimental feature build ARM SVE512 kernels, default: disabled |
| `--enable-gpu` | build GPU kernels, default: disabled | | `--enable-nvidia-gpu` | build NVIDIA GPU kernels, default: disabled |
| `--enable-gpu` | same as --enable-nvidia-gpu |
| `--enable-amd-gpu` | EXPERIMENTAL: build AMD GPU kernels, default: disabled |
| `--enable-intel-gpu` | VERY EXPERIMENTAL: build INTEL GPU kernels, default: disabled |
| `--enable-bgp` | build BGP kernels, default: disabled | | `--enable-bgp` | build BGP kernels, default: disabled |
| `--enable-bgq` | build BGQ kernels, default: disabled | | `--enable-bgq` | build BGQ kernels, default: disabled |
| `--with-mpi=[yes|no]` | compile with MPI. Default: yes | | `--with-mpi=[yes|no]` | compile with MPI. Default: yes |
...@@ -71,7 +74,9 @@ An excerpt of the most important (*ELPA* specific) options reads as follows: ...@@ -71,7 +74,9 @@ An excerpt of the most important (*ELPA* specific) options reads as follows:
| `--with-GPU-compute-capability=VALUE` | use compute capability VALUE for GPU version, <br> default: "sm_35" | | `--with-GPU-compute-capability=VALUE` | use compute capability VALUE for GPU version, <br> default: "sm_35" |
| `--with-fixed-real-kernel=KERNEL` | compile with only a single specific real kernel. | | `--with-fixed-real-kernel=KERNEL` | compile with only a single specific real kernel. |
| `--with-fixed-complex-kernel=KERNEL` | compile with only a single specific complex kernel. | | `--with-fixed-complex-kernel=KERNEL` | compile with only a single specific complex kernel. |
| `--with-gpu-support-only` | Compile and always use the GPU version | | `--with-nvidia-gpu-support-only` | Compile and always use the NVIDIA GPU version |
| `--with-amd-gpu-support-only` | EXPERIMENTAL: Compile and always use the AMD GPU version |
| `--with-intel-gpu-support-only` | EXPERIMENTAL: Compile and always use the INTEL GPU version |
| `--with-likwid=[yes|no|PATH]` | use the likwid tool to measure performance (has an performance impact!), default: no | | `--with-likwid=[yes|no|PATH]` | use the likwid tool to measure performance (has an performance impact!), default: no |
| `--with-default-real-kernel=KERNEL` | set the real kernel KERNEL as default | | `--with-default-real-kernel=KERNEL` | set the real kernel KERNEL as default |
| `--with-default-complex-kernel=KERNEL`| set the compplex kernel KERNEL as default | | `--with-default-complex-kernel=KERNEL`| set the compplex kernel KERNEL as default |
...@@ -384,7 +389,7 @@ Remarks: ...@@ -384,7 +389,7 @@ Remarks:
FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=native -mavx2 -mfma" CFLAGS="-O3 -march=native -mavx2 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=native -mavx2 -mfma" CFLAGS="-O3 -march=native -mavx2 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
``` ```
2. Building with Intel Fortran compiler and Intel C compiler: 3. Building with Intel Fortran compiler and Intel C compiler:
Remarks: Remarks:
- you have to know the name of the Intel Fortran compiler wrapper - you have to know the name of the Intel Fortran compiler wrapper
...@@ -392,13 +397,117 @@ Remarks: ...@@ -392,13 +397,117 @@ Remarks:
- you should specify compiler flags for Intel Fortran compiler; in the example only "-O3 -xAVX2" is set - you should specify compiler flags for Intel Fortran compiler; in the example only "-O3 -xAVX2" is set
- you should be careful with the CFLAGS, the example shows typical flags - you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_intel_C_compiler ./configure FCFLAGS="-O3 -xAVX2" CFLAGS="-O3 -xAVX2" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_intel_C_compiler ./configure FCFLAGS="-O3 -xAVX2" CFLAGS="-O3 -xAVX2" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64"
```
#### Intel cores supporting AVX-512 (Skylake and newer) ####
We recommend that you build ELPA with the Intel compiler (if available) for the Fortran part, but
with GNU compiler for the C part.
1. Building with Intel Fortran compiler and GNU C compiler:
Remarks:
- you have to know the name of the Intel Fortran compiler wrapper
- you do not have to specify a C compiler (with CC); GNU C compiler is recognized automatically
- you should specify compiler flags for Intel Fortran compiler; in the example only `-O3 -xCORE-AVX512` is set
- you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -xCORE-AVX512" CFLAGS="-O3 -march=skylake-avx512 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" --enable-avx2 --enable-avx512
```
2. Building with GNU Fortran compiler and GNU C compiler:
Remarks:
- you have to know the name of the GNU Fortran compiler wrapper
- you DO have to specify a C compiler (with CC); GNU C compiler is recognized automatically
- you should specify compiler flags for GNU Fortran compiler; in the example only `-O3 -march=skylake-avx512 -mfma` is set
- you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=skylake-avx512 -mfma" CFLAGS="-O3 -march=skylake-avx512 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" --enable-avx2 --enable-avx512
```
3. Building with Intel Fortran compiler and Intel C compiler:
Remarks:
- you have to know the name of the Intel Fortran compiler wrapper
- you have to specify the Intel C compiler
- you should specify compiler flags for Intel Fortran compiler; in the example only "-O3 -xCORE-AVX512" is set
- you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_intel_C_compiler ./configure FCFLAGS="-O3 -xCORE-AVX512" CFLAGS="-O3 -xCORE-AVX512" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" --enable-avx2 --enable-avx512
```
#### Building for NVIDIA A100 GPUS (and Intel Icelake CPUs) ####
For the GPU builds of ELPA it is mandatory that you choose a GNU compiler for the C part, the Fortran part can be compiled with any compiler, for example with the Intel Fortran compiler
1. Building with Intel Fortran compiler and GNU C compiler:
Remarks:
- you have to know the name of the Intel Fortran compiler wrapper
- you do not have to specify a C compiler (with CC); GNU C compiler is recognized automatically
- you should specify compiler flags for Intel Fortran compiler; in the example only `-O3 -xCORE-AVX512` is set
- you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_intel_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -xCORE-AVX512" CFLAGS="-O3 -march=skylake-avx512 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" --enable-avx2 --enable-avx512 --enable-nvidia-gpu --with-cuda-path=PATH_TO_YOUR_CUDA_INSTALLATION --with-NVIDIA-GPU-compute-capability=sm_80
```
2. Building with GNU Fortran compiler and GNU C compiler:
Remarks:
- you have to know the name of the GNU Fortran compiler wrapper
- you DO have to specify a C compiler (with CC); GNU C compiler is recognized automatically
- you should specify compiler flags for GNU Fortran compiler; in the example only `-O3 -march=skylake-avx512 -mfma` is set
- you should be careful with the CFLAGS, the example shows typical flags
```
FC=mpi_wrapper_for_gnu_Fortran_compiler CC=mpi_wrapper_for_gnu_C_compiler ./configure FCFLAGS="-O3 -march=skylake-avx512 -mfma" CFLAGS="-O3 -march=skylake-avx512 -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize" --enable-option-checking=fatal SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " SCALAPACK_FCFLAGS="-I$MKL_HOME/include/intel64/lp64" --enable-avx2 --enable-avx512 --enable-nvidia-gpu --with-cuda-path=PATH_TO_YOUR_CUDA_INSTALLATION --with-NVIDIA-GPU-compute-capability=sm_80
```
#### Building for IBM SUMMIT HPC system ####
For more information please have a look at the [ELSI wiki](https://git.elsi-interchange.org/elsi-devel/elsi-interface/-/wikis/install-elpa).
1. Building with GNU Fortran compiler and GNU C compiler:
```
FC=mpif90 CC=mpicc ./configure --prefix=$(pwd) CFLAGS="-O2 -mcpu=power9" CFLAGS="-O2 -mcpu=power9" CPP="cpp -E" LDFLAGS="-L${OLCF_NETLIB_SCALAPACK_ROOT}/lib -lscalapack -L${OLCF_ESSL_ROOT}/lib64 -lessl -L${OLCF_NETLIB_LAPACK_ROOT}/lib64 -llapack" --enable-gpu --with-cuda-path=${OLCF_CUDA_ROOT} --with-GPU-compute-capability=sm_70 --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-avx512 --enable-c-tests=no
```
2. Building with PGI Fortran compiler and PGI C compiler:
```
FC=mpif90 CC=mpicc ./configure --prefix=$(pwd) CFLAGS="-fast -tp=pwr9" CFLAGS="-fast -tp=pwr9" CPP="cpp -E" LDFLAGS="-L${OLCF_NETLIB_SCALAPACK_ROOT}/lib -lscalapack -L${OLCF_ESSL_ROOT}/lib64 -lessl -L${OLCF_NETLIB_LAPACK_ROOT}/lib64 -llapack" --enable-gpu --with-cuda-path=${OLCF_CUDA_ROOT} --with-GPU-compute-capability=sm_70 --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-avx512 --enable-c-tests=no
```
3. Building with IBM Fortran compiler and IBM C compiler:
```
FC=mpixlf CC=mpixlc ../configure --prefix=$(pwd) FCFLAGS="-O2 -qarch=pwr9 -qstrict -WF,-qfpp=linecont" CFLAGS="-O2 -qarch=pwr9 -qstrict" CPP="cpp -E" LDFLAGS="-L${OLCF_NETLIB_SCALAPACK_ROOT}/lib -lscalapack -L${OLCF_ESSL_ROOT}/lib64 -lessl -L${OLCF_NETLIB_LAPACK_ROOT}/lib64 -llapack" --enable-gpu --with-cuda-path=${OLCF_CUDA_ROOT} --with-GPU-compute-capability=sm_70 --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-avx512 --enable-c-tests=no
```
#### EXPERIMENTAL: Building for AMD GPUs (currently tested only --with-mpi=0 ####
In order to build *ELPA* for AMD GPUs please ensure that you have a working installation of HIP, ROCm, BLAS, and LAPACK
```
./configure CXX=hipcc CXXFLAGS="-I/opt/rocm-4.0.0/hip/include/ -I/opt/rocm-4.0.0/rocblas/inlcude -g" CC=hipcc CFLAGS="-I/opt/rocm-4.0.0/hip/include/ -I/opt/rocm-4.0.0/rocblas/include -g" LIBS="-L/opt/rocm-4.0.0/rocblas/lib" --enable-option-checking=fatal --with-mpi=0 FC=gfortran FCFLAGS="-g -LPATH_TO_YOUR_LAPACK_INSTALLATION -lopenblas -llapack" --disable-sse --disable-sse-assembly --disable-avx --disable-avx2 --disable-avx512 --enable-AMD-gpu --enable-single-precision
```
#### Problems of building with clang-12.0 ####
The libtool tool adds some flags to the compiler commands (to be used for linking by ld) which are not known
by the clang-12 compiler. One way to solve this issue is by calling directly after the configue step
```
sed -i 's/\\$wl-soname \\$wl\\$soname/-fuse-ld=ld -Wl,-soname,\\$soname/g' libtool
sed -i 's/\\$wl--whole-archive\\$convenience \\$wl--no-whole-archive//g' libtool
```