Commit 75ca6504 authored by Andreas Marek's avatar Andreas Marek
Browse files

update USERS_GUIDE and other documentation

parent 7c928b19
......@@ -118,6 +118,13 @@ option, otherwise no parallelisation whatsoever will be present.
It is possible to install the different flavours of *ELPA* (with/without MPI, with/without OpenMP) in the same
directory, since the library is named differently for each build.
### GPU support as runtime option ###
If the target systems has NVIDIA GPUs as accelerator cards *ELPA* can be build with GPU support.
The configure option "--enable-GPU-support" will trigger a build of the GPU version of ELPA. It might be necessary
that also the option "--with-cuda-path=[your path to the local cuda installation]" has to be set.
For GPU support, only models of NIVIDA GPUs with "compute capabillity" > 3.5 are supported.
### Doxygen documentation ###
A doxygen documentation can be created with the "--enable-doxygen-doc" configure option
......
......@@ -3,6 +3,9 @@
For more details and recent updates please visit the online [issue system] (https://gitlab.mpcdf.mpg.de/elpa/elpa/issues)
Issues which are not mentioned in a newer release are (considered as) solved
### ELPA 2016.11.001 release ###
- at the moment no issues are known
### ELPA 2016.05.003 release ###
- the Fortran module file "precision" was installed as private. Lead to linking problems
......
......@@ -2,7 +2,7 @@
## Current Release ##
The current release is ELPA 2016.05.004
The current release is ELPA 2016.11.001
## About *ELPA*
......
This file contains the release notes for the ELPA 2016.05.004 version
This file contains the release notes for the ELPA 2016.11.001 version
What is new?
-------------
For changes since release ELPA 2016.05.003 please have a look at the Changelog file
For detailed information about changes since release ELPA 2016.05.004 please have a look at the Changelog file
Highlights are:
- support of single and double precision (real and complex) calculations
- GPU support for ELPA 1stage and ELPA 2stage
a) ABI change
ABI change
---------------------
Since release 2016.05.001 the ABI has not changed. Only additional interfaces habe been added
Since release 2016.05.001 the ABI has not changed. Only additional interfaces habe been added, please have also a look
at the DEPRECTATED_FEATURES file
Any incompatibilities to previous version?
......@@ -16,3 +20,4 @@ Any incompatibilities to previous version?
As mentioned before, the ABI of ELPA was not changed; There is no
incompatibility with the previous version ELPA 2016.05.002. There is, however, an incompatibility with older versions than ELPA 2015.11.001 !
......@@ -18,22 +18,20 @@ The *ELPA* library consists of two main parts:
- *ELPA 1stage* solver
- *ELPA 2stage* solver
Both variants of the *ELPA* solvers are available for real or complex valued matrices.
Both variants of the *ELPA* solvers are available for real or complex singe and double precision valued matrices.
Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2016.05.004/html/index.html) for details):
- elpa_get_communicators : set the row / column communicators for *ELPA*
- elpa_solve_evp_complex_1stage : solve a complex valued eigenvale proplem with the *ELPA 1stage* solver
- elpa_solve_evp_real_1stage : solve a real valued eigenvale proplem with the *ELPA 1stage* solver
- elpa_solve_evp_complex_2stage : solve a complex valued eigenvale proplem with the *ELPA 2stage* solver
- elpa_solve_evp_real_2stage : solve a real valued eigenvale proplem with the *ELPA 2stage* solver
- elpa_get_communicators : set the row / column communicators for *ELPA*
- elpa_solve_evp_complex_1stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 1stage* solver
- elpa_solve_evp_real_1stage_{single|double} : solve a {single|double} precision real eigenvalue proplem with the *ELPA 1stage* solver
- elpa_solve_evp_complex_2stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 2stage* solver
- elpa_solve_evp_real_2stage_{single|double} : solve a {single|double} precision real eigenvalue proplem with the *ELPA 2stage* solver
*NEW*
- elpa_solve_evp_real : driver for *ELPA 1stage* or *ELPA 2stage* solver
- elpa_solve_evp_complex : driver for *ELPA 1stage* or *ELPA 2stage* solver
- elpa_solve_evp_real_{single|double} : driver for the {single|double} precision real *ELPA 1stage* or *ELPA 2stage* solver
- elpa_solve_evp_complex_{single|double} : driver for the {single|double} precision complex *ELPA 1stage* or *ELPA 2stage* solver
......@@ -84,7 +82,7 @@ of a simple example program can be found in ./test_project/src.
print *
end if
success = elpa_solve_evp_real_1stage(na, nev, a, na_rows, ev, z, na_rows, nblk, &
success = elpa_solve_evp_real_1stage_{single|double} (na, nev, a, na_rows, ev, z, na_rows, nblk, &
matrixCols, mpi_comm_rows, mpi_comm_cols)
if (myid==0) then
......@@ -132,23 +130,23 @@ SYNOPSIS
#### Using *ELPA 1stage* ####
After setting up the *ELPA* row and column communicators (by calling get_elpa_communicators),
After setting up the *ELPA* row and column communicators (by calling elpa_get_communicators),
only the real or complex valued solver has to be called:
SYNOPSIS
FORTRAN INTERFACE
use elpa1
success = elpa_solve_evp_real_1stage (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
success = elpa_solve_evp_real_1stage_{single|double} (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
mpi_comm_cols)
With the definintions of the input and output variables:
integer, intent(in) na: global dimension of quadratic matrix a to solve
integer, intent(in) nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
real*8, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
real*{4|8}, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
integer, intent(in) lda: leading dimension of locally distributed matrix a
real*8, intent(inout) ev: on output the first nev computed eigenvalues
real*8, intent(inout) q: on output the first nev computed eigenvectors
real*{4|8}, intent(inout) ev: on output the first nev computed eigenvalues
real*{4|8}, intent(inout) q: on output the first nev computed eigenvectors
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
......@@ -160,17 +158,17 @@ SYNOPSIS
C INTERFACE
#include "elpa.h"
success = elpa_solve_evp_real_1stage (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int
success = elpa_solve_evp_real_1stage_{single|double} (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int
mpi_comm_rows, int mpi_comm_cols);
With the definintions of the input and output variables:
int na: global dimension of quadratic matrix a to solve
int nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
double *a: pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
{float|double} *a: pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
int lda: leading dimension of locally distributed matrix a
double *ev: pointer to memory containing on output the first nev computed eigenvalues
double *q: pointer to memory containing on output the first nev computed eigenvectors
{float|double} *ev: pointer to memory containing on output the first nev computed eigenvalues
{float|double} *q: pointer to memory containing on output the first nev computed eigenvectors
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
......@@ -187,17 +185,17 @@ DESCRIPTION
FORTRAN INTERFACE
use elpa1
success = elpa_solve_evp_complex_1stage (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
success = elpa_solve_evp_complex_1stage_{single|double} (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
mpi_comm_cols)
With the definintions of the input and output variables:
integer, intent(in) na: global dimension of quadratic matrix a to solve
integer, intent(in) nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
complex*16, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
complex*{8|16}, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
integer, intent(in) lda: leading dimension of locally distributed matrix a
real*8, intent(inout) ev: on output the first nev computed eigenvalues
complex*16, intent(inout) q: on output the first nev computed eigenvectors
real*{4|8}, intent(inout) ev: on output the first nev computed eigenvalues
complex*{8|16}, intent(inout) q: on output the first nev computed eigenvectors
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
......@@ -210,17 +208,17 @@ DESCRIPTION
#include "elpa.h"
#include <complex.h>
success = elpa_solve_evp_complex_1stage (int na, int nev, double complex *a, int lda, double *ev, double complex*q, int ldq, int nblk, int
success = elpa_solve_evp_complex_1stage_{single|double} (int na, int nev, double complex *a, int lda, double *ev, double complex*q, int ldq, int nblk, int
matrixCols, int mpi_comm_rows, int mpi_comm_cols);
With the definintions of the input and output variables:
int na: global dimension of quadratic matrix a to solve
int nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
double complex *a: pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
{float|double} complex *a: pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
int lda: leading dimension of locally distributed matrix a
double *ev: pointer to memory containing on output the first nev computed eigenvalues
double complex *q: pointer to memory containing on output the first nev computed eigenvectors
{float|double} *ev: pointer to memory containing on output the first nev computed eigenvalues
{float|double} complex *q: pointer to memory containing on output the first nev computed eigenvectors
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
......@@ -259,27 +257,37 @@ the default kernels will be set.
##### Setting the *ELPA 2stage* compute kernels #####
##### Setting the *ELPA 2stage* compute kernels with environment variables#####
If the *ELPA* installation allows setting ther compute kernels with enviroment variables,
setting the variables "REAL_ELPA_KERNEL" and "COMPLEX_ELPA_KERNEL" will set the compute
kernels. The environment variable setting will take precedence over all other settings!
The utility program "elpa2_print_kernels" can list which kernels are available and which
would be choosen. This reflects, as well the setting of the default kernel or the settings
with the environment variables
##### Setting the *ELPA 2stage* compute kernels with API calls#####
It is also possible to set the *ELPA 2stage* compute kernels via the API.
As an example the API for ELPA real double-precision 2stage is shown:
SYNOPSIS
FORTRAN INTERFACE
use elpa1
use elpa2
success = elpa_solve_evp_real_2stage (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL, useQr=useQR)
success = elpa_solve_evp_real_2stage_double (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL, useQR, useGPU)
With the definintions of the input and output variables:
integer, intent(in) na: global dimension of quadratic matrix a to solve
integer, intent(in) nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
real*8, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
real*{4|8}, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
integer, intent(in) lda: leading dimension of locally distributed matrix a
real*8, intent(inout) ev: on output the first nev computed eigenvalues
real*8, intent(inout) q: on output the first nev computed eigenvectors
real*{4|8}, intent(inout) ev: on output the first nev computed eigenvalues
real*{4|8}, intent(inout) q: on output the first nev computed eigenvectors
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
......@@ -287,14 +295,15 @@ SYNOPSIS
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_all: communicator for all processes in the processor set involved in ELPA
logical, intent(in), optional: useQR: optional argument; switches to QR-decomposition if set to .true.
logical, intent(in), optional: useGPU: decide whether GPUs should be used ore not
logical success: return value indicating success or failure
C INTERFACE
#include "elpa.h"
success = elpa_solve_evp_real_2stage (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int
mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQr);
success = elpa_solve_evp_real_2stage_double (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int
mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQR, int useGPU);
With the definintions of the input and output variables:
......@@ -311,6 +320,7 @@ SYNOPSIS
int mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
int mpi_comm_all: communicator for all processes in the processor set involved in ELPA
int useQR: if set to 1 switch to QR-decomposition
int useGPU: decide whether the GPU version should be used or not
int success: return value indicating success (1) or failure (0)
......@@ -321,63 +331,11 @@ DESCRIPTION
The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
SYNOPSIS
FORTRAN INTERFACE
use elpa1
use elpa2
success = elpa_solve_evp_real_2stage (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL)
With the definintions of the input and output variables:
integer, intent(in) na: global dimension of quadratic matrix a to solve
integer, intent(in) nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
complex*16, intent(inout) a: locally distributed part of the matrix a. The local dimensions are lda x matrixCols
integer, intent(in) lda: leading dimension of locally distributed matrix a
real*8, intent(inout) ev: on output the first nev computed eigenvalues
complex*16, intent(inout) q: on output the first nev computed eigenvectors
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_all: communicator for all processes in the processor set involved in ELPA
logical success: return value indicating success or failure
C INTERFACE
#include "elpa.h"
#include <complex.h>
success = elpa_solve_evp_complex_2stage (int na, int nev, double complex *a, int lda, double *ev, double complex *q, int ldq, int nblk, int
matrixCols, int mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL);
With the definintions of the input and output variables:
int na: global dimension of quadratic matrix a to solve
int nev: number of eigenvalues to be computed; the first nev eigenvalules are calculated
double complex *a: pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
int lda: leading dimension of locally distributed matrix a
double *ev: pointer to memory containing on output the first nev computed eigenvalues
double complex *q: pointer to memory containing on output the first nev computed eigenvectors
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
int mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
int mpi_comm_all: communicator for all processes in the processor set involved in ELPA
int success: return value indicating success (1) or failure (0)
DESCRIPTION
Solve the complex eigenvalue problem with the 2-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
##### Setting up *ELPA 1stage* or *ELPA 2stage* with the *ELPA driver interface* #####
Since release ELPA 2016.005.004 a driver routine allows to choose more easily which solver (1stage or 2stage) will be used
Since release ELPA 2016.005.004 a driver routine allows to choose more easily which solver (1stage or 2stage) will be used.
As an exmple the real double-precision case is explained:
SYNOPSIS
......@@ -385,7 +343,7 @@ Since release ELPA 2016.005.004 a driver routine allows to choose more easily wh
use elpa
success = elpa_solve_evp_real (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows, mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL=THIS_REAL_ELPA_KERNEL, useQr=useQR, method=method)
success = elpa_solve_evp_real_double (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows, mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL=THIS_REAL_ELPA_KERNEL, useQR, useGPU, method=method)
Generalized interface to the ELPA 1stage and 2stage solver for real-valued problems
......@@ -420,6 +378,8 @@ Since release ELPA 2016.005.004 a driver routine allows to choose more easily wh
logical, intent(in), optional: useQR: optional argument; switches to QR-decomposition if set to .true.
logical, intent(in), optional: useQPU: decide whether the GPU version should be used or not
character(*), optional method: use 1stage solver if "1stage", use 2stage solver if "2stage", (at the moment) use 2stage solver if "auto"
logical success: return value indicating success or failure
......@@ -429,7 +389,7 @@ Since release ELPA 2016.005.004 a driver routine allows to choose more easily wh
#include "elpa.h"
success = elpa_solve_evp_real (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQr, char *method);"
success = elpa_solve_evp_real_double (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQR, int useGPU, char *method);"
With the definintions of the input and output variables:"
......@@ -463,6 +423,8 @@ Since release ELPA 2016.005.004 a driver routine allows to choose more easily wh
int useQR: if set to 1 switch to QR-decomposition
int useGPU: decide whether the GPU version should be used or not
char *method: use 1stage solver if "1stage", use 2stage solver if "2stage", (at the moment) use 2stage solver if "auto"
int success: return value indicating success (1) or failure (0)
......@@ -470,3 +432,17 @@ Since release ELPA 2016.005.004 a driver routine allows to choose more easily wh
DESCRIPTION
Solve the real eigenvalue problem. The value of method desides whether the 1stage or 2stage solver is used. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the elpa_get_communicators function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols. The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
##### Setting up the GPU version of *ELPA* 1 and 2 stage #####
Since release ELPA 2016.011.001.pre *ELPA* offers GPU support, IF *ELPA* has been build with the configure option "--enabble-gpu-support".
At run-time the GPU version can be used by setting the environment variable "ELPA_USE_GPU" to "yes", or by calling the *ELPA* functions
(elpa_solve_evp_real_{double|single}, elpa_solve_evp_real_1stage_{double|single}, elpa_solve_evp_real_2stage_{double|single}) with the
argument "useGPU = .true." or "useGPU = 1" for the Fortran and C case, respectively. Please, not that similiar to the choice of the
*ELPA* 2stage compute kernels, the enviroment variable takes precendence over the setting in the API call.
Further note that it is NOT allowed to define the usage of GPUs AND to EXPLICITLY set an ELPA 2stage compute kernel other than
"REAL_ELPA_KERNEL_GPU" or "COMPLEX_ELPA_KERNEL_GPU".
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment