## Documentation how to switch from the legacy API to the new API of the *ELPA* library ##
### Using *ELPA* from a Fortran code ###
Up to now, if you have been using the (legacy API of the) *ELPA* library you had to do the following
steps: (we assume that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application)
1. including the *ELPA* modules
use elpa1
use elpa2 ! this step was only needed if you wanted to use the ELPA 2stage solver
2. call the "elpa_get_communicators" routine, in order to obtain the row/column MPI communicators needed by *ELPA*
mpierr = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
mpi_comm_rows, mpi_comm_cols)
3. do the desired task with the *ELPA* library, which could be
a) elpa_solve_[real|complex]_1stage_[double|single] ! solve EV problem with ELPA 1stage solver
b) elpa_solve_[real|complex]_2stage_[double|single] ! solve EV problem with ELPA 2stage solver
c) elpa_solve_tridi_[double|single] ! solve a the problem for a tri-diagonal matrix
d) elpa_cholesky_[real|complex]_[double|single] ! Cholesky decomposition
e) elpa_invert_trm_[real|complex]_[double|single] ! invert triangular matrix
f) elpa_mult_at_b_real_[double|single] ! multiply a**T * b
g) elpa_mult_ah_b_complex_[double|single] ! multiply a**H * b
For each of the function calls you had to set some parameters (see man pages) to control the execution like
useGPU=[.false.|.true.], choice of ELPA 2stage kernel .... New parameters were likely added with a new release of
the *ELPA* library to reflect the growing functionality.
The new interface of *ELPA* is more generic, which, however, requires ONCE the adaption of the user code if the new
interface should be used.
This are the new steps to do (again it is assumed that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application):
1. include the correct *ELPA* module and define a name for the ELPA instance
use elpa ! this is the only module needed for ELPA
class(elpa_t), pointer :: e ! name the ELPA instance "e"
2. initialize ELPA and create the instance
if (elpa_init(20170403) /= ELPA_OK) then
error stop "ELPA API version not supported"
endif
e => elpa_allocate()
3. set the parameters which describe the matrix setup and the MPI
call e%set("na", na,success) ! size of matrix
call e%set("local_nrows", na_rows,success) ! MPI process local rows of the distributed matrixdo the
! desired task with the *ELPA* library, which could be
call e%set("local_ncols", na_cols,success) ! MPI process local cols of the distributed matrix
call e%set("nblk", nblk, success) ! size of block-cylic distribution
call e%set("mpi_comm_parent", MPI_COMM_WORLD,succes) ! global communicator for all processes which have parts of
! the distributed matrix
call e%set("process_row", my_prow, success) ! row coordinate of MPI task
call e%set("process_col", my_pcol, success) ! column coordinate of MPI task
4. setup the ELPA instance
success = e%setup()
5. set/get any possible option (see man pages)
call e%get("qr", qr, success) ! query whether QR-decomposition is set
print *, "qr =", qr
if (success .ne. ELPA_OK) stop
call e%set("solver", ELPA_SOLVER_2STAGE, success) ! set solver to 2stage
if (success .ne. ELPA_OK) stop
call e%set("real_kernel", ELPA_2STAGE_REAL_GENERIC, success) ! set kernel of ELPA 2stage solver for
!real case to the generic kernel
....
At the moment, the following configurable runtime options are supported:
"solver" can be one of {ELPA_SOLVER_1STAGE | ELPA_SOLVER_2STAGE }
"real_kernel" can be one of { [real,complex]_generic | [real,complex]_generic_simple |
complex_sse_block1 | [real,complex]_sse_block2 |
real_sse_block4 | real_sse_block6 | [real,complex]_sse_assembly |
complex_avx_block1 | [real,complex]_avx_block2 |
real_avx_block4 | real_avx_block6 |
complex_avx2_block1 | [real,complex]_avx2_block2 |
real_avx2_block4 | real_avx2_block6 |
complex_avx512_block1 | [real,complex]_avx512_block2 |
real_avx512_block4 | real_avx512_block6 |
[real,complex]_bgp | [real,complex]_bgq }
depending on your system and the installed kernels. This can be queried with the
helper binary "elpa2_print_kernels"
"qr" can be one of { 0 | 1 }, depending whether you want to use QR decomposition in the REAL
ELPA_SOLVER_2STAGE
"gpu" can be one of { 0 | 1 }, depending whether you want to use GPU acceleration (assuming your
ELPA installation has ben build with GPU support
"timings" can be one of { 0 | 1 }, depending whether you want to measure times within the library calls
"debug" can be one of { 0 | 1 }, will give more information case of an error if set to 1
6. do the desired task with the *ELPA* library, which could be
a) e%eigenvectors ! solve EV problem with solver as set by "set" method; computes eigenvalues AND eigenvectors
! (replaces a) and b) from legacy API)
b) e%eigenvalues ! solve EV problem with solver as set by "set" method; computes eigenvalues only
c) e%choleksy ! do a cholesky decomposition (replaces d) from legacy API)
d) e%invert_triangular ! invert triangular matrix (replaces e) from legacy API)
e) e%hermitian_multiply ! multiply a**T *b or a**H *b (replaces f) and g) from legacy API)
f) e%solve_tridiagonal ! solves the eigenvalue problem for a tridiagonal matrix (replaces c) from legacy
! API)
7. when not needed anymore, destroy the instance
call elpa_deallocate()
8. when *ELPA* is not needed anymore, unitialize the *ELPA* library
call elpa_uninit()
### Online and local documentation ###
Local documentation (via man pages) should be available (if *ELPA* has been installed with the documentation):
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.05.001.rc1/html/index.html)
for each *ELPA* release is available.