SWITCHING_TO_NEW_INTERFACE.md 6.63 KB
Newer Older
1
## Documentation how to switch from the legacy API to the new API of the *ELPA* library ##
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33


### Using *ELPA* from a Fortran code ###

Up to now, if you have been using the (legacy API of the) *ELPA* library you had to do the following
steps: (we assume that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application)

1. including the *ELPA* modules

   use elpa1
   use elpa2   ! this step was only needed if you wanted to use the ELPA 2stage solver

2. call the "elpa_get_communicators" routine, in order to obtain the row/column MPI communicators needed by *ELPA*

   mpierr = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
                                   mpi_comm_rows, mpi_comm_cols)

3. do the desired task with the *ELPA* library, which could be
  a) elpa_solve_[real|complex]_1stage_[double|single]     ! solve EV problem with ELPA 1stage solver
  b) elpa_solve_[real|complex]_2stage_[double|single]     ! solve EV problem with ELPA 2stage solver
  c) elpa_solve_tridi_[double|single]                     ! solve a the problem for a tri-diagonal matrix
  d) elpa_cholesky_[real|complex]_[double|single]         ! Cholesky decomposition
  e) elpa_invert_trm_[real|complex]_[double|single]       ! invert triangular matrix
  f) elpa_mult_at_b_real_[double|single]                  ! multiply a**T * b
  g) elpa_mult_ah_b_complex_[double|single]               ! multiply a**H * b

For each of the function calls you had to set some parameters (see man pages) to control the execution like
useGPU=[.false.|.true.], choice of ELPA 2stage kernel .... New parameters were likely added with a new release of
the *ELPA* library to reflect the growing functionality.


34
The new interface of *ELPA* is more generic, which, however, requires ONCE the adaption of the user code if the new
35 36 37 38 39 40 41
interface should be used.

This are the new steps to do (again it is assumed that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application):

1. include the correct *ELPA* module and define a name for the ELPA instance

42
   use elpa   ! this is the only module needed for ELPA
43

44
   class(elpa_t), pointer :: e   ! name the ELPA instance "e"
45 46 47

2. initialize ELPA and create the instance

48
   if (elpa_init(20170403) /= ELPA_OK) then
49 50 51
     error stop "ELPA API version not supported"
   endif

52 53
   e => elpa_allocate()

54 55 56

3. set the parameters which describe the matrix setup and the MPI

57 58 59
   call e%set("na", na,success)                          ! size of matrix
   call e%set("local_nrows", na_rows,success)            ! MPI process local rows of the distributed matrixdo the
                                                         ! desired task with the *ELPA* library, which could be
60

61 62
   call e%set("local_ncols", na_cols,success)            ! MPI process local cols of the distributed matrix
   call e%set("nblk", nblk, success)                     ! size of block-cylic distribution
63

64 65 66 67
   call e%set("mpi_comm_parent", MPI_COMM_WORLD,succes)  ! global communicator for all processes which have parts of
                                                         ! the distributed matrix
   call e%set("process_row", my_prow, success)           ! row coordinate of MPI task
   call e%set("process_col", my_pcol, success)           ! column coordinate of MPI task
68 69 70

4. setup the ELPA instance

71
   success = e%setup()
72 73 74

5. set/get any possible option (see man pages)

75
   call e%get("qr", qr, success)                        ! query whether QR-decomposition is set
76
   print *, "qr =", qr
77
   if (success .ne. ELPA_OK) stop
78 79

   call e%set("solver", ELPA_SOLVER_2STAGE, success)    ! set solver to 2stage
80
   if (success .ne. ELPA_OK) stop
81

82 83
   call e%set("real_kernel", ELPA_2STAGE_REAL_GENERIC, success) ! set kernel of ELPA 2stage solver for
                                                                !real case to the generic kernel
84 85 86

   ....

87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
   At the moment, the following configurable runtime options are supported:

   "solver"       can be one of {ELPA_SOLVER_1STAGE | ELPA_SOLVER_2STAGE }
   "real_kernel"  can be one of { [real,complex]_generic | [real,complex]_generic_simple |
                                  complex_sse_block1 | [real,complex]_sse_block2 |
				  real_sse_block4 | real_sse_block6 | [real,complex]_sse_assembly |
				  complex_avx_block1 | [real,complex]_avx_block2 |
				  real_avx_block4 | real_avx_block6 |
  				  complex_avx2_block1 | [real,complex]_avx2_block2 |
				  real_avx2_block4 | real_avx2_block6 |
				  complex_avx512_block1 | [real,complex]_avx512_block2 |
				  real_avx512_block4 | real_avx512_block6 |
				  [real,complex]_bgp | [real,complex]_bgq }
		 depending on your system and the installed kernels. This can be queried with the
		 helper binary "elpa2_print_kernels"

   "qr"       can be one of { 0 | 1 }, depending whether you want to use QR decomposition in the REAL
              ELPA_SOLVER_2STAGE
   "gpu"      can be one of { 0 | 1 }, depending whether you want to use GPU acceleration (assuming your
              ELPA installation has ben build with GPU support

   "timings"  can be one of { 0 | 1 }, depending whether you want to measure times within the library calls

   "debug"    can be one of { 0 | 1 }, will give more information case of an error if set to 1


6. do the desired task with the *ELPA* library, which could be
114
   a) e%eigenvectors                  ! solve EV problem with solver as set by "set" method; computes eigenvalues AND eigenvectors
115
                                      ! (replaces a) and b) from legacy API)
116
   b) e%eigenvalues                   ! solve EV problem with solver as set by "set" method; computes eigenvalues only
117
   c) e%choleksy                      ! do a cholesky decomposition (replaces  d) from legacy API)
118
   d) e%invert_triangular             ! invert triangular matrix (replaces  e) from legacy API)
119
   e) e%hermitian_multiply            ! multiply a**T *b or a**H *b (replaces f) and g) from legacy API)
120 121
   f) e%solve_tridiagonal             ! solves the eigenvalue problem for a tridiagonal matrix (replaces c) from legacy
                                      ! API)
122 123

7. when not needed anymore, destroy the instance
124
   call elpa_deallocate()
125 126 127 128 129 130 131 132 133 134 135 136

8. when *ELPA* is not needed anymore, unitialize the *ELPA* library
   call elpa_uninit()


### Online and local documentation ###

Local documentation (via man pages) should be available (if *ELPA* has been installed with the documentation):

For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.

137
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.05.001.rc1/html/index.html)
138 139 140
for each *ELPA* release is available.