SWITCHING_TO_NEW_INTERFACE.md 7.02 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
## Documentation how to switch from the legay API to the new API of the *ELPA* library ##


### Using *ELPA* from a Fortran code ###

Up to now, if you have been using the (legacy API of the) *ELPA* library you had to do the following
steps: (we assume that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application)

1. including the *ELPA* modules

   use elpa1
   use elpa2   ! this step was only needed if you wanted to use the ELPA 2stage solver

2. call the "elpa_get_communicators" routine, in order to obtain the row/column MPI communicators needed by *ELPA*

   mpierr = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
                                   mpi_comm_rows, mpi_comm_cols)

3. do the desired task with the *ELPA* library, which could be
  a) elpa_solve_[real|complex]_1stage_[double|single]     ! solve EV problem with ELPA 1stage solver
  b) elpa_solve_[real|complex]_2stage_[double|single]     ! solve EV problem with ELPA 2stage solver
  c) elpa_solve_tridi_[double|single]                     ! solve a the problem for a tri-diagonal matrix
  d) elpa_cholesky_[real|complex]_[double|single]         ! Cholesky decomposition
  e) elpa_invert_trm_[real|complex]_[double|single]       ! invert triangular matrix
  f) elpa_mult_at_b_real_[double|single]                  ! multiply a**T * b
  g) elpa_mult_ah_b_complex_[double|single]               ! multiply a**H * b

For each of the function calls you had to set some parameters (see man pages) to control the execution like
useGPU=[.false.|.true.], choice of ELPA 2stage kernel .... New parameters were likely added with a new release of
the *ELPA* library to reflect the growing functionality.


The new interface of *ELPA* is more generic, which ,however, requires ONCE the adaption of the user code if the new
interface should be used.

This are the new steps to do (again it is assumed that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application):

1. include the correct *ELPA* module and define a name for the ELPA instance

42
   use elpa   ! this is the only module needed for ELPA
43

44
   class(elpa_t), pointer :: e   ! name the ELPA instance "e"
45 46 47

2. initialize ELPA and create the instance

48
   if (elpa_init(20170403) /= ELPA_OK) then
49 50 51
     error stop "ELPA API version not supported"
   endif

52 53 54 55 56 57 58 59 60
   e => elpa_allocate()

!>   call e%set("solver", ELPA_SOLVER_2STAGE, success)
!> \endcode
!>   ... set and get all other options that are desired
!> \code{.f90}
!>
!>   ! use method solve to solve the eigenvalue problem
!>   ! other possible methods are desribed in \ref elpa_api::elpa_t derived type
61
!>   call e%eigenvectors(a, ev, z, success)
62 63 64 65 66 67 68
!>
!>   ! cleanup
!>   call elpa_deallocate(e)
!>
!>   call elpa_uninit()


69 70 71

3. set the parameters which describe the matrix setup and the MPI

72 73 74
   call e%set("na", na,success)                          ! size of matrix
   call e%set("local_nrows", na_rows,success)            ! MPI process local rows of the distributed matrixdo the
                                                         ! desired task with the *ELPA* library, which could be
75

76 77
   call e%set("local_ncols", na_cols,success)            ! MPI process local cols of the distributed matrix
   call e%set("nblk", nblk, success)                     ! size of block-cylic distribution
78

79 80 81 82
   call e%set("mpi_comm_parent", MPI_COMM_WORLD,succes)  ! global communicator for all processes which have parts of
                                                         ! the distributed matrix
   call e%set("process_row", my_prow, success)           ! row coordinate of MPI task
   call e%set("process_col", my_pcol, success)           ! column coordinate of MPI task
83 84 85

4. setup the ELPA instance

86
   success = e%setup()
87 88 89

5. set/get any possible option (see man pages)

90
   call e%get("qr", qr, success)                        ! querry whether QR-decomposition is set
91
   print *, "qr =", qr
92
   if (success .ne. ELPA_OK) stop
93 94

   call e%set("solver", ELPA_SOLVER_2STAGE, success)    ! set solver to 2stage
95
   if (success .ne. ELPA_OK) stop
96

97 98
   call e%set("real_kernel", ELPA_2STAGE_REAL_GENERIC, success) ! set kernel of ELPA 2stage solver for
                                                                !real case to the generic kernel
99 100 101

   ....

102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
   At the moment, the following configurable runtime options are supported:

   "solver"       can be one of {ELPA_SOLVER_1STAGE | ELPA_SOLVER_2STAGE }
   "real_kernel"  can be one of { [real,complex]_generic | [real,complex]_generic_simple |
                                  complex_sse_block1 | [real,complex]_sse_block2 |
				  real_sse_block4 | real_sse_block6 | [real,complex]_sse_assembly |
				  complex_avx_block1 | [real,complex]_avx_block2 |
				  real_avx_block4 | real_avx_block6 |
  				  complex_avx2_block1 | [real,complex]_avx2_block2 |
				  real_avx2_block4 | real_avx2_block6 |
				  complex_avx512_block1 | [real,complex]_avx512_block2 |
				  real_avx512_block4 | real_avx512_block6 |
				  [real,complex]_bgp | [real,complex]_bgq }
		 depending on your system and the installed kernels. This can be queried with the
		 helper binary "elpa2_print_kernels"

   "qr"       can be one of { 0 | 1 }, depending whether you want to use QR decomposition in the REAL
              ELPA_SOLVER_2STAGE
   "gpu"      can be one of { 0 | 1 }, depending whether you want to use GPU acceleration (assuming your
              ELPA installation has ben build with GPU support

   "timings"  can be one of { 0 | 1 }, depending whether you want to measure times within the library calls

   "debug"    can be one of { 0 | 1 }, will give more information case of an error if set to 1


6. do the desired task with the *ELPA* library, which could be
129
   a) e%eigenvectors                  ! solve EV problem with solver as set by "set" method; computes eigenvalues AND eigenvectors
130
                                      ! (replaces a) and b) from legacy API)
131
   b) e%eigenvalues                   ! solve EV problem with solver as set by "set" method; computes eigenvalues only
132
   c) e%choleksy                      ! do a cholesky decomposition (replaces  d) from legacy API)
133
   d) e%invert_triangular             ! invert triangular matrix (replaces  e) from legacy API)
134
   e) e%hermitian_multiply            ! multiply a**T *b or a**H *b (replaces f) and g) from legacy API)
135 136
   f) e%solve_tridiagonal             ! solves the eigenvalue problem for a tridiagonal matrix (replaces c) from legacy
                                      ! API)
137 138

7. when not needed anymore, destroy the instance
139
   call elpa_deallocate()
140 141 142 143 144 145 146 147 148 149 150 151

8. when *ELPA* is not needed anymore, unitialize the *ELPA* library
   call elpa_uninit()


### Online and local documentation ###

Local documentation (via man pages) should be available (if *ELPA* has been installed with the documentation):

For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.

152
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2017.05.002/html/index.html)
153 154 155
for each *ELPA* release is available.