SWITCHING_TO_NEW_INTERFACE.md 6.63 KB
Newer Older
1
## Documentation how to switch from the legacy API to the new API of the *ELPA* library ##
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33


### Using *ELPA* from a Fortran code ###

Up to now, if you have been using the (legacy API of the) *ELPA* library you had to do the following
steps: (we assume that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application)

1. including the *ELPA* modules

   use elpa1
   use elpa2   ! this step was only needed if you wanted to use the ELPA 2stage solver

2. call the "elpa_get_communicators" routine, in order to obtain the row/column MPI communicators needed by *ELPA*

   mpierr = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
                                   mpi_comm_rows, mpi_comm_cols)

3. do the desired task with the *ELPA* library, which could be
  a) elpa_solve_[real|complex]_1stage_[double|single]     ! solve EV problem with ELPA 1stage solver
  b) elpa_solve_[real|complex]_2stage_[double|single]     ! solve EV problem with ELPA 2stage solver
  c) elpa_solve_tridi_[double|single]                     ! solve a the problem for a tri-diagonal matrix
  d) elpa_cholesky_[real|complex]_[double|single]         ! Cholesky decomposition
  e) elpa_invert_trm_[real|complex]_[double|single]       ! invert triangular matrix
  f) elpa_mult_at_b_real_[double|single]                  ! multiply a**T * b
  g) elpa_mult_ah_b_complex_[double|single]               ! multiply a**H * b

For each of the function calls you had to set some parameters (see man pages) to control the execution like
useGPU=[.false.|.true.], choice of ELPA 2stage kernel .... New parameters were likely added with a new release of
the *ELPA* library to reflect the growing functionality.


34
The new interface of *ELPA* is more generic, which, however, requires ONCE the adaption of the user code if the new
35
36
37
38
39
40
41
interface should be used.

This are the new steps to do (again it is assumed that MPI and a distributed matrix in block-cyclic scalapack layout is already created in
the user application):

1. include the correct *ELPA* module and define a name for the ELPA instance

42
   use elpa   ! this is the only module needed for ELPA
43

44
   class(elpa_t), pointer :: e   ! name the ELPA instance "e"
45
46
47

2. initialize ELPA and create the instance

48
   if (elpa_init(20170403) /= ELPA_OK) then
49
50
51
     error stop "ELPA API version not supported"
   endif

52
53
   e => elpa_allocate()

54
55
56

3. set the parameters which describe the matrix setup and the MPI

57
58
59
   call e%set("na", na,success)                          ! size of matrix
   call e%set("local_nrows", na_rows,success)            ! MPI process local rows of the distributed matrixdo the
                                                         ! desired task with the *ELPA* library, which could be
60

61
62
   call e%set("local_ncols", na_cols,success)            ! MPI process local cols of the distributed matrix
   call e%set("nblk", nblk, success)                     ! size of block-cylic distribution
63

64
65
66
67
   call e%set("mpi_comm_parent", MPI_COMM_WORLD,succes)  ! global communicator for all processes which have parts of
                                                         ! the distributed matrix
   call e%set("process_row", my_prow, success)           ! row coordinate of MPI task
   call e%set("process_col", my_pcol, success)           ! column coordinate of MPI task
68
69
70

4. setup the ELPA instance

71
   success = e%setup()
72
73
74

5. set/get any possible option (see man pages)

75
   call e%get("qr", qr, success)                        ! query whether QR-decomposition is set
76
   print *, "qr =", qr
77
   if (success .ne. ELPA_OK) stop
78
79

   call e%set("solver", ELPA_SOLVER_2STAGE, success)    ! set solver to 2stage
80
   if (success .ne. ELPA_OK) stop
81

82
83
   call e%set("real_kernel", ELPA_2STAGE_REAL_GENERIC, success) ! set kernel of ELPA 2stage solver for
                                                                !real case to the generic kernel
84
85
86

   ....

87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
   At the moment, the following configurable runtime options are supported:

   "solver"       can be one of {ELPA_SOLVER_1STAGE | ELPA_SOLVER_2STAGE }
   "real_kernel"  can be one of { [real,complex]_generic | [real,complex]_generic_simple |
                                  complex_sse_block1 | [real,complex]_sse_block2 |
				  real_sse_block4 | real_sse_block6 | [real,complex]_sse_assembly |
				  complex_avx_block1 | [real,complex]_avx_block2 |
				  real_avx_block4 | real_avx_block6 |
  				  complex_avx2_block1 | [real,complex]_avx2_block2 |
				  real_avx2_block4 | real_avx2_block6 |
				  complex_avx512_block1 | [real,complex]_avx512_block2 |
				  real_avx512_block4 | real_avx512_block6 |
				  [real,complex]_bgp | [real,complex]_bgq }
		 depending on your system and the installed kernels. This can be queried with the
		 helper binary "elpa2_print_kernels"

   "qr"       can be one of { 0 | 1 }, depending whether you want to use QR decomposition in the REAL
              ELPA_SOLVER_2STAGE
   "gpu"      can be one of { 0 | 1 }, depending whether you want to use GPU acceleration (assuming your
              ELPA installation has ben build with GPU support

   "timings"  can be one of { 0 | 1 }, depending whether you want to measure times within the library calls

   "debug"    can be one of { 0 | 1 }, will give more information case of an error if set to 1


6. do the desired task with the *ELPA* library, which could be
114
   a) e%eigenvectors                  ! solve EV problem with solver as set by "set" method; computes eigenvalues AND eigenvectors
115
                                      ! (replaces a) and b) from legacy API)
116
   b) e%eigenvalues                   ! solve EV problem with solver as set by "set" method; computes eigenvalues only
117
   c) e%choleksy                      ! do a cholesky decomposition (replaces  d) from legacy API)
118
   d) e%invert_triangular             ! invert triangular matrix (replaces  e) from legacy API)
119
   e) e%hermitian_multiply            ! multiply a**T *b or a**H *b (replaces f) and g) from legacy API)
120
121
   f) e%solve_tridiagonal             ! solves the eigenvalue problem for a tridiagonal matrix (replaces c) from legacy
                                      ! API)
122
123

7. when not needed anymore, destroy the instance
124
   call elpa_deallocate()
125
126
127
128
129
130
131
132
133
134
135
136

8. when *ELPA* is not needed anymore, unitialize the *ELPA* library
   call elpa_uninit()


### Online and local documentation ###

Local documentation (via man pages) should be available (if *ELPA* has been installed with the documentation):

For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.

137
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.05.001.rc1/html/index.html)
138
139
140
for each *ELPA* release is available.