USERS_GUIDE.md 26.9 KB
Newer Older
Andreas Marek's avatar
Andreas Marek committed
1
## Users guide for the *ELPA* library ##
Andreas Marek's avatar
Andreas Marek committed
2
3
4
5
6
7
8

This document provides the guide for using the *ELPA* library in user applications.

### Online and local documentation ###

Local documentation (via man pages) should be available (if *ELPA* has been installed with the documentation):

Andreas Marek's avatar
Andreas Marek committed
9
10
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.
Andreas Marek's avatar
Andreas Marek committed
11

12
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2017.05.001.rc1/html/index.html)
Andreas Marek's avatar
Andreas Marek committed
13
14
for each *ELPA* release is available.

Andreas Marek's avatar
Andreas Marek committed
15
16
17

## API of the *ELPA* library ##

18
With release 2017.05.001.rc1 of the *ELPA* library the interface has been rewritten substantially, in order to have a more
Andreas Marek's avatar
Andreas Marek committed
19
20
21
22
23
24
generic interface and avoid future interface changes.

For compatibility reasons the interface defined in the previous release 2016.11.001 is also still available
IF AND ONLY IF *ELPA* has been build with support of this legacy interface. If you want to use the legacy
interface, please look to section "B) Using the legacy API of the *ELPA* library.

25
26
The legacy API defines all the functionallity as it has been defined in *ELPA* release 2016.11.011. Note, however,
that all future features of *ELPA* will only be accessible via the new API defined in release 2017.05.001.rc1 or later.
Andreas Marek's avatar
Andreas Marek committed
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

## A) Using the final API definition of the *ELPA* library ##

Using *ELPA* with the latest API is done in the following steps

- include elpa headers (C-Case) or use the Fortran module
- define a instance of the elpa type
- call elpa_init
- call elpa_allocate to allocate an instance of *ELPA*
- use ELPA-type function "set" to set matrix parameters
- call ELPA-type function "setup"
- set or get all possible ELPA options with ELPA-type functions get/set
- call ELPA-type function solve or others
- call ELPA-type function destroy
- call elpa_uninit



## B) Using the legacy API of the *ELPA* library ##

The following description describes the usage of the *ELPA* library with the legacy interface.

Andreas Marek's avatar
Andreas Marek committed
49
### General concept of the *ELPA* library ###
Andreas Marek's avatar
Andreas Marek committed
50

Andreas Marek's avatar
Andreas Marek committed
51
52
53
54
The *ELPA* library consists of two main parts:
- *ELPA 1stage* solver
- *ELPA 2stage* solver

55
Both variants of the *ELPA* solvers are available for real or complex singe and double precision valued matrices.
Andreas Marek's avatar
Andreas Marek committed
56

Andreas Marek's avatar
Andreas Marek committed
57
Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2016.11.001/html/index.html) for details):
Andreas Marek's avatar
Andreas Marek committed
58

59
60
61
62
63
- elpa_get_communicators                        : set the row / column communicators for *ELPA*
- elpa_solve_evp_complex_1stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 1stage* solver
- elpa_solve_evp_real_1stage_{single|double}    : solve a {single|double} precision real eigenvalue proplem with the *ELPA 1stage* solver
- elpa_solve_evp_complex_2stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 2stage* solver
- elpa_solve_evp_real_2stage_{single|double}    : solve a {single|double} precision real eigenvalue proplem with the *ELPA 2stage* solver
Andreas Marek's avatar
Andreas Marek committed
64
65
- elpa_solve_evp_real_{single|double}           : driver for the {single|double} precision real *ELPA 1stage* or *ELPA 2stage* solver
- elpa_solve_evp_complex_{single|double}        : driver for the {single|double} precision complex *ELPA 1stage* or *ELPA 2stage* solver
66
67
68



Andreas Marek's avatar
Andreas Marek committed
69
Furthermore *ELPA* provides the utility binary "elpa2_print_available_kernels": it tells the user
Andreas Marek's avatar
Andreas Marek committed
70
71
72
73
74
75
76
77
78
which *ELPA 2stage* compute kernels have been installed and which default kernels are set

If you want to solve an eigenvalue problem with *ELPA*, you have to decide whether you
want to use *ELPA 1stage* or *ELPA 2stage* solver. Normally, *ELPA 2stage* is the better
choice since it is faster, but there a matrix dimensions where *ELPA 1stage* is supperior.

Independent of the choice of the solver, the concept of calling *ELPA* is always the same:

#### MPI version of *ELPA* ####
Andreas Marek's avatar
Andreas Marek committed
79
80
81
82
83

In this case, *ELPA* relies on a BLACS distributed matrix.
To solve a Eigenvalue problem of this matrix with *ELPA*, one has

1. to include the *ELPA* header (C case) or module (Fortran)
84
85
2. to create row and column MPI communicators for ELPA (with "elpa_get_communicators")
3. to call to the *ELPA driver* or directly call *ELPA 1stage* or *ELPA 2stage* for the matrix.
Andreas Marek's avatar
Andreas Marek committed
86

Andreas Marek's avatar
Andreas Marek committed
87
Here is a very simple MPI code snippet for using *ELPA 1stage*: For the definition of all variables
Andreas Marek's avatar
Andreas Marek committed
88
89
please have a look at the man pages and/or the online documentation (see above). A full version
of a simple example program can be found in ./test_project/src.
Andreas Marek's avatar
Andreas Marek committed
90
91
92
93
94


   ! All ELPA routines need MPI communicators for communicating within
   ! rows or columns of processes, these are set in get_elpa_communicators

Andreas Marek's avatar
Andreas Marek committed
95
   success = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
Andreas Marek's avatar
Andreas Marek committed
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
                                    mpi_comm_rows, mpi_comm_cols)

   if (myid==0) then
     print '(a)','| Past split communicator setup for rows and columns.'
   end if

   ! Determine the necessary size of the distributed matrices,
   ! we use the Scalapack tools routine NUMROC for that.

   na_rows = numroc(na, nblk, my_prow, 0, np_rows)
   na_cols = numroc(na, nblk, my_pcol, 0, np_cols)

   !-------------------------------------------------------------------------------
   ! Calculate eigenvalues/eigenvectors

   if (myid==0) then
     print '(a)','| Entering one-step ELPA solver ... '
     print *
   end if

116
   success = elpa_solve_evp_real_1stage_{single|double} (na, nev, a, na_rows, ev, z, na_rows, nblk, &
Andreas Marek's avatar
Andreas Marek committed
117
                                   matrixCols, mpi_comm_rows, mpi_comm_cols)
Andreas Marek's avatar
Andreas Marek committed
118
119
120
121
122
123
124

   if (myid==0) then
     print '(a)','| One-step ELPA solver complete.'
     print *
   end if


Andreas Marek's avatar
Andreas Marek committed
125
#### Shared-memory version of *ELPA* ####
Andreas Marek's avatar
Andreas Marek committed
126

Andreas Marek's avatar
Andreas Marek committed
127
If the *ELPA* library has been compiled with the configure option "--with-mpi=0",
Andreas Marek's avatar
Andreas Marek committed
128
129
130
131
no MPI will be used.

Still the **same** call sequence as in the MPI case can be used (see above).

Andreas Marek's avatar
Andreas Marek committed
132
#### Setting the row and column communicators ####
Andreas Marek's avatar
Andreas Marek committed
133

Andreas Marek's avatar
Andreas Marek committed
134
135
136
SYNOPSIS
   FORTRAN INTERFACE
       use elpa1
137

138
       success = elpa_get_communicators (mpi_comm_global, my_prow, my_pcol, mpi_comm_rows, mpi_comm_cols)
139

Andreas Marek's avatar
Andreas Marek committed
140
141
142
143
144
145
146
147
148
149
150
       integer, intent(in)   mpi_comm_global:  global communicator for the calculation
       integer, intent(in)   my_prow:          row coordinate of the calling process in the process grid
       integer, intent(in)   my_pcol:          column coordinate of the calling process in the process grid
       integer, intent(out)  mpi_comm_row:     communicator for communication within rows of processes
       integer, intent(out)  mpi_comm_row:     communicator for communication within columns of processes

       integer               success:          return value indicating success or failure of the underlying MPI_COMM_SPLIT function

   C INTERFACE
       #include "elpa_generated.h"

151
       success = elpa_get_communicators (int mpi_comm_world, int my_prow, my_pcol, int *mpi_comm_rows, int *Pmpi_comm_cols);
Andreas Marek's avatar
Andreas Marek committed
152
153
154
155
156
157
158
159
160
161
162
163

       int mpi_comm_global:  global communicator for the calculation
       int my_prow:          row coordinate of the calling process in the process grid
       int my_pcol:          column coordinate of the calling process in the process grid
       int *mpi_comm_row:    pointer to the communicator for communication within rows of processes
       int *mpi_comm_row:    pointer to the communicator for communication within columns of processes

       int  success:         return value indicating success or failure of the underlying MPI_COMM_SPLIT function


#### Using *ELPA 1stage* ####

164
After setting up the *ELPA* row and column communicators (by calling elpa_get_communicators),
Andreas Marek's avatar
Andreas Marek committed
165
166
167
168
169
only the real or complex valued solver has to be called:

SYNOPSIS
   FORTRAN INTERFACE
       use elpa1
170
       success = elpa_solve_evp_real_1stage_{single|double} (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
Andreas Marek's avatar
Andreas Marek committed
171
172
173
174
175
176
       mpi_comm_cols)

       With the definintions of the input and output variables:

       integer, intent(in)    na:            global dimension of quadratic matrix a to solve
       integer, intent(in)    nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
177
       real*{4|8},  intent(inout) a:         locally distributed part of the matrix a. The local dimensions are lda x matrixCols
Andreas Marek's avatar
Andreas Marek committed
178
       integer, intent(in)    lda:           leading dimension of locally distributed matrix a
179
180
       real*{4|8},  intent(inout) ev:        on output the first nev computed eigenvalues
       real*{4|8},  intent(inout) q:         on output the first nev computed eigenvectors
Andreas Marek's avatar
Andreas Marek committed
181
182
183
184
185
186
187
188
189
190
191
       integer, intent(in)    ldq:           leading dimension of matrix q which stores the eigenvectors
       integer, intent(in)    nblk:          blocksize of block cyclic distributin, must be the same in both directions
       integer, intent(in)    matrixCols:    number of columns of locally distributed matrices a and q
       integer, intent(in)    mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       integer, intent(in)    mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)

       logical                success:       return value indicating success or failure

   C INTERFACE
       #include "elpa.h"

192
       success = elpa_solve_evp_real_1stage_{single|double} (int na, int nev,  double *a, int lda,  double *ev, double *q, int ldq, int nblk, int matrixCols, int
Andreas Marek's avatar
Andreas Marek committed
193
194
195
196
197
198
       mpi_comm_rows, int mpi_comm_cols);

       With the definintions of the input and output variables:

       int     na:            global dimension of quadratic matrix a to solve
       int     nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
199
       {float|double} *a:     pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
Andreas Marek's avatar
Andreas Marek committed
200
       int     lda:           leading dimension of locally distributed matrix a
201
202
       {float|double} *ev:    pointer to memory containing on output the first nev computed eigenvalues
       {float|double} *q:     pointer to memory containing on output the first nev computed eigenvectors
Andreas Marek's avatar
Andreas Marek committed
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
       int     ldq:           leading dimension of matrix q which stores the eigenvectors
       int     nblk:          blocksize of block cyclic distributin, must be the same in both directions
       int     matrixCols:    number of columns of locally distributed matrices a and q
       int     mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       int     mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)

       int     success:       return value indicating success (1) or failure (0)

DESCRIPTION
       Solve the real eigenvalue problem with the 1-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
       get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
       The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
       will be stored in q. All memory of the arguments must be allocated outside the call to the solver.

   FORTRAN INTERFACE
       use elpa1
219
       success = elpa_solve_evp_complex_1stage_{single|double} (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
Andreas Marek's avatar
Andreas Marek committed
220
221
222
223
224
225
       mpi_comm_cols)

       With the definintions of the input and output variables:

       integer,     intent(in)    na:            global dimension of quadratic matrix a to solve
       integer,     intent(in)    nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
226
       complex*{8|16},  intent(inout) a:         locally distributed part of the matrix a. The local dimensions are lda x matrixCols
Andreas Marek's avatar
Andreas Marek committed
227
       integer,     intent(in)    lda:           leading dimension of locally distributed matrix a
228
229
       real*{4|8},      intent(inout) ev:        on output the first nev computed eigenvalues
       complex*{8|16},  intent(inout) q:         on output the first nev computed eigenvectors
Andreas Marek's avatar
Andreas Marek committed
230
231
232
233
234
235
236
237
238
239
240
241
       integer,     intent(in)    ldq:           leading dimension of matrix q which stores the eigenvectors
       integer,     intent(in)    nblk:          blocksize of block cyclic distributin, must be the same in both directions
       integer,     intent(in)    matrixCols:    number of columns of locally distributed matrices a and q
       integer,     intent(in)    mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       integer, intent(in)        mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)

       logical                    success:       return value indicating success or failure

   C INTERFACE
       #include "elpa.h"
       #include <complex.h>

242
       success = elpa_solve_evp_complex_1stage_{single|double} (int na, int nev,  double complex *a, int lda,  double *ev, double complex*q, int ldq, int nblk, int
Andreas Marek's avatar
Andreas Marek committed
243
244
245
246
247
248
       matrixCols, int mpi_comm_rows, int mpi_comm_cols);

       With the definintions of the input and output variables:

       int             na:            global dimension of quadratic matrix a to solve
       int             nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
249
       {float|double} complex *a:     pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
Andreas Marek's avatar
Andreas Marek committed
250
       int             lda:           leading dimension of locally distributed matrix a
251
252
       {float|double}         *ev:    pointer to memory containing on output the first nev computed eigenvalues
       {float|double} complex *q:     pointer to memory containing on output the first nev computed eigenvectors
Andreas Marek's avatar
Andreas Marek committed
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
       int             ldq:           leading dimension of matrix q which stores the eigenvectors
       int             nblk:          blocksize of block cyclic distributin, must be the same in both directions
       int             matrixCols:    number of columns of locally distributed matrices a and q
       int             mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       int             mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)

       int             success:       return value indicating success (1) or failure (0)

DESCRIPTION
       Solve the complex eigenvalue problem with the 1-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
       get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
       The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
       will be stored in q. All memory of the arguments must be allocated outside the call to the solver.


The *ELPA 1stage* solver, does not need or accept any other parameters than in the above
specification.

#### Using *ELPA 2stage* ####

The *ELPA 2stage* solver can be used in the same manner, as the *ELPA 1stage* solver.
However, the 2 stage solver, can be used with different compute kernels, which offers
more possibilities for configuration.

It is recommended to first call the utillity program

elpa2_print_kernels

which will tell all the compute kernels that can be used with *ELPA 2stage*". It will
also give information, whether a kernel can be set via environment variables.

##### Using the default kernels #####

If no kernel is set either via an environment variable or the *ELPA 2stage API* then
the default kernels will be set.

##### Setting the *ELPA 2stage* compute kernels #####

291
292
##### Setting the *ELPA 2stage* compute kernels with environment variables#####

Andreas Marek's avatar
Andreas Marek committed
293
294
295
296
If the *ELPA* installation allows setting ther compute kernels with enviroment variables,
setting the variables "REAL_ELPA_KERNEL" and "COMPLEX_ELPA_KERNEL" will set the compute
kernels. The environment variable setting will take precedence over all other settings!

297
298
299
300
301
302
The utility program "elpa2_print_kernels" can list which kernels are available and which
would be choosen. This reflects, as well the setting of the default kernel or the settings
with the environment variables

##### Setting the *ELPA 2stage* compute kernels with API calls#####

Andreas Marek's avatar
Andreas Marek committed
303
304
It is also possible to set the *ELPA 2stage* compute kernels via the API.

305
306
As an example the API for ELPA real double-precision 2stage is shown:

Andreas Marek's avatar
Andreas Marek committed
307
308
SYNOPSIS
   FORTRAN INTERFACE
Andreas Marek's avatar
Andreas Marek committed
309
310
       use elpa1
       use elpa2
311
312
       success = elpa_solve_evp_real_2stage_double (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows,
       mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL, useQR, useGPU)
Andreas Marek's avatar
Andreas Marek committed
313
314
315
316
317

       With the definintions of the input and output variables:

       integer, intent(in)            na:            global dimension of quadratic matrix a to solve
       integer, intent(in)            nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
318
       real*{4|8},  intent(inout)         a:         locally distributed part of the matrix a. The local dimensions are lda x matrixCols
Andreas Marek's avatar
Andreas Marek committed
319
       integer, intent(in)            lda:           leading dimension of locally distributed matrix a
320
321
       real*{4|8},  intent(inout)         ev:        on output the first nev computed eigenvalues
       real*{4|8},  intent(inout)         q:         on output the first nev computed eigenvectors
Andreas Marek's avatar
Andreas Marek committed
322
323
324
325
326
327
328
       integer, intent(in)            ldq:           leading dimension of matrix q which stores the eigenvectors
       integer, intent(in)            nblk:          blocksize of block cyclic distributin, must be the same in both directions
       integer, intent(in)            matrixCols:    number of columns of locally distributed matrices a and q
       integer, intent(in)            mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       integer, intent(in)            mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
       integer, intent(in)            mpi_comm_all:  communicator for all processes in the processor set involved in ELPA
       logical, intent(in), optional: useQR:         optional argument; switches to QR-decomposition if set to .true.
329
       logical, intent(in), optional: useGPU:        decide whether GPUs should be used ore not
Andreas Marek's avatar
Andreas Marek committed
330
331
332
333
334
335

      logical                        success:       return value indicating success or failure

   C INTERFACE
       #include "elpa.h"

336
337
       success = elpa_solve_evp_real_2stage_double (int na, int nev,  double *a, int lda,  double *ev, double *q, int ldq, int nblk, int matrixCols, int
       mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQR, int useGPU);
Andreas Marek's avatar
Andreas Marek committed
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353

       With the definintions of the input and output variables:

       int     na:            global dimension of quadratic matrix a to solve
       int     nev:           number of eigenvalues to be computed; the first nev eigenvalules are calculated
       double *a:             pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols
       int     lda:           leading dimension of locally distributed matrix a
       double *ev:            pointer to memory containing on output the first nev computed eigenvalues
       double *q:             pointer to memory containing on output the first nev computed eigenvectors
       int     ldq:           leading dimension of matrix q which stores the eigenvectors
       int     nblk:          blocksize of block cyclic distributin, must be the same in both directions
       int     matrixCols:    number of columns of locally distributed matrices a and q
       int     mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
       int     mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
       int     mpi_comm_all:  communicator for all processes in the processor set involved in ELPA
       int     useQR:         if set to 1 switch to QR-decomposition
354
       int     useGPU:        decide whether the GPU version should be used or not
Andreas Marek's avatar
Andreas Marek committed
355
356
357
358
359
360
361
362
363
364

       int     success:       return value indicating success (1) or failure (0)


DESCRIPTION
       Solve the real eigenvalue problem with the 2-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
       get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
       The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
       will be stored in q. All memory of the arguments must be allocated outside the call to the solver.

365
366
##### Setting up *ELPA 1stage* or *ELPA 2stage* with the *ELPA driver interface* #####

367
368
369
Since release ELPA 2016.005.004 a driver routine allows to choose more easily which solver (1stage or 2stage) will be used.

As an exmple the real double-precision case is explained:
370
371
372
373
374

 SYNOPSIS

 FORTRAN INTERFACE

Andreas Marek's avatar
Andreas Marek committed
375
  use elpa_driver
376

377
  success = elpa_solve_evp_real_double (na, nev, a(lda,matrixCols), ev(nev), q(ldq, matrixCols), ldq, nblk, matrixCols, mpi_comm_rows, mpi_comm_cols, mpi_comm_all, THIS_REAL_ELPA_KERNEL=THIS_REAL_ELPA_KERNEL, useQR, useGPU, method=method)
378
379
380
381
382
383

  Generalized interface to the ELPA 1stage and 2stage solver for real-valued problems

  With the definintions of the input and output variables:


Andreas Marek's avatar
Andreas Marek committed
384
  integer, intent(in)            na:                    global dimension of quadratic matrix a to solve
385

Andreas Marek's avatar
Andreas Marek committed
386
  integer, intent(in)            nev:                   number of eigenvalues to be computed; the first nev eigenvalules are calculated
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411

  real*8,  intent(inout)         a:                     locally distributed part of the matrix a. The local dimensions are lda x matrixCols

  integer, intent(in)            lda:                   leading dimension of locally distributed matrix a

  real*8,  intent(inout)         ev:                    on output the first nev computed eigenvalues"

  real*8,  intent(inout)         q:                     on output the first nev computed eigenvectors"

  integer, intent(in)            ldq:                   leading dimension of matrix q which stores the eigenvectors

  integer, intent(in)            nblk:                  blocksize of block cyclic distributin, must be the same in both directions

  integer, intent(in)            matrixCols:            number of columns of locally distributed matrices a and q

  integer, intent(in)            mpi_comm_rows:         communicator for communication in rows. Constructed with elpa_get_communicators

  integer, intent(in)            mpi_comm_cols:         communicator for communication in colums. Constructed with elpa_get_communicators

  integer, intent(in)            mpi_comm_all:          communicator for all processes in the processor set involved in ELPA

  integer, intent(in), optional: THIS_REAL_ELPA_KERNEL: optional argument, choose the compute kernel for 2-stage solver

  logical, intent(in), optional: useQR:                 optional argument; switches to QR-decomposition if set to .true.

412
413
  logical, intent(in), optional: useQPU:                decide whether the GPU version should be used or not

414
415
416
417
418
419
420
421
422
  character(*), optional         method:                use 1stage solver if "1stage", use 2stage solver if "2stage", (at the moment) use 2stage solver if "auto"

  logical                        success:               return value indicating success or failure


 C INTERFACE

 #include "elpa.h"

423
 success = elpa_solve_evp_real_double (int na, int nev, double *a, int lda, double *ev, double *q, int ldq, int nblk, int matrixCols, int mpi_comm_rows, int mpi_comm_cols, int mpi_comm_all, int THIS_ELPA_REAL_KERNEL, int useQR, int useGPU, char *method);"
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456


 With the definintions of the input and output variables:"


 int     na:                    global dimension of quadratic matrix a to solve

 int     nev:                   number of eigenvalues to be computed; the first nev eigenvalules are calculated

 double *a:                     pointer to locally distributed part of the matrix a. The local dimensions are lda x matrixCols

 int     lda:                   leading dimension of locally distributed matrix a

 double *ev:                    pointer to memory containing on output the first nev computed eigenvalues

 double *q:                     pointer to memory containing on output the first nev computed eigenvectors

 int     ldq:                   leading dimension of matrix q which stores the eigenvectors

 int     nblk:                  blocksize of block cyclic distributin, must be the same in both directions

 int     matrixCols:            number of columns of locally distributed matrices a and q

 int     mpi_comm_rows:         communicator for communication in rows. Constructed with elpa_get_communicators

 int     mpi_comm_cols:         communicator for communication in colums. Constructed with elpa_get_communicators

 int     mpi_comm_all:          communicator for all processes in the processor set involved in ELPA

 int     THIS_ELPA_REAL_KERNEL: choose the compute kernel for 2-stage solver

 int     useQR:                 if set to 1 switch to QR-decomposition

457
458
 int     useGPU:                decide whether the GPU version should be used or not

459
460
461
462
463
464
465
 char   *method:                use 1stage solver if "1stage", use 2stage solver if "2stage", (at the moment) use 2stage solver if "auto"

 int     success:               return value indicating success (1) or failure (0)

 DESCRIPTION
 Solve the real eigenvalue problem. The value of method desides whether the 1stage or 2stage solver is used. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the elpa_get_communicators function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols. The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues will be stored in q. All memory of the arguments must be allocated outside the call to the solver.

466
467
468
469
470
471
472
473
474
475
476
477
478
479
##### Setting up the GPU version of *ELPA* 1 and 2 stage #####

Since release ELPA 2016.011.001.pre *ELPA* offers GPU support, IF *ELPA* has been build with the configure option "--enabble-gpu-support".

At run-time the GPU version can be used by setting the environment variable "ELPA_USE_GPU" to "yes", or by calling the *ELPA* functions
(elpa_solve_evp_real_{double|single}, elpa_solve_evp_real_1stage_{double|single}, elpa_solve_evp_real_2stage_{double|single}) with the
argument "useGPU = .true." or "useGPU = 1" for the Fortran and C case, respectively. Please, not that similiar to the choice of the
*ELPA* 2stage compute kernels, the enviroment variable takes precendence over the setting in the API call.

Further note that it is NOT allowed to define the usage of GPUs AND to EXPLICITLY set an ELPA 2stage compute kernel other than
"REAL_ELPA_KERNEL_GPU" or "COMPLEX_ELPA_KERNEL_GPU".