Commit 0891ae1d authored by Andreas Marek's avatar Andreas Marek

Merge branch 'master_pre_stage' into loh/autotuning

parents 5dff56a7 4e7b3126
......@@ -17,12 +17,12 @@ following listed interfaces will be removed at some time.
In order to unfiy the namespace of the *ELPA* public interfaces, several interfaces
have been replaced by new names. The old interfaces will be removed
Deprecated interface Replacement
===================================================
get_elpa_row_col_coms elpa_get_communicators
get_elpa_communicators elpa_get_communicators
solve_evp_real elpa_solve_evp_real_1stage_double
solve_evp_complex elpa_solve_evp_complex_1stage_double
Deprecated interface Replacement Comment
================================================================================
get_elpa_row_col_coms elpa_get_communicators (removed)
get_elpa_communicators elpa_get_communicators (removed)
solve_evp_real elpa_solve_evp_real_1stage_double (removed)
solve_evp_complex elpa_solve_evp_complex_1stage_double (removed)
solve_evp_real_1stage elpa_solve_evp_real_1stage_double
solve_evp_complex_1stage elpa_solve_evp_complex_1stage_double
solve_evp_real_2stage elpa_solve_evp_real_2stage_double
......
......@@ -456,9 +456,7 @@ dist_man_MANS = \
if ENABLE_LEGACY
dist_man_MANS += \
man/solve_evp_real.3 \
man/solve_evp_real_1stage_double.3 \
man/solve_evp_complex.3 \
man/solve_evp_complex_1stage_double.3 \
man/solve_evp_real_2stage_double.3 \
man/solve_evp_complex_2stage_double.3 \
......@@ -466,8 +464,7 @@ dist_man_MANS += \
man/elpa_solve_evp_complex_1stage_double.3 \
man/elpa_solve_evp_real_2stage_double.3 \
man/elpa_solve_evp_complex_2stage_double.3 \
man/get_elpa_row_col_comms.3 \
man/get_elpa_communicators.3 \
man/elpa_get_communicators.3 \
man/elpa_mult_at_b_real_double.3 \
man/elpa_mult_at_b_real_single.3 \
man/elpa_mult_ah_b_complex_double.3 \
......@@ -479,7 +476,11 @@ dist_man_MANS += \
man/elpa_solve_evp_real_double.3 \
man/elpa_solve_evp_real_single.3 \
man/elpa_solve_evp_complex_double.3 \
man/elpa_solve_evp_complex_single.3
man/elpa_solve_evp_complex_single.3 \
man/elpa_autotune_setup.3 \
man/elpa_autotune_step.3 \
man/elpa_autotune_set_best.3 \
man/elpa_autotune_deallocate.3
endif
......
......@@ -125,7 +125,7 @@ of a simple example program can be found in ./test_project/src.
! All ELPA routines need MPI communicators for communicating within
! rows or columns of processes, these are set in get_elpa_communicators
! rows or columns of processes, these are set in elpa_get_communicators
success = elpa_get_communicators(mpi_comm_world, my_prow, my_pcol, &
mpi_comm_rows, mpi_comm_cols)
......@@ -216,8 +216,8 @@ SYNOPSIS
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
logical success: return value indicating success or failure
......@@ -238,14 +238,14 @@ SYNOPSIS
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
int mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
int mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
int success: return value indicating success (1) or failure (0)
DESCRIPTION
Solve the real eigenvalue problem with the 1-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
elpa_get_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
......@@ -265,8 +265,8 @@ DESCRIPTION
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
logical success: return value indicating success or failure
......@@ -288,14 +288,14 @@ DESCRIPTION
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
int mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
int mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
int success: return value indicating success (1) or failure (0)
DESCRIPTION
Solve the complex eigenvalue problem with the 1-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
elpa_get_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
......@@ -357,8 +357,8 @@ SYNOPSIS
integer, intent(in) ldq: leading dimension of matrix q which stores the eigenvectors
integer, intent(in) nblk: blocksize of block cyclic distributin, must be the same in both directions
integer, intent(in) matrixCols: number of columns of locally distributed matrices a and q
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
integer, intent(in) mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
integer, intent(in) mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
integer, intent(in) mpi_comm_all: communicator for all processes in the processor set involved in ELPA
logical, intent(in), optional: useQR: optional argument; switches to QR-decomposition if set to .true.
logical, intent(in), optional: useGPU: decide whether GPUs should be used ore not
......@@ -382,8 +382,8 @@ SYNOPSIS
int ldq: leading dimension of matrix q which stores the eigenvectors
int nblk: blocksize of block cyclic distributin, must be the same in both directions
int matrixCols: number of columns of locally distributed matrices a and q
int mpi_comm_rows: communicator for communication in rows. Constructed with get_elpa_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with get_elpa_communicators(3)
int mpi_comm_rows: communicator for communication in rows. Constructed with elpa_get_communicators(3)
int mpi_comm_cols: communicator for communication in colums. Constructed with elpa_get_communicators(3)
int mpi_comm_all: communicator for all processes in the processor set involved in ELPA
int useQR: if set to 1 switch to QR-decomposition
int useGPU: decide whether the GPU version should be used or not
......@@ -393,7 +393,7 @@ SYNOPSIS
DESCRIPTION
Solve the real eigenvalue problem with the 2-stage solver. The ELPA communicators mpi_comm_rows and mpi_comm_cols are obtained with the
get_elpa_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
elpa_get_communicators(3) function. The distributed quadratic marix a has global dimensions na x na, and a local size lda x matrixCols.
The solver will compute the first nev eigenvalues, which will be stored on exit in ev. The eigenvectors corresponding to the eigenvalues
will be stored in q. All memory of the arguments must be allocated outside the call to the solver.
......
......@@ -971,6 +971,49 @@ if test x"${USE_ASSUMED_SIZE}" = x"yes" ; then
AC_DEFINE([USE_ASSUMED_SIZE],[1],[for performance reasons use assumed size Fortran arrays, even if not debuggable])
fi
enable_fortran2008_features=yes
AC_MSG_CHECKING(whether Fortran2008 features should be enabled)
AC_ARG_ENABLE([Fortran2008-features],
AS_HELP_STRING([--enable-Fortran2008-features],
[enables some Fortran 2008 features, default yes.]),
[
if test x"$enableval" = x"yes"; then
enable_fortran2008_features=yes
else
enable_fortran2008_features=no
fi
],
[])
AC_MSG_RESULT([${enable_fortran2008_features}])
AM_CONDITIONAL([USE_FORTRAN2008],[test x"$enable_fortran2008_features" = x"yes"])
if test x"${enable_fortran2008_features}" = x"yes"; then
AC_DEFINE([USE_FORTRAN2008], [1], [use some Fortran 2008 features])
fi
enable_kcomputer=no
AC_MSG_CHECKING(whether we build for K-computer)
AC_ARG_ENABLE([K-computer],
AS_HELP_STRING([--enable-K-computer],
[enable builds on K-Computer, default no.]),
[if test x"$enableval"=x"yes"; then
enable_kcomputer=yes
else
enable_kcomputer=no
fi],
[enable_kcomputer=no])
AC_MSG_RESULT([${enable_kcomputer}])
AM_CONDITIONAL([BUILD_KCOMPUTER],[test x"$enable_kcomputer" = x"yes"])
if test x"${enable_kcomputer}" = x"yes"; then
AC_DEFINE([BUILD_KCOMPUTER], [1], [build for K-Computer])
FC_MODINC="-I"
if test x"${USE_ASSUMED_SIZE}" = x"yes" ; then
AC_MSG_ERROR(on K-computer you have to switch off assumed-size arrays!)
fi
if test x"${enable_fortran2008_features}" = x"yes" ; then
AC_MSG_ERROR(on K-computer you have to switch off Fortran 2008 features!)
fi
fi
if test x"${want_single_precision}" = x"yes" ; then
AC_DEFINE([WANT_SINGLE_PRECISION_REAL],[1],[build also single-precision for real calculation])
AC_DEFINE([WANT_SINGLE_PRECISION_COMPLEX],[1],[build also single-precision for complex calculation])
......@@ -1101,4 +1144,14 @@ m4_foreach_w([elpa_m4_kind],[real complex],[
#echo "* It mainly contains bugfixes to ELPA 2017.05.001 *"
#echo "***********************************************************************"
#echo " "
make -f $srcdir/generated_headers.am generated-headers top_srcdir="$srcdir"
if test x"$enable_kcomputer" = x"yes" ; then
echo " "
echo "Important message:"
echo "On K-computer (at the moment) the automatic create of the generated"
echo "headers does not work."
echo "call: make -f ../generated_headers.am generated-headers top_srcdir=.."
echo "BEFORE triggering the build with make!"
else
make -f $srcdir/generated_headers.am generated-headers top_srcdir="$srcdir"
fi
......@@ -28,6 +28,13 @@ gpu_flag = {
matrix_flag = {
"random" : "-DTEST_MATRIX_RANDOM",
"analytic" : "-DTEST_MATRIX_ANALYTIC",
"toeplitz" : "-DTEST_MATRIX_TOEPLITZ",
"frank" : "-DTEST_MATRIX_FRANK",
}
qr_flag = {
0 : "-DTEST_QR_DECOMPOSITION=0",
1 : "-DTEST_QR_DECOMPOSITION=1",
}
test_type_flag = {
......@@ -36,7 +43,6 @@ test_type_flag = {
"solve_tridiagonal" : "-DTEST_SOLVE_TRIDIAGONAL",
"cholesky" : "-DTEST_CHOLESKY",
"hermitian_multiply" : "-DTEST_HERMITIAN_MULTIPLY",
"qr" : "-DTEST_QR_DECOMPOSITION",
}
layout_flag = {
......@@ -44,10 +50,11 @@ layout_flag = {
"square" : ""
}
for lang, m, g, t, p, d, s, l in product(
for lang, m, g, q, t, p, d, s, l in product(
sorted(language_flag.keys()),
sorted(matrix_flag.keys()),
sorted(gpu_flag.keys()),
sorted(qr_flag.keys()),
sorted(test_type_flag.keys()),
sorted(prec_flag.keys()),
sorted(domain_flag.keys()),
......@@ -59,22 +66,39 @@ for lang, m, g, t, p, d, s, l in product(
if (lang == "C" and ( m == "analytic" or l == "all_layouts")):
continue
# exclude some test combinations
# analytic tests only for "eigenvectors" and not on GPU
if(m == "analytic" and (g == 1 or t != "eigenvectors")):
continue
# Frank tests only for "eigenvectors" and eigenvalues and real double precision case
if(m == "frank" and ((t != "eigenvectors" or t != "eigenvalues") and (d !="real" or p !="double"))):
continue
if(s in ["scalapack_all", "scalapack_part"] and (g == 1 or t != "eigenvectors" or m != "analytic")):
continue
if (t == "solve_tridiagonal" and (s == "2stage" or d == "complex")):
# solve tridiagonal only for real toeplitz matrix in 1stage
if (t == "solve_tridiagonal" and (s != "1stage" or d !="real" or m != "toeplitz")):
continue
if (t == "cholesky" and (s == "2stage")):
# cholesky tests only 1stage and teoplitz matrix
if (t == "cholesky" and (m != "toeplitz" or s == "2stage")):
continue
if (t == "eigenvalues" and (m == "random")):
continue
if (t == "hermitian_multiply" and (s == "2stage")):
continue
if (t == "qr" and (s == "1stage" or d == "complex")):
if (t == "hermitian_multiply" and (m == "toeplitz")):
continue
# qr only for 2stage real
if (q == 1 and (s != "2stage" or d != "real" or t != "eigenvectors" or g == 1 or m != "random")):
continue
for kernel in ["all_kernels", "default_kernel"] if s == "2stage" else ["nokernel"]:
......@@ -115,13 +139,18 @@ for lang, m, g, t, p, d, s, l in product(
if (lang == "Fortran"):
name = "test_{0}_{1}_{2}_{3}{4}{5}{6}{7}".format(
name = "test_{0}_{1}_{2}_{3}{4}_{5}{6}{7}{8}".format(
d, p, t, s,
"" if kernel == "nokernel" else "_" + kernel,
"_gpu" if g else "",
"_analytic" if m == "analytic" else "",
"gpu_" if g else "",
"qr_" if q else "",
m,
"_all_layouts" if l == "all_layouts" else "")
print("if BUILD_KCOMPUTER")
print("bin_PROGRAMS += " + name)
print("else")
print("noinst_PROGRAMS += " + name)
print("endif")
print("check_SCRIPTS += " + name + ".sh")
print(name + "_SOURCES = test/Fortran/test.F90")
print(name + "_LDADD = $(test_program_ldadd)")
......@@ -133,19 +162,25 @@ for lang, m, g, t, p, d, s, l in product(
test_type_flag[t],
solver_flag[s],
gpu_flag[g],
qr_flag[q],
matrix_flag[m]] + extra_flags))
print("endif\n" * endifs)
if (lang == "C"):
name = "test_c_version_{0}_{1}_{2}_{3}{4}{5}{6}{7}".format(
name = "test_c_version_{0}_{1}_{2}_{3}{4}_{5}{6}{7}{8}".format(
d, p, t, s,
"" if kernel == "nokernel" else "_" + kernel,
"_gpu" if g else "",
"_analytic" if m == "analytic" else "",
"gpu_" if g else "",
"qr_" if q else "",
m,
"_all_layouts" if l == "all_layouts" else "")
print("if BUILD_KCOMPUTER")
print("bin_PROGRAMS += " + name)
print("else")
print("noinst_PROGRAMS += " + name)
print("endif")
print("check_SCRIPTS += " + name + ".sh")
print(name + "_SOURCES = test/C/test.c")
print(name + "_LDADD = $(test_program_ldadd) $(FCLIBS)")
......@@ -157,6 +192,7 @@ for lang, m, g, t, p, d, s, l in product(
test_type_flag[t],
solver_flag[s],
gpu_flag[g],
qr_flag[q],
matrix_flag[m]] + extra_flags))
print("endif\n" * endifs)
......@@ -183,5 +219,4 @@ for p, d in product(sorted(prec_flag.keys()), sorted(domain_flag.keys())):
print(" " + " \\\n ".join([
domain_flag[d],
prec_flag[p]]))
print("endif\n" * endifs)
.TH "elpa_autotune_deallocate" 3 "Tue Nov 28 2017" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
elpa_autotune_deallocate \- Deallocates an ELPA autotuning instance
.br
.SH SYNOPSIS
.br
.SS FORTRAN INTERFACE
use elpa
.br
class(elpa_t), pointer :: elpa
class(elpa_autotune_t), pointer :: tune_state
.br
.RI "call\fBelpa%autotune_deallocate\fP (tune_state)"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.RI "type(elpa_autotune_t) :: \fBtune_state\fP ! the ELPA autotuning object, created with \fBelpa_autotune_setup\fP(3)
.br
.SS C INTERFACE
#include <elpa/elpa.h>
.br
elpa_t handle;
elpa_autotune_t autotune_handle;
.br
.RI "void \fBelpa_autotune_deallocate\fP (\fBelpa_t\fP handle, \fBelpa_autotune_t\fP autotune_handle);"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.br
.RI "elpa_t \fBhandle\fP; // the handle of an ELPA object, obtained before with \fBelpa_allocate\fP(3)"
.br
.RI "elpa_autotune_t \fBautotune_handle\fP; // the handle of an ELPA object, obtained before with \fBelpa_autotune_setup\fP(3)"
.SH DESCRIPTION
Deallocates an ELPA autotuning instance. \fIPrior\fP to calling the elpa_autotune_deallocate method, an ELPA autotuning object must have been created. See \fBelpa_autotune_setup\fP(3)
.SH "SEE ALSO"
.br
\fBelpa_autotune_step\fP(3) \fBelpa_autotune_setup\fP(3) \fBelpa_autotune_deallocate\fp(3)
.TH "elpa_autotune_set_best" 3 "Tue Nov 28 2017" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
elpa_autotune_set_best \- Sets the tunable parameters to the up-to-know best solution
.br
Before the autotuning options can be set, an autotuning step has to be done \fBelpa_autotune_step\fP(3)
.SH SYNOPSIS
.br
.SS FORTRAN INTERFACE
use elpa
.br
class(elpa_t), pointer :: elpa
class(elpa_autotune_t), pointer :: tune_state
.br
.RI "call\fBelpa%autotune_set_best\fP (tune_state)"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.RI "type(elpa_autotune_t) :: \fBtune_state\fP ! the ELPA autotuning object, created with \fBelpa_autotune_setup\fP(3)
.br
.SS C INTERFACE
#include <elpa/elpa.h>
.br
elpa_t handle;
elpa_autotune_t autotune_handle;
.br
.RI "void \fBelpa_autotune_set_best\fP (\fBelpa_t\fP handle, \fBelpa_autotune_t\fP autotune_handle);"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.br
.RI "elpa_t \fBhandle\fP; // the handle of an ELPA object, obtained before with \fBelpa_allocate\fP(3)"
.br
.RI "elpa_autotune_t \fBautotune_handle\fP; // the handle of an ELPA object, obtained before with \fBelpa_autotune_setup\fP(3)"
.SH DESCRIPTION
Sets the up-to-know best options for ELPA tunable parameters. \fIPrior\fP to calling the elpa_autotune_set_best method, an ELPA autotuning step must have been performed. See \fBelpa_autotune_set_best\fP(3)
.SH "SEE ALSO"
.br
\fBelpa_autotune_step\fP(3) \fBelpa_autotune_setup\fP(3) \fBelpa_autotune_deallocate\fp(3)
.TH "elpa_autotune_setup" 3 "Tue Nov 28 2017" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
elpa_autotune_setup \- create an instance for autotuning of the ELPA library
.br
Before the autotuning object can be created, an instance of the ELPA library has to be setup, see e.g. \fBelpa_setup\fP(3)
.SH SYNOPSIS
.br
.SS FORTRAN INTERFACE
use elpa
.br
class(elpa_t), pointer :: elpa
class(elpa_autotune_t), pointer :: tune_state
.br
.RI "tune_state= \fBelpa%autotune_setup\fP (level, domain)"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.RI "integer :: \fBlevel\fP ! the level of the autotuning, at the moment ELPA_AUTOTUNE_FAST is supported"
.br
.RI "integer :: \fBdomain\fP ! the domain (real or complex) of the autotuning, can be either ELPA_AUTOTUNE_DOMAIN_REAL or ELPA_AUTOTUNE_DOMAIN_COMPLEX"
.br
.SS C INTERFACE
#include <elpa/elpa.h>
.br
elpa_t handle;
elpa_autotune_t autotune_handle;
.br
.RI "\fBelpa_autotune_t\fP autotune_handle = \fBelpa_autotune_setup\fP (\fBelpa_t\fP handle, int level, int domain);"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.br
.RI "elpa_t \fBhandle\fP; // the handle of an ELPA object, obtained before with \fBelpa_allocate\fP(3)"
.br
.RI "int \fBlevel\fP; // the level of the autotuning, at the moment "ELPA_AUTOTUNE_FAST" is supported
.br
.RI "int \fBdomain\fP; // the domain (real or complex) of the autotuning, can be either "ELPA_AUTOTUNE_DOMAIN_REAL" and "ELPA_AUTOTUNE_DOMAIN_COMPLEX"
.br
.RI "elpa_autotune_t \fBautotune_handel\fP; // the created handle of the autotune object
.SH DESCRIPTION
Creates an ELPA autotuning object. \fIPrior\fP to calling the autotune_setup, an ELPA object must have been created. See \fBelpa_setup\fP(3)
.SH "SEE ALSO"
.br
\fBelpa_autotune_step\fP(3) \fBelpa_autotune_set_best\fP(3) \fBelpa_autotune_deallocate\fp(3)
.TH "elpa_autotune_step" 3 "Tue Nov 28 2017" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
elpa_autotune_step \- do one ELPA autotuning step
.br
Before the autotuning step can be done, an instance of the ELPA autotune object has to be created, see \fBelpa_autotune_setup\fP(3)
.SH SYNOPSIS
.br
.SS FORTRAN INTERFACE
use elpa
.br
class(elpa_t), pointer :: elpa
class(elpa_autotune_t), pointer :: tune_state
.br
.RI "unfinished = \fBelpa%autotune_step\fP (tune_state)"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.RI "type(elpa_autotune_t) :: \fBtune_state\fP ! the ELPA autotuning object, created with \fBelpa_autotune_setup\fP(3)
.br
.RI "logical :: \fBunfinished\fP ! logical, specifying whether autotuning has finined (.false.) or not (.true.)
.br
.SS C INTERFACE
#include <elpa/elpa.h>
.br
elpa_t handle;
elpa_autotune_t autotune_handle;
.br
.RI "\fBint\fP unfinished = \fBelpa_autotune_step\fP (\fBelpa_t\fP handle, \fBelpa_autotune_t\fP autotune_handle);"
.br
.RI " "
.br
.RI "With the definitions of the input and output variables:"
.br
.br
.RI "elpa_t \fBhandle\fP; // the handle of an ELPA object, obtained before with \fBelpa_allocate\fP(3)"
.br
.RI "elpa_autotune_t \fBautotune_handle\fP; // the handle of the autotuning object, created with \fBelpa_autotune_setup\fP(3)
.br
.RI "int \fBunfinished\fP; // int, specifying whether autotuning has finined (0) or not (1)
.SH DESCRIPTION
Does an ELPA autotuning step. \fIPrior\fP to calling the autotune_step, an ELPA autotune object must have been created. See \fBelpa_autotune_setup\fP(3)
.SH "SEE ALSO"
.br
\fBelpa_autotune_setup\fP(3) \fBelpa_autotune_set_best\fP(3) \fBelpa_autotune_deallocate\fp(3)
.TH "get_elpa_row_col_comms" 3 "Wed Dec 2 2015" "ELPA" \" -*- nroff -*-
.TH "elpa_get_communicators" 3 "Tue Nov 28 2017" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
get_elpa_row_col_comms \- old, deprecated interface to get the MPI row and column communicators needed in ELPA.
It is recommended to use \fBelpa_get_communicators\fP(3)
elpa_get_communicators
.br
.SH SYNOPSIS
......@@ -12,7 +11,7 @@ It is recommended to use \fBelpa_get_communicators\fP(3)
use elpa1
.br
.RI "success = \fBget_elpa_row_col_comms\fP (mpi_comm_global, my_prow, my_pcol, mpi_comm_rows, mpi_comm_cols)"
.RI "success = \fBelpa_get_communicators\fP (mpi_comm_global, my_prow, my_pcol, mpi_comm_rows, mpi_comm_cols)"
.br
.br
......@@ -53,9 +52,7 @@ use elpa1
.SH DESCRIPTION
Old, depcreated interface, which will be deleted at some point: Please use \fBelpa_get_communicators\fP(3) !
All ELPA routines need MPI communicators for communicating within rows or columns of processes. These communicators are created from the \fBmpi_comm_global\fP communicator. It is assumed that the matrix used in ELPA is distributed with \fBmy_prow\fP rows and \fBmy_pcol\fP columns on the calling process. This function has to be envoked by all involved processes before any other calls to ELPA routines.
.br
.SH "SEE ALSO"
......
......@@ -58,9 +58,9 @@ use elpa1
.br
.RI "int \fBmatrixCols\fP: number of columns of locally distributed matrices \fBa\fP"
.br
.RI "int \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "int \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "int \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "int \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "int \fBwantDebug\fP: give more debugging information"
.br
......
......@@ -58,9 +58,9 @@ use elpa1
.br
.RI "int \fBmatrixCols\fP: number of columns of locally distributed matrices \fBa\fP"
.br
.RI "int \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "int \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "int \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "int \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "int \fBwantDebug\fP: give more debugging information"
.br
......
......@@ -28,9 +28,9 @@ use elpa1
.br
.RI "integer, intent(in) \fBmatrixCols\fP: number of columns of locally distributed matrices \fBa\fP"
.br
.RI "integer, intent(in) \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "integer, intent(in) \fBmpi_comm_rows\fP: communicator for communication in rows. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "integer, intent(in) \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBget_elpa_communicators\fP(3)"
.RI "integer, intent(in) \fBmpi_comm_cols\fP: communicator for communication in colums. Constructed with \fBelpa_get_communicators\fP(3)"
.br
.RI "logical, intent(in) \fBwantDebug\fP: if .true. , print more debug information in case of an error"
......
.TH "get_elpa_communicators" 3 "Wed Dec 2 2015" "ELPA" \" -*- nroff -*-
.ad l
.nh
.SH NAME
get_elpa_communicators \- Old, deprecated interface better use \fBelpa_get_communicators\fP(3)
.br
.SH SYNOPSIS
.br
.SS FORTRAN INTERFACE
use elpa1
.br