...
 
Commits (41)
This diff is collapsed.
Changelog for ELPA 2018.11.001.rc1 Changelog for upcoming release
- user can define the default kernels
- simple block4 and block6 real kernel
- ELPA versioning number is provided in the C header files
Changelog for ELPA 2018.11.001
- improved autotuning - improved autotuning
- improved performance of generalized problem via Cannon's algorithm - improved performance of generalized problem via Cannon's algorithm
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Preamble ## ## Preamble ##
This file provides documentation on how to build the *ELPA* library in **version ELPA-2018.11.001.rc1**. This file provides documentation on how to build the *ELPA* library in **version ELPA-2018.11.001**.
With release of **version ELPA-2017.05.001** the build process has been significantly simplified, With release of **version ELPA-2017.05.001** the build process has been significantly simplified,
which makes it easier to install the *ELPA* library. which makes it easier to install the *ELPA* library.
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
For more details and recent updates please visit the online [issue system] (https://gitlab.mpcdf.mpg.de/elpa/elpa/issues) For more details and recent updates please visit the online [issue system] (https://gitlab.mpcdf.mpg.de/elpa/elpa/issues)
Issues which are not mentioned in a newer release are (considered as) solved. Issues which are not mentioned in a newer release are (considered as) solved.
### ELPA 2018.11.001.rc1 release ### ### ELPA 2018.11.001 release ###
- same issues as in ELPA 2017.11.001 - same issues as in ELPA 2017.11.001
### ELPA 2018.05.001 release ### ### ELPA 2018.05.001 release ###
......
...@@ -78,7 +78,7 @@ https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html ...@@ -78,7 +78,7 @@ https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
"legacy interface", since as announced some deprecated function aliases have been "legacy interface", since as announced some deprecated function aliases have been
removed). For the current interface all changes since 2017.05.001 are removed). For the current interface all changes since 2017.05.001 are
compatible, since only some functions have been added. compatible, since only some functions have been added.
The state of release 2017.11.001.(rc1) defines this interface The state of release 2017.11.001 defines this interface
- 12 - 12
No incompatible API changes w.r.t. the previous version. Some functions have been No incompatible API changes w.r.t. the previous version. Some functions have been
......
...@@ -108,6 +108,7 @@ EXTRA_libelpa@SUFFIX@_private_la_DEPENDENCIES = \ ...@@ -108,6 +108,7 @@ EXTRA_libelpa@SUFFIX@_private_la_DEPENDENCIES = \
src/elpa2/kernels/real_template.F90 \ src/elpa2/kernels/real_template.F90 \
src/elpa2/kernels/complex_template.F90 \ src/elpa2/kernels/complex_template.F90 \
src/elpa2/kernels/simple_template.F90 \ src/elpa2/kernels/simple_template.F90 \
src/elpa2/kernels/simple_block4_template.F90 \
src/elpa2/pack_unpack_cpu.F90 \ src/elpa2/pack_unpack_cpu.F90 \
src/elpa2/pack_unpack_gpu.F90 \ src/elpa2/pack_unpack_gpu.F90 \
src/elpa2/compute_hh_trafo.F90 \ src/elpa2/compute_hh_trafo.F90 \
...@@ -188,6 +189,13 @@ if WITH_COMPLEX_GENERIC_SIMPLE_KERNEL ...@@ -188,6 +189,13 @@ if WITH_COMPLEX_GENERIC_SIMPLE_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/complex_simple.F90 libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/complex_simple.F90
endif endif
if WITH_REAL_GENERIC_SIMPLE_BLOCK4_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_simple_block4.F90
endif
#if WITH_REAL_GENERIC_SIMPLE_BLOCK6_KERNEL
# libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_simple_block6.F90
#endif
if WITH_REAL_BGP_KERNEL if WITH_REAL_BGP_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_bgp.f90 libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_bgp.f90
endif endif
...@@ -443,6 +451,7 @@ nobase_elpa_include_HEADERS = \ ...@@ -443,6 +451,7 @@ nobase_elpa_include_HEADERS = \
elpa/elpa_legacy.h elpa/elpa_legacy.h
nobase_nodist_elpa_include_HEADERS = \ nobase_nodist_elpa_include_HEADERS = \
elpa/elpa_version.h \
elpa/elpa_constants.h \ elpa/elpa_constants.h \
elpa/elpa_generated.h \ elpa/elpa_generated.h \
elpa/elpa_generated_legacy.h elpa/elpa_generated_legacy.h
...@@ -779,6 +788,7 @@ EXTRA_DIST = \ ...@@ -779,6 +788,7 @@ EXTRA_DIST = \
src/elpa2/kernels/real_sse_6hv_template.c \ src/elpa2/kernels/real_sse_6hv_template.c \
src/elpa2/kernels/real_template.F90 \ src/elpa2/kernels/real_template.F90 \
src/elpa2/kernels/simple_template.F90 \ src/elpa2/kernels/simple_template.F90 \
src/elpa2/kernels/simple_block4_template.F90 \
src/elpa2/pack_unpack_cpu.F90 \ src/elpa2/pack_unpack_cpu.F90 \
src/elpa2/pack_unpack_gpu.F90 \ src/elpa2/pack_unpack_gpu.F90 \
src/elpa2/qr/elpa_pdgeqrf_template.F90 \ src/elpa2/qr/elpa_pdgeqrf_template.F90 \
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Current Release ## ## Current Release ##
The current release is ELPA 2018.11.001.rc1 The current supported API version The current release is ELPA 2018.11.001 The current supported API version
is 20181113. This release supports the earliest API version 20170403. is 20181113. This release supports the earliest API version 20170403.
The old, obsolete legacy API will be deprecated in the future ! The old, obsolete legacy API will be deprecated in the future !
...@@ -110,7 +110,7 @@ the possible configure options. ...@@ -110,7 +110,7 @@ the possible configure options.
## Using *ELPA* ## Using *ELPA*
Please have a look at the "**USERS_GUIDE**" file, to get a documentation or at the [online] Please have a look at the "**USERS_GUIDE**" file, to get a documentation or at the [online]
(http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) doxygen (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) doxygen
documentation, where you find the definition of the interfaces. documentation, where you find the definition of the interfaces.
## Contributing to *ELPA* ## Contributing to *ELPA*
......
This file contains the release notes for the ELPA 2018.11.001.rc1 version This file contains the release notes for the ELPA 2018.11.001 version
What is new? What is new?
------------- -------------
......
...@@ -146,7 +146,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst ...@@ -146,7 +146,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels. the available kernels.
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html)
for each *ELPA* release is available. for each *ELPA* release is available.
...@@ -13,7 +13,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst ...@@ -13,7 +13,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program, which prints all For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program, which prints all
the available kernels. the available kernels.
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html)
for each *ELPA* release is available. for each *ELPA* release is available.
...@@ -200,7 +200,7 @@ The following table gives a list of all supported parameters which can be used t ...@@ -200,7 +200,7 @@ The following table gives a list of all supported parameters which can be used t
## III) List of computational routines ## ## III) List of computational routines ##
The following compute routines are available in *ELPA*: Please have a look at the man pages or [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) for details. The following compute routines are available in *ELPA*: Please have a look at the man pages or [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) for details.
| Name | Purpose | since API version | | Name | Purpose | since API version |
......
...@@ -22,7 +22,7 @@ The *ELPA* library consists of two main parts: ...@@ -22,7 +22,7 @@ The *ELPA* library consists of two main parts:
Both variants of the *ELPA* solvers are available for real or complex singe and double precision valued matrices. Both variants of the *ELPA* solvers are available for real or complex singe and double precision valued matrices.
Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) for details): Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) for details):
- elpa_get_communicators : set the row / column communicators for *ELPA* - elpa_get_communicators : set the row / column communicators for *ELPA*
- elpa_solve_evp_complex_1stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 1stage* solver - elpa_solve_evp_complex_1stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 1stage* solver
......
if [ "$(hostname)" == "freya01" ]; then module purge && source /mpcdf/soft/try_new_modules.sh && module load git intel/17.0 gcc/7 impi/2017.3 mkl/2017.3 autoconf automake libtool pkg-config anaconda/3 && unset SLURM_MPI_TYPE I_MPI_SLURM_EXT I_MPI_PMI_LIBRARY I_MPI_PMI2 I_MPI_HYDRA_BOOTSTRAP; fi if [ "$(hostname)" == "freya01" ]; then module purge && source /mpcdf/soft/obs_modules.sh && module load git intel/18.0.3 impi/2018.3 mkl/2018.4 anaconda/3/5.1 mpi4py/3.0.0 gcc/8 autoconf automake libtool pkg-config && unset SLURM_MPI_TYPE I_MPI_SLURM_EXT I_MPI_PMI_LIBRARY I_MPI_PMI2 I_MPI_HYDRA_BOOTSTRAP; fi
if [ "$(hostname)" == "buildtest-rzg" ]; then module load impi/5.1.3 intel/16.0 gcc/6.3 mkl/11.3 autotools pkg-config; fi if [ "$(hostname)" == "buildtest-rzg" ]; then module load impi/5.1.3 intel/16.0 gcc/6.3 mkl/11.3 autotools pkg-config; fi
...@@ -14,8 +14,8 @@ if [ "$(hostname)" == "amarek-elpa-gitlab-runner-2" ]; then module load intel/16 ...@@ -14,8 +14,8 @@ if [ "$(hostname)" == "amarek-elpa-gitlab-runner-2" ]; then module load intel/16
if [ "$(hostname)" == "amarek-elpa-gitlab-runner-3" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi if [ "$(hostname)" == "amarek-elpa-gitlab-runner-3" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi
if [ "$(hostname)" == "amarek-elpa-gitlab-runner-4" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi if [ "$(hostname)" == "amarek-elpa-gitlab-runner-4" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi
if [ "$(hostname)" == "dvl01" ]; then module load intel/17.0 gcc/5.4 mkl/2017 impi/2017.2 gcc/5.4 cuda/8.0; fi if [ "$(hostname)" == "dvl01" ]; then module load intel/17.0 gcc/6.4 mkl/2017 impi/2017.4 cuda/9.2; fi
if [ "$(hostname)" == "dvl02" ]; then module load intel/17.0 gcc/5.4 mkl/2017 impi/2017.2 gcc/5.4 cuda/8.0; fi if [ "$(hostname)" == "dvl02" ]; then module load intel/17.0 gcc/6.4 mkl/2017 impi/2017.4 cuda/9.2; fi
if [ "$(hostname)" == "miy01" ]; then module purge && module load gcc/5.4 smpi essl/5.5 cuda pgi/17.9 && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH && export OMPI_CC=gcc && export OMPI_FC=gfortran; fi if [ "$(hostname)" == "miy01" ]; then module purge && module load gcc/5.4 smpi essl/5.5 cuda pgi/17.9 && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH && export OMPI_CC=gcc && export OMPI_FC=gfortran; fi
if [ "$(hostname)" == "miy02" ]; then module load gcc/5.4 pgi/17.9 ompi/pgi/17.9/1.10.2 essl/5.5 cuda && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH; fi if [ "$(hostname)" == "miy02" ]; then module load gcc/5.4 pgi/17.9 ompi/pgi/17.9/1.10.2 essl/5.5 cuda && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH; fi
......
#!/bin/bash #!/bin/bash
source /etc/profile.d/modules.sh #source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex set -ex
source ./ci_test_scripts/.ci-env-vars source ./ci_test_scripts/.ci-env-vars
......
#!/bin/bash #!/bin/bash
source /etc/profile.d/modules.sh
#source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex set -ex
source ./ci_test_scripts/.ci-env-vars source ./ci_test_scripts/.ci-env-vars
......
...@@ -336,6 +336,19 @@ print(" # stupid 'make distcheck' leaves behind write-protected files that th ...@@ -336,6 +336,19 @@ print(" # stupid 'make distcheck' leaves behind write-protected files that th
print(' - make distcheck DISTCHECK_CONFIGURE_FLAGS="FC=mpiifort FCFLAGS=\\"-xHost\\" CFLAGS=\\"-march=native\\" SCALAPACK_LDFLAGS=\\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\\" SCALAPACK_FCFLAGS=\\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\\" --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2" TASKS=2 TEST_FLAGS="150 50 16" || { chmod u+rwX -R . ; exit 1 ; }') print(' - make distcheck DISTCHECK_CONFIGURE_FLAGS="FC=mpiifort FCFLAGS=\\"-xHost\\" CFLAGS=\\"-march=native\\" SCALAPACK_LDFLAGS=\\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\\" SCALAPACK_FCFLAGS=\\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\\" --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2" TASKS=2 TEST_FLAGS="150 50 16" || { chmod u+rwX -R . ; exit 1 ; }')
print("\n\n") print("\n\n")
print("distcheck-no-autotune:")
print(" tags:")
print(" - buildtest")
print(" script:")
print(" - ./configure FC=mpiifort FCFLAGS=\"-xHost\" CFLAGS=\"-march=native\" SCALAPACK_LDFLAGS=\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\" SCALAPACK_FCFLAGS=\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\" --enable-option-checking=fatal --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-autotuning || { cat config.log; exit 1; }")
print(" # stupid 'make distcheck' leaves behind write-protected files that the stupid gitlab runner cannot remove")
print(' - make distcheck DISTCHECK_CONFIGURE_FLAGS="FC=mpiifort FCFLAGS=\\"-xHost\\" CFLAGS=\\"-march=native\\" SCALAPACK_LDFLAGS=\\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\\" SCALAPACK_FCFLAGS=\\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\\" --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-autotuning " TASKS=2 TEST_FLAGS="150 50 16" || { chmod u+rwX -R . ; exit 1 ; }')
print("\n\n")
# add python tests # add python tests
python_ci_tests = [ python_ci_tests = [
"# python tests", "# python tests",
......
...@@ -13,6 +13,7 @@ configueArg="" ...@@ -13,6 +13,7 @@ configueArg=""
skipStep=0 skipStep=0
batchCommand="" batchCommand=""
interactiveRun="yes" interactiveRun="yes"
SLURMBATCH="no"
function usage() { function usage() {
cat >&2 <<-EOF cat >&2 <<-EOF
...@@ -58,7 +59,7 @@ function usage() { ...@@ -58,7 +59,7 @@ function usage() {
} }
while getopts "c:t:j:m:n:b:o:s:q:i:h" opt; do while getopts "c:t:j:m:n:b:o:s:q:S:i:h" opt; do
case $opt in case $opt in
j) j)
makeTasks=$OPTARG;; makeTasks=$OPTARG;;
...@@ -80,6 +81,8 @@ while getopts "c:t:j:m:n:b:o:s:q:i:h" opt; do ...@@ -80,6 +81,8 @@ while getopts "c:t:j:m:n:b:o:s:q:i:h" opt; do
batchCommand=$OPTARG;; batchCommand=$OPTARG;;
i) i)
interactiveRun=$OPTARG;; interactiveRun=$OPTARG;;
S)
SLURMBATCH=$OPTARG;;
:) :)
echo "Option -$OPTARG requires an argument" >&2;; echo "Option -$OPTARG requires an argument" >&2;;
h) h)
......
#!/bin/bash #!/bin/bash
source /etc/profile.d/modules.sh #source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex set -ex
source ./ci_test_scripts/.ci-env-vars source ./ci_test_scripts/.ci-env-vars
......
...@@ -29,12 +29,21 @@ AM_SILENT_RULES([yes]) ...@@ -29,12 +29,21 @@ AM_SILENT_RULES([yes])
# #
AC_SUBST([ELPA_SO_VERSION], [13:0:0]) AC_SUBST([ELPA_SO_VERSION], [13:0:0])
# AC_DEFINE_SUBST(NAME, VALUE, DESCRIPTION)
# -----------------------------------------
AC_DEFUN([AC_DEFINE_SUBST], [
AC_DEFINE([$1], [$2], [$3])
AC_SUBST([$1], ['$2'])
])
# API Version # API Version
AC_DEFINE([EARLIEST_API_VERSION], [20170403], [Earliest supported ELPA API version]) AC_DEFINE([EARLIEST_API_VERSION], [20170403], [Earliest supported ELPA API version])
AC_DEFINE([CURRENT_API_VERSION], [20181113], [Current ELPA API version])
AC_DEFINE_SUBST(CURRENT_API_VERSION, 20181113, "Current ELPA API version")
# Autotune Version # Autotune Version
AC_DEFINE([EARLIEST_AUTOTUNE_VERSION], [20171201], [Earliest ELPA API version, which supports autotuning]) AC_DEFINE([EARLIEST_AUTOTUNE_VERSION], [20171201], [Earliest ELPA API version, which supports autotuning])
AC_DEFINE([CURRENT_AUTOTUNE_VERSION], [20181113], [Current ELPA autotune version]) AC_DEFINE([CURRENT_AUTOTUNE_VERSION], [20181113], [Current ELPA autotune version])
AC_DEFINE_SUBST(CURRENT_AUTOTUNE_VERSION, 20181113, "Current ELPA autotune version")
AX_CHECK_GNU_MAKE() AX_CHECK_GNU_MAKE()
if test x$_cv_gnu_make_command = x ; then if test x$_cv_gnu_make_command = x ; then
...@@ -540,6 +549,7 @@ m4_pattern_forbid([elpa_m4]) ...@@ -540,6 +549,7 @@ m4_pattern_forbid([elpa_m4])
m4_define(elpa_m4_generic_kernels, [ m4_define(elpa_m4_generic_kernels, [
real_generic real_generic
real_generic_simple real_generic_simple
real_generic_simple_block4
complex_generic complex_generic
complex_generic_simple complex_generic_simple
]) ])
...@@ -748,6 +758,30 @@ m4_foreach_w([elpa_m4_type],elpa_m4_kernel_types,[ ...@@ -748,6 +758,30 @@ m4_foreach_w([elpa_m4_type],elpa_m4_kernel_types,[
dnl the list of kernels is now assembled dnl the list of kernels is now assembled
dnl choosing a default kernel dnl choosing a default kernel
m4_foreach_w([elpa_m4_kind],[real complex],[
AC_ARG_WITH([default-]elpa_m4_kind[-kernel], m4_expand([AS_HELP_STRING([--with-default-]elpa_m4_kind[-kernel]=KERNEL,
[set a specific ]elpa_m4_kind[ kernel as default kernel. Available kernels are:]
m4_foreach_w([elpa_m4_kernel],m4_expand(elpa_m4_[]elpa_m4_kind[]_kernels),[m4_bpatsubst(elpa_m4_kernel,elpa_m4_kind[]_,[]) ]))]),
[default_]elpa_m4_kind[_kernel="]elpa_m4_kind[_$withval"],[default_]elpa_m4_kind[_kernel=""])
#if test -n "$default_[]elpa_m4_kind[]_kernel" ; then
# found="no"
# m4_foreach_w([elpa_m4_otherkernel],m4_expand(elpa_m4_[]elpa_m4_kind[]_kernels),[
# if test "$default_]elpa_m4_kind[_kernel" = "]elpa_m4_otherkernel[" ; then
# use_[]elpa_m4_otherkernel[]=yes
# found="yes"
# else
# use_[]elpa_m4_otherkernel[]=no
# fi
# ])
# if test x"$found" = x"no" ; then
# AC_MSG_ERROR([Invalid kernel "$default_]elpa_m4_kind[_kernel" specified for --with-default-]elpa_m4_kind[-kernel])
# fi
# AC_DEFINE([WITH_DEFAULT_]m4_toupper(elpa_m4_kind)[_KERNEL],[1],[use specific ]elpa_m4_kind[ default kernel (set at compile time)])
#fi
])
m4_foreach_w([elpa_m4_kind],[real complex],[ m4_foreach_w([elpa_m4_kind],[real complex],[
m4_foreach_w([elpa_m4_kernel], m4_foreach_w([elpa_m4_kernel],
m4_foreach_w([elpa_m4_cand_kernel], m4_foreach_w([elpa_m4_cand_kernel],
...@@ -1257,6 +1291,7 @@ AC_CONFIG_FILES([ ...@@ -1257,6 +1291,7 @@ AC_CONFIG_FILES([
Doxyfile Doxyfile
${PKG_CONFIG_FILE}:elpa.pc.in ${PKG_CONFIG_FILE}:elpa.pc.in
elpa/elpa_constants.h elpa/elpa_constants.h
elpa/elpa_version.h
]) ])
m4_include([m4/ax_fc_check_define.m4]) m4_include([m4/ax_fc_check_define.m4])
...@@ -1404,12 +1439,12 @@ echo "* off). With the 2019.11.001 release it will be abolished! *" ...@@ -1404,12 +1439,12 @@ echo "* off). With the 2019.11.001 release it will be abolished! *"
echo "***********************************************************************" echo "***********************************************************************"
echo " " echo " "
echo " " echo " "
echo "***********************************************************************" #echo "***********************************************************************"
echo "* This is a the first release candidate of ELPA 2018.11.001.rc1 *" #echo "* This is a the first release candidate of ELPA 2018.11.001.rc1 *"
echo "* There might be still some changes until the final release of *" #echo "* There might be still some changes until the final release of *"
echo "* ELPA 2018.11.001 *" #echo "* ELPA 2018.11.001 *"
echo "***********************************************************************" #echo "***********************************************************************"
echo " " #echo " "
if test x"$enable_kcomputer" = x"yes" ; then if test x"$enable_kcomputer" = x"yes" ; then
echo " " echo " "
......
...@@ -19,7 +19,7 @@ ...@@ -19,7 +19,7 @@
%define with_openmp 0 %define with_openmp 0
Name: elpa Name: elpa
Version: 2018.11.001.rc1 Version: 2018.11.001
Release: 1 Release: 1
Summary: A massively parallel eigenvector solver Summary: A massively parallel eigenvector solver
License: LGPL-3.0 License: LGPL-3.0
......
...@@ -4,6 +4,8 @@ ...@@ -4,6 +4,8 @@
#include <limits.h> #include <limits.h>
#include <complex.h> #include <complex.h>
#include <elpa/elpa_version.h>
struct elpa_struct; struct elpa_struct;
typedef struct elpa_struct *elpa_t; typedef struct elpa_struct *elpa_t;
......
...@@ -46,7 +46,8 @@ enum ELPA_SOLVERS { ...@@ -46,7 +46,8 @@ enum ELPA_SOLVERS {
X(ELPA_2STAGE_REAL_SPARC64_BLOCK6, 21, @ELPA_2STAGE_REAL_SPARC64_BLOCK6_COMPILED@, __VA_ARGS__) \ X(ELPA_2STAGE_REAL_SPARC64_BLOCK6, 21, @ELPA_2STAGE_REAL_SPARC64_BLOCK6_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK2, 22, @ELPA_2STAGE_REAL_VSX_BLOCK2_COMPILED@, __VA_ARGS__) \ X(ELPA_2STAGE_REAL_VSX_BLOCK2, 22, @ELPA_2STAGE_REAL_VSX_BLOCK2_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK4, 23, @ELPA_2STAGE_REAL_VSX_BLOCK4_COMPILED@, __VA_ARGS__) \ X(ELPA_2STAGE_REAL_VSX_BLOCK4, 23, @ELPA_2STAGE_REAL_VSX_BLOCK4_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK6, 24, @ELPA_2STAGE_REAL_VSX_BLOCK6_COMPILED@, __VA_ARGS__) X(ELPA_2STAGE_REAL_VSX_BLOCK6, 24, @ELPA_2STAGE_REAL_VSX_BLOCK6_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_GENERIC_SIMPLE_BLOCK4, 25, @ELPA_2STAGE_REAL_GENERIC_SIMPLE_BLOCK4_COMPILED@, __VA_ARGS__)
#define ELPA_FOR_ALL_2STAGE_REAL_KERNELS_AND_DEFAULT(X) \ #define ELPA_FOR_ALL_2STAGE_REAL_KERNELS_AND_DEFAULT(X) \
ELPA_FOR_ALL_2STAGE_REAL_KERNELS(X) \ ELPA_FOR_ALL_2STAGE_REAL_KERNELS(X) \
......
#define ELPA_API_VERSION @CURRENT_API_VERSION@
#define ELPA_AUTOTUNE_API_VERSION @CURRENT_AUTOTUNE_VERSION@
...@@ -62,15 +62,15 @@ module mod_check_for_gpu ...@@ -62,15 +62,15 @@ module mod_check_for_gpu
gpuAvailable = .false. gpuAvailable = .false.
if(cublasHandle .ne. -1) then if (cublasHandle .ne. -1) then
gpuAvailable = .true. gpuAvailable = .true.
numberOfDevices = -1 numberOfDevices = -1
if(myid == 0) then if (myid == 0) then
print *, "Skipping GPU init, should have already been initialized " print *, "Skipping GPU init, should have already been initialized "
endif endif
return return
else else
if(myid == 0) then if (myid == 0) then
print *, "Initializing the GPU devices" print *, "Initializing the GPU devices"
endif endif
endif endif
......
This diff is collapsed.
...@@ -138,6 +138,13 @@ program print_available_elpa2_kernels ...@@ -138,6 +138,13 @@ program print_available_elpa2_kernels
do i = 0, elpa_option_cardinality(KERNEL_KEY) do i = 0, elpa_option_cardinality(KERNEL_KEY)
kernel = elpa_option_enumerate(KERNEL_KEY, i) kernel = elpa_option_enumerate(KERNEL_KEY, i)
if (elpa_int_value_to_string(KERNEL_KEY, i) .eq. "ELPA_2STAGE_COMPLEX_GPU" .or. &
elpa_int_value_to_string(KERNEL_KEY, i) .eq. "ELPA_2STAGE_REAL_GPU") then
if (e%can_set("use_gpu",1) == ELPA_OK) then
call e%set("use_gpu",1)
endif
endif
if (e%can_set(KERNEL_KEY, kernel) == ELPA_OK) then if (e%can_set(KERNEL_KEY, kernel) == ELPA_OK) then
print *, " ", elpa_int_value_to_string(KERNEL_KEY, kernel) print *, " ", elpa_int_value_to_string(KERNEL_KEY, kernel)
endif endif
......
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! --------------------------------------------------------------------------------------------------
#endif
#include "config-f90.h"
!#ifndef USE_ASSUMED_SIZE
!module real_generic_simple_block4_kernel
!
! private
! public quad_hh_trafo_real_generic_simple_4hv_double
!
!#ifdef WANT_SINGLE_PRECISION_REAL
! public quad_hh_trafo_real_generic_simple_4hv_single
!#endif
!
! contains
!#endif
#define REALCASE 1
#define DOUBLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block4_template.F90"
#undef REALCASE
#undef DOUBLE_PRECISION
#ifdef WANT_SINGLE_PRECISION_REAL
#define REALCASE 1
#define SINGLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block4_template.F90"
#undef REALCASE
#undef SINGLE_PRECISION
#endif
!#ifndef USE_ASSUMED_SIZE
!end module real_generic_simple_block4_kernel
!#endif
! --------------------------------------------------------------------------------------------------
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! Author: A. Marek, MPCDF
! --------------------------------------------------------------------------------------------------
#endif
#include "config-f90.h"
!#ifndef USE_ASSUMED_SIZE
!module real_generic_simple_block6_kernel
!
! private
! public hexa_hh_trafo_real_generic_simple_6hv_double
!
!#ifdef WANT_SINGLE_PRECISION_REAL
! public hexa_hh_trafo_real_generic_simple_6hv_single
!#endif
!
! contains
!#endif
#define REALCASE 1
#define DOUBLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block6_template.F90"
#undef REALCASE
#undef DOUBLE_PRECISION
#ifdef WANT_SINGLE_PRECISION_REAL
#define REALCASE 1
#define SINGLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block6_template.F90"
#undef REALCASE
#undef SINGLE_PRECISION
#endif
!#ifndef USE_ASSUMED_SIZE
!end module real_generic_simple_block6_kernel
!#endif
! --------------------------------------------------------------------------------------------------
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! --------------------------------------------------------------------------------------------------
#endif
subroutine quad_hh_trafo_&
&MATH_DATATYPE&
&_generic_simple_4hv_&
&PRECISION&
& (q, hh, nb, nq, ldq, ldh)
use precision
use elpa_abstract_impl
implicit none
!class(elpa_abstract_impl_t), intent(inout) :: obj
integer(kind=ik), intent(in) :: nb, nq, ldq, ldh
#if REALCASE==1
#ifdef USE_ASSUMED_SIZE
real(kind=C_DATATYPE_KIND), intent(inout) :: q(ldq,*)
real(kind=C_DATATYPE_KIND), intent(in) :: hh(ldh,*)
#else
real(kind=C_DATATYPE_KIND), intent(inout) :: q(1:ldq,1:nb+3)
real(kind=C_DATATYPE_KIND), intent(in) :: hh(1:ldh,1:6)
#endif
real(kind=C_DATATYPE_KIND) :: s_1_2, s_1_3, s_2_3, s_1_4, s_2_4, s_3_4
real(kind=C_DATATYPE_KIND) :: vs_1_2, vs_1_3, vs_2_3, vs_1_4, vs_2_4, vs_3_4
real(kind=C_DATATYPE_KIND) :: h_2_1, h_3_2, h_3_1, h_4_3, h_4_2, h_4_1
real(kind=C_DATATYPE_KIND) :: a_1_1(nq), a_2_1(nq), a_3_1(nq), a_4_1(nq)
real(kind=C_DATATYPE_KIND) :: h1, h2, h3, h4
real(kind=C_DATATYPE_KIND) :: w(nq), z(nq), x(nq), y(nq)
real(kind=C_DATATYPE_KIND) :: tau1, tau2, tau3, tau4
#endif /* REALCASE==1 */
#if COMPLEXCASE==1
#ifdef USE_ASSUMED_SIZE
complex(kind=C_DATATYPE_KIND), intent(inout) :: q(ldq,*)
complex(kind=C_DATATYPE_KIND), intent(in) :: hh(ldh,*)
#else
complex(kind=C_DATATYPE_KIND), intent(inout) :: q(1:ldq,1:nb+3)
complex(kind=C_DATATYPE_KIND), intent(in) :: hh(1:ldh,1:6)
#endif
complex(kind=C_DATATYPE_KIND) :: s_1_2, s_1_3, s_2_3, s_1_4, s_2_4, s_3_4
complex(kind=C_DATATYPE_KIND) :: vs_1_2, vs_1_3, vs_2_3, vs_1_4, vs_2_4, vs_3_4
complex(kind=C_DATATYPE_KIND) :: h_2_1, h_3_2, h_3_1, h_4_3, h_4_2, h_4_1
complex(kind=C_DATATYPE_KIND) :: a_1_1(nq), a_2_1(nq), a_3_1(nq), a_4_1(nq)
complex(kind=C_DATATYPE_KIND) :: w(nq), z(nq), x(nq), y(nq)
complex(kind=C_DATATYPE_KIND) :: h1, h2, h3, h4
complex(kind=C_DATATYPE_KIND) :: tau1, tau2, tau3, tau4
#endif /* COMPLEXCASE==1 */
integer(kind=ik) :: i
! Calculate dot product of the two Householder vectors
#if REALCASE==1
s_1_2 = hh(2,2)
s_1_3 = hh(3,3)
s_2_3 = hh(2,3)
s_1_4 = hh(4,4)
s_2_4 = hh(3,4)
s_3_4 = hh(2,4)
s_1_2 = s_1_2 + hh(2,1) * hh(3,2)
s_2_3 = s_2_3 + hh(2,2) * hh(3,3)
s_3_4 = s_3_4 + hh(2,3) * hh(3,4)
s_1_2 = s_1_2 + hh(3,1) * hh(4,2)
s_2_3 = s_2_3 + hh(3,2) * hh(4,3)
s_3_4 = s_3_4 + hh(3,3) * hh(4,4)
s_1_3 = s_1_3 + hh(2,1) * hh(4,3)
s_2_4 = s_2_4 + hh(2,2) * hh(4,4)
!DIR$ IVDEP
do i=5,nb
s_1_2 = s_1_2 + hh(i-1,1) * hh(i,2)
s_2_3 = s_2_3 + hh(i-1,2) * hh(i,3)
s_3_4 = s_3_4 + hh(i-1,3) * hh(i,4)
s_1_3 = s_1_3 + hh(i-2,1) * hh(i,3)
s_2_4 = s_2_4 + hh(i-2,2) * hh(i,4)
s_1_4 = s_1_4 + hh(i-3,1) * hh(i,4)
enddo
#endif
#if COMPLEXCASE==1
stop
!s = conjg(hh(2,2))*1.0
!do i=3,nb
! s = s+(conjg(hh(i,2))*hh(i-1,1))
!enddo
#endif
! Do the Householder transformations
a_1_1(1:nq) = q(1:nq,4)
a_2_1(1:nq) = q(1:nq,3)
a_3_1(1:nq) = q(1:nq,2)
a_4_1(1:nq) = q(1:nq,1)
h_2_1 = hh(2,2)
h_3_2 = hh(2,3)
h_3_1 = hh(3,3)
h_4_3 = hh(2,4)
h_4_2 = hh(3,4)
h_4_1 = hh(4,4)
#if REALCASE == 1
w(1:nq) = a_3_1(1:nq) * h_4_3 + a_4_1(1:nq)
w(1:nq) = a_2_1(1:nq) * h_4_2 + w(1:nq)
w(1:nq) = a_1_1(1:nq) * h_4_1 + w(1:nq)
z(1:nq) = a_2_1(1:nq) * h_3_2 + a_3_1(1:nq)
z(1:nq) = a_1_1(1:nq) * h_3_1 + z(1:nq)
y(1:nq) = a_1_1(1:nq) * h_2_1 + a_2_1(1:nq)
x(1:nq) = a_1_1(1:nq)
#endif
#if COMPLEXCASE==1
stop
!y(1:nq) = q(1:nq,1) + q(1:nq,2)*conjg(hh(2,2))
#endif
do i=5,nb
#if REALCASE == 1
h1 = hh(i-3,1)
h2 = hh(i-2,2)
h3 = hh(i-1,3)
h4 = hh(i ,4)
#endif
#if COMPLEXCASE==1
stop
! h1 = conjg(hh(i-1,1))
! h2 = conjg(hh(i,2))
#endif
x(1:nq) = x(1:nq) + q(1:nq,i) * h1
y(1:nq) = y(1:nq) + q(1:nq,i) * h2
z(1:nq) = z(1:nq) + q(1:nq,i) * h3
w(1:nq) = w(1:nq) + q(1:nq,i) * h4
enddo
h1 = hh(nb-2,1)
h2 = hh(nb-1,2)
h3 = hh(nb ,3)
#if REALCASE==1
x(1:nq) = x(1:nq) + q(1:nq,nb+1) * h1
y(1:nq) = y(1:nq) + q(1:nq,nb+1) * h2
z(1:nq) = z(1:nq) + q(1:nq,nb+1) * h3
#endif
#if COMPLEXCASE==1
stop
!x(1:nq) = x(1:nq) + q(1:nq,nb+1)*conjg(hh(nb,1))
#endif
h1 = hh(nb-1,1)
h2 = hh(nb ,2)
x(1:nq) = x(1:nq) + q(1:nq,nb+2) * h1
y(1:nq) = y(1:nq) + q(1:nq,nb+2) * h2
h1 = hh(nb,1)
x(1:nq) = x(1:nq) + q(1:nq,nb+3) * h1
! Rank-1 update
tau1 = hh(1,1)
tau2 = hh(1,2)
tau3 = hh(1,3)
tau4 = hh(1,4)
vs_1_2 = s_1_2
vs_1_3 = s_1_3
vs_2_3 = s_2_3
vs_1_4 = s_1_4
vs_2_4 = s_2_4
vs_3_4 = s_3_4
h1 = tau1
x(1:nq) = x(1:nq) * h1
h1 = tau2
h2 = tau2 * vs_1_2
y(1:nq) = y(1:nq) * h1 - x(1:nq) * h2
h1 = tau3
h2 = tau3 * vs_1_3
h3 = tau3 * vs_2_3
z(1:nq) = z(1:nq) * h1 - (y(1:nq) * h3 + x(1:nq) * h2)
h1 = tau4
h2 = tau4 * vs_1_4
h3 = tau4 * vs_2_4
h4 = tau4 * vs_3_4
w(1:nq) = w(1:nq) * h1 - ( z(1:nq) * h4 + y(1:nq) * h3 + x(1:nq) * h2)
q(1:nq,1) = q(1:nq,1) - w(1:nq)
h4 = hh(2,4)
q(1:nq,2) = q(1:nq,2) - (w(1:nq) * h4 + z(1:nq))
h3 = hh(2,3)
h4 = hh(3,4)
q(1:nq,3) = q(1:nq,3) - y(1:nq)
q(1:nq,3) = -( z(1:nq) * h3) + q(1:nq,3)
q(1:nq,3) = -( w(1:nq) * h4) + q(1:nq,3)
h2 = hh(2,2)
h3 = hh(3,3)
h4 = hh(4,4)
q(1:nq,4) = q(1:nq,4) - x(1:nq)
q(1:nq,4) = -(y(1:nq) * h2) + q(1:nq,4)
q(1:nq,4) = -(z(1:nq) * h3) + q(1:nq,4)
q(1:nq,4) = -(w(1:nq) * h4) + q(1:nq,4)
do i=5,nb
h1 = hh(i-3,1)
h2 = hh(i-2,2)
h3 = hh(i-1,3)
h4 = hh(i ,4)
q(1:nq,i) = -(x(1:nq) * h1) + q(1:nq,i)
q(1:nq,i) = -(y(1:nq) * h2) + q(1:nq,i)
q(1:nq,i) = -(z(1:nq) * h3) + q(1:nq,i)
q(1:nq,i) = -(w(1:nq) * h4) + q(1:nq,i)
enddo
h1 = hh(nb-2,1)
h2 = hh(nb-1,2)
h3 = hh(nb ,3)
q(1:nq,nb+1) = -(x(1:nq) * h1) + q(1:nq,nb+1)
q(1:nq,nb+1) = -(y(1:nq) * h2) + q(1:nq,nb+1)
q(1:nq,nb+1) = -(z(1:nq) * h3) + q(1:nq,nb+1)
h1 = hh(nb-1,1)
h2 = hh(nb ,2)
q(1:nq,nb+2) = - (x(1:nq) * h1) + q(1:nq,nb+2)
q(1:nq,nb+2) = - (y(1:nq) * h2) + q(1:nq,nb+2)
h1 = hh(nb,1)
q(1:nq,nb+3) = - (x(1:nq) * h1) + q(1:nq,nb+3)
end subroutine
This diff is collapsed.
...@@ -56,7 +56,7 @@ module elpa2_utilities ...@@ -56,7 +56,7 @@ module elpa2_utilities
implicit none implicit none
public public
integer(kind=c_int), parameter :: number_of_real_kernels = ELPA_2STAGE_NUMBER_OF_REAL_KERNELS - 6 integer(kind=c_int), parameter :: number_of_real_kernels = ELPA_2STAGE_NUMBER_OF_REAL_KERNELS - 7
integer(kind=c_int), parameter :: number_of_complex_kernels = ELPA_2STAGE_NUMBER_OF_COMPLEX_KERNELS integer(kind=c_int), parameter :: number_of_complex_kernels = ELPA_2STAGE_NUMBER_OF_COMPLEX_KERNELS
#ifdef WITH_REAL_GENERIC_KERNEL #ifdef WITH_REAL_GENERIC_KERNEL
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
#include "config-f90.h" #include "config-f90.h"
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
// it seems, that we need those two levels of indirection to correctly expand macros // it seems, that we need those two levels of indirection to correctly expand macros
#define cannons_triang_rectangular_impl_expand2(SUFFIX) cannons_triang_rectangular_##SUFFIX #define cannons_triang_rectangular_impl_expand2(SUFFIX) cannons_triang_rectangular_##SUFFIX
#define cannons_triang_rectangular_impl_expand1(SUFFIX) cannons_triang_rectangular_impl_expand2(SUFFIX) #define cannons_triang_rectangular_impl_expand1(SUFFIX) cannons_triang_rectangular_impl_expand2(SUFFIX)
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
// it seems, that we need those two levels of indirection to correctly expand macros // it seems, that we need those two levels of indirection to correctly expand macros
#define cannons_reduction_impl_expand2(SUFFIX) cannons_reduction_##SUFFIX #define cannons_reduction_impl_expand2(SUFFIX) cannons_reduction_##SUFFIX
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#ifdef WITH_MPI #ifdef WITH_MPI
......
This diff is collapsed.
This diff is collapsed.
...@@ -498,3 +498,5 @@ int elpa_index_print_autotune_state(elpa_index_t index, int autotune_level, int ...@@ -498,3 +498,5 @@ int elpa_index_print_autotune_state(elpa_index_t index, int autotune_level, int
*/ */
int elpa_index_load_autotune_state(elpa_index_t index, int* autotune_level, int* autotune_domain, int* min_loc, int elpa_index_load_autotune_state(elpa_index_t index, int* autotune_level, int* autotune_domain, int* min_loc,
double* min_val, int* current, int* cardinality, char* filename); double* min_val, int* current, int* cardinality, char* filename);
int elpa_index_is_printing_mpi_rank(elpa_index_t index);
...@@ -76,6 +76,8 @@ int ftimings_papi_init(void) { ...@@ -76,6 +76,8 @@ int ftimings_papi_init(void) {
flops_available = 1; flops_available = 1;
} }
ldst_available = 0;
#if 0
/* Loads + Stores */ /* Loads + Stores */
if ((ret = PAPI_query_event(PAPI_LD_INS)) < 0) { if ((ret = PAPI_query_event(PAPI_LD_INS)) < 0) {
fprintf(stderr, "ftimings: %s:%d: PAPI_query_event(PAPI_LD_INS): %s\n", fprintf(stderr, "ftimings: %s:%d: PAPI_query_event(PAPI_LD_INS): %s\n",
...@@ -96,7 +98,7 @@ int ftimings_papi_init(void) { ...@@ -96,7 +98,7 @@ int ftimings_papi_init(void) {
} else { } else {
ldst_available = 1; ldst_available = 1;
} }
#endif
/* Start */ /* Start */
if ((ret = PAPI_start(event_set)) < 0) { if ((ret = PAPI_start(event_set)) < 0) {
fprintf(stderr, "ftimings: %s:%d PAPI_start(): %s\n", fprintf(stderr, "ftimings: %s:%d PAPI_start(): %s\n",
......
This diff is collapsed.
AC_PREREQ([2.69]) AC_PREREQ([2.69])
AC_INIT([elpa_test_project],[2018.11.001.rc1], elpa-library@rzg.mpg.de) AC_INIT([elpa_test_project],[2018.11.001], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001.rc1" elpaversion="2018.11.001"
AC_CONFIG_SRCDIR([src/test_real.F90]) AC_CONFIG_SRCDIR([src/test_real.F90])
AM_INIT_AUTOMAKE([foreign -Wall subdir-objects]) AM_INIT_AUTOMAKE([foreign -Wall subdir-objects])
......
AC_PREREQ([2.69]) AC_PREREQ([2.69])
AC_INIT([elpa_test_project],[2018.11.001.rc1], elpa-library@rzg.mpg.de) AC_INIT([elpa_test_project],[2018.11.001], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001.rc1" elpaversion="2018.11.001"
AC_CONFIG_SRCDIR([src/test_real.F90]) AC_CONFIG_SRCDIR([src/test_real.F90])
AM_INIT_AUTOMAKE([foreign -Wall subdir-objects]) AM_INIT_AUTOMAKE([foreign -Wall subdir-objects])
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.