...
 
Commits (41)
This diff is collapsed.
Changelog for ELPA 2018.11.001.rc1
Changelog for upcoming release
- user can define the default kernels
- simple block4 and block6 real kernel
- ELPA versioning number is provided in the C header files
Changelog for ELPA 2018.11.001
- improved autotuning
- improved performance of generalized problem via Cannon's algorithm
......
......@@ -2,7 +2,7 @@
## Preamble ##
This file provides documentation on how to build the *ELPA* library in **version ELPA-2018.11.001.rc1**.
This file provides documentation on how to build the *ELPA* library in **version ELPA-2018.11.001**.
With release of **version ELPA-2017.05.001** the build process has been significantly simplified,
which makes it easier to install the *ELPA* library.
......
......@@ -3,7 +3,7 @@
For more details and recent updates please visit the online [issue system] (https://gitlab.mpcdf.mpg.de/elpa/elpa/issues)
Issues which are not mentioned in a newer release are (considered as) solved.
### ELPA 2018.11.001.rc1 release ###
### ELPA 2018.11.001 release ###
- same issues as in ELPA 2017.11.001
### ELPA 2018.05.001 release ###
......
......@@ -78,7 +78,7 @@ https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
"legacy interface", since as announced some deprecated function aliases have been
removed). For the current interface all changes since 2017.05.001 are
compatible, since only some functions have been added.
The state of release 2017.11.001.(rc1) defines this interface
The state of release 2017.11.001 defines this interface
- 12
No incompatible API changes w.r.t. the previous version. Some functions have been
......
......@@ -108,6 +108,7 @@ EXTRA_libelpa@SUFFIX@_private_la_DEPENDENCIES = \
src/elpa2/kernels/real_template.F90 \
src/elpa2/kernels/complex_template.F90 \
src/elpa2/kernels/simple_template.F90 \
src/elpa2/kernels/simple_block4_template.F90 \
src/elpa2/pack_unpack_cpu.F90 \
src/elpa2/pack_unpack_gpu.F90 \
src/elpa2/compute_hh_trafo.F90 \
......@@ -188,6 +189,13 @@ if WITH_COMPLEX_GENERIC_SIMPLE_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/complex_simple.F90
endif
if WITH_REAL_GENERIC_SIMPLE_BLOCK4_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_simple_block4.F90
endif
#if WITH_REAL_GENERIC_SIMPLE_BLOCK6_KERNEL
# libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_simple_block6.F90
#endif
if WITH_REAL_BGP_KERNEL
libelpa@SUFFIX@_private_la_SOURCES += src/elpa2/kernels/real_bgp.f90
endif
......@@ -443,6 +451,7 @@ nobase_elpa_include_HEADERS = \
elpa/elpa_legacy.h
nobase_nodist_elpa_include_HEADERS = \
elpa/elpa_version.h \
elpa/elpa_constants.h \
elpa/elpa_generated.h \
elpa/elpa_generated_legacy.h
......@@ -779,6 +788,7 @@ EXTRA_DIST = \
src/elpa2/kernels/real_sse_6hv_template.c \
src/elpa2/kernels/real_template.F90 \
src/elpa2/kernels/simple_template.F90 \
src/elpa2/kernels/simple_block4_template.F90 \
src/elpa2/pack_unpack_cpu.F90 \
src/elpa2/pack_unpack_gpu.F90 \
src/elpa2/qr/elpa_pdgeqrf_template.F90 \
......
......@@ -2,7 +2,7 @@
## Current Release ##
The current release is ELPA 2018.11.001.rc1 The current supported API version
The current release is ELPA 2018.11.001 The current supported API version
is 20181113. This release supports the earliest API version 20170403.
The old, obsolete legacy API will be deprecated in the future !
......@@ -110,7 +110,7 @@ the possible configure options.
## Using *ELPA*
Please have a look at the "**USERS_GUIDE**" file, to get a documentation or at the [online]
(http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) doxygen
(http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) doxygen
documentation, where you find the definition of the interfaces.
## Contributing to *ELPA*
......
This file contains the release notes for the ELPA 2018.11.001.rc1 version
This file contains the release notes for the ELPA 2018.11.001 version
What is new?
-------------
......
......@@ -146,7 +146,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program which prints all
the available kernels.
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html)
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html)
for each *ELPA* release is available.
......@@ -13,7 +13,7 @@ Local documentation (via man pages) should be available (if *ELPA* has been inst
For example "man elpa2_print_kernels" should provide the documentation for the *ELPA* program, which prints all
the available kernels.
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html)
Also a [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html)
for each *ELPA* release is available.
......@@ -200,7 +200,7 @@ The following table gives a list of all supported parameters which can be used t
## III) List of computational routines ##
The following compute routines are available in *ELPA*: Please have a look at the man pages or [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) for details.
The following compute routines are available in *ELPA*: Please have a look at the man pages or [online doxygen documentation] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) for details.
| Name | Purpose | since API version |
......
......@@ -22,7 +22,7 @@ The *ELPA* library consists of two main parts:
Both variants of the *ELPA* solvers are available for real or complex singe and double precision valued matrices.
Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001.rc1/html/index.html) for details):
Thus *ELPA* provides the following user functions (see man pages or [online] (http://elpa.mpcdf.mpg.de/html/Documentation/ELPA-2018.11.001/html/index.html) for details):
- elpa_get_communicators : set the row / column communicators for *ELPA*
- elpa_solve_evp_complex_1stage_{single|double} : solve a {single|double} precision complex eigenvalue proplem with the *ELPA 1stage* solver
......
if [ "$(hostname)" == "freya01" ]; then module purge && source /mpcdf/soft/try_new_modules.sh && module load git intel/17.0 gcc/7 impi/2017.3 mkl/2017.3 autoconf automake libtool pkg-config anaconda/3 && unset SLURM_MPI_TYPE I_MPI_SLURM_EXT I_MPI_PMI_LIBRARY I_MPI_PMI2 I_MPI_HYDRA_BOOTSTRAP; fi
if [ "$(hostname)" == "freya01" ]; then module purge && source /mpcdf/soft/obs_modules.sh && module load git intel/18.0.3 impi/2018.3 mkl/2018.4 anaconda/3/5.1 mpi4py/3.0.0 gcc/8 autoconf automake libtool pkg-config && unset SLURM_MPI_TYPE I_MPI_SLURM_EXT I_MPI_PMI_LIBRARY I_MPI_PMI2 I_MPI_HYDRA_BOOTSTRAP; fi
if [ "$(hostname)" == "buildtest-rzg" ]; then module load impi/5.1.3 intel/16.0 gcc/6.3 mkl/11.3 autotools pkg-config; fi
......@@ -14,8 +14,8 @@ if [ "$(hostname)" == "amarek-elpa-gitlab-runner-2" ]; then module load intel/16
if [ "$(hostname)" == "amarek-elpa-gitlab-runner-3" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi
if [ "$(hostname)" == "amarek-elpa-gitlab-runner-4" ]; then module load intel/16.0 gcc mkl/11.3 autoconf automake libtool impi/5.1.3; fi
if [ "$(hostname)" == "dvl01" ]; then module load intel/17.0 gcc/5.4 mkl/2017 impi/2017.2 gcc/5.4 cuda/8.0; fi
if [ "$(hostname)" == "dvl02" ]; then module load intel/17.0 gcc/5.4 mkl/2017 impi/2017.2 gcc/5.4 cuda/8.0; fi
if [ "$(hostname)" == "dvl01" ]; then module load intel/17.0 gcc/6.4 mkl/2017 impi/2017.4 cuda/9.2; fi
if [ "$(hostname)" == "dvl02" ]; then module load intel/17.0 gcc/6.4 mkl/2017 impi/2017.4 cuda/9.2; fi
if [ "$(hostname)" == "miy01" ]; then module purge && module load gcc/5.4 smpi essl/5.5 cuda pgi/17.9 && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH && export OMPI_CC=gcc && export OMPI_FC=gfortran; fi
if [ "$(hostname)" == "miy02" ]; then module load gcc/5.4 pgi/17.9 ompi/pgi/17.9/1.10.2 essl/5.5 cuda && export LD_LIBRARY_PATH=/opt/ibm/spectrum_mpi/lib:/opt/ibm/spectrum_mpi/profilesupport/lib:$LD_LIBRARY_PATH && export PATH=/opt/ibm/spectrum_mpi/bin:$PATH; fi
......
#!/bin/bash
source /etc/profile.d/modules.sh
#source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex
source ./ci_test_scripts/.ci-env-vars
......
#!/bin/bash
source /etc/profile.d/modules.sh
#source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex
source ./ci_test_scripts/.ci-env-vars
......
......@@ -336,6 +336,19 @@ print(" # stupid 'make distcheck' leaves behind write-protected files that th
print(' - make distcheck DISTCHECK_CONFIGURE_FLAGS="FC=mpiifort FCFLAGS=\\"-xHost\\" CFLAGS=\\"-march=native\\" SCALAPACK_LDFLAGS=\\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\\" SCALAPACK_FCFLAGS=\\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\\" --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2" TASKS=2 TEST_FLAGS="150 50 16" || { chmod u+rwX -R . ; exit 1 ; }')
print("\n\n")
print("distcheck-no-autotune:")
print(" tags:")
print(" - buildtest")
print(" script:")
print(" - ./configure FC=mpiifort FCFLAGS=\"-xHost\" CFLAGS=\"-march=native\" SCALAPACK_LDFLAGS=\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\" SCALAPACK_FCFLAGS=\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\" --enable-option-checking=fatal --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-autotuning || { cat config.log; exit 1; }")
print(" # stupid 'make distcheck' leaves behind write-protected files that the stupid gitlab runner cannot remove")
print(' - make distcheck DISTCHECK_CONFIGURE_FLAGS="FC=mpiifort FCFLAGS=\\"-xHost\\" CFLAGS=\\"-march=native\\" SCALAPACK_LDFLAGS=\\"$MKL_INTEL_SCALAPACK_LDFLAGS_MPI_NO_OMP\\" SCALAPACK_FCFLAGS=\\"$MKL_INTEL_SCALAPACK_FCFLAGS_MPI_NO_OMP\\" --with-mpi=yes --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-autotuning " TASKS=2 TEST_FLAGS="150 50 16" || { chmod u+rwX -R . ; exit 1 ; }')
print("\n\n")
# add python tests
python_ci_tests = [
"# python tests",
......
......@@ -13,6 +13,7 @@ configueArg=""
skipStep=0
batchCommand=""
interactiveRun="yes"
SLURMBATCH="no"
function usage() {
cat >&2 <<-EOF
......@@ -58,7 +59,7 @@ function usage() {
}
while getopts "c:t:j:m:n:b:o:s:q:i:h" opt; do
while getopts "c:t:j:m:n:b:o:s:q:S:i:h" opt; do
case $opt in
j)
makeTasks=$OPTARG;;
......@@ -80,6 +81,8 @@ while getopts "c:t:j:m:n:b:o:s:q:i:h" opt; do
batchCommand=$OPTARG;;
i)
interactiveRun=$OPTARG;;
S)
SLURMBATCH=$OPTARG;;
:)
echo "Option -$OPTARG requires an argument" >&2;;
h)
......
#!/bin/bash
source /etc/profile.d/modules.sh
#source /etc/profile.d/modules.sh
if [ -f /etc/profile.d/modules.sh ]; then source /etc/profile.d/modules.sh ; else source /etc/profile.d/mpcdf_modules.sh; fi
set -ex
source ./ci_test_scripts/.ci-env-vars
......
......@@ -29,12 +29,21 @@ AM_SILENT_RULES([yes])
#
AC_SUBST([ELPA_SO_VERSION], [13:0:0])
# AC_DEFINE_SUBST(NAME, VALUE, DESCRIPTION)
# -----------------------------------------
AC_DEFUN([AC_DEFINE_SUBST], [
AC_DEFINE([$1], [$2], [$3])
AC_SUBST([$1], ['$2'])
])
# API Version
AC_DEFINE([EARLIEST_API_VERSION], [20170403], [Earliest supported ELPA API version])
AC_DEFINE([CURRENT_API_VERSION], [20181113], [Current ELPA API version])
AC_DEFINE_SUBST(CURRENT_API_VERSION, 20181113, "Current ELPA API version")
# Autotune Version
AC_DEFINE([EARLIEST_AUTOTUNE_VERSION], [20171201], [Earliest ELPA API version, which supports autotuning])
AC_DEFINE([CURRENT_AUTOTUNE_VERSION], [20181113], [Current ELPA autotune version])
AC_DEFINE_SUBST(CURRENT_AUTOTUNE_VERSION, 20181113, "Current ELPA autotune version")
AX_CHECK_GNU_MAKE()
if test x$_cv_gnu_make_command = x ; then
......@@ -540,6 +549,7 @@ m4_pattern_forbid([elpa_m4])
m4_define(elpa_m4_generic_kernels, [
real_generic
real_generic_simple
real_generic_simple_block4
complex_generic
complex_generic_simple
])
......@@ -748,6 +758,30 @@ m4_foreach_w([elpa_m4_type],elpa_m4_kernel_types,[
dnl the list of kernels is now assembled
dnl choosing a default kernel
m4_foreach_w([elpa_m4_kind],[real complex],[
AC_ARG_WITH([default-]elpa_m4_kind[-kernel], m4_expand([AS_HELP_STRING([--with-default-]elpa_m4_kind[-kernel]=KERNEL,
[set a specific ]elpa_m4_kind[ kernel as default kernel. Available kernels are:]
m4_foreach_w([elpa_m4_kernel],m4_expand(elpa_m4_[]elpa_m4_kind[]_kernels),[m4_bpatsubst(elpa_m4_kernel,elpa_m4_kind[]_,[]) ]))]),
[default_]elpa_m4_kind[_kernel="]elpa_m4_kind[_$withval"],[default_]elpa_m4_kind[_kernel=""])
#if test -n "$default_[]elpa_m4_kind[]_kernel" ; then
# found="no"
# m4_foreach_w([elpa_m4_otherkernel],m4_expand(elpa_m4_[]elpa_m4_kind[]_kernels),[
# if test "$default_]elpa_m4_kind[_kernel" = "]elpa_m4_otherkernel[" ; then
# use_[]elpa_m4_otherkernel[]=yes
# found="yes"
# else
# use_[]elpa_m4_otherkernel[]=no
# fi
# ])
# if test x"$found" = x"no" ; then
# AC_MSG_ERROR([Invalid kernel "$default_]elpa_m4_kind[_kernel" specified for --with-default-]elpa_m4_kind[-kernel])
# fi
# AC_DEFINE([WITH_DEFAULT_]m4_toupper(elpa_m4_kind)[_KERNEL],[1],[use specific ]elpa_m4_kind[ default kernel (set at compile time)])
#fi
])
m4_foreach_w([elpa_m4_kind],[real complex],[
m4_foreach_w([elpa_m4_kernel],
m4_foreach_w([elpa_m4_cand_kernel],
......@@ -1257,6 +1291,7 @@ AC_CONFIG_FILES([
Doxyfile
${PKG_CONFIG_FILE}:elpa.pc.in
elpa/elpa_constants.h
elpa/elpa_version.h
])
m4_include([m4/ax_fc_check_define.m4])
......@@ -1404,12 +1439,12 @@ echo "* off). With the 2019.11.001 release it will be abolished! *"
echo "***********************************************************************"
echo " "
echo " "
echo "***********************************************************************"
echo "* This is a the first release candidate of ELPA 2018.11.001.rc1 *"
echo "* There might be still some changes until the final release of *"
echo "* ELPA 2018.11.001 *"
echo "***********************************************************************"
echo " "
#echo "***********************************************************************"
#echo "* This is a the first release candidate of ELPA 2018.11.001.rc1 *"
#echo "* There might be still some changes until the final release of *"
#echo "* ELPA 2018.11.001 *"
#echo "***********************************************************************"
#echo " "
if test x"$enable_kcomputer" = x"yes" ; then
echo " "
......
......@@ -19,7 +19,7 @@
%define with_openmp 0
Name: elpa
Version: 2018.11.001.rc1
Version: 2018.11.001
Release: 1
Summary: A massively parallel eigenvector solver
License: LGPL-3.0
......
......@@ -4,6 +4,8 @@
#include <limits.h>
#include <complex.h>
#include <elpa/elpa_version.h>
struct elpa_struct;
typedef struct elpa_struct *elpa_t;
......
......@@ -46,7 +46,8 @@ enum ELPA_SOLVERS {
X(ELPA_2STAGE_REAL_SPARC64_BLOCK6, 21, @ELPA_2STAGE_REAL_SPARC64_BLOCK6_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK2, 22, @ELPA_2STAGE_REAL_VSX_BLOCK2_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK4, 23, @ELPA_2STAGE_REAL_VSX_BLOCK4_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_VSX_BLOCK6, 24, @ELPA_2STAGE_REAL_VSX_BLOCK6_COMPILED@, __VA_ARGS__)
X(ELPA_2STAGE_REAL_VSX_BLOCK6, 24, @ELPA_2STAGE_REAL_VSX_BLOCK6_COMPILED@, __VA_ARGS__) \
X(ELPA_2STAGE_REAL_GENERIC_SIMPLE_BLOCK4, 25, @ELPA_2STAGE_REAL_GENERIC_SIMPLE_BLOCK4_COMPILED@, __VA_ARGS__)
#define ELPA_FOR_ALL_2STAGE_REAL_KERNELS_AND_DEFAULT(X) \
ELPA_FOR_ALL_2STAGE_REAL_KERNELS(X) \
......
#define ELPA_API_VERSION @CURRENT_API_VERSION@
#define ELPA_AUTOTUNE_API_VERSION @CURRENT_AUTOTUNE_VERSION@
......@@ -62,15 +62,15 @@ module mod_check_for_gpu
gpuAvailable = .false.
if(cublasHandle .ne. -1) then
if (cublasHandle .ne. -1) then
gpuAvailable = .true.
numberOfDevices = -1
if(myid == 0) then
if (myid == 0) then
print *, "Skipping GPU init, should have already been initialized "
endif
return
else
if(myid == 0) then
if (myid == 0) then
print *, "Initializing the GPU devices"
endif
endif
......
This diff is collapsed.
......@@ -138,6 +138,13 @@ program print_available_elpa2_kernels
do i = 0, elpa_option_cardinality(KERNEL_KEY)
kernel = elpa_option_enumerate(KERNEL_KEY, i)
if (elpa_int_value_to_string(KERNEL_KEY, i) .eq. "ELPA_2STAGE_COMPLEX_GPU" .or. &
elpa_int_value_to_string(KERNEL_KEY, i) .eq. "ELPA_2STAGE_REAL_GPU") then
if (e%can_set("use_gpu",1) == ELPA_OK) then
call e%set("use_gpu",1)
endif
endif
if (e%can_set(KERNEL_KEY, kernel) == ELPA_OK) then
print *, " ", elpa_int_value_to_string(KERNEL_KEY, kernel)
endif
......
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! --------------------------------------------------------------------------------------------------
#endif
#include "config-f90.h"
!#ifndef USE_ASSUMED_SIZE
!module real_generic_simple_block4_kernel
!
! private
! public quad_hh_trafo_real_generic_simple_4hv_double
!
!#ifdef WANT_SINGLE_PRECISION_REAL
! public quad_hh_trafo_real_generic_simple_4hv_single
!#endif
!
! contains
!#endif
#define REALCASE 1
#define DOUBLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block4_template.F90"
#undef REALCASE
#undef DOUBLE_PRECISION
#ifdef WANT_SINGLE_PRECISION_REAL
#define REALCASE 1
#define SINGLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block4_template.F90"
#undef REALCASE
#undef SINGLE_PRECISION
#endif
!#ifndef USE_ASSUMED_SIZE
!end module real_generic_simple_block4_kernel
!#endif
! --------------------------------------------------------------------------------------------------
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! Author: A. Marek, MPCDF
! --------------------------------------------------------------------------------------------------
#endif
#include "config-f90.h"
!#ifndef USE_ASSUMED_SIZE
!module real_generic_simple_block6_kernel
!
! private
! public hexa_hh_trafo_real_generic_simple_6hv_double
!
!#ifdef WANT_SINGLE_PRECISION_REAL
! public hexa_hh_trafo_real_generic_simple_6hv_single
!#endif
!
! contains
!#endif
#define REALCASE 1
#define DOUBLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block6_template.F90"
#undef REALCASE
#undef DOUBLE_PRECISION
#ifdef WANT_SINGLE_PRECISION_REAL
#define REALCASE 1
#define SINGLE_PRECISION 1
#include "../../general/precision_macros.h"
#include "simple_block6_template.F90"
#undef REALCASE
#undef SINGLE_PRECISION
#endif
!#ifndef USE_ASSUMED_SIZE
!end module real_generic_simple_block6_kernel
!#endif
! --------------------------------------------------------------------------------------------------
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#if 0
! This file is part of ELPA.
!
! The ELPA library was originally created by the ELPA consortium,
! consisting of the following organizations:
!
! - Max Planck Computing and Data Facility (MPCDF), formerly known as
! Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
! - Bergische Universität Wuppertal, Lehrstuhl für angewandte
! Informatik,
! - Technische Universität München, Lehrstuhl für Informatik mit
! Schwerpunkt Wissenschaftliches Rechnen ,
! - Fritz-Haber-Institut, Berlin, Abt. Theorie,
! - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
! Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
! and
! - IBM Deutschland GmbH
!
!
! More information can be found here:
! http://elpa.mpcdf.mpg.de/
!
! ELPA is free software: you can redistribute it and/or modify
! it under the terms of the version 3 of the license of the
! GNU Lesser General Public License as published by the Free
! Software Foundation.
!
! ELPA is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! GNU Lesser General Public License for more details.
!
! You should have received a copy of the GNU Lesser General Public License
! along with ELPA. If not, see <http://www.gnu.org/licenses/>
!
! ELPA reflects a substantial effort on the part of the original
! ELPA consortium, and we ask you to respect the spirit of the
! license that we chose: i.e., please contribute any changes you
! may have back to the original ELPA library distribution, and keep
! any derivatives of ELPA under the same license that we chose for
! the original distribution, the GNU Lesser General Public License.
!
!
! --------------------------------------------------------------------------------------------------
!
! This file contains the compute intensive kernels for the Householder transformations.
!
! This is the small and simple version (no hand unrolling of loops etc.) but for some
! compilers this performs better than a sophisticated version with transformed and unrolled loops.
!
! It should be compiled with the highest possible optimization level.
!
! Copyright of the original code rests with the authors inside the ELPA
! consortium. The copyright of any additional modifications shall rest
! with their original authors, but shall adhere to the licensing terms
! distributed along with the original code in the file "COPYING".
!
! --------------------------------------------------------------------------------------------------
#endif
subroutine quad_hh_trafo_&
&MATH_DATATYPE&
&_generic_simple_4hv_&
&PRECISION&
& (q, hh, nb, nq, ldq, ldh)
use precision
use elpa_abstract_impl
implicit none
!class(elpa_abstract_impl_t), intent(inout) :: obj
integer(kind=ik), intent(in) :: nb, nq, ldq, ldh
#if REALCASE==1
#ifdef USE_ASSUMED_SIZE
real(kind=C_DATATYPE_KIND), intent(inout) :: q(ldq,*)
real(kind=C_DATATYPE_KIND), intent(in) :: hh(ldh,*)
#else
real(kind=C_DATATYPE_KIND), intent(inout) :: q(1:ldq,1:nb+3)
real(kind=C_DATATYPE_KIND), intent(in) :: hh(1:ldh,1:6)
#endif
real(kind=C_DATATYPE_KIND) :: s_1_2, s_1_3, s_2_3, s_1_4, s_2_4, s_3_4
real(kind=C_DATATYPE_KIND) :: vs_1_2, vs_1_3, vs_2_3, vs_1_4, vs_2_4, vs_3_4
real(kind=C_DATATYPE_KIND) :: h_2_1, h_3_2, h_3_1, h_4_3, h_4_2, h_4_1
real(kind=C_DATATYPE_KIND) :: a_1_1(nq), a_2_1(nq), a_3_1(nq), a_4_1(nq)
real(kind=C_DATATYPE_KIND) :: h1, h2, h3, h4
real(kind=C_DATATYPE_KIND) :: w(nq), z(nq), x(nq), y(nq)
real(kind=C_DATATYPE_KIND) :: tau1, tau2, tau3, tau4
#endif /* REALCASE==1 */
#if COMPLEXCASE==1
#ifdef USE_ASSUMED_SIZE
complex(kind=C_DATATYPE_KIND), intent(inout) :: q(ldq,*)
complex(kind=C_DATATYPE_KIND), intent(in) :: hh(ldh,*)
#else
complex(kind=C_DATATYPE_KIND), intent(inout) :: q(1:ldq,1:nb+3)
complex(kind=C_DATATYPE_KIND), intent(in) :: hh(1:ldh,1:6)
#endif
complex(kind=C_DATATYPE_KIND) :: s_1_2, s_1_3, s_2_3, s_1_4, s_2_4, s_3_4
complex(kind=C_DATATYPE_KIND) :: vs_1_2, vs_1_3, vs_2_3, vs_1_4, vs_2_4, vs_3_4
complex(kind=C_DATATYPE_KIND) :: h_2_1, h_3_2, h_3_1, h_4_3, h_4_2, h_4_1
complex(kind=C_DATATYPE_KIND) :: a_1_1(nq), a_2_1(nq), a_3_1(nq), a_4_1(nq)
complex(kind=C_DATATYPE_KIND) :: w(nq), z(nq), x(nq), y(nq)
complex(kind=C_DATATYPE_KIND) :: h1, h2, h3, h4
complex(kind=C_DATATYPE_KIND) :: tau1, tau2, tau3, tau4
#endif /* COMPLEXCASE==1 */
integer(kind=ik) :: i
! Calculate dot product of the two Householder vectors
#if REALCASE==1
s_1_2 = hh(2,2)
s_1_3 = hh(3,3)
s_2_3 = hh(2,3)
s_1_4 = hh(4,4)
s_2_4 = hh(3,4)
s_3_4 = hh(2,4)
s_1_2 = s_1_2 + hh(2,1) * hh(3,2)
s_2_3 = s_2_3 + hh(2,2) * hh(3,3)
s_3_4 = s_3_4 + hh(2,3) * hh(3,4)
s_1_2 = s_1_2 + hh(3,1) * hh(4,2)
s_2_3 = s_2_3 + hh(3,2) * hh(4,3)
s_3_4 = s_3_4 + hh(3,3) * hh(4,4)
s_1_3 = s_1_3 + hh(2,1) * hh(4,3)
s_2_4 = s_2_4 + hh(2,2) * hh(4,4)
!DIR$ IVDEP
do i=5,nb
s_1_2 = s_1_2 + hh(i-1,1) * hh(i,2)
s_2_3 = s_2_3 + hh(i-1,2) * hh(i,3)
s_3_4 = s_3_4 + hh(i-1,3) * hh(i,4)
s_1_3 = s_1_3 + hh(i-2,1) * hh(i,3)
s_2_4 = s_2_4 + hh(i-2,2) * hh(i,4)
s_1_4 = s_1_4 + hh(i-3,1) * hh(i,4)
enddo
#endif
#if COMPLEXCASE==1
stop
!s = conjg(hh(2,2))*1.0
!do i=3,nb
! s = s+(conjg(hh(i,2))*hh(i-1,1))
!enddo
#endif
! Do the Householder transformations
a_1_1(1:nq) = q(1:nq,4)
a_2_1(1:nq) = q(1:nq,3)
a_3_1(1:nq) = q(1:nq,2)
a_4_1(1:nq) = q(1:nq,1)
h_2_1 = hh(2,2)
h_3_2 = hh(2,3)
h_3_1 = hh(3,3)
h_4_3 = hh(2,4)
h_4_2 = hh(3,4)
h_4_1 = hh(4,4)
#if REALCASE == 1
w(1:nq) = a_3_1(1:nq) * h_4_3 + a_4_1(1:nq)
w(1:nq) = a_2_1(1:nq) * h_4_2 + w(1:nq)
w(1:nq) = a_1_1(1:nq) * h_4_1 + w(1:nq)
z(1:nq) = a_2_1(1:nq) * h_3_2 + a_3_1(1:nq)
z(1:nq) = a_1_1(1:nq) * h_3_1 + z(1:nq)
y(1:nq) = a_1_1(1:nq) * h_2_1 + a_2_1(1:nq)
x(1:nq) = a_1_1(1:nq)
#endif
#if COMPLEXCASE==1
stop
!y(1:nq) = q(1:nq,1) + q(1:nq,2)*conjg(hh(2,2))
#endif
do i=5,nb
#if REALCASE == 1
h1 = hh(i-3,1)
h2 = hh(i-2,2)
h3 = hh(i-1,3)
h4 = hh(i ,4)
#endif
#if COMPLEXCASE==1
stop
! h1 = conjg(hh(i-1,1))
! h2 = conjg(hh(i,2))
#endif
x(1:nq) = x(1:nq) + q(1:nq,i) * h1
y(1:nq) = y(1:nq) + q(1:nq,i) * h2
z(1:nq) = z(1:nq) + q(1:nq,i) * h3
w(1:nq) = w(1:nq) + q(1:nq,i) * h4
enddo
h1 = hh(nb-2,1)
h2 = hh(nb-1,2)
h3 = hh(nb ,3)
#if REALCASE==1
x(1:nq) = x(1:nq) + q(1:nq,nb+1) * h1
y(1:nq) = y(1:nq) + q(1:nq,nb+1) * h2
z(1:nq) = z(1:nq) + q(1:nq,nb+1) * h3
#endif
#if COMPLEXCASE==1
stop
!x(1:nq) = x(1:nq) + q(1:nq,nb+1)*conjg(hh(nb,1))
#endif
h1 = hh(nb-1,1)
h2 = hh(nb ,2)
x(1:nq) = x(1:nq) + q(1:nq,nb+2) * h1
y(1:nq) = y(1:nq) + q(1:nq,nb+2) * h2
h1 = hh(nb,1)
x(1:nq) = x(1:nq) + q(1:nq,nb+3) * h1
! Rank-1 update
tau1 = hh(1,1)
tau2 = hh(1,2)
tau3 = hh(1,3)
tau4 = hh(1,4)
vs_1_2 = s_1_2
vs_1_3 = s_1_3
vs_2_3 = s_2_3
vs_1_4 = s_1_4
vs_2_4 = s_2_4
vs_3_4 = s_3_4
h1 = tau1
x(1:nq) = x(1:nq) * h1
h1 = tau2
h2 = tau2 * vs_1_2
y(1:nq) = y(1:nq) * h1 - x(1:nq) * h2
h1 = tau3
h2 = tau3 * vs_1_3
h3 = tau3 * vs_2_3
z(1:nq) = z(1:nq) * h1 - (y(1:nq) * h3 + x(1:nq) * h2)
h1 = tau4
h2 = tau4 * vs_1_4
h3 = tau4 * vs_2_4
h4 = tau4 * vs_3_4
w(1:nq) = w(1:nq) * h1 - ( z(1:nq) * h4 + y(1:nq) * h3 + x(1:nq) * h2)
q(1:nq,1) = q(1:nq,1) - w(1:nq)
h4 = hh(2,4)
q(1:nq,2) = q(1:nq,2) - (w(1:nq) * h4 + z(1:nq))
h3 = hh(2,3)
h4 = hh(3,4)
q(1:nq,3) = q(1:nq,3) - y(1:nq)
q(1:nq,3) = -( z(1:nq) * h3) + q(1:nq,3)
q(1:nq,3) = -( w(1:nq) * h4) + q(1:nq,3)
h2 = hh(2,2)
h3 = hh(3,3)
h4 = hh(4,4)
q(1:nq,4) = q(1:nq,4) - x(1:nq)
q(1:nq,4) = -(y(1:nq) * h2) + q(1:nq,4)
q(1:nq,4) = -(z(1:nq) * h3) + q(1:nq,4)
q(1:nq,4) = -(w(1:nq) * h4) + q(1:nq,4)
do i=5,nb
h1 = hh(i-3,1)
h2 = hh(i-2,2)
h3 = hh(i-1,3)
h4 = hh(i ,4)
q(1:nq,i) = -(x(1:nq) * h1) + q(1:nq,i)
q(1:nq,i) = -(y(1:nq) * h2) + q(1:nq,i)
q(1:nq,i) = -(z(1:nq) * h3) + q(1:nq,i)
q(1:nq,i) = -(w(1:nq) * h4) + q(1:nq,i)
enddo
h1 = hh(nb-2,1)
h2 = hh(nb-1,2)
h3 = hh(nb ,3)
q(1:nq,nb+1) = -(x(1:nq) * h1) + q(1:nq,nb+1)
q(1:nq,nb+1) = -(y(1:nq) * h2) + q(1:nq,nb+1)
q(1:nq,nb+1) = -(z(1:nq) * h3) + q(1:nq,nb+1)
h1 = hh(nb-1,1)
h2 = hh(nb ,2)
q(1:nq,nb+2) = - (x(1:nq) * h1) + q(1:nq,nb+2)
q(1:nq,nb+2) = - (y(1:nq) * h2) + q(1:nq,nb+2)
h1 = hh(nb,1)
q(1:nq,nb+3) = - (x(1:nq) * h1) + q(1:nq,nb+3)
end subroutine
This diff is collapsed.
......@@ -56,7 +56,7 @@ module elpa2_utilities
implicit none
public
integer(kind=c_int), parameter :: number_of_real_kernels = ELPA_2STAGE_NUMBER_OF_REAL_KERNELS - 6
integer(kind=c_int), parameter :: number_of_real_kernels = ELPA_2STAGE_NUMBER_OF_REAL_KERNELS - 7
integer(kind=c_int), parameter :: number_of_complex_kernels = ELPA_2STAGE_NUMBER_OF_COMPLEX_KERNELS
#ifdef WITH_REAL_GENERIC_KERNEL
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
#include "config-f90.h"
#include <stdio.h>
#include <stdlib.h>
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
// it seems, that we need those two levels of indirection to correctly expand macros
#define cannons_triang_rectangular_impl_expand2(SUFFIX) cannons_triang_rectangular_##SUFFIX
#define cannons_triang_rectangular_impl_expand1(SUFFIX) cannons_triang_rectangular_impl_expand2(SUFFIX)
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
// it seems, that we need those two levels of indirection to correctly expand macros
#define cannons_reduction_impl_expand2(SUFFIX) cannons_reduction_##SUFFIX
......
// This file is part of ELPA.
//
// The ELPA library was originally created by the ELPA consortium,
// consisting of the following organizations:
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// - Max-Plack-Institut für Mathematik in den Naturwissenschaften,
// Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition,
// and
// - IBM Deutschland GmbH
//
// This particular source code file has been developed within the ELPA-AEO //
// project, which has been a joint effort of
//
// - Max Planck Computing and Data Facility (MPCDF), formerly known as
// Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
// - Bergische Universität Wuppertal, Lehrstuhl für angewandte
// Informatik,
// - Technische Universität München, Lehrstuhl für Informatik mit
// Schwerpunkt Wissenschaftliches Rechnen ,
// - Technische Universität München, Lehrstuhl für Theoretische Chemie,
// - Fritz-Haber-Institut, Berlin, Abt. Theorie,
// More information can be found here:
// http://elpa.mpcdf.mpg.de/ and
// http://elpa-aeo.mpcdf.mpg.de
//
// ELPA is free software: you can redistribute it and/or modify
// it under the terms of the version 3 of the license of the
// GNU Lesser General Public License as published by the Free
// Software Foundation.
//
// ELPA is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with ELPA. If not, see <http://www.gnu.org/licenses/>
//
// ELPA reflects a substantial effort on the part of the original
// ELPA consortium, and we ask you to respect the spirit of the
// license that we chose: i.e., please contribute any changes you
// may have back to the original ELPA library distribution, and keep
// any derivatives of ELPA under the same license that we chose for
// the original distribution, the GNU Lesser General Public License.
//
// Author: Valeriy Manin (Bergische Universität Wuppertal)
// integreated into the ELPA library Pavel Kus, Andeas Marek (MPCDF)
#include <stdio.h>
#include <stdlib.h>
#ifdef WITH_MPI
......
This diff is collapsed.
This diff is collapsed.
......@@ -498,3 +498,5 @@ int elpa_index_print_autotune_state(elpa_index_t index, int autotune_level, int
*/
int elpa_index_load_autotune_state(elpa_index_t index, int* autotune_level, int* autotune_domain, int* min_loc,
double* min_val, int* current, int* cardinality, char* filename);
int elpa_index_is_printing_mpi_rank(elpa_index_t index);
......@@ -76,6 +76,8 @@ int ftimings_papi_init(void) {
flops_available = 1;
}
ldst_available = 0;
#if 0
/* Loads + Stores */
if ((ret = PAPI_query_event(PAPI_LD_INS)) < 0) {
fprintf(stderr, "ftimings: %s:%d: PAPI_query_event(PAPI_LD_INS): %s\n",
......@@ -96,7 +98,7 @@ int ftimings_papi_init(void) {
} else {
ldst_available = 1;
}
#endif
/* Start */
if ((ret = PAPI_start(event_set)) < 0) {
fprintf(stderr, "ftimings: %s:%d PAPI_start(): %s\n",
......
This diff is collapsed.
AC_PREREQ([2.69])
AC_INIT([elpa_test_project],[2018.11.001.rc1], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001.rc1"
AC_INIT([elpa_test_project],[2018.11.001], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001"
AC_CONFIG_SRCDIR([src/test_real.F90])
AM_INIT_AUTOMAKE([foreign -Wall subdir-objects])
......
AC_PREREQ([2.69])
AC_INIT([elpa_test_project],[2018.11.001.rc1], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001.rc1"
AC_INIT([elpa_test_project],[2018.11.001], elpa-library@rzg.mpg.de)
elpaversion="2018.11.001"
AC_CONFIG_SRCDIR([src/test_real.F90])
AM_INIT_AUTOMAKE([foreign -Wall subdir-objects])
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.