Unverified Commit ff7beab2 authored by Andreas Marek's avatar Andreas Marek
Browse files

Prepare release of ELPA_2015.02.001

The qr decomposition is now available as a runtime choice.
Some testing has still to be done
parent 2d84e981
......@@ -2,19 +2,20 @@ How to install ELPA:
----------------------
First of all, if you do not want to build ELPA yourself, and you run Linux,
it is worth haveing a look at the ELPA webpage and/or the repositories of
your Linux distibution: there exist pre-build packages for a few
it is worth haveing a look at the ELPA webpage and/or the repositories of
your Linux distibution: there exist pre-build packages for a few
distributions like Fedora, Debian, and Opensuse. More, will hopefully follow
in the future.
If you want to build (or have to since no packages are available) ELPA yourself,
please note that ELPA is shipped with a typical "configure" and "make"
procedure. It is recommended to use this way to install ELPA, see (A).
If you do not want to install ELPA as library, but to include it in your
source code, please refer to point (B).
An example makefile "Makefile.example" can be found in ./test,
to give some hints how this is done. Please distibute then all files of ELPA
with your code. Please note, that usage of ELPA as described in Section (B)
please note that ELPA is shipped with a typical "configure" and "make"
procedure. It is recommended to use this way to install ELPA, see (A).
If you do not want to install ELPA as library, but to include it in your
source code, please refer to point (B). Note, that this is not recommended
and no support whatsoever can be given for this approach !
However, an example makefile "Makefile.example" can be found in ./test,
to give some hints how this is done. Please distibute then all files of ELPA
with your code. Please note, that usage of ELPA as described in Section (B)
requires advanced knowledge about compilers, preprocessor flags, and
optimizations. Please also not, that we cannot give any official support if
ELPA is used as described in Section (B)!
......@@ -26,26 +27,26 @@ ELPA is used as described in Section (B)!
The configure installation is best done in four steps
1) run configure:
Check the available options with "configure --help".
1) run configure:
Check the available options with "configure --help".
ELPA is shipped with several different versions of the
elpa2-kernel, each is optimized and tuned for a different
architecture.
1.1) Choice of ELPA2 kernels
With this release of ELPA (2014.06 or newer) it is _not_
With the release of ELPA (2014.06 or newer) it is _not_
mandatory anymore to define the (real and complex) kernels
at build time. The configure procedure will build all the
kernels which can be used on the build system. The choice of
the kernels is now a run-time option. This is the most
kernels which can be used on the build system. The choice of
the kernels is now a run-time option. This is the most
convenient and also recommended way. It is intended to augment
this with an auto-tuning feature.
Nevertheless, one can still define at build-time _one_
specific kernel (for the real and the complex case each).
Then, ELPA is configured only with this real (and complex)
specific kernel (for the real and the complex case each).
Then, ELPA is configured only with this real (and complex)
kernel, and all run-time checking is disabled. Have a look
at the "configure --help" messages and please refer to the
file "./src/elpa2_kernels/README_elpa2_kernels.txt".
......@@ -55,8 +56,8 @@ The configure installation is best done in four steps
The configure script tries to auto-detect an installed Blacs/Scalapack
library. If this is successfull, you do not have to specify anything
in this regard. However, this will fail, if you do not use Netlib
Blacs/Scalapack but vendor specific implementations (e.g. Intel's MKL
in this regard. However, this will fail, if you do not use Netlib
Blacs/Scalapack but vendor specific implementations (e.g. Intel's MKL
library or the implementation of Cray).
Please then point to your Blacs/Scalapack installation and the
......@@ -79,7 +80,7 @@ The configure installation is best done in four steps
variable "FCFLAGS", "CFLAGS", and "CXXFLAGS", e.g. FCFLAGS="-O3 -xAVX",
please see "./src/elpa2_kernels/README_elpa2_kernels.txt".
Setting the optimization flags for the AVX kernels can be a hazel. If AVX
Setting the optimization flags for the AVX kernels can be a hazel. If AVX
kernels are build for your system, you can set the configure option
"--with-avx-optimizations=yes". This will automatically set a few compiler
optimization flags which turned out to be beneficial for AVX support.
......@@ -107,7 +108,7 @@ The configure installation is best done in four steps
3) run "make check"
a simple test of ELPA is done. At the moment the usage of "mpiexec"
is required. If this is not possible at your system, you can run the
binaries "elpa1_test_real", "elpa2_test_real",
binaries "elpa1_test_real", "elpa2_test_real",
"elpa1_test_complex", "elpa2_test_complex",
"elpa2_test_complex_default_kernel", "elpa2_test_complex_choose_kernel_with_api",
"elpa2_test_real_default_kernel", and "elpa2_test_real_choose_kernel_with_api"
......@@ -128,7 +129,7 @@ The configure installation is best done in four steps
B) Installing ELPA without the autotools procedure
===================================================
You can find an example makefile "Makefile.example" in "./test",
to see how you can use ELPA directly in your code, and not as library.
to see how you can use ELPA directly in your code, and not as library.
If you do so, please distibute then all files of ELPA with your code.
However, this is not the recommended way for several reasons:
......@@ -138,21 +139,21 @@ B) Installing ELPA without the autotools procedure
- you still have to choose an elpa2-kernel (see at (A)). Getting them
build from hand might be tedious.
- the file elpa2.F90 uses preprocessor defines for the different kernels.
you will have to do this by hand, if you do not use the autotools
you will have to do this by hand, if you do not use the autotools
infrastructure.
- also the test programs now use preprocessor defines, discriminating
between version with and without OpenMP
- it is entirely possible that due to ever growing complexity of ELPA
in future releases the build procedure without autotools will not be
in future releases the build procedure without autotools will not be
supported anymore
Thus, if you really want to use ELPA this way and not with the autotools
please ensure the following
- make yourself familiar with the preprocessor flags you will need
for your configuration of ELPA and define them in a file "config-f90.h"
- adapte the Makefile.example accordingly to your needs
Again, it is strongly encouraged to use the autotools build procedure!
How to use ELPA:
......
......@@ -10,7 +10,10 @@ lib_LTLIBRARIES = libelpa@SUFFIX@.la
libelpa@SUFFIX@_la_LINK = $(FCLINK) $(AM_LDFLAGS) -version-info $(ELPA_SO_VERSION) -lstdc++
libelpa@SUFFIX@_la_SOURCES = src/elpa1.F90 src/elpa2.F90
libelpa@SUFFIX@_la_SOURCES += src/elpa_qr/qr_utils.f90 \
src/elpa_qr/elpa_qrkernels.f90 \
src/elpa_qr/elpa_pdlarfb.f90 \
src/elpa_qr/elpa_pdgeqrf.f90
if HAVE_DETAILED_TIMINGS
libelpa@SUFFIX@_la_SOURCES += src/timer.F90 \
src/ftimings/ftimings.F90 \
......@@ -23,13 +26,6 @@ if HAVE_DETAILED_TIMINGS
src/ftimings/papi.c
endif
if WITH_QR
libelpa@SUFFIX@_la_SOURCES += src/elpa_qr/qr_utils.f90 \
src/elpa_qr/elpa_qrkernels.f90 \
src/elpa_qr/elpa_pdlarfb.f90 \
src/elpa_qr/elpa_pdgeqrf.f90
endif
if WITH_REAL_GENERIC_KERNEL
libelpa@SUFFIX@_la_SOURCES += src/elpa2_kernels/elpa2_kernels_real.f90
endif
......@@ -103,6 +99,7 @@ dist_files_DATA = \
test/test_complex_gen.F90 \
test/test_real2.F90 \
test/test_real2_default_kernel.F90 \
test/test_real2_default_kernel_qr_decomposition.F90 \
test/test_real2_choose_kernel_with_api.F90 \
src/print_available_elpa2_kernels.F90 \
test/test_real.F90 \
......@@ -124,6 +121,7 @@ bin_PROGRAMS = \
noinst_PROGRAMS = \
elpa2_test_real_default_kernel@SUFFIX@ \
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@ \
elpa2_test_complex_default_kernel@SUFFIX@ \
elpa2_test_real_choose_kernel_with_api@SUFFIX@ \
elpa2_test_complex_choose_kernel_with_api@SUFFIX@
......@@ -147,6 +145,10 @@ elpa2_test_real_default_kernel@SUFFIX@_SOURCES = test/test_real2_default_kernel.
elpa2_test_real_default_kernel@SUFFIX@_LDADD = $(build_lib)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_SOURCES = test/test_real2_default_kernel_qr_decomposition.F90 test/util.F90 $(redirect_sources)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_LDADD = $(build_lib)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_SOURCES = test/test_real2_choose_kernel_with_api.F90 test/util.F90 $(redirect_sources)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_LDADD = $(build_lib)
......@@ -178,6 +180,7 @@ check_SCRIPTS = \
elpa1_test_complex.sh \
elpa2_test_complex.sh \
elpa2_test_complex_default_kernel.sh \
elpa2_test_complex_default_kernel_qr_decomposition.sh \
elpa2_test_real_choose_kernel_with_api.sh \
elpa2_test_complex_choose_kernel_with_api.sh \
elpa2_print_kernels@SUFFIX@
......@@ -196,6 +199,10 @@ elpa2_test_real_default_kernel.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_default_kernel@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_default_kernel.sh
chmod +x elpa2_test_real_default_kernel.sh
elpa2_test_real_default_kernel_qr_decomposition.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_default_kernel_qr_decomposition.sh
chmod +x elpa2_test_real_default_kernel_qr_decomposition.sh
elpa2_test_real_choose_kernel_with_api.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_choose_kernel_with_api@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_choose_kernel_with_api.sh
chmod +x elpa2_test_real_choose_kernel_with_api.sh
......@@ -227,6 +234,7 @@ CLEANFILES = \
elpa1_test_complex.sh \
elpa2_test_real.sh \
elpa2_test_real_default_kernel.sh \
elpa2_test_real_default_kernel_qr_decomposition.sh \
elpa2_test_complex.sh \
elpa2_test_complex_default_kernel.sh \
elpa2_test_real_choose_kernel_with_api.sh \
......
......@@ -91,30 +91,26 @@ host_triplet = @host@
@HAVE_DETAILED_TIMINGS_TRUE@ src/ftimings/virtual_memory.c \
@HAVE_DETAILED_TIMINGS_TRUE@ src/ftimings/papi.c
@WITH_QR_TRUE@am__append_2 = src/elpa_qr/qr_utils.f90 \
@WITH_QR_TRUE@ src/elpa_qr/elpa_qrkernels.f90 \
@WITH_QR_TRUE@ src/elpa_qr/elpa_pdlarfb.f90 \
@WITH_QR_TRUE@ src/elpa_qr/elpa_pdgeqrf.f90
@WITH_REAL_GENERIC_KERNEL_TRUE@am__append_3 = src/elpa2_kernels/elpa2_kernels_real.f90
@WITH_COMPLEX_GENERIC_KERNEL_TRUE@am__append_4 = src/elpa2_kernels/elpa2_kernels_complex.f90
@WITH_REAL_GENERIC_SIMPLE_KERNEL_TRUE@am__append_5 = src/elpa2_kernels/elpa2_kernels_real_simple.f90
@WITH_COMPLEX_GENERIC_SIMPLE_KERNEL_TRUE@am__append_6 = src/elpa2_kernels/elpa2_kernels_complex_simple.f90
@WITH_REAL_BGP_KERNEL_TRUE@am__append_7 = src/elpa2_kernels/elpa2_kernels_real_bgp.f90
@WITH_REAL_BGQ_KERNEL_TRUE@am__append_8 = src/elpa2_kernels/elpa2_kernels_real_bgq.f90
@WITH_REAL_SSE_KERNEL_TRUE@am__append_9 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.s
@WITH_COMPLEX_SSE_KERNEL_TRUE@@WITH_REAL_SSE_KERNEL_FALSE@am__append_10 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.s
@WITH_REAL_AVX_BLOCK2_KERNEL_TRUE@am__append_11 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_2hv.c
@WITH_REAL_AVX_BLOCK4_KERNEL_TRUE@am__append_12 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_4hv.c
@WITH_REAL_AVX_BLOCK6_KERNEL_TRUE@am__append_13 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_6hv.c
@WITH_COMPLEX_AVX_BLOCK1_KERNEL_TRUE@am__append_14 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_1hv.cpp
@WITH_COMPLEX_AVX_BLOCK2_KERNEL_TRUE@am__append_15 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_2hv.cpp
@WITH_REAL_GENERIC_KERNEL_TRUE@am__append_2 = src/elpa2_kernels/elpa2_kernels_real.f90
@WITH_COMPLEX_GENERIC_KERNEL_TRUE@am__append_3 = src/elpa2_kernels/elpa2_kernels_complex.f90
@WITH_REAL_GENERIC_SIMPLE_KERNEL_TRUE@am__append_4 = src/elpa2_kernels/elpa2_kernels_real_simple.f90
@WITH_COMPLEX_GENERIC_SIMPLE_KERNEL_TRUE@am__append_5 = src/elpa2_kernels/elpa2_kernels_complex_simple.f90
@WITH_REAL_BGP_KERNEL_TRUE@am__append_6 = src/elpa2_kernels/elpa2_kernels_real_bgp.f90
@WITH_REAL_BGQ_KERNEL_TRUE@am__append_7 = src/elpa2_kernels/elpa2_kernels_real_bgq.f90
@WITH_REAL_SSE_KERNEL_TRUE@am__append_8 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.s
@WITH_COMPLEX_SSE_KERNEL_TRUE@@WITH_REAL_SSE_KERNEL_FALSE@am__append_9 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.s
@WITH_REAL_AVX_BLOCK2_KERNEL_TRUE@am__append_10 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_2hv.c
@WITH_REAL_AVX_BLOCK4_KERNEL_TRUE@am__append_11 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_4hv.c
@WITH_REAL_AVX_BLOCK6_KERNEL_TRUE@am__append_12 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_6hv.c
@WITH_COMPLEX_AVX_BLOCK1_KERNEL_TRUE@am__append_13 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_1hv.cpp
@WITH_COMPLEX_AVX_BLOCK2_KERNEL_TRUE@am__append_14 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_2hv.cpp
bin_PROGRAMS = elpa1_test_real@SUFFIX@$(EXEEXT) \
elpa1_test_complex@SUFFIX@$(EXEEXT) \
elpa2_test_real@SUFFIX@$(EXEEXT) \
elpa2_test_complex@SUFFIX@$(EXEEXT) \
elpa2_print_kernels@SUFFIX@$(EXEEXT)
noinst_PROGRAMS = elpa2_test_real_default_kernel@SUFFIX@$(EXEEXT) \
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@$(EXEEXT) \
elpa2_test_complex_default_kernel@SUFFIX@$(EXEEXT) \
elpa2_test_real_choose_kernel_with_api@SUFFIX@$(EXEEXT) \
elpa2_test_complex_choose_kernel_with_api@SUFFIX@$(EXEEXT)
......@@ -175,14 +171,13 @@ am__installdirs = "$(DESTDIR)$(libdir)" "$(DESTDIR)$(bindir)" \
LTLIBRARIES = $(lib_LTLIBRARIES)
libelpa@SUFFIX@_la_LIBADD =
am__libelpa@SUFFIX@_la_SOURCES_DIST = src/elpa1.F90 src/elpa2.F90 \
src/elpa_qr/qr_utils.f90 src/elpa_qr/elpa_qrkernels.f90 \
src/elpa_qr/elpa_pdlarfb.f90 src/elpa_qr/elpa_pdgeqrf.f90 \
src/timer.F90 src/ftimings/ftimings.F90 \
src/ftimings/ftimings_type.F90 src/ftimings/ftimings_value.F90 \
src/ftimings/highwater_mark.c src/ftimings/resident_set_size.c \
src/ftimings/time.c src/ftimings/virtual_memory.c \
src/ftimings/papi.c src/elpa_qr/qr_utils.f90 \
src/elpa_qr/elpa_qrkernels.f90 src/elpa_qr/elpa_pdlarfb.f90 \
src/elpa_qr/elpa_pdgeqrf.f90 \
src/elpa2_kernels/elpa2_kernels_real.f90 \
src/ftimings/papi.c src/elpa2_kernels/elpa2_kernels_real.f90 \
src/elpa2_kernels/elpa2_kernels_complex.f90 \
src/elpa2_kernels/elpa2_kernels_real_simple.f90 \
src/elpa2_kernels/elpa2_kernels_complex_simple.f90 \
......@@ -204,37 +199,35 @@ am__dirstamp = $(am__leading_dot)dirstamp
@HAVE_DETAILED_TIMINGS_TRUE@ src/ftimings/time.lo \
@HAVE_DETAILED_TIMINGS_TRUE@ src/ftimings/virtual_memory.lo \
@HAVE_DETAILED_TIMINGS_TRUE@ src/ftimings/papi.lo
@WITH_QR_TRUE@am__objects_2 = src/elpa_qr/qr_utils.lo \
@WITH_QR_TRUE@ src/elpa_qr/elpa_qrkernels.lo \
@WITH_QR_TRUE@ src/elpa_qr/elpa_pdlarfb.lo \
@WITH_QR_TRUE@ src/elpa_qr/elpa_pdgeqrf.lo
@WITH_REAL_GENERIC_KERNEL_TRUE@am__objects_3 = src/elpa2_kernels/elpa2_kernels_real.lo
@WITH_COMPLEX_GENERIC_KERNEL_TRUE@am__objects_4 = src/elpa2_kernels/elpa2_kernels_complex.lo
@WITH_REAL_GENERIC_SIMPLE_KERNEL_TRUE@am__objects_5 = src/elpa2_kernels/elpa2_kernels_real_simple.lo
@WITH_COMPLEX_GENERIC_SIMPLE_KERNEL_TRUE@am__objects_6 = src/elpa2_kernels/elpa2_kernels_complex_simple.lo
@WITH_REAL_BGP_KERNEL_TRUE@am__objects_7 = src/elpa2_kernels/elpa2_kernels_real_bgp.lo
@WITH_REAL_BGQ_KERNEL_TRUE@am__objects_8 = src/elpa2_kernels/elpa2_kernels_real_bgq.lo
@WITH_REAL_SSE_KERNEL_TRUE@am__objects_9 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.lo
@WITH_COMPLEX_SSE_KERNEL_TRUE@@WITH_REAL_SSE_KERNEL_FALSE@am__objects_10 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.lo
@WITH_REAL_AVX_BLOCK2_KERNEL_TRUE@am__objects_11 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_2hv.lo
@WITH_REAL_AVX_BLOCK4_KERNEL_TRUE@am__objects_12 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_4hv.lo
@WITH_REAL_AVX_BLOCK6_KERNEL_TRUE@am__objects_13 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_6hv.lo
@WITH_COMPLEX_AVX_BLOCK1_KERNEL_TRUE@am__objects_14 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_1hv.lo
@WITH_COMPLEX_AVX_BLOCK2_KERNEL_TRUE@am__objects_15 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_2hv.lo
@WITH_REAL_GENERIC_KERNEL_TRUE@am__objects_2 = src/elpa2_kernels/elpa2_kernels_real.lo
@WITH_COMPLEX_GENERIC_KERNEL_TRUE@am__objects_3 = src/elpa2_kernels/elpa2_kernels_complex.lo
@WITH_REAL_GENERIC_SIMPLE_KERNEL_TRUE@am__objects_4 = src/elpa2_kernels/elpa2_kernels_real_simple.lo
@WITH_COMPLEX_GENERIC_SIMPLE_KERNEL_TRUE@am__objects_5 = src/elpa2_kernels/elpa2_kernels_complex_simple.lo
@WITH_REAL_BGP_KERNEL_TRUE@am__objects_6 = src/elpa2_kernels/elpa2_kernels_real_bgp.lo
@WITH_REAL_BGQ_KERNEL_TRUE@am__objects_7 = src/elpa2_kernels/elpa2_kernels_real_bgq.lo
@WITH_REAL_SSE_KERNEL_TRUE@am__objects_8 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.lo
@WITH_COMPLEX_SSE_KERNEL_TRUE@@WITH_REAL_SSE_KERNEL_FALSE@am__objects_9 = src/elpa2_kernels/elpa2_kernels_asm_x86_64.lo
@WITH_REAL_AVX_BLOCK2_KERNEL_TRUE@am__objects_10 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_2hv.lo
@WITH_REAL_AVX_BLOCK4_KERNEL_TRUE@am__objects_11 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_4hv.lo
@WITH_REAL_AVX_BLOCK6_KERNEL_TRUE@am__objects_12 = src/elpa2_kernels/elpa2_kernels_real_sse-avx_6hv.lo
@WITH_COMPLEX_AVX_BLOCK1_KERNEL_TRUE@am__objects_13 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_1hv.lo
@WITH_COMPLEX_AVX_BLOCK2_KERNEL_TRUE@am__objects_14 = src/elpa2_kernels/elpa2_kernels_complex_sse-avx_2hv.lo
am_libelpa@SUFFIX@_la_OBJECTS = src/elpa1.lo src/elpa2.lo \
src/elpa_qr/qr_utils.lo src/elpa_qr/elpa_qrkernels.lo \
src/elpa_qr/elpa_pdlarfb.lo src/elpa_qr/elpa_pdgeqrf.lo \
$(am__objects_1) $(am__objects_2) $(am__objects_3) \
$(am__objects_4) $(am__objects_5) $(am__objects_6) \
$(am__objects_7) $(am__objects_8) $(am__objects_9) \
$(am__objects_10) $(am__objects_11) $(am__objects_12) \
$(am__objects_13) $(am__objects_14) $(am__objects_15)
$(am__objects_13) $(am__objects_14)
libelpa@SUFFIX@_la_OBJECTS = $(am_libelpa@SUFFIX@_la_OBJECTS)
PROGRAMS = $(bin_PROGRAMS) $(noinst_PROGRAMS)
am__elpa1_test_complex@SUFFIX@_SOURCES_DIST = test/test_complex.F90 \
test/util.F90 test/redir.c test/redirect.F90
@HAVE_REDIRECT_TRUE@am__objects_16 = test/redir.$(OBJEXT) \
@HAVE_REDIRECT_TRUE@am__objects_15 = test/redir.$(OBJEXT) \
@HAVE_REDIRECT_TRUE@ test/redirect.$(OBJEXT)
am_elpa1_test_complex@SUFFIX@_OBJECTS = test/test_complex.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa1_test_complex@SUFFIX@_OBJECTS = \
$(am_elpa1_test_complex@SUFFIX@_OBJECTS)
elpa1_test_complex@SUFFIX@_DEPENDENCIES = $(build_lib)
......@@ -245,7 +238,7 @@ am__v_lt_1 =
am__elpa1_test_real@SUFFIX@_SOURCES_DIST = test/test_real.F90 \
test/util.F90 test/redir.c test/redirect.F90
am_elpa1_test_real@SUFFIX@_OBJECTS = test/test_real.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa1_test_real@SUFFIX@_OBJECTS = \
$(am_elpa1_test_real@SUFFIX@_OBJECTS)
elpa1_test_real@SUFFIX@_DEPENDENCIES = $(build_lib)
......@@ -254,14 +247,14 @@ am__elpa2_print_kernels@SUFFIX@_SOURCES_DIST = \
test/redir.c test/redirect.F90
am_elpa2_print_kernels@SUFFIX@_OBJECTS = \
src/print_available_elpa2_kernels.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_print_kernels@SUFFIX@_OBJECTS = \
$(am_elpa2_print_kernels@SUFFIX@_OBJECTS)
elpa2_print_kernels@SUFFIX@_DEPENDENCIES = $(build_lib)
am__elpa2_test_complex@SUFFIX@_SOURCES_DIST = test/test_complex2.F90 \
test/util.F90 test/redir.c test/redirect.F90
am_elpa2_test_complex@SUFFIX@_OBJECTS = test/test_complex2.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_complex@SUFFIX@_OBJECTS = \
$(am_elpa2_test_complex@SUFFIX@_OBJECTS)
elpa2_test_complex@SUFFIX@_DEPENDENCIES = $(build_lib)
......@@ -270,7 +263,7 @@ am__elpa2_test_complex_choose_kernel_with_api@SUFFIX@_SOURCES_DIST = \
test/redir.c test/redirect.F90
am_elpa2_test_complex_choose_kernel_with_api@SUFFIX@_OBJECTS = \
test/test_complex2_choose_kernel_with_api.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_complex_choose_kernel_with_api@SUFFIX@_OBJECTS = $(am_elpa2_test_complex_choose_kernel_with_api@SUFFIX@_OBJECTS)
elpa2_test_complex_choose_kernel_with_api@SUFFIX@_DEPENDENCIES = \
$(build_lib)
......@@ -279,14 +272,14 @@ am__elpa2_test_complex_default_kernel@SUFFIX@_SOURCES_DIST = \
test/redir.c test/redirect.F90
am_elpa2_test_complex_default_kernel@SUFFIX@_OBJECTS = \
test/test_complex2_default_kernel.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_complex_default_kernel@SUFFIX@_OBJECTS = \
$(am_elpa2_test_complex_default_kernel@SUFFIX@_OBJECTS)
elpa2_test_complex_default_kernel@SUFFIX@_DEPENDENCIES = $(build_lib)
am__elpa2_test_real@SUFFIX@_SOURCES_DIST = test/test_real2.F90 \
test/util.F90 test/redir.c test/redirect.F90
am_elpa2_test_real@SUFFIX@_OBJECTS = test/test_real2.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_real@SUFFIX@_OBJECTS = \
$(am_elpa2_test_real@SUFFIX@_OBJECTS)
elpa2_test_real@SUFFIX@_DEPENDENCIES = $(build_lib)
......@@ -295,7 +288,7 @@ am__elpa2_test_real_choose_kernel_with_api@SUFFIX@_SOURCES_DIST = \
test/redir.c test/redirect.F90
am_elpa2_test_real_choose_kernel_with_api@SUFFIX@_OBJECTS = \
test/test_real2_choose_kernel_with_api.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_16)
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_OBJECTS = \
$(am_elpa2_test_real_choose_kernel_with_api@SUFFIX@_OBJECTS)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_DEPENDENCIES = \
......@@ -305,10 +298,19 @@ am__elpa2_test_real_default_kernel@SUFFIX@_SOURCES_DIST = \
test/redirect.F90
am_elpa2_test_real_default_kernel@SUFFIX@_OBJECTS = \
test/test_real2_default_kernel.$(OBJEXT) test/util.$(OBJEXT) \
$(am__objects_16)
$(am__objects_15)
elpa2_test_real_default_kernel@SUFFIX@_OBJECTS = \
$(am_elpa2_test_real_default_kernel@SUFFIX@_OBJECTS)
elpa2_test_real_default_kernel@SUFFIX@_DEPENDENCIES = $(build_lib)
am__elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_SOURCES_DIST = \
test/test_real2_default_kernel_qr_decomposition.F90 \
test/util.F90 test/redir.c test/redirect.F90
am_elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_OBJECTS = \
test/test_real2_default_kernel_qr_decomposition.$(OBJEXT) \
test/util.$(OBJEXT) $(am__objects_15)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_OBJECTS = $(am_elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_OBJECTS)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_DEPENDENCIES = \
$(build_lib)
AM_V_P = $(am__v_P_@AM_V@)
am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
am__v_P_0 = false
......@@ -403,7 +405,8 @@ SOURCES = $(libelpa@SUFFIX@_la_SOURCES) \
$(elpa2_test_complex_default_kernel@SUFFIX@_SOURCES) \
$(elpa2_test_real@SUFFIX@_SOURCES) \
$(elpa2_test_real_choose_kernel_with_api@SUFFIX@_SOURCES) \
$(elpa2_test_real_default_kernel@SUFFIX@_SOURCES)
$(elpa2_test_real_default_kernel@SUFFIX@_SOURCES) \
$(elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_SOURCES)
DIST_SOURCES = $(am__libelpa@SUFFIX@_la_SOURCES_DIST) \
$(am__elpa1_test_complex@SUFFIX@_SOURCES_DIST) \
$(am__elpa1_test_real@SUFFIX@_SOURCES_DIST) \
......@@ -413,7 +416,8 @@ DIST_SOURCES = $(am__libelpa@SUFFIX@_la_SOURCES_DIST) \
$(am__elpa2_test_complex_default_kernel@SUFFIX@_SOURCES_DIST) \
$(am__elpa2_test_real@SUFFIX@_SOURCES_DIST) \
$(am__elpa2_test_real_choose_kernel_with_api@SUFFIX@_SOURCES_DIST) \
$(am__elpa2_test_real_default_kernel@SUFFIX@_SOURCES_DIST)
$(am__elpa2_test_real_default_kernel@SUFFIX@_SOURCES_DIST) \
$(am__elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_SOURCES_DIST)
am__can_run_installinfo = \
case $$AM_UPDATE_INFO_DIR in \
n|no|NO) false;; \
......@@ -602,6 +606,7 @@ RECHECK_LOGS = $(TEST_LOGS)
am__EXEEXT_1 = elpa1_test_real.sh elpa2_test_real.sh \
elpa2_test_real_default_kernel.sh elpa1_test_complex.sh \
elpa2_test_complex.sh elpa2_test_complex_default_kernel.sh \
elpa2_test_complex_default_kernel_qr_decomposition.sh \
elpa2_test_real_choose_kernel_with_api.sh \
elpa2_test_complex_choose_kernel_with_api.sh \
elpa2_print_kernels@SUFFIX@$(EXEEXT)
......@@ -793,11 +798,13 @@ AM_LDFLAGS = $(SCALAPACK_LDFLAGS)
lib_LTLIBRARIES = libelpa@SUFFIX@.la
libelpa@SUFFIX@_la_LINK = $(FCLINK) $(AM_LDFLAGS) -version-info $(ELPA_SO_VERSION) -lstdc++
libelpa@SUFFIX@_la_SOURCES = src/elpa1.F90 src/elpa2.F90 \
src/elpa_qr/qr_utils.f90 src/elpa_qr/elpa_qrkernels.f90 \
src/elpa_qr/elpa_pdlarfb.f90 src/elpa_qr/elpa_pdgeqrf.f90 \
$(am__append_1) $(am__append_2) $(am__append_3) \
$(am__append_4) $(am__append_5) $(am__append_6) \
$(am__append_7) $(am__append_8) $(am__append_9) \
$(am__append_10) $(am__append_11) $(am__append_12) \
$(am__append_13) $(am__append_14) $(am__append_15)
$(am__append_13) $(am__append_14)
#if WITH_AVX_SANDYBRIDGE
# libelpa@SUFFIX@_la_SOURCES += src/elpa2_kernels/elpa2_kernels_real_sse-avx_2hv.c \
......@@ -820,6 +827,7 @@ dist_files_DATA = \
test/test_complex_gen.F90 \
test/test_real2.F90 \
test/test_real2_default_kernel.F90 \
test/test_real2_default_kernel_qr_decomposition.F90 \
test/test_real2_choose_kernel_with_api.F90 \
src/print_available_elpa2_kernels.F90 \
test/test_real.F90 \
......@@ -839,6 +847,8 @@ elpa2_test_real@SUFFIX@_SOURCES = test/test_real2.F90 test/util.F90 $(redirect_s
elpa2_test_real@SUFFIX@_LDADD = $(build_lib)
elpa2_test_real_default_kernel@SUFFIX@_SOURCES = test/test_real2_default_kernel.F90 test/util.F90 $(redirect_sources)
elpa2_test_real_default_kernel@SUFFIX@_LDADD = $(build_lib)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_SOURCES = test/test_real2_default_kernel_qr_decomposition.F90 test/util.F90 $(redirect_sources)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_LDADD = $(build_lib)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_SOURCES = test/test_real2_choose_kernel_with_api.F90 test/util.F90 $(redirect_sources)
elpa2_test_real_choose_kernel_with_api@SUFFIX@_LDADD = $(build_lib)
elpa1_test_complex@SUFFIX@_SOURCES = test/test_complex.F90 test/util.F90 $(redirect_sources)
......@@ -858,6 +868,7 @@ check_SCRIPTS = \
elpa1_test_complex.sh \
elpa2_test_complex.sh \
elpa2_test_complex_default_kernel.sh \
elpa2_test_complex_default_kernel_qr_decomposition.sh \
elpa2_test_real_choose_kernel_with_api.sh \
elpa2_test_complex_choose_kernel_with_api.sh \
elpa2_print_kernels@SUFFIX@
......@@ -867,6 +878,7 @@ CLEANFILES = \
elpa1_test_complex.sh \
elpa2_test_real.sh \
elpa2_test_real_default_kernel.sh \
elpa2_test_real_default_kernel_qr_decomposition.sh \
elpa2_test_complex.sh \
elpa2_test_complex_default_kernel.sh \
elpa2_test_real_choose_kernel_with_api.sh \
......@@ -974,6 +986,20 @@ src/$(DEPDIR)/$(am__dirstamp):
@: > src/$(DEPDIR)/$(am__dirstamp)
src/elpa1.lo: src/$(am__dirstamp) src/$(DEPDIR)/$(am__dirstamp)
src/elpa2.lo: src/$(am__dirstamp) src/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/$(am__dirstamp):
@$(MKDIR_P) src/elpa_qr
@: > src/elpa_qr/$(am__dirstamp)
src/elpa_qr/$(DEPDIR)/$(am__dirstamp):
@$(MKDIR_P) src/elpa_qr/$(DEPDIR)
@: > src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/qr_utils.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_qrkernels.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_pdlarfb.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_pdgeqrf.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/timer.lo: src/$(am__dirstamp) src/$(DEPDIR)/$(am__dirstamp)
src/ftimings/$(am__dirstamp):
@$(MKDIR_P) src/ftimings
......@@ -997,20 +1023,6 @@ src/ftimings/virtual_memory.lo: src/ftimings/$(am__dirstamp) \
src/ftimings/$(DEPDIR)/$(am__dirstamp)
src/ftimings/papi.lo: src/ftimings/$(am__dirstamp) \
src/ftimings/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/$(am__dirstamp):
@$(MKDIR_P) src/elpa_qr
@: > src/elpa_qr/$(am__dirstamp)
src/elpa_qr/$(DEPDIR)/$(am__dirstamp):
@$(MKDIR_P) src/elpa_qr/$(DEPDIR)
@: > src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/qr_utils.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_qrkernels.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_pdlarfb.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa_qr/elpa_pdgeqrf.lo: src/elpa_qr/$(am__dirstamp) \
src/elpa_qr/$(DEPDIR)/$(am__dirstamp)
src/elpa2_kernels/$(am__dirstamp):
@$(MKDIR_P) src/elpa2_kernels
@: > src/elpa2_kernels/$(am__dirstamp)
......@@ -1180,6 +1192,12 @@ test/test_real2_default_kernel.$(OBJEXT): test/$(am__dirstamp) \
elpa2_test_real_default_kernel@SUFFIX@$(EXEEXT): $(elpa2_test_real_default_kernel@SUFFIX@_OBJECTS) $(elpa2_test_real_default_kernel@SUFFIX@_DEPENDENCIES) $(EXTRA_elpa2_test_real_default_kernel@SUFFIX@_DEPENDENCIES)
@rm -f elpa2_test_real_default_kernel@SUFFIX@$(EXEEXT)
$(AM_V_FCLD)$(FCLINK) $(elpa2_test_real_default_kernel@SUFFIX@_OBJECTS) $(elpa2_test_real_default_kernel@SUFFIX@_LDADD) $(LIBS)
test/test_real2_default_kernel_qr_decomposition.$(OBJEXT): \
test/$(am__dirstamp) test/$(DEPDIR)/$(am__dirstamp)
elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@$(EXEEXT): $(elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_OBJECTS) $(elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_DEPENDENCIES) $(EXTRA_elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_DEPENDENCIES)
@rm -f elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@$(EXEEXT)
$(AM_V_FCLD)$(FCLINK) $(elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_OBJECTS) $(elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@_LDADD) $(LIBS)
mostlyclean-compile:
-rm -f *.$(OBJEXT)
......@@ -1625,6 +1643,13 @@ elpa2_test_complex_default_kernel.sh.log: elpa2_test_complex_default_kernel.sh
--log-file $$b.log --trs-file $$b.trs \
$(am__common_driver_flags) $(AM_LOG_DRIVER_FLAGS) $(LOG_DRIVER_FLAGS) -- $(LOG_COMPILE) \
"$$tst" $(AM_TESTS_FD_REDIRECT)
elpa2_test_complex_default_kernel_qr_decomposition.sh.log: elpa2_test_complex_default_kernel_qr_decomposition.sh
@p='elpa2_test_complex_default_kernel_qr_decomposition.sh'; \
b='elpa2_test_complex_default_kernel_qr_decomposition.sh'; \
$(am__check_pre) $(LOG_DRIVER) --test-name "$$f" \
--log-file $$b.log --trs-file $$b.trs \
$(am__common_driver_flags) $(AM_LOG_DRIVER_FLAGS) $(LOG_DRIVER_FLAGS) -- $(LOG_COMPILE) \
"$$tst" $(AM_TESTS_FD_REDIRECT)
elpa2_test_real_choose_kernel_with_api.sh.log: elpa2_test_real_choose_kernel_with_api.sh
@p='elpa2_test_real_choose_kernel_with_api.sh'; \
b='elpa2_test_real_choose_kernel_with_api.sh'; \
......@@ -1996,6 +2021,10 @@ elpa2_test_real_default_kernel.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_default_kernel@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_default_kernel.sh
chmod +x elpa2_test_real_default_kernel.sh
elpa2_test_real_default_kernel_qr_decomposition.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_default_kernel_qr_decomposition@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_default_kernel_qr_decomposition.sh
chmod +x elpa2_test_real_default_kernel_qr_decomposition.sh
elpa2_test_real_choose_kernel_with_api.sh:
echo 'mpiexec -n 2 ./elpa2_test_real_choose_kernel_with_api@SUFFIX@ $$TEST_FLAGS' > elpa2_test_real_choose_kernel_with_api.sh
chmod +x elpa2_test_real_choose_kernel_with_api.sh
......
......@@ -12,7 +12,7 @@ such improvements under the same exact terms of the (modified) LGPL v3
that we are using here. Please do not simply absorb ELPA into your own
project and then redistribute binary-only without making your exact
version of the ELPA source code (unmodified or MODIFIED) available as
well.
well.
*** Citing:
......@@ -20,15 +20,15 @@ well.
A description of some algorithms present in ELPA can be found in:
T. Auckenthaler, V. Blum, H.-J. Bungartz, T. Huckle, R. Johanni,
L. Kr\"amer, B. Lang, H. Lederer, and P. R. Willems,
L. Kr\"amer, B. Lang, H. Lederer, and P. R. Willems,
"Parallel solution of partial symmetric eigenvalue problems from
electronic structure calculations",
electronic structure calculations",
Parallel Computing 37, 783-794 (2011).
doi:10.1016/j.parco.2011.05.002.
doi:10.1016/j.parco.2011.05.002.
Marek, A.; Blum, V.; Johanni, R.; Havu, V.; Lang, B.; Auckenthaler,
Marek, A.; Blum, V.; Johanni, R.; Havu, V.; Lang, B.; Auckenthaler,
T.; Heinecke, A.; Bungartz, H.-J.; Lederer, H.
"The ELPA library: scalable parallel eigenvalue solutions for electronic
"The ELPA library: scalable parallel eigenvalue solutions for electronic
structure theory and computational science",
Journal of Physics Condensed Matter, 26 (2014)
doi:10.1088/0953-8984/26/21/213201
......@@ -38,10 +38,10 @@ well.
make appropriate reference to that as well, once it appears.
*** Copyright:
*** Copyright:
Copyright of the original code rests with the authors inside the ELPA
consortium. The code is distributed under the terms of the GNU Lesser General
consortium. The code is distributed under the terms of the GNU Lesser General
Public License version 3 (LGPL).
Please also note the express "NO WARRANTY" disclaimers in the GPL.
......@@ -50,7 +50,7 @@ Please see the file "COPYING" for details, and the files "gpl.txt" and
"lgpl.txt" for further information.
*** Using ELPA:
*** Using ELPA:
ELPA is designed to be compiled (Fortran) on its own, to be later
linked to your own application. In order to use ELPA, you must still
......@@ -69,98 +69,24 @@ are usually available from any HPC proprietary compiler vendors.
For example, Intel's ifort compiler contains the "math kernel library"
(MKL), providing BLAS/Lapack/BLACS/Scalapack functionality. (except on
Mac OS X, where the BLACS and Scalapack part must still be obtained
and compiled separately).
and compiled separately).
A very usable general-purpose MPI library is OpenMPI (ELPA was tested
with OpenMPI 1.4.3 for example). Intel MPI seems to be a very well
performing option on Intel platforms.
Examples of how to use ELPA are included in the accompanying
test_*.f90 subroutines in the "test" directory. An example makefile
"Makefile.example" is also included as a minimal example of how to
test_*.f90 subroutines in the "test" directory. An example makefile
"Makefile.example" is also included as a minimal example of how to
build and link ELPA to any other piece of code. In general, however,
we suggest to use the build environment in order to install ELPA
as library to your system.
as library to your system.
*** Structure of this repository:
* README file - this file. Please also consult the ELPA Wiki, and
consider adding any useful information that you may have.
* COPYING directory - the copyright and licensing information for ELPA.
* src directory - contains all the files that are needed for the
actual ELPA subroutines. If you are attempting to use ELPA in your
own application, these are the files which you need.
* test directory
- Contains the Makefile that demonstrates how to compile and link to
the ELPA routines
- All test programs solve a eigenvalue problem and check the correctnes
of the result by evaluating || A*x - x*lamba || and checking the
orthogonality of the eigenvectors
elpa1_test_real Real eigenvalue problem, 1 stage solver
test_real_gen Real generalized eigenvalue problem, 1 stage solver
elpa1test_complex Complex eigenvalue problem, 1 stage solver
test_complex_gen Complex generalized eigenvalue problem, 1 stage solver
elpa2_test_real Real eigenvalue problem, 2 stage solver
elpa2test_complex Complex eigenvalue problem, 2 stage solver
- There are two programs which read matrices from a file, solve the
eigenvalue problem, print the eigenvalues and check the correctness
of the result (all using elpa1 only)
read_real for the real eigenvalue problem
read_real_gen for the real generalized eigenvalue problem
A*x - B*x*lambda = 0
read_real has to be called with 1 command line argument (the file
containing the matrix). The file must be in ASCII (formatted) form.