Prepare ELPA 2014.06 release: update of documentation

- the ascii files "INSTALL", "README", and "RELEASE_NOTES" are
  updated
- the Makefile.example in the test directory is updated
parent 5545a83a
How to install ELPA:
----------------------
ELPA is shipped with a typical "configure" and "make" procedure. It is
recommended to use this way to install ELPA, see (A). If you do not want to
install ELPA as library, but to include it in your source code, please refer
to point (B). An example makefile "Makefile.example" can be found in ./test,
First of all, if you do not want to build ELPA yourself, and you run Linux,
it is worth haveing a look at the ELPA webpage and/or the repositories of
your Linux distibution: there exist pre-build packages for a few
distributions like Fedora, Debian, and Opensuse. More, will hopefully follow
in the future.
If you want to build (or have to since no packages are available) ELPA yourself,
please note that ELPA is shipped with a typical "configure" and "make"
procedure. It is recommended to use this way to install ELPA, see (A).
If you do not want to install ELPA as library, but to include it in your
source code, please refer to point (B).
An example makefile "Makefile.example" can be found in ./test,
to give some hints how this is done. Please distibute then all files of ELPA
with your code. Please note, that usage of ELPA as described in Section (B)
requires advanced knowledge about compilers, preprocessor flags, and
......@@ -43,29 +51,31 @@ The configure installation is best done in four steps
file "./src/elpa2_kernels/README_elpa2_kernels.txt".
1.2 Setting up blacs/scalapack
1.2 Setting up Blacs/Scalapack
Please point to your blacs/scalapack installation and the
linkline with the variables "BLACS_LDFLAGS" and "BLACS_FCFLAGS".
"BLACS_LDFLAGS" should contain the correct linkline for your
blacs/scalapack installation and "BLACS_FCFLAGS" the include path
The configure script tries to auto-detect an installed Blacs/Scalapack
library. If this is successfull, you do not have to specify anything
in this regard. However, this will fail, if you do not use Netlib
Blacs/Scalapack but vendor specific implementations (e.g. Intel's MKL
library or the implementation of Cray).
Please then point to your Blacs/Scalapack installation and the
linkline with the variables "SCALAPACK_LDFLAGS" and "SCALAPACK_FCFLAGS".
"SCALAPACK_LDFLAGS" should contain the correct linkline for your
Blacs/Scalapack installation and "SCALAPACK_FCFLAGS" the include path
and any other flags you need at compile time.
It is recommended that you use the "rpath functionality" in the linkline,
otherwise it will be necessary to update the LD_LIBRARY_PATH environment
variable.
You can either specify your own builds of lapack/blacs/scalapack
or use specialized Vendor packages, e.g. if available you can use
Intel's MKL.
The configure procedure will check whether blacs/scalapack is available
at build-time. If you do not set the variables "BLACS_LDFLAGS" and
"BLACS_FCFLAGS" the chances are high that ELPA will not build.
In any case, auto-detection of manual specifing of Blacs/Scalapack,
the configure procedure will check whether Blacs/Scalapack is available
at build-time and try to link with it.
1.3 Setting optimizations
Please set the optimisation that you would like with the
Please set the optimisation that you prefer with the
variable "FCFLAGS", "CFLAGS", and "CXXFLAGS", e.g. FCFLAGS="-O3 -xAVX",
please see "./src/elpa2_kernels/README_elpa2_kernels.txt".
......@@ -84,7 +94,8 @@ The configure installation is best done in four steps
1.5 Hybrid OpenMP support
If you want to use the hybrid MPI/OpenMP version of ELPA please specify
"--enable-openmp" or "--with-openmp".
"--with-openmp". Note that the ELPA library will then contain a "_mt" in
it's name to indicate multi threading support.
1.6 Other
......@@ -96,15 +107,16 @@ The configure installation is best done in four steps
3) run "make check"
a simple test of ELPA is done. At the moment the usage of "mpiexec"
is required. If this is not possible at your system, you can run the
binaries "test_real", "test_real2", "test_complex", "test_complex2",
"test_complex2_default_kernel", "test_complex2_choose_kernel_with_api",
"test_real2_default_kernel", and "test_real2_choose_kernel_with_api"
binaries "elpa1_test_real", "elpa2_test_real",
"elpa1_test_complex", "elpa2_test_complex",
"elpa2_test_complex_default_kernel", "elpa2_test_complex_choose_kernel_with_api",
"elpa2_test_real_default_kernel", and "elpa2_test_real_choose_kernel_with_api"
yourself. At the moment the tests check whether the residual and the
orthogonality of the found eigenvectors are lower than a threshold of
5e-12. If this test fails, or if you believe the threshold should be
even lower, please talk to us. Furthermore, your run-time choice of
ELPA kernels is tested. This is intended as a help to get used to this
new feature. With the same thought in mind a binary "print_available_elpa2_kernels"
new feature. With the same thought in mind, a binary "elpa2_print_kernels"
is provided, which is rather self-explanatory.
......
......@@ -26,6 +26,13 @@ well.
Parallel Computing 37, 783-794 (2011).
doi:10.1016/j.parco.2011.05.002.
Marek, A.; Blum, V.; Johanni, R.; Havu, V.; Lang, B.; Auckenthaler,
T.; Heinecke, A.; Bungartz, H.-J.; Lederer, H.
"The ELPA library: scalable parallel eigenvalue solutions for electronic
structure theory and computational science",
Journal of Physics Condensed Matter, 26 (2014)
doi:10.1088/0953-8984/26/21/213201
Please cite this paper when using ELPA. We also intend to publish an
overview description of the ELPA library as such, and ask you to
make appropriate reference to that as well, once it appears.
......@@ -87,51 +94,21 @@ as library to your system.
actual ELPA subroutines. If you are attempting to use ELPA in your
own application, these are the files which you need.
- elpa1.f90 contains routines for the one-stage solver,
The 1 stage solver (elpa1.f90) can be used standalone without elpa2.
- elpa2.f90 - ADDITIONAL routines needed for the two-stage solver
elpa2.f90 requires elpa1.f90 and a version of elpa2_kernels.f90, so
always compile them together.
- elpa2_kernels.f90 - optimized linear algebra kernels for ELPA.
This file is a generic version of optimized linear algebra kernels
for use with the ELPA library. The standard elpa2_kernels.f90 runs
on every platform but it is optimized for the Intel SSE instruction
set. Best perfomance is achieved with the Intel ifort compiler and
compile flags -O3 -xSSE4.2
For optimum performance on special architectures, you may wish to
investigate whether hand-tuned versions of this file give additional
gains. If so, simply remove elpa2_kernels.f90 from your compilation
and replace with the version of your choice. It would be great if
you could contribute such hand-tuned versions back to the
repository. (LGPL requirement for redistribution holds in any case)
- elpa2_kernels_bg.f90
Example of optimized ELPA kernels for the BlueGene/P
architecture. Use instead of the standard elpa2_kernels.f90
file. elpa2_kernels_bg.f90 contains assembler instructions for the
BlueGene/P processor which IBM's xlf Fortran compiler can handle.
* test directory
- Contains the Makefile that demonstrates how to compile and link to
the ELPA routines
- All files starting with test_... are for demonstrating the use
of the elpa library (but not needed for using it).
- All test programs solve a eigenvalue problem and check the correctnes
of the result by evaluating || A*x - x*lamba || and checking the
orthogonality of the eigenvectors
test_real Real eigenvalue problem, 1 stage solver
elpa1_test_real Real eigenvalue problem, 1 stage solver
test_real_gen Real generalized eigenvalue problem, 1 stage solver
test_complex Complex eigenvalue problem, 1 stage solver
elpa1test_complex Complex eigenvalue problem, 1 stage solver
test_complex_gen Complex generalized eigenvalue problem, 1 stage solver
test_real2 Real eigenvalue problem, 2 stage solver
test_complex2 Complex eigenvalue problem, 2 stage solver
elpa2_test_real Real eigenvalue problem, 2 stage solver
elpa2test_complex Complex eigenvalue problem, 2 stage solver
- There are two programs which read matrices from a file, solve the
eigenvalue problem, print the eigenvalues and check the correctness
......
This file contains the release notes for the ELPA 2014.06.000 version
This file contains the release notes for the ELPA 2014.06.001 version
What is new?
-------------
a)
With this release (and newer) it is not mandatory anymore to specify the real
and complex kernels at build-time! Instead the choice of kernel is now a
run-time option
......@@ -17,6 +19,24 @@ It is still possible to build ELPA with a specific real and complex kernel, if
one wants to obtain the old behaviour (see configure --help for the exact
options)
b)
At build time, configure now expects variables "SCALAPACK_FCFLAGS" and
"SCLAPACK_LDFLAGS" to be set, which replace the previous "BLACS_FCFLAGS" and
"BLACS_LDFLAGS".
c)
Binaries names for the test programs have been renamed: instead of
"test_real1" (for ELPA 1) and "test_real2" (for ELPA 2) and so forth, now
the binary names are "elpa1_test_real" and "elpa2_test_real" ...
d)
The name of the installed library has changed: since this release changes the
ABI of ELPA, it is possible to have several versions of ELPA installed with
different ABIs. In order to have an unique identifier, the library will from
now on be called "libelpa_[package_version].so" (for single threaded version)
and "libelpa_[package_version]_mt.so" (for the multi threaded version). In
this release this is "elpa_2014.06{_mt}.so"
Any incompatibles to previous version?
---------------------------------------
......@@ -24,6 +44,9 @@ Any incompatibles to previous version?
The ABI of ELPA has changed! It will be necessary to rebuild the programs using
ELPA if this new version should be used. Beware, that not rebuilding the user
programs most likely leads to undefined behaviour!
Note also, that the library names have changed, in order to reflect the new ABI
(see point d above).
......@@ -41,97 +41,57 @@
# ------------------------------------------------------------------------------
# Please set the variables below according to your system!
# ------------------------------------------------------------------------------
# Settings for Intel Fortran (Linux):
# It is strongly advised, not to use this Makefile, but build ELPA with the
# configure autotools. If you want to do it anyway, here are some hints
# Settings for Intel Fortran (Linux) with OpenMP and GENERIC_KERNELS:
#
F90=mpif90 -O3 -traceback -g -fpe0
PREPROCESSORFLAGS=-DWITH_OPENMP -DWITH_REAL_GENERIC_KERNEL -DWITH_COMPLEX_GENERIC_KERNEL
F90=mpif90 -O3 -traceback -g -openmp $(PREPROCESSORFLAGS)
F90OPT=$(F90) -xSSE4.2
LIBS = -L/opt/intel/Compiler/11.0/069/mkl/lib/em64t -lmkl_lapack -lmkl -lguide -lpthread \
-lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64
#
# ------------------------------------------------------------------------------
# Settings for Intel Fortran on MacOSX (home-built BLACS and scalapack):
#
#F90=mpif90 -O3 -traceback -g -fpe0
#F90OPT=$(F90) # -xSSE4.2 ### on Mac OSX, the -xSSE4.2 option is possibly buggy in ifort!
#LIBS = -L/opt/intel/mkl/lib -I/opt/intel/mkl/include -lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
# /usr/local/BLACS/LIB/blacs_MPI-OSX-0.a /usr/local/BLACS/LIB/blacsF77init_MPI-OSX-0.a \
# /usr/local/SCALAPACK-1.8.0/libscalapack.a
#
# ------------------------------------------------------------------------------
# Settings for IBM AIX Power6
#
#F90 = mpxlf95_r -q64 -O2 -g -qarch=auto -qtune=auto
#F90OPT = mpxlf95_r -q64 -O4 -g -qarch=auto -qtune=auto
#LIBS = -L/usr/local/lib -lscalapack -llapack-essl -lessl -lblacsF77init -lblacs -lblacsF77init -lblacs -lc
#
# ------------------------------------------------------------------------------
# Settings for IBM BlueGene/P
#
#F90 = mpixlf95_r -O3 -g -qarch=auto -qtune=auto
#F90OPT = mpixlf95_r -O4 -g -qarch=auto -qtune=auto
#LIBS = -L/usr/local/lib -lscalapack -llapack -lblacsF77init -lblacs -lblacsF77init -lblacs \
#-L/opt/ibmmath/essl/4.4/lib -lesslbg -lc
#
# ------------------------------------------------------------------------------
all: test_real read_real test_complex test_real_gen read_real_gen test_complex_gen test_real2 test_complex2
all: elpa1_test_real elpa1_test_complex elpa2_test_real elpa2_test_complex
test_real: test_real.o elpa1.o
$(F90) -o $@ test_real.o elpa1.o $(LIBS)
elpa1_test_real: elpa1_test_real.o elpa1.o
$(F90) -o $@ elpa1_test_real.o elpa1.o $(LIBS)
read_real: read_real.o elpa1.o
$(F90) -o $@ read_real.o elpa1.o $(LIBS)
test_complex: test_complex.o elpa1.o
elpa1_test_complex: test_complex.o elpa1.o
$(F90) -o $@ test_complex.o elpa1.o $(LIBS)
test_real_gen: test_real_gen.o elpa1.o
$(F90) -o $@ test_real_gen.o elpa1.o $(LIBS)
read_real_gen: read_real_gen.o elpa1.o
$(F90) -o $@ read_real_gen.o elpa1.o $(LIBS)
test_complex_gen: test_complex_gen.o elpa1.o
$(F90) -o $@ test_complex_gen.o elpa1.o $(LIBS)
test_real2: test_real2.o elpa1.o elpa2.o elpa2_kernels.o
$(F90) -o $@ test_real2.o elpa1.o elpa2.o elpa2_kernels.o $(LIBS)
test_complex2: test_complex2.o elpa1.o elpa2.o elpa2_kernels.o
$(F90) -o $@ test_complex2.o elpa1.o elpa2.o elpa2_kernels.o $(LIBS)
elpa2_test_real: elpa2_test_real.o elpa1.o elpa2.o elpa2_kernels_real.o elpa2_kernels_complex.o
$(F90) -o $@ elpa2_test_real.o elpa1.o elpa2.o elpa2_kernels_real.o elpa2_kernels_complex.o $(LIBS)
test_real.o: test_real.f90 elpa1.o
$(F90) -c $<
read_real.o: read_real.f90 elpa1.o
$(F90) -c $<
elpa2_test_complex: elpa2_test_complex.o elpa1.o elpa2.o elpa2_kernels_real.o elpa2_kernels_complex.o
$(F90) -o $@ elpa2_test_complex.o elpa1.o elpa2.o elpa2_kernels_real.o elpa2_kernels_complex.o $(LIBS)
test_complex.o: test_complex.f90 elpa1.o
elpa1_test_real.o: elpa1_test_real.F90 elpa1.o
$(F90) -c $<
test_real_gen.o: test_real_gen.f90 elpa1.o
elpa1_test_complex.o: elpa1_test_complex.F90 elpa1.o
$(F90) -c $<
read_real_gen.o: read_real_gen.f90 elpa1.o
elpa2_test_real.o: elpa2_test_real.F90 elpa1.o elpa2.o
$(F90) -c $<
test_complex_gen.o: test_complex_gen.f90 elpa1.o
elpa2_test_complex2.o: elpa2_test_complex.f90 elpa1.o elpa2.o
$(F90) -c $<
test_real2.o: test_real2.f90 elpa1.o elpa2.o
elpa1.o: ../src/elpa1.F90
$(F90) -c $<
test_complex2.o: test_complex2.f90 elpa1.o elpa2.o
$(F90) -c $<
elpa1.o: ../src/elpa1.f90
$(F90) -c $<
elpa2.o: ../src/elpa2.F90 elpa1.o
$(F90) -c ../src/elpa2.F90
elpa2.o: ../src/elpa2.f90 elpa1.o
$(F90) -c ../src/elpa2.f90
elpa2_kernels_real.o: ../src/elpa2_kernels_real.f90
$(F90OPT) -c ../src/elpa2_kernels_real.f90
elpa2_kernels.o: ../src/elpa2_kernels.f90
$(F90OPT) -c ../src/elpa2_kernels.f90
elpa2_kernels_complex.o: ../src/elpa2_kernels_complex.f90
$(F90OPT) -c ../src/elpa2_kernels_complex.f90
clean:
rm -f *.o *.mod test_real test_complex test_real_gen test_complex_gen test_real2 test_complex2 read_real read_real_gen
rm -f *.o *.mod elpa1_test_real elpa1_test_complex elpa2_test_real elpa2_test_complex
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment