Commit 607a1166 authored by Andreas Marek's avatar Andreas Marek

ELPA 2013.08.003

 - The INSTALL documentation was updated a bit
 - the documentation of the ELPA kernels was improved
 - the configure script was improved:

   if usage of an AVX kernel is specified, at configure time
   it is checked whether such a kernel can be build
   If not, it is checked whether the kernel can be build if the
   option "-mavx" is added to the CFLAGS and CXXFLAGS. If this is
   still not possible an error is thrown.

   if the option "--with-avx-optimization" than the CFLAGS and CXXFLAGS
   are automatically updated with some necessary flags (which are
   described in the kernel documentation file)
parent db2e0fab
......@@ -4,13 +4,10 @@ How to install ELPA:
ELPA is shipped with a typical "configure" and "make" procedure. It is
recommended to use this way to install ELPA, see (A). If you do not want to
install ELPA as library, but to include it in your source code, please refer
to point (B)
you can find a
"Makefile.example" in ./test, to see how this is done. Please distibute then
all files of ELPA with your code.
to point (B). An example makefile "Makefile.example" can be found in ./test,
to give some hints how this is done. Please distibute then all files of ELPA
with your code. Please note, that usage of ELPA as described in Section (B)
requires advanced knowledge about compilers, preprocessor flags, and optimizations.
(A): Installing ELPA as library with configure
......@@ -43,13 +40,12 @@ The configure installation is best done in four steps
You can either specify your own builds of lapack/blacs/scalapack
or use specialized Vendor packages, e.g. if available you can use
Intel's MKL. If you do not set these variables ELPA will not be
build!
Intel's MKL. If you do not set the variables "BLACS_LDFLAGS" and
"BLACS_FCFLAGS" ELPA will not be build!
Please set the optimisation that you would like with the
variable "FCFLAGS", "CFLAGS", and "CXXFLAGS", e.g. FCFLAGS="-O3 -xAVX".
For some elpa2-kernels, it is MANDATORY to set a few options,
variable "FCFLAGS", "CFLAGS", and "CXXFLAGS", e.g. FCFLAGS="-O3 -xAVX",
please see "./src/elpa2_kernels/README_elpa2_kernels.txt".
Set the "prefix" - flag, if you wish another installation location than
......@@ -82,15 +78,28 @@ B) Installing ELPA without the autotools procedure
If you do so, please distibute then all files of ELPA with your code.
However, this is not the recommended way for several reasons:
- from the last release, ELPA has grown substantially in performance
optimizations but also complexity. The simple "just use elpa source
files in your code" approach is becoming more and more difficult.
- from the last release, ELPA has grown substantially in performance
optimizations but also complexity. The simple "just use elpa source
files in your code" approach is becoming more and more difficult.
- you still have to choose an elpa2-kernel (see at (A)). Getting them
build from hand might be tedious.
- the file elpa2.F90 uses preprocessor defines for the different kernels.
you will have to do this by hand, if you do not use the autotools
infrastructure.
If the above warnings do not frighten you to build ELPA without the
"configure & make" procedure, here are some hints that might be
useful for you
- choose the appropriate compiler and compiler optimization flags,
take care that -- depending on the use case of ELPA -- you will need
a fortran, (GNU) c, and GNU C++ compiler.
- make yourself accquainted with the used preprocessor flags which are
used in ELPA. Write by hand a file "config-f90.h", which defines the
necessary preprocessor flags for your desired build
How to use ELPA:
......
This diff is collapsed.
This diff is collapsed.
......@@ -4,7 +4,7 @@
me=ar-lib
scriptversion=2012-03-01.08; # UTC
# Copyright (C) 2010-2012 Free Software Foundation, Inc.
# Copyright (C) 2010-2013 Free Software Foundation, Inc.
# Written by Peter Rosin <peda@lysator.liu.se>.
#
# This program is free software; you can redistribute it and/or modify
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#! /bin/sh
# Wrapper for compilers which do not understand '-c -o'.
scriptversion=2012-03-05.13; # UTC
scriptversion=2012-10-14.11; # UTC
# Copyright (C) 1999-2012 Free Software Foundation, Inc.
# Copyright (C) 1999-2013 Free Software Foundation, Inc.
# Written by Tom Tromey <tromey@cygnus.com>.
#
# This program is free software; you can redistribute it and/or modify
......@@ -112,6 +112,11 @@ func_cl_dashl ()
lib=$dir/$lib.lib
break
fi
if test -f "$dir/lib$lib.a"; then
found=yes
lib=$dir/lib$lib.a
break
fi
done
IFS=$save_IFS
......
This diff is collapsed.
#! /bin/sh
# Configuration validation subroutine script.
# Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
# 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
# 2011, 2012 Free Software Foundation, Inc.
# Copyright 1992-2013 Free Software Foundation, Inc.
timestamp='2012-04-18'
timestamp='2013-04-24'
# This file is (in principle) common to ALL GNU software.
# The presence of a machine in this file suggests that SOME GNU software
# can handle that machine. It does not imply ALL GNU software can.
#
# This file is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, see <http://www.gnu.org/licenses/>.
......@@ -26,11 +20,12 @@ timestamp='2012-04-18'
# As a special exception to the GNU General Public License, if you
# distribute this file as part of a program that contains a
# configuration script generated by Autoconf, you may include it under
# the same distribution terms that you use for the rest of that program.
# the same distribution terms that you use for the rest of that
# program. This Exception is an additional permission under section 7
# of the GNU General Public License, version 3 ("GPLv3").
# Please send patches to <config-patches@gnu.org>. Submit a context
# diff and a properly formatted GNU ChangeLog entry.
# Please send patches with a ChangeLog entry to config-patches@gnu.org.
#
# Configuration subroutine to validate and canonicalize a configuration type.
# Supply the specified configuration type as an argument.
......@@ -73,9 +68,7 @@ Report bugs and patches to <config-patches@gnu.org>."
version="\
GNU config.sub ($timestamp)
Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
Free Software Foundation, Inc.
Copyright 1992-2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
......@@ -123,7 +116,7 @@ esac
maybe_os=`echo $1 | sed 's/^\(.*\)-\([^-]*-[^-]*\)$/\2/'`
case $maybe_os in
nto-qnx* | linux-gnu* | linux-android* | linux-dietlibc | linux-newlib* | \
linux-uclibc* | uclinux-uclibc* | uclinux-gnu* | kfreebsd*-gnu* | \
linux-musl* | linux-uclibc* | uclinux-uclibc* | uclinux-gnu* | kfreebsd*-gnu* | \
knetbsd*-gnu* | netbsd*-gnu* | \
kopensolaris*-gnu* | \
storm-chaos* | os2-emx* | rtmk-nova*)
......@@ -156,7 +149,7 @@ case $os in
-convergent* | -ncr* | -news | -32* | -3600* | -3100* | -hitachi* |\
-c[123]* | -convex* | -sun | -crds | -omron* | -dg | -ultra | -tti* | \
-harris | -dolphin | -highlevel | -gould | -cbm | -ns | -masscomp | \
-apple | -axis | -knuth | -cray | -microblaze)
-apple | -axis | -knuth | -cray | -microblaze*)
os=
basic_machine=$1
;;
......@@ -259,8 +252,10 @@ case $basic_machine in
| alpha | alphaev[4-8] | alphaev56 | alphaev6[78] | alphapca5[67] \
| alpha64 | alpha64ev[4-8] | alpha64ev56 | alpha64ev6[78] | alpha64pca5[67] \
| am33_2.0 \
| arc | arm | arm[bl]e | arme[lb] | armv[2345] | armv[345][lb] | avr | avr32 \
| be32 | be64 \
| arc | arceb \
| arm | arm[bl]e | arme[lb] | armv[2-8] | armv[3-8][lb] | armv7[arm] \
| avr | avr32 \
| be32 | be64 \
| bfin \
| c4x | clipper \
| d10v | d30v | dlx | dsp16xx \
......@@ -273,7 +268,7 @@ case $basic_machine in
| le32 | le64 \
| lm32 \
| m32c | m32r | m32rle | m68000 | m68k | m88k \
| maxq | mb | microblaze | mcore | mep | metag \
| maxq | mb | microblaze | microblazeel | mcore | mep | metag \
| mips | mipsbe | mipseb | mipsel | mipsle \
| mips16 \
| mips64 | mips64el \
......@@ -291,16 +286,17 @@ case $basic_machine in
| mipsisa64r2 | mipsisa64r2el \
| mipsisa64sb1 | mipsisa64sb1el \
| mipsisa64sr71k | mipsisa64sr71kel \
| mipsr5900 | mipsr5900el \
| mipstx39 | mipstx39el \
| mn10200 | mn10300 \
| moxie \
| mt \
| msp430 \
| nds32 | nds32le | nds32be \
| nios | nios2 \
| nios | nios2 | nios2eb | nios2el \
| ns16k | ns32k \
| open8 \
| or32 \
| or1k | or32 \
| pdp10 | pdp11 | pj | pjl \
| powerpc | powerpc64 | powerpc64le | powerpcle \
| pyramid \
......@@ -370,7 +366,7 @@ case $basic_machine in
| aarch64-* | aarch64_be-* \
| alpha-* | alphaev[4-8]-* | alphaev56-* | alphaev6[78]-* \
| alpha64-* | alpha64ev[4-8]-* | alpha64ev56-* | alpha64ev6[78]-* \
| alphapca5[67]-* | alpha64pca5[67]-* | arc-* \
| alphapca5[67]-* | alpha64pca5[67]-* | arc-* | arceb-* \
| arm-* | armbe-* | armle-* | armeb-* | armv*-* \
| avr-* | avr32-* \
| be32-* | be64-* \
......@@ -389,7 +385,8 @@ case $basic_machine in
| lm32-* \
| m32c-* | m32r-* | m32rle-* \
| m68000-* | m680[012346]0-* | m68360-* | m683?2-* | m68k-* \
| m88110-* | m88k-* | maxq-* | mcore-* | metag-* | microblaze-* \
| m88110-* | m88k-* | maxq-* | mcore-* | metag-* \
| microblaze-* | microblazeel-* \
| mips-* | mipsbe-* | mipseb-* | mipsel-* | mipsle-* \
| mips16-* \
| mips64-* | mips64el-* \
......@@ -407,12 +404,13 @@ case $basic_machine in
| mipsisa64r2-* | mipsisa64r2el-* \
| mipsisa64sb1-* | mipsisa64sb1el-* \
| mipsisa64sr71k-* | mipsisa64sr71kel-* \
| mipsr5900-* | mipsr5900el-* \
| mipstx39-* | mipstx39el-* \
| mmix-* \
| mt-* \
| msp430-* \
| nds32-* | nds32le-* | nds32be-* \
| nios-* | nios2-* \
| nios-* | nios2-* | nios2eb-* | nios2el-* \
| none-* | np1-* | ns16k-* | ns32k-* \
| open8-* \
| orion-* \
......@@ -788,9 +786,13 @@ case $basic_machine in
basic_machine=ns32k-utek
os=-sysv
;;
microblaze)
microblaze*)
basic_machine=microblaze-xilinx
;;
mingw64)
basic_machine=x86_64-pc
os=-mingw64
;;
mingw32)
basic_machine=i386-pc
os=-mingw32
......@@ -1019,7 +1021,11 @@ case $basic_machine in
basic_machine=i586-unknown
os=-pw32
;;
rdos)
rdos | rdos64)
basic_machine=x86_64-pc
os=-rdos
;;
rdos32)
basic_machine=i386-pc
os=-rdos
;;
......@@ -1346,21 +1352,21 @@ case $os in
-gnu* | -bsd* | -mach* | -minix* | -genix* | -ultrix* | -irix* \
| -*vms* | -sco* | -esix* | -isc* | -aix* | -cnk* | -sunos | -sunos[34]*\
| -hpux* | -unos* | -osf* | -luna* | -dgux* | -auroraux* | -solaris* \
| -sym* | -kopensolaris* \
| -sym* | -kopensolaris* | -plan9* \
| -amigaos* | -amigados* | -msdos* | -newsos* | -unicos* | -aof* \
| -aos* | -aros* \
| -nindy* | -vxsim* | -vxworks* | -ebmon* | -hms* | -mvs* \
| -clix* | -riscos* | -uniplus* | -iris* | -rtu* | -xenix* \
| -hiux* | -386bsd* | -knetbsd* | -mirbsd* | -netbsd* \
| -openbsd* | -solidbsd* \
| -bitrig* | -openbsd* | -solidbsd* \
| -ekkobsd* | -kfreebsd* | -freebsd* | -riscix* | -lynxos* \
| -bosx* | -nextstep* | -cxux* | -aout* | -elf* | -oabi* \
| -ptx* | -coff* | -ecoff* | -winnt* | -domain* | -vsta* \
| -udi* | -eabi* | -lites* | -ieee* | -go32* | -aux* \
| -chorusos* | -chorusrdb* | -cegcc* \
| -cygwin* | -msys* | -pe* | -psos* | -moss* | -proelf* | -rtems* \
| -mingw32* | -linux-gnu* | -linux-android* \
| -linux-newlib* | -linux-uclibc* \
| -mingw32* | -mingw64* | -linux-gnu* | -linux-android* \
| -linux-newlib* | -linux-musl* | -linux-uclibc* \
| -uxpv* | -beos* | -mpeix* | -udk* \
| -interix* | -uwin* | -mks* | -rhapsody* | -darwin* | -opened* \
| -openstep* | -oskit* | -conix* | -pw32* | -nonstopux* \
......@@ -1492,9 +1498,6 @@ case $os in
-aros*)
os=-aros
;;
-kaos*)
os=-kaos
;;
-zvmoe)
os=-zvmoe
;;
......@@ -1543,6 +1546,9 @@ case $basic_machine in
c4x-* | tic4x-*)
os=-coff
;;
hexagon-*)
os=-elf
;;
tic54x-*)
os=-coff
;;
......@@ -1583,6 +1589,9 @@ case $basic_machine in
mips*-*)
os=-elf
;;
or1k-*)
os=-elf
;;
or32-*)
os=-coff
;;
......
This diff is collapsed.
AC_PREREQ([2.69])
AC_INIT([elpa],[2013.08.002], elpa-library@rzg.mpg.de)
AC_INIT([elpa],[2013.08.003], elpa-library@rzg.mpg.de)
AC_CONFIG_SRCDIR([src/elpa1.f90])
AM_INIT_AUTOMAKE([foreign -Wall subdir-objects])
......@@ -7,6 +7,15 @@ AC_CONFIG_MACRO_DIR([m4])
AC_CONFIG_HEADERS([config.h])
#AM_SILENT_RULES([yes])
AX_CHECK_GNU_MAKE()
if test x$_cv_gnu_make_command = x ; then
AC_MSG_ERROR([Need GNU Make])
fi
AC_CHECK_PROG(CPP_FOUND,cpp,yes,no)
if test "x${CPP_FOUND}" = xno; then
AC_MSG_ERROR([no cpp found])
fi
AC_PROG_INSTALL
AM_PROG_CC_C_O
......@@ -14,27 +23,6 @@ AM_PROG_AR
AM_PROG_AS
AC_PROG_CXX
AC_LANG(Fortran)
m4_include([m4/ax_prog_fc_mpi.m4])
dnl check whether an mpi compiler is available;
dnl if not abort since it is mandatory
AX_PROG_FC_MPI([],[have_mpi=yes],[have_mpi=no
if test "x${have_mpi}" = xno; then
AC_MSG_ERROR([no mpi found])
fi])
AC_SUBST([ELPA_LIB_VERSION], [2013.08.002])
# this is the version of the API, should be changed in the major revision
# if and only if the actual API changes
AC_SUBST([ELPA_SO_VERSION], [0:0:0])
AC_FC_FREEFORM
AC_FC_MODULE_FLAG
AC_FC_MODULE_OUTPUT_FLAG
dnl macro for an --with-$2 switch that sets the
dnl preprocessor define $1, with description $3, default $4, possible values $5
AC_DEFUN([DEFINE_OPTION],[
......@@ -53,6 +41,7 @@ AC_DEFUN([DEFINE_OPTION],[
fi
])
DEFINE_OPTION([WITH_GENERIC], [generic],
[use generic kernel for all architectures (with some hand-coded optimizations)],
[no],[])
......@@ -101,6 +90,102 @@ DEFINE_OPTION([WITH_AVX_REAL_BLOCK6], [avx-real-block6],
[use AVX optimized real kernel with blocking 6 (written in gcc assembler)],
[no],[])
dnl check whether we better check if AVX compilation should work
if test "x${with_avx_sandybridge}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_amd_bulldozer}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_avx_complex_block1}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_avx_complex_block2}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_avx_real_block2}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_avx_real_block4}" = xyes; then
check_avx_compilation=yes
fi
if test "x${with_avx_real_block6}" = xyes; then
check_avx_compilation=yes
fi
dnl if necessary do the check for avx compilation
if test "x${check_avx_compilation}" = xyes; then
dnl check whether one can compile with avx - gcc intrinsics
AC_MSG_CHECKING([whether we can compile a gcc intrinsic AVX program])
dnl first pass: try with specified CFLAGS and CXXFLAGS
AC_COMPILE_IFELSE([AC_LANG_SOURCE([
#include <x86intrin.h>
void main(){
double* q;
__m256d a1_1 = _mm256_load_pd(q);
}
])],
[can_compile_avx_prog=yes],
[can_compile_avx_prog=no]
)
dnl first test failed: try again after updating CFLAGS and CXXFLAGS with -mavx
if test "x${can_compile_avx_prog}" = xno; then
CFLAGS="$CFLAGS -mavx"
CXXFLAGS="$CXXFLAGS -mavx"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([
#include <x86intrin.h>
void main(){
double* q;
__m256d a1_1 = _mm256_load_pd(q);
}
])],
[can_compile_avx_prog=yes],
[can_compile_avx_prog=no]
)
fi
AC_MSG_RESULT([${can_compile_avx_prog}])
if test "x${can_compile_avx_prog}" = xno; then
AC_MSG_ERROR([could not compile with gcc AVX intrinsic! Maybe choose another kernel])
fi
fi
dnl set the AVX optimization flags if this option is specified
AC_MSG_CHECKING(whether AVX optimization flags should be set automatically)
AC_ARG_WITH([avx-optimization],
AS_HELP_STRING([--with-avx-optimization],
[use AVX optimization, default no.]),
[with_avx_optimization=yes],
[with_avx_optimization=no])
AC_MSG_RESULT([${with_avx_optimization}])
if test "x${with_avx_optimization}" = xyes; then
CFLAGS="$CFLAGS -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize"
CXXFLAGS="$CXXFLAGS -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vect-loop-version -ftree-vectorize"
fi
AC_LANG(Fortran)
m4_include([m4/ax_prog_fc_mpi.m4])
dnl check whether an mpi compiler is available;
dnl if not abort since it is mandatory
AX_PROG_FC_MPI([],[have_mpi=yes],[have_mpi=no
if test "x${have_mpi}" = xno; then
AC_MSG_ERROR([no mpi found])
fi])
AC_SUBST([ELPA_LIB_VERSION], [2013.08.003])
# this is the version of the API, should be changed in the major revision
# if and only if the actual API changes
AC_SUBST([ELPA_SO_VERSION], [0:0:0])
AC_FC_FREEFORM
AC_FC_MODULE_FLAG
AC_FC_MODULE_OUTPUT_FLAG
save_FCFLAGS=$FCFLAGS
......
This diff is collapsed.
This diff is collapsed.
......@@ -27,11 +27,15 @@ Currently we offer the following alternatives for the ELPA2 kernels:
in the hope to get optimal code from most FORTRAN
compilers. The configure option "--with-generic"
uses these kernels. They are propably a good
default if you do not know which kernel to use.
default if you do not know which kernel
to use. Note that in the real version,
there is used a complex variable in
order to enforce better compiler
optimizations. This produces correct
code, however, some compilers might
produce a warning.
* elpa2_kernels_{real|complex}_simple.f90
- Plain and simple version of elpa2_kernels.f90.
......@@ -67,7 +71,6 @@ Currently we offer the following alternatives for the ELPA2 kernels:
e.g. Intel Nehalem.
Several
* elpa2_kernels_{real|complex}_sse-avx_*.c(pp)
......@@ -88,6 +91,10 @@ Several
-ftree-vectorize"
for best performace results.
For convenience the flag
"--with-avx-optimization" sets these
CFLAGS and CXXFLAGS automatically.
On Intel Sandybridge architectures the
configure option "--with-intel-sandybride"
use the best combination.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment