1. 24 Apr, 2016 1 commit
  2. 23 Apr, 2016 1 commit
  3. 22 Apr, 2016 2 commits
  4. 21 Apr, 2016 2 commits
  5. 20 Apr, 2016 3 commits
    • Andreas Marek's avatar
      Error in Makefile.am · 5499416d
      Andreas Marek authored
      5499416d
    • Andreas Marek's avatar
      Change of test programs · e18155f9
      Andreas Marek authored
      e18155f9
    • Andreas Marek's avatar
      Additional configure check for gcc SSE intrinsics · 896388e9
      Andreas Marek authored
      It turned out that if a CPU supports SSE the already existing
      test for SSE assembly instructions always passes.
      However, the compilation of gcc SSE intrinic instructions might
      nevertheless fail if gcc is not called with one of the options
      "-msse3", "-msse4" , "-msse4.1", "-msse4.2", "-mavx", or "-mavx2"!
      
      Obviously gcc does still not consider SSE as a standard on X86_64
      Intel CPUs.
      
      An additional configure test has been introduced, which test for
      gcc intrinsic sse instructions. If this test fails, the corresponding
      kernels are switched off.
      896388e9
  6. 19 Apr, 2016 3 commits
  7. 12 Apr, 2016 1 commit
  8. 08 Apr, 2016 1 commit
    • Lorenz Hüdepohl's avatar
      AVX kernels need aligned memory · 59e405e0
      Lorenz Hüdepohl authored
      For the Intel compiler, this was assured with the pragma
      
        !DEC$ ATTRIBUTES ALIGN: 64:: a
      
      however, other compilers such as gcc of course did not honour this,
      which could result in SIGSEGVs in case the variable was not aligned to
      32 bytes (by chance!).
      
      This fixes issue #11, thanks to Nico Holmberg for reporting this.
      59e405e0
  9. 05 Apr, 2016 1 commit
    • Andreas Marek's avatar
      Introduction of new SSE kernels with different blocking · 69792b15
      Andreas Marek authored
      The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex)
      case are now available by default
      
      Thus the following changes have been done
      - introduce new macros in configure.ac and Makefile.am
      - renmae the AVX kernels in AVX_AVX2 (they also support AVX2)
      - introduce new files with SSE kernel
      - introduce new kernel parameters !
      - make the SSE kernels callable
      
      The results are identical with previous kernels
      69792b15
  10. 04 Apr, 2016 1 commit
  11. 01 Apr, 2016 1 commit
  12. 18 Mar, 2016 1 commit
    • Andreas Marek's avatar
      Allow ELPA to be build with single and double precision symbols in one · 647aa5a8
      Andreas Marek authored
      library
      
      It the configure option "--enable-single-precision" is specified,
      ELPA will also be build for single precision usage. The double precision
      and single precision will be available at the same time with names
      "solve_evp_real_1stage_double" or "solve_evp_real_1stage_single" and
      so on...
      
      This change immplied some major refactoring of the ELPA code:
      1.) functions/procedures had to be renamed with suffix "_double"
      
      2.) If necessary the same functions have to be available with suffix
      "_single"
      
      3.) Variable kind definitions have to be consistent with the
      intented use
      
      To avoid uneccessary code duplication this is done (most of the time)
      with preprocessor string substitution.
      
      The documentation has been updated.
      
      NOT SUPPORTED are at the moment:
      
      - single precision usage of ELPA2 with kernels, others than "generic"
        and "generic_simple"
      
      - single precision usage of GPU
      647aa5a8
  13. 26 Feb, 2016 1 commit
  14. 24 Feb, 2016 3 commits
    • Andreas Marek's avatar
      Add migration notice · 31a03aa2
      Andreas Marek authored
      31a03aa2
    • Andreas Marek's avatar
      Template for print messages of test programs · 296e4f48
      Andreas Marek authored
      The test programs include the same template now, the
      printed messages are thus unified
      296e4f48
    • Andreas Marek's avatar
      Optional build of ELPA without MPI · 49f119aa
      Andreas Marek authored
      The configure flag "--enable-shared-memory-only" triggers a build
      of ELPA without MPI support:
      
      - all MPI calls are skipped (or overloaded)
      - all calls to scalapack functions are replaced by the corresponding
        lapack calls
      - all calls to blacs are skipped
      
      Using ELPA without MPI gives the same results as using ELPA with 1 MPI
      task!
      
      This version is not yet optimized for performance, here and there some
      unecessary copies are done.
      
      Ths version is intended for users, who do not have MPI in their
      application but still would like to use ELPA on one compute node
      49f119aa
  15. 11 Feb, 2016 1 commit
    • Andreas Marek's avatar
      Enable single-precision calculations for ELPA1 · de6a4fde
      Andreas Marek authored
      With the configure option "--enable-single-precision" ELPA1 is build
      with single-precision (half-words) only.
      
      The best precision in single-precision (float or complex) is
      2^-23 ~ 1.2e-7. The accuracy of the error residual of ELPA1 in
      single-precision mode is of the order 1e-4 to 1e-5. The orthogonality of
      the EV's is fullfilled up to about ~1e-6.
      
      Thus the precision of ELPA1 in single-precision mode is roughly 100 -
      1000 times less than the best achievable precison. This is consistent
      with the double-precision mode, where also a factor of 100 - 1000 less
      precision than the theoretical best one is found.
      
      The float EVs are identical to the double EVs to at least 1e-2, the
      precision of the EVs is thus about 1e-7/1e-2 = 1e5 times lower than the
      best theoretical precision. If the same holds for the double precision
      calculations, this implies that the double precision results can also
      be only trusted on the level 1e-11 (5 orders of magnitude larger
      than the best theoretical precision)
      
      The best speed-up compared to the double precision calculation is
      a factor of two. This is by far not achieved yet, since the singl
      precision version is not at all optimized at the moment
      de6a4fde
  16. 02 Feb, 2016 5 commits
  17. 22 Jan, 2016 1 commit
  18. 19 Jan, 2016 2 commits
  19. 11 Jan, 2016 1 commit
  20. 04 Jan, 2016 1 commit
    • Andreas Marek's avatar
      Started to remove depecrated Fortran variable declerations · 0a05f7d3
      Andreas Marek authored
      The Fortran variable declerations "variable type*[4,8,16]" is non
      Fortran standard. It might cause problem in the future.
      Furthermore, the usage of Fortran and C togehther is more clean
      if variables are defined according to C variable types.
      
      This is done, now for all the test programs
      0a05f7d3
  21. 22 Dec, 2015 2 commits
  22. 16 Dec, 2015 1 commit
    • Andreas Marek's avatar
      Add interface to unify C and Fortran names · bb046d1c
      Andreas Marek authored
      This commit does not change the interfaces defined in ELPA_2015.11.001 !
      All functionality is available via the interface names and definitions
      as in ELPA_2015.11.001
      
      But some new interfaces have been added, in order to unfiy the
      references from C and Fortran codes:
      
      - The procedures to create the ELPA (row/column) communicators are now
        available from C _and_ Fortran with the name "get_elpa_communicators".
        The old Fortran name "get_elpa_row_col_comms" and the old C name
        "elpa_get_communicators" are from now on deprecated but still available
      
      - The 1-stage solver routines are available from C _and_ Fortran via
        the names "solve_evp_real_1stage" and "solve_evp_complex_1stage".
        The old Fortran names "solve_evp_real" and "solve_evp_complex" are
        from now on deprecated but still functional.
      
      All documentation (man pages, doxygen, and example test programs) have
      been changed accordingly.
      
      This commit implies a change in the API versioning number, but no
      changes to codes calling ELPA (if they have been already updated to the
      API of ELPA_2015.11.001)
      bb046d1c
  23. 15 Dec, 2015 1 commit
    • Andreas Marek's avatar
      Man pages for ELPA · b1df09cd
      Andreas Marek authored
      For the library functions which are accessible by the user
      man pages decribing the Fortran and C interface exist:
      
      -get_elpa_row_comms
      -solve_evp_real , solve_evp_complex
      -solve_evp_real_2stage, solve_evp_complex_2stage
      
      For the "service binary" print_available_elpa2_kernels,
      also a man page exists.
      
      TODO: extend man pages to test-binaries, or do not install test-binaries
      b1df09cd
  24. 10 Dec, 2015 3 commits
    • Andreas Marek's avatar
      Create doxygen documentation for ELPA · 927f988a
      Andreas Marek authored
      The user functions of ELPA are now documented with doxygen tags.
      At the moment the interface of ELPA 2015.11.001 is decribed.
      
      The documentation has step by step to be implemented for all functions
      and test programms.
      927f988a
    • Andreas Marek's avatar
      Split file elpa2.F90 into elpa2.F90 and elpa2_compute.F90 · 2998fac3
      Andreas Marek authored
      As in a previous commit for elpa1.F90, for automatic generation of
      documentation elpa2.F90 has been splitted in two files, in order to
      have a lean, easy-to-understand user interface:
      
      elpa2.F90
      the visible user functions, which provide the library calls.
      The usage is the same as before
      
      elpa2_compute.F90
      all internal routines, which are used by ELPA2, but which are never
      called external of the library by a user. These functions are now
      "hidden" in the module elpa2_compute, which is used by ELPA2.
      
      The procedures in elpa2_compute.F90 are identical to the ones in
      elpa2.F90 before this split commit. The only -- but quite a lot of them
      -- changes are intendation changes.
      2998fac3
    • Andreas Marek's avatar
      Split file elpa1.F90 into elpa1.F90 and elpa1_compute.F90 · 9710bf08
      Andreas Marek authored
      For automatic generation of documentation, the file elpa1.F90
      has been splitted into two files, in order to have a lean,
      easy-to-understand user interface:
      
      elpa1.F90
      the visible user functios, which provide the library calls.
      The usage is the same as always
      
      elpa1_compute.F90
      all internal routines, which are used by ELPA1 and ELPA2, but
      which are never called by the user. These functions are now "hidden"
      in the module elpa1_compute, which is used by ELPA1 and ELPA2.
      
      The procedures in elpa1_compute.F90 are identical to the ones in
      elpa1.F90 before this split commit. The only -- but lot of --
      changes are intendation.
      9710bf08