1. 07 Jun, 2016 2 commits
  2. 06 Jun, 2016 1 commit
  3. 02 May, 2016 1 commit
  4. 23 Apr, 2016 1 commit
  5. 22 Apr, 2016 2 commits
  6. 21 Apr, 2016 1 commit
  7. 20 Apr, 2016 2 commits
    • Andreas Marek's avatar
      a9d27681
    • Andreas Marek's avatar
      Additional configure check for gcc SSE intrinsics · 896388e9
      Andreas Marek authored
      It turned out that if a CPU supports SSE the already existing
      test for SSE assembly instructions always passes.
      However, the compilation of gcc SSE intrinic instructions might
      nevertheless fail if gcc is not called with one of the options
      "-msse3", "-msse4" , "-msse4.1", "-msse4.2", "-mavx", or "-mavx2"!
      
      Obviously gcc does still not consider SSE as a standard on X86_64
      Intel CPUs.
      
      An additional configure test has been introduced, which test for
      gcc intrinsic sse instructions. If this test fails, the corresponding
      kernels are switched off.
      896388e9
  8. 19 Apr, 2016 1 commit
  9. 15 Apr, 2016 1 commit
  10. 14 Apr, 2016 2 commits
  11. 13 Apr, 2016 1 commit
  12. 12 Apr, 2016 1 commit
  13. 06 Apr, 2016 1 commit
  14. 05 Apr, 2016 1 commit
    • Andreas Marek's avatar
      Introduction of new SSE kernels with different blocking · 69792b15
      Andreas Marek authored
      The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex)
      case are now available by default
      
      Thus the following changes have been done
      - introduce new macros in configure.ac and Makefile.am
      - renmae the AVX kernels in AVX_AVX2 (they also support AVX2)
      - introduce new files with SSE kernel
      - introduce new kernel parameters !
      - make the SSE kernels callable
      
      The results are identical with previous kernels
      69792b15
  15. 04 Apr, 2016 1 commit
  16. 01 Apr, 2016 1 commit
  17. 18 Mar, 2016 1 commit
    • Andreas Marek's avatar
      Allow ELPA to be build with single and double precision symbols in one · 647aa5a8
      Andreas Marek authored
      library
      
      It the configure option "--enable-single-precision" is specified,
      ELPA will also be build for single precision usage. The double precision
      and single precision will be available at the same time with names
      "solve_evp_real_1stage_double" or "solve_evp_real_1stage_single" and
      so on...
      
      This change immplied some major refactoring of the ELPA code:
      1.) functions/procedures had to be renamed with suffix "_double"
      
      2.) If necessary the same functions have to be available with suffix
      "_single"
      
      3.) Variable kind definitions have to be consistent with the
      intented use
      
      To avoid uneccessary code duplication this is done (most of the time)
      with preprocessor string substitution.
      
      The documentation has been updated.
      
      NOT SUPPORTED are at the moment:
      
      - single precision usage of ELPA2 with kernels, others than "generic"
        and "generic_simple"
      
      - single precision usage of GPU
      647aa5a8
  18. 24 Feb, 2016 2 commits
    • Andreas Marek's avatar
      Add migration notice · 31a03aa2
      Andreas Marek authored
      31a03aa2
    • Andreas Marek's avatar
      Optional build of ELPA without MPI · 49f119aa
      Andreas Marek authored
      The configure flag "--enable-shared-memory-only" triggers a build
      of ELPA without MPI support:
      
      - all MPI calls are skipped (or overloaded)
      - all calls to scalapack functions are replaced by the corresponding
        lapack calls
      - all calls to blacs are skipped
      
      Using ELPA without MPI gives the same results as using ELPA with 1 MPI
      task!
      
      This version is not yet optimized for performance, here and there some
      unecessary copies are done.
      
      Ths version is intended for users, who do not have MPI in their
      application but still would like to use ELPA on one compute node
      49f119aa
  19. 17 Feb, 2016 1 commit
    • Andreas Marek's avatar
      Single precision support for ELPA2 · 940b8f26
      Andreas Marek authored
      ELPA2 can now be build (as ELPA1) for single precision calculations.
      The ELPA2 kernles which are implemented in assembler, C, or C++ have NOT
      yet been ported.
      
      Thus at the moment only the GENERIC and GENERIC_SIMPLE kernels support
      single precision calculations
      940b8f26
  20. 12 Feb, 2016 1 commit
    • Andreas Marek's avatar
      Single precision support for ELPA2 · 56043bdc
      Andreas Marek authored
      ELPA2 can now be build (as ELPA1) for single precision calculations.
      The ELPA2 kernles which are implemented in assembler, C, or C++ have NOT
      yet been ported.
      
      Thus at the moment only the GENERIC and GENERIC_SIMPLE kernels support
      single precision calculations
      56043bdc
  21. 04 Feb, 2016 1 commit
  22. 02 Feb, 2016 3 commits
    • Andreas Marek's avatar
      Remove assumend size arrays from real simple kernel · 7a564731
      Andreas Marek authored
      This commit might be performance critical, it has to be timed
      carefully. Thus one can switch back to the old implementation.
      The new one, however, is more safe and better to debug
      7a564731
    • Andreas Marek's avatar
      Remove assumed size arrays from generic complex kernel · 1da1bd50
      Andreas Marek authored
      This change might be performance critical and has to be timed
      carefully. Thus it is possible to switch back to the old
      implementation. The new one, however, can actually be debbuged
      1da1bd50
    • Andreas Marek's avatar
      Remove assumed size from generic real kernel · cb4c4ae7
      Andreas Marek authored
      The generic real kernel is now contained in a module, this allows
      strict interface checking! It also does not use assumed size arrays
      anymore. Both points increase the possibility to debug and find errors.
      
      However, this might be performance critical! It is possible to
      switch back to the old implementation if that turns out to
      be beneficial w.r.t. performance. Timings with gfortran 4.9 on Intel
      Haswell showed that the new implementation is about 30 percent faster
      then the previous one
      cb4c4ae7
  23. 19 Jan, 2016 1 commit