1. 19 Apr, 2016 1 commit
  2. 18 Apr, 2016 3 commits
  3. 15 Apr, 2016 1 commit
  4. 14 Apr, 2016 2 commits
  5. 13 Apr, 2016 4 commits
  6. 12 Apr, 2016 1 commit
  7. 08 Apr, 2016 2 commits
  8. 05 Apr, 2016 1 commit
    • Andreas Marek's avatar
      Introduction of new SSE kernels with different blocking · 69792b15
      Andreas Marek authored
      The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex)
      case are now available by default
      
      Thus the following changes have been done
      - introduce new macros in configure.ac and Makefile.am
      - renmae the AVX kernels in AVX_AVX2 (they also support AVX2)
      - introduce new files with SSE kernel
      - introduce new kernel parameters !
      - make the SSE kernels callable
      
      The results are identical with previous kernels
      69792b15
  9. 04 Apr, 2016 2 commits
  10. 01 Apr, 2016 1 commit
  11. 18 Mar, 2016 1 commit
    • Andreas Marek's avatar
      Allow ELPA to be build with single and double precision symbols in one · 647aa5a8
      Andreas Marek authored
      library
      
      It the configure option "--enable-single-precision" is specified,
      ELPA will also be build for single precision usage. The double precision
      and single precision will be available at the same time with names
      "solve_evp_real_1stage_double" or "solve_evp_real_1stage_single" and
      so on...
      
      This change immplied some major refactoring of the ELPA code:
      1.) functions/procedures had to be renamed with suffix "_double"
      
      2.) If necessary the same functions have to be available with suffix
      "_single"
      
      3.) Variable kind definitions have to be consistent with the
      intented use
      
      To avoid uneccessary code duplication this is done (most of the time)
      with preprocessor string substitution.
      
      The documentation has been updated.
      
      NOT SUPPORTED are at the moment:
      
      - single precision usage of ELPA2 with kernels, others than "generic"
        and "generic_simple"
      
      - single precision usage of GPU
      647aa5a8
  12. 24 Feb, 2016 2 commits
    • Andreas Marek's avatar
      Add migration notice · 31a03aa2
      Andreas Marek authored
      31a03aa2
    • Andreas Marek's avatar
      Optional build of ELPA without MPI · 49f119aa
      Andreas Marek authored
      The configure flag "--enable-shared-memory-only" triggers a build
      of ELPA without MPI support:
      
      - all MPI calls are skipped (or overloaded)
      - all calls to scalapack functions are replaced by the corresponding
        lapack calls
      - all calls to blacs are skipped
      
      Using ELPA without MPI gives the same results as using ELPA with 1 MPI
      task!
      
      This version is not yet optimized for performance, here and there some
      unecessary copies are done.
      
      Ths version is intended for users, who do not have MPI in their
      application but still would like to use ELPA on one compute node
      49f119aa
  13. 04 Feb, 2016 1 commit
  14. 02 Feb, 2016 10 commits
  15. 19 Jan, 2016 1 commit
  16. 05 Jan, 2016 1 commit
    • Andreas Marek's avatar
      Updated all variable types · 62a29931
      Andreas Marek authored
      All variables (real, integer, complex) are now declecared with the
      appropiate kind statement. The definition of the kind types is done
      in src/mod_precision.f90
      
      To ensure interoperability with C, the kind types are decleared via
      iso_c_binding to C variables
      62a29931
  17. 11 Dec, 2015 2 commits
  18. 28 Oct, 2015 1 commit
    • Alexander Heinecke's avatar
      This commit improves ELPA's performance on Intel(R) Xeon(R) E5v2 and E5v3 series CPUs by: · fe63372d
      Alexander Heinecke authored
      - enabling fusing iterations of stage 5 in ELPA2 for every configuration
      - Changed reduction bandwidth in ELPA2 to be at least 64
      - partial OpenMP parallelization of the QR factorization in bandred_real
      - OpenMP parallelization of SYMM
      - OpenMP parallelization of SYR2K in bandred_real
      - OpenMP parallelization for elpa1_reduce_add_vectors and elpa1_transpose_vectors
      - AVX2 support in backtransformation elpa2_kernels (FMA3 instructions introduced with Haswell microarchitecture)
      fe63372d
  19. 23 Mar, 2015 1 commit
  20. 11 Feb, 2015 1 commit
  21. 27 Jan, 2015 1 commit