    Andreas Marek
      Alignment error due to wrong stripe_width · f5feb969
      Andreas Marek
      In case of single precision calculations the stripe_width needs to
      be a multiple, which differs from the double precision by a factor of 2
      since one needs 32 bytes alignment and the sizeof(float) and sizeof(double)
      is different by a factor of two
      This commit closes issue #18
    Andreas Marek
      Error in single precision SSE BLOCK 4 kernel · 789121d6
      Andreas Marek
      The sub-kernels _8_ and _4_ were wrong
      This also solves problems with single precision SSE Block 6 kernel,
      since it also uses the Block 4 kernel
    Andreas Marek
      Additional configure check for gcc SSE intrinsics · 896388e9
      Andreas Marek
      It turned out that if a CPU supports SSE the already existing
      test for SSE assembly instructions always passes.
      However, the compilation of gcc SSE intrinic instructions might
      nevertheless fail if gcc is not called with one of the options
      "-msse3", "-msse4" , "-msse4.1", "-msse4.2", "-mavx", or "-mavx2"!
      Obviously gcc does still not consider SSE as a standard on X86_64
      Intel CPUs.
      An additional configure test has been introduced, which test for
      gcc intrinsic sse instructions. If this test fails, the corresponding
      kernels are switched off.
    Andreas Marek
      Introduction of new SSE kernels with different blocking · 69792b15
      Andreas Marek
      The SSE kernels with blocking of 2,4,6 (real case) and 1,2 (complex)
      case are now available by default
      Thus the following changes have been done
      - introduce new macros in configure.ac and Makefile.am
      - renmae the AVX kernels in AVX_AVX2 (they also support AVX2)
      - introduce new files with SSE kernel
      - introduce new kernel parameters !
      - make the SSE kernels callable
      The results are identical with previous kernels
    Andreas Marek
      Allow ELPA to be build with single and double precision symbols in one · 647aa5a8
      Andreas Marek
      It the configure option "--enable-single-precision" is specified,
      ELPA will also be build for single precision usage. The double precision
      and single precision will be available at the same time with names
      "solve_evp_real_1stage_double" or "solve_evp_real_1stage_single" and
      so on...
      This change immplied some major refactoring of the ELPA code:
      1.) functions/procedures had to be renamed with suffix "_double"
      2.) If necessary the same functions have to be available with suffix
      3.) Variable kind definitions have to be consistent with the
      intented use
      To avoid uneccessary code duplication this is done (most of the time)
      with preprocessor string substitution.
      The documentation has been updated.
      NOT SUPPORTED are at the moment:
      - single precision usage of ELPA2 with kernels, others than "generic"
        and "generic_simple"
      - single precision usage of GPU
    Andreas Marek
      Add migration notice · 31a03aa2
      Andreas Marek
    Andreas Marek
      Optional build of ELPA without MPI · 49f119aa
      Andreas Marek
      The configure flag "--enable-shared-memory-only" triggers a build
      of ELPA without MPI support:
      - all MPI calls are skipped (or overloaded)
      - all calls to scalapack functions are replaced by the corresponding
        lapack calls
      - all calls to blacs are skipped
      Using ELPA without MPI gives the same results as using ELPA with 1 MPI
      This version is not yet optimized for performance, here and there some
      unecessary copies are done.
      Ths version is intended for users, who do not have MPI in their
      application but still would like to use ELPA on one compute node
