1. 02 Jun, 2020 1 commit
  2. 20 Nov, 2019 1 commit
    • Wenzhe Yu's avatar
      Rewrite compute_hh_trafo CUDA kernels · 6cd5a4f1
      Wenzhe Yu authored
      * Switch to a simple non-WY algorithm
      * Unify real and complex cases
      * Update reduction kernel
      * Use __shfl_xor_sync for warp reduce (CUDA 9+)
      * Support 2^n block size, n = 1,2,...,10
      * Use templates when possible
      * Clean up unused CUDA functions
      * Increase default stripe width when using GPU
      6cd5a4f1
  3. 03 Aug, 2017 1 commit
  4. 18 Apr, 2017 1 commit
  5. 06 Apr, 2017 1 commit
  6. 21 Mar, 2017 2 commits
  7. 09 Feb, 2017 1 commit
  8. 07 Feb, 2017 1 commit