- 02 Jun, 2020 1 commit
-
-
Andreas Marek authored
-
- 20 Nov, 2019 1 commit
-
-
Wenzhe Yu authored
* Switch to a simple non-WY algorithm * Unify real and complex cases * Update reduction kernel * Use __shfl_xor_sync for warp reduce (CUDA 9+) * Support 2^n block size, n = 1,2,...,10 * Use templates when possible * Clean up unused CUDA functions * Increase default stripe width when using GPU
-
- 16 Jan, 2019 1 commit
-
-
Andreas Marek authored
-
- 11 Jan, 2019 1 commit
-
-
Andreas Marek authored
-
- 03 Sep, 2017 1 commit
-
-
Andreas Marek authored
This closes issue #51.
-
- 02 Sep, 2017 1 commit
-
-
Andreas Marek authored
This closes issue #51.
-
- 03 Aug, 2017 1 commit
-
-
Lorenz Huedepohl authored
Anything if it makes Andreas happy :)
-
- 29 May, 2017 1 commit
-
-
Andreas Marek authored
-
- 06 Apr, 2017 1 commit
-
-
Andreas Marek authored
-
- 03 Apr, 2017 1 commit
-
-
Lorenz Huedepohl authored
-
- 29 Mar, 2017 1 commit
-
-
Andreas Marek authored
- the functions now contain the appropiate real/complex in their name - unused functions have been removed as cleanup
-
- 21 Mar, 2017 1 commit
-
-
Andreas Marek authored
-