- 16 Mar, 2020 1 commit
-
-
Andreas Marek authored
-
- 09 Mar, 2020 1 commit
-
-
Andreas Marek authored
-
- 06 Mar, 2020 1 commit
-
-
Andreas Marek authored
-
- 05 Mar, 2020 5 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
Redistribute See merge request !27
-
Andreas Marek authored
-
Andreas Marek authored
-
- 04 Mar, 2020 5 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
- 03 Mar, 2020 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 02 Mar, 2020 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
- 29 Feb, 2020 1 commit
-
-
Andreas Marek authored
-
- 28 Feb, 2020 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 21 Feb, 2020 1 commit
-
-
Andreas Marek authored
-
- 19 Dec, 2019 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
- 13 Dec, 2019 1 commit
-
-
Andreas Marek authored
-
- 10 Dec, 2019 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Andreas Marek authored
-
- 06 Dec, 2019 1 commit
-
-
Andreas Marek authored
-
- 04 Dec, 2019 3 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
Wenzhe Yu authored
-
- 20 Nov, 2019 5 commits
-
-
Wenzhe Yu authored
-
Wenzhe Yu authored
* Removed redundant malloc, memset and memcpy * Use pinned host memory * Implemented blocking for GPU code path in step5 * Removed unused code
-
Wenzhe Yu authored
* cudaMallocHost * cudaFreeHost * cudaHostRegister * cudaHostUnregister
-
Wenzhe Yu authored
* Switch to a simple non-WY algorithm * Unify real and complex cases * Update reduction kernel * Use __shfl_xor_sync for warp reduce (CUDA 9+) * Support 2^n block size, n = 1,2,...,10 * Use templates when possible * Clean up unused CUDA functions * Increase default stripe width when using GPU
-
Andreas Marek authored
-
- 14 Nov, 2019 1 commit
-
-
Andreas Marek authored
-
- 11 Nov, 2019 1 commit
-
-
Andreas Marek authored
-