|
|
# Performance Variables |
|
|
\ No newline at end of file |
|
|
# Performance Variables
|
|
|
We provide a list of the performance variables that enhance or modify the code’s computational performance without modifying the numerical results (see chapter 3 of the usermanual). They are passed via environment variables using bash (with the **export** command).
|
|
|
|
|
|
- **FFTALGO**=[x] (Default: 1) Set to 1 to enable the Fourier-algorithm or to 0 to use the non-Fourier-algorithm.
|
|
|
|
|
|
- **GPU**=[x] (Default: 0) Set to 1 to enable GPU usage, set to 0 to use only the CPU. The GPU will be used to calculate the probabilities for all particle-images. The preparation of projections and PSF convolutions will be processed by the CPU. This is arranged in a pipeline to ensure continuous GPU utilization.
|
|
|
|
|
|
- **GPUALGO**=[x] (Default: 2) This option is only relevant if GPU=1 and FFTALGO=0. Hence, it is commonly not used, since FFTALGO defaults to 1. For the non-Fourier-algorithm there are three GPUALGO implementations:
|
|
|
– x=2: This will parallelize over the particle-images, and over the center displacements. The approach requires less memory bandwidth than GPUALGO=0 or GPUALGO=1. However, it has several constraints on the problem configuration: i) the number of center displacements per dimension must be a power of 2, and ii) must be a factor of the number of CUDA threads per block.
|
|
|
– x=1: This will parallelize over the particle-images, and then loop over the center displacements on the GPU. It is usually slower than GPUALGO=2 but there are no constraints on the problem configuration. The particle-images are not processed all at once but in chunks.
|
|
|
– x=0: As GPUALGO=1, all particle images are processed at once. It is always slower than GPUALGO=1 and should not be used anymore. |
|
|
\ No newline at end of file |