diff --git a/Manual/BioEM_Manual.pdf b/Manual/BioEM_Manual.pdf index 6c301054c81627e636bbb091190494001f7dc6b0..1e9f7b5a4f685a070024ecb5af0c9cd072959134 100644 Binary files a/Manual/BioEM_Manual.pdf and b/Manual/BioEM_Manual.pdf differ diff --git a/Manual/BioEM_Manual.tex b/Manual/BioEM_Manual.tex index 75679a6977d748b157c3c46bb636dcaec7a1edcb..eeaaca2e4d94b877b02216acd3e04c24e4d55b84 100644 --- a/Manual/BioEM_Manual.tex +++ b/Manual/BioEM_Manual.tex @@ -1,8 +1,9 @@ %++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ % < BioEM software Manual for Bayesian inference of Electron Microscopy images> -% Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, -% Volker Lindenstruth and Gerhard Hummer. +% Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, +% Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. % Max Planck Institute of Biophysics, Frankfurt, Germany. +% Max Planck Computing and Data Facility, Garching, Germany. % % See license statement for terms of distribution. % @@ -1207,6 +1208,11 @@ for the current orientation and convolution. finished calculating the probabilities for the current orientation and PSF convolution. \end{itemize} +\fbox{% +\parbox{12cm}{ +{\texttt{export GPUWORKLOAD=-1}}}} +The autotuning algorithm will automatically, empirically determine how to distribute workload between CPU and GPU. This is a default behavior. + \item {\it Multiple Projections at once via OpenMP:} BioEM can prepare the projection of multiple orientations at once using OpenMP. The benefit compared to the pure OpenMP parallelization over the particle images, however, is tiny. This is relevant if MPI is not used, OpenMP is used, GPU is used, @@ -1266,7 +1272,7 @@ automatically. Naturally, different methods of parallelization can be combined: \begin{itemize} \item[--] As described in the GPU section, one can combine MPI with the GPU algorithm to use multiple GPUs at once. -\item[--] One can use GPUs and CPU cores jointly to calculate the probabilities for all particle-images with the \texttt{GPUWORKLOAD=[x]} variable. +\item[--] One can use GPUs and CPU cores jointly to calculate the probabilities for all particle-images. For more than one GPU, MPI must be employed. In this case, the number of MPI processes must match the number of GPUs. So it is important to combine MPI, and OpenMP inside one node in order to use all CPU cores. \end{itemize} @@ -1315,10 +1321,10 @@ node1) must be placed as (node0, node0, node1, node1) and not as (node0, node1, because in the latter case only 1 GPU per node will be used (by two MPI processes each). \end{itemize} -\item \texttt{GPUWORKLOAD=[x]} (Default: 100) +\item \texttt{GPUWORKLOAD=[x]} (Default: -1) Only relevant if \texttt{GPU=1}. -This defines the fraction of the workload in percent. To be precise: the fraction of the number -of particle-images processed by the GPU. The remaining particle-images will be processed by the CPU. Preparation +When set, this defines the fraction of the workload in percent. To be precise: the fraction of the number +of particle-images processed by the GPU. The remaining particle-images will be processed by the CPU. If this variable is not set or it is set to -1 and there is a significant number of comparisons to perform ({\it i.e.} \texttt{BIOEM\_DEBUG\_BREAK>7}), the autotuning algorithm will automatically, empirically determine how to distribute workload between CPU and GPU. Preparation of projection and convolution will be processed by the CPU in any case. \item \texttt{GPUASYNC=[x]} (Default: 1) @@ -1385,17 +1391,9 @@ The memory footprint increases with \texttt{x}, so it should not be too large. For best performance, choose a multiple of the number of OpenMP threads. \item \texttt{GPU=1} should be used if a GPU is available. -Performance wise, one Titan GPU corresponds roughly to 20 cores at 3 GHz. -If the CPU has significant compute capabilities, one should use \texttt{GPUWORKLOAD=[x]} for combined -CPU / GPU processing. + Performance wise, one Titan GPU corresponds roughly to 20 cores at 3 GHz. Combined CPU / GPU processing is used by default, and the autotuning algorithm is used to automatically, empirically determine how to distribute workload between CPU and GPU. If \texttt{BIOEM\_DEBUG\_OUTPUT=2}, the output message for each comparison contains the chosen workload balance. One can observe how in the beginning of the execution several different values are used, until the workload balance converges to the optimal value. -\item If one uses combined CPU / GPU processing, a good value for \texttt{GPUWORKLOAD=[x]} must be determined. -A starting point is to measure the CPU and GPU individually with - -\texttt{BIOEM\_DEBUG\_OUTPUT=2; BIOEM\_DEBUG\_BREAK=4; GPU=0/1} - -and compare the time of CPU to that of the GPU. Assuming that the CPU takes \texttt{c} seconds for the comparison, and the -GPU takes \texttt{g} second, a good starting point for \texttt{GPUWORKLOAD=[x]} is \texttt{x = 100 * c / (c + g)}. +\item After a certain number of projections, autotuning is automatically repeated to recalibrate the optimal workload distribution. This is executed to repair the cases where the performance of the processing units changed during the run or where the initial workload balance was badly computed due to the experimental noise. Autotuning algorithm introduces minor overhead to the whole run, which is typically less than 10 milliseconds per calibration. In order to avoid this overhead or simply to have a full manual control the fraction of the workload to be executed on GPU, one should use \texttt{GPUWORKLOAD=[x]}. \item On a single node, one should use OpenMP parallelization for many particle-images and few orientations; and MPI parallelization for few particle-images and many orientations. Assume a system with \texttt{N} CPU cores, the command @@ -1448,9 +1446,7 @@ processes matches the number of GPUs: \texttt{BIOEM\_PROJECTIONS\_AT\_ONCE=[4*C/G] OMP\_NUM\_THREADS=[C/G] GPU=1} -\texttt{GPUDEVICE=-1 GPUWORKLOAD=[x] mpirun -n [N*G]} - -Here, \texttt{GPUWORKLOAD} must be tuned as described before. +\texttt{GPUDEVICE=-1 GPUWORKLOAD=-1 mpirun -n [N*G]} \end{itemize} diff --git a/autotuner.cpp b/autotuner.cpp index 125fc6c651579f8c0259a773a1326d410a716273..7fa0e09870791b3e6b0171482a153998917b59cb 100644 --- a/autotuner.cpp +++ b/autotuner.cpp @@ -1,3 +1,13 @@ +/* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + < BioEM software for Bayesian inference of Electron Microscopy images> + Copyright (C) 2017 Pilar Cossio, Markus Rampp, Luka Stanisic and Gerhard Hummer. + Max Planck Institute of Biophysics, Frankfurt, Germany. + Max Planck Computing and Data Facility, Garching, Germany. + + See license statement for terms of distribution. + + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*/ + #include "autotuner.h" void Autotuner::Reset() diff --git a/bioem.cpp b/bioem.cpp index ddaf52e6147bc2beff195fcbe8c16a90206a8898..c66439ae4aba630ac288b78ed404ee9906352abe 100644 --- a/bioem.cpp +++ b/bioem.cpp @@ -1,8 +1,7 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. - + Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, + Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. Frankfurt Institute for Advanced Studies, Goethe University Frankfurt, Germany. Max Planck Computing and Data Facility, Garching, Germany. diff --git a/bioem_cuda.cu b/bioem_cuda.cu index 71a5c97ee2f30633ad69d64f139fc5cac4064798..08a10ab9c320a357f24e60b5742a89b636d34163 100644 --- a/bioem_cuda.cu +++ b/bioem_cuda.cu @@ -1,8 +1,9 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. + Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, + Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. + Max Planck Computing and Data Facility, Garching, Germany. See license statement for terms of distribution. diff --git a/include/autotuner.h b/include/autotuner.h index 10db9ca8d21810f883d4d3bbf74dd9895a9e1498..16cdb225d99ef9952aa66c7f669d02d90665463b 100644 --- a/include/autotuner.h +++ b/include/autotuner.h @@ -1,8 +1,8 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. + Copyright (C) 2017 Pilar Cossio, Markus Rampp, Luka Stanisic and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. + Max Planck Computing and Data Facility, Garching, Germany. See license statement for terms of distribution. diff --git a/include/bioem.h b/include/bioem.h index 98222791db36eb60d1061d63c74793719364cf24..ff6c253e9ba83d5114903387ba3b6106405a9438 100644 --- a/include/bioem.h +++ b/include/bioem.h @@ -1,7 +1,7 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. + Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, + Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. Frankfurt Institute for Advanced Studies, Goethe University Frankfurt, Germany. Max Planck Computing and Data Facility, Garching, Germany. diff --git a/include/bioem_cuda_internal.h b/include/bioem_cuda_internal.h index 708b40fb3e9a2a10b265965baaa224244159b975..3bbcfb0c15c04a5477cc53fa5e9600a4d9d81669 100644 --- a/include/bioem_cuda_internal.h +++ b/include/bioem_cuda_internal.h @@ -1,7 +1,7 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. + Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, + Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. Frankfurt Institute for Advanced Studies, Goethe University Frankfurt, Germany. Max Planck Computing and Data Facility, Garching, Germany. diff --git a/include/defs.h b/include/defs.h index b7338ca8246126becf8cebce6191b0fa7ca86ec6..80abbeb8f169b444447041aafda0f21f237392c0 100644 --- a/include/defs.h +++ b/include/defs.h @@ -1,7 +1,7 @@ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ < BioEM software for Bayesian inference of Electron Microscopy images> - Copyright (C) 2016 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, - Volker Lindenstruth and Gerhard Hummer. + Copyright (C) 2017 Pilar Cossio, David Rohr, Fabio Baruffa, Markus Rampp, + Luka Stanisic, Volker Lindenstruth and Gerhard Hummer. Max Planck Institute of Biophysics, Frankfurt, Germany. Frankfurt Institute for Advanced Studies, Goethe University Frankfurt, Germany. Max Planck Computing and Data Facility, Garching, Germany.