BioEM_Guide_2014 7.68 KB
Newer Older
Pilar Cossio's avatar
Pilar Cossio committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

   < BioEM software for Bayesian inference of Electron Microscopy images>

   Copyright (C) 2014 Pilar Cossio, David Rohr and Gerhard Hummer.
   Max Planck Institute of Biophysics, Frankfurt, Germany.

   See license statement for terms of distribution.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

*************************************************************
    BioEM:  Bayesian inference of Electron Microscopy
*************************************************************

	     PRE-ALPHA VERSION: November, 2014

**************************************************************
Requisites: 

	**** FFTW libraries:   http://www.fftw.org/

        **** BOOST libraries:  http://www.boost.org/

        **** OpenMP:           http://openmp.org/

Optional: 
        **** CMake:            http://www.cmake.org/
	     for compliation with CMakeLists.txt file.
 
        **** Cuda:             Parallel Code for GPUs.             
  
***************************************************************

DESCRIPTION:

 *** The main objective of the BioEM code is to compare one Model to multiple experimental
     EM images, obtaining a posterior probability using Bayesian analysis with the
     mathematical details explained in Ref. [1].

 *** Command line input & help is found by just running the
     compiled executable ./bioEM


      ++++++++++++ FROM COMMAND LINE +++++++++++

	  --Modelfile arg       (Mandatory) Name of model file
	  --Particlesfile arg   if BioEM (Mandatory) Name of paricles file
	  --Inputfile arg       if BioEM (Mandatory) Name of input parameter file
	  --PrintBestCalMap arg (Optional) Only print best calculated map (file nec.). 
                        NO BioEM (!)
	  --ReadEulerAngles arg (Optional) Read Euler angle list instead of uniform 
                        grid (file nec.)
	  --ReadPDB             (Optional) If reading model file in PDB format
	  --ReadMRC             (Optional) If reading particle file in MRC format
	  --ReadMultipleMRC     (Optional) If reading Multiple MRCs
	  --DumpMaps            (Optional) Dump maps after they were red from maps file
	  --LoadMapDump         (Optional) Read Maps from dump instead of maps file
	  --help                (Optional) Produce help message


	BioEM has four main input readthroughs: 
	1) Command line, where the filenames of the Model, Parameters ranges and Particles
 	should be provided (and some extra features as seen before).
	2) The Model file should contain the coordinates of the model either in PDB or 
	txt format (see bellow).
	3) The parameter file should contain all the parameter ranges, and additional
	features can be included (see bellow).
	4) The particle file should contain the EM images, it can be in text format
	or in MRC (this should be specified in the command line) (see bellow).
	

  


 *** TUTORIAL DIRECTORY:
	
     A directory with example EM particles, c-alpha PDB & simple Model, and
     the corresponding launch scripts are provided.
     -- Standard input file parameters are provided and recommened.








 ** EXPERIMENTAL IMAGE FORMAT:
     Two options are allowed for the map-particle files:
    	 A) Simple *.txt or .dat with data formated as
	    printf"%8d%8d%16.8f\n" where the first two columns are
	    the pixel indexes and the third column is the intensity.
	    Multiple particles are read in the same file with the
	    separator "PARTICLE" & Number.
	    Pixel indexes should start at 0 and all pixels should be
	    in the file.
            -- For this case it is recommended  all particles 
	    to be normalized to zero average and unit standard deviation.
	 Example in::	 

         B) Standard MRC particle file. If reading multiple MRCs
            provide in command line 
		  --Particlesfile FILE --ReadMRC --ReadMultipleMRC
	    where FILE contains the names of each mrc file to be read.
	    If only one MRC on command line 
                  --Particlesfile FILEMRC --ReadMRC
            where FILEMRC is the name of the single mrc file.     
	    By default when reading MRC particles are normalized to 
	    to be normalized to zero average and unit standard deviation.
	    Each MRC file can contain multiple particles.			
	 Example in::

	Note:: .mrc extension is not mandatory to read mrc but a warning is 
	printed out.

    Useful Key Words for procesing multiple models
	--DumpMaps
    writing out in file maps.dump in XX format so its faster to re-read.
    To read use  	
	--LoadMapDump


 ** MODEL FORMAT:
     	
     A) Standard PDB file: Reading only CA atoms and corresponding
	residues with proper density. 
	Key word in command line is needed::
		--ReadPDB
	Also, it is recommended to have in the parameter file the key word 
	"PROJECT_RADIUS". For modeling the CA atoms as spheres
	with the proper number of electrons and van der waals radii 
	corresponding to each amino acid. If this key word is not mentioned elements 
	will be considered as points.

	Note:: .pdb extension is not mandatory to read pdb but a warning sign is 
	printed out.
	
     B)*.txt *.dat file: With format printf"%f %f %f %f %f\n",
     	the first three columns as the coordinates of atoms or
	voxels, fourth column is the radius (\AA) and the 
     	last column is the corresponding density::
	----  x   y   z  radius  density ------- 
     	(Useful for all atom representation or 3D EM density maps).
      	The key word "PROJECT_RADIUS" is needed to consider
      	the elements in the coordinate file as spheres and project their radius.
      	If this key word is not mentioned elements will be considered as points. 


		

 ** PARAMETER FILE FORMAT:


	Additional:
	Print CTF maximizing parameters

 ** STANDARD CALCULATION:
	In a standard BioEM calculation the goal is to obtain the posterior
	probability from a Model given a set of images. In this case 
	a Model file, Parameter file and Particle file should be provided.
	Example in::



 ** Optional Calculations
	Several additional options are available in this program:

	A) Euler Angle Probabilities: This option prints out the 
	posterior probabilities of the model as a function of the Euler Angles.
	In this case no integration over the angles is performed, and one
	can view more directly the probability distriubtion as a function of the angles.
	Input needed:

	Example in:
	
	B)Cross Correlation Calculation: This option prints the best* cross correlation 
	of the model at as a function of the pixels in the micrograph (*see Manual
	for mathematical formulation of how the "best" cross correlation is obtained).
	This can be useful in the preliminary steps of particle picking (identification).
	Input needed::

	Example in:

	C)Print map from Model: This option is completely independent of the BioEM calculation.
	It can be useful to construct synthetic images from a model, given fixed set of parameters.
	Noise can also be included in the artificial image.
	Input needed::
	

	Example in:




 *** OUTPUT:
     -- Main output file: "Output_Probabilities"
     with
     RefMap #(number Particle Map) Probability  #(log(P))
     RefMap #(number Particle Map) Maximizing Param: #(Euler Angles) #(PSF parameters) #(center displacement)

     **Important: It is recommended to compare log(P) with respect to other Models or to Noise as in [1].

    ** Optional OUTPUTS:

    -- Write the probailities for each triplet of Euler Angles (key word: WRITE_PROB_ANGLES in InputFile).
       Key word in 

    -- Write the cross-correlations of a full micrograph
       Key word in

    -- (Excluding BioEM calculation) Print a map given a set of parameters
       Key word in command line:: --PrintBestCalMap 


	
  [1] Cossio, P and Hummer, G. J Struct Biol. 2013 Dec;184(3):427-37. doi: 10.1016/j.jsb.2013.10.006.