Diff for "EMAN2/Programs/e2refine"

Differences between revisions 4 and 5

e2refine

This program is the heart of single particle reconstruction in EMAN2. It embodies the concept of an iterative 3-D single particle reconstruction in a single step, starting with a 3-D starting model and a set of preprocessed particle data. The overall strategy is similar to that used in EMAN1, with a number of improvements for speed and accuracy. The general idea is that the 3-D orientation of each particle is determined by comparison to a set of projections of the current 3-D model. Particles in near-identical orientations are then aligned and averaged in 2-D. These averages are then used to construct a new 3-D model, which is then reprojected for use in the next cycle of refinement. This process of reference-based classification is somewhat unique to EMAN, and is one reason why it can converge so rapidly to the correct answer even with a poor starting model.

EMAN2 refinement has many more options than EMAN1, and permits much more precise control over the refinement process. This can be both a blessing and a curse. We suggest launching your refinements from the workflow interface which simplifies specifying all of the necessary options. For those in search of detail, we document everything here.

Command Line Arguments

General Options

	--version	bool	show program's version number and exit
-h	--help	bool	show this help message and exit
-c	--check	bool	Checks the contents of the current directory to verify that e2refine.py command will work - checks for the existence of the necessary starting files and checks their dimensions.
-v	--verbose	int	verbose level [0-9], higner number means higher level of verboseness

Options impacting the overall refinement

--iter	int	The total number of refinement iterations to perform
--startiter	int	If a refinement crashes, this can be used to pick it up where it left off. This should NOT be used to change parameters, but only to resume an incomplete run.
--model	string	The name 3D image that will seed the refinement
--input	string	The name of the image containing the particle data
--usefilt	string	Note: some unresolved bugs may exist with this option (6/2011) Specify a particle data file that has been low pass or Wiener filtered. Has a one to one correspondence with your particle data. If specified will be used in projection matching routines, and elsewhere.
--path	string	The name of a directory where results are placed. If not specified (suggested), will use a path of the form refine_xx.
--mass	float	The mass of the particle in kilodaltons, used to run normalize.bymass. If unspecified nothing happens. Requires the --apix argument.
--apix	float	The angstrom per pixel of the input particles. This argument is required if you specify the --mass argument. If unspecified, the convergence plot is generated using either the project apix, or an apix of 1.
--sym	string	Specify symmetry - choices are: c<n>, d<n>, h<n>, tet, oct, icos. Omit this option or specify 'c1' for asymmetric reconstructions.

Options related to making projections

See also EMAN2/Programs/e2project3d

	--projector	string	Projector to use. 'standard' is the default
	--orientgen	string	The orientation generation argument for e2project3d.py. Typically something like: --orientgen=eman:delta=2.0:inc_mirror=0

	--automask3d	string	The 5 parameters of the mask.auto3d processor, applied after 3D reconstruction. These paramaters are, in order, isosurface threshold,radius,nshells and ngaussshells. From e2proc3d.py you could achieve the same thing using --process=mask.auto3d:threshold=1.1:radius=30:nshells=5:ngaussshells=5.
	--simalign	string	The name of an 'aligner' to use prior to comparing the images
	--simaligncmp	string	Name of the aligner along with its construction arguments
	--simralign	string	The name and parameters of the second stage aligner which refines the results of the first alignment
	--simraligncmp	string	The name and parameters of the comparitor used by the second stage aligner. Default is dot.
	--simcmp	string	The name of a 'cmp' to be used in comparing the aligned images
	--simmask	string	A file containing a single 0/1 image to apply as a mask before comparison but after alignment
	--shrink	int	Optionally shrink the input particles by an integer amount prior to computing similarity scores. For speed purposes.
	--twostage	int	Optionally run a faster 2-stage similarity matrix, ~5-10x faster, generally same accuracy. Value specifies shrink factor for first stage, typ 1-3
	--prefilt	bool	Filter each reference (c) to match the power spectrum of each particle (r) before alignment and comparison
	--sep	int	The number of classes a particle can contribute towards (default is 1)
	--classkeep	float	The fraction of particles to keep in each class, based on the similarity score generated by the --cmp argument.
	--classkeepsig	bool	Change the keep ('--keep') criterion from fraction-based to sigma-based.
	--classiter	int	The number of iterations to perform. Default is 1.
	--classalign	string	If doing more than one iteration, this is the name and parameters of the 'aligner' used to align particles to the previous class average.
	--classaligncmp	string	This is the name and parameters of the comparitor used by the fist stage aligner Default is dot.
	--classralign	string	The second stage aligner which refines the results of the first alignment in class averaging. Default is None.
	--classraligncmp	string	The comparitor used by the second stage aligner in class averageing. Default is dot:normalize=1.
	--classaverager	string	The averager used to generate the class averages. Default is 'mean'.
	--classcmp	string	The name and parameters of the comparitor used to generate similarity scores, when class averaging. Default is 'dot:normalize=1'
	--classnormproc	string	Normalization applied during class averaging
	--classrefsf	bool	Use the setsfref option in class averaging to produce better filtered averages.
	--classautomask	bool	This will apply an automask to the class-average during iterative alignment for better accuracy. The final class averages are unmasked.
	--pad	int	To reduce Fourier artifacts, the model is typically padded by ~25% - only applies to Fourier reconstruction
	--recon	string	Reconstructor to use see e2help.py reconstructors -v
	--m3dkeep	float	The percentage of slices to keep in e2make3d.py
	--m3dkeepsig	bool	The standard deviation alternative to the --m3dkeep argument
	--m3dsetsf	bool	The standard deviation alternative to the --m3dkeep argument
	--m3diter	int	The number of times the 3D reconstruction should be iterated
	--m3dpreprocess	string	Normalization processor applied before 3D reconstruction
	--m3dpostprocess	string	Post processor to be applied to the 3D volume once the reconstruction is completed
	--lowmem	bool	Make limited use of memory when possible - useful on lower end machines
-P	--parallel	string	Run in parallel, specify type:<option>=<value>:<option>:<value>

The refinement process produces a large number of different output files in databases within directories named refine_xx. The easiest way to browse these files is with EMAN2/Programs/e2display, the file browser. For documentation of the file contents, please see the items towards the bottom of this page.

-  ⇤ ← Revision 4 as of 2010-01-06 20:15:03 → 
  Size: 9833
  Editor: SteveLudtke
  Comment:
+   ← Revision 5 as of 2011-07-11 18:45:56 → ⇥
  Size: 7899
  Editor: SteveLudtke
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-| [[#args|Command line arguments]] |  [[#checkfunc|Check functionality]] | [[EMAN2/e2refinefaq|e2refine FAQ]] |
-Line 5:
+Line 3:
+This program is the heart of single particle reconstruction in EMAN2. It embodies the concept of an iterative 3-D single particle reconstruction in a single step, starting with a [[EMAN2/Programs/e2initialmodel|3-D starting model]] and a set of preprocessed particle data. The overall strategy is similar to that used in [[EMAN1]], with a number of improvements for speed and accuracy. The general idea is that the 3-D orientation of each particle is determined by comparison to a set of projections of the current 3-D model. Particles in near-identical orientations are then aligned and averaged in 2-D. These averages are then used to construct a new 3-D model, which is then reprojected for use in the next cycle of refinement. This process of reference-based classification is somewhat unique to EMAN, and is one reason why it can converge so rapidly to the correct answer even with a poor starting model.
-Line 6:
+Line 5:
-||<35%><<TableOfContents>>||
+EMAN2 refinement has many more options than EMAN1, and permits much more precise control over the refinement process. This can be both a blessing and a curse. We suggest launching your refinements from the [[EMAN2/Programs/e2workflow|workflow interface]] which simplifies specifying all of the necessary options. For those in search of detail, we document everything here.
-Line 8:
+Line 7:
-|| {{attachment:e2refine.png}} ||
+=== Command Line Arguments ===

==== General Options ====
|| ||--version||bool||show program's version number and exit||
||-h||--help||bool||show this help message and exit||
||-c||--check||bool||Checks the contents of the current directory to verify that e2refine.py command will work - checks for the existence of the necessary starting files and checks their dimensions. ||
||-v||--verbose||int||verbose level [0-9], higner number means higher level of verboseness||

==== Options impacting the overall refinement ====
|| ||--iter||int||The total number of refinement iterations to perform||
|| ||--startiter||int||If a refinement crashes, this can be used to pick it up where it left off. This should NOT be used to change parameters, but only to resume an incomplete run.||
|| ||--model||string||The name 3D image that will seed the refinement||
|| ||--input||string||The name of the image containing the particle data||
|| ||--usefilt||string||''Note: some unresolved bugs may exist with this option (6/2011)'' Specify a particle data file that has been low pass or Wiener filtered. Has a one to one correspondence with your particle data. If specified will be used in projection matching routines, and elsewhere.||
|| ||--path||string||The name of a directory where results are placed. If not specified (suggested), will use a path of the form ''refine_xx''.||
|| ||--mass||float||The mass of the particle in kilodaltons, used to run normalize.bymass. If unspecified nothing happens. Requires the --apix argument.||
|| ||--apix||float||The angstrom per pixel of the input particles. This argument is required if you specify the --mass argument. If unspecified, the convergence plot is generated using either the project apix, or an apix of 1.||
|| ||--sym||string||Specify symmetry - choices are: c<n>, d<n>, h<n>, tet, oct, icos. Omit this option or specify 'c1' for asymmetric reconstructions. ||

==== Options related to making projections ====
See also [[EMAN2/Programs/e2project3d]]
|| ||--projector||string||[[EMAN2/Modular/Projectors|Projector]] to use. 'standard' is the default||
|| ||--orientgen||string||The [[EMAN2/Modular/OrientGens|orientation generation]] argument for e2project3d.py. Typically something like: ''--orientgen=eman:delta=2.0:inc_mirror=0''||
-Line 11:
+Line 32:
-e2refine.py runs in much the same way as [[EMAN1/Programs/Refine|refine]] in [[EMAN1]]

This programs oversees iterative single particle reconstruction. The overall process is to take a pre-existing 3D image and a set of 2D images and to run a variety of (often intensive) image processing applications which produces a refined 3D model. In particular, the program iteratively executes a sequence of python scripts which perform specific tasks, starting with with 3D projection (e2project3d.py), comparision of particle data to projections (e2simmx.py),  classification (e2classify.py), the generation of class averages (e2classaverage.py), and finally the generation of a new 3D model (e2make3d.py). This pipeline is depicted graphically in '''Figure 1''' below, along with accompanying data inputs and outputs.
|| {{attachment:refinepipeline_small.png}} ||
+|| ||--automask3d||string||The 5 parameters of the mask.auto3d processor, applied after 3D reconstruction. These paramaters are, in order, isosurface threshold,radius,nshells and ngaussshells. From e2proc3d.py you could achieve the same thing using --process=mask.auto3d:threshold=1.1:radius=30:nshells=5:ngaussshells=5.||
|| ||--simalign||string||The name of an 'aligner' to use prior to comparing the images||
|| ||--simaligncmp||string||Name of the aligner along with its construction arguments||
|| ||--simralign||string||The name and parameters of the second stage aligner which refines the results of the first alignment||
|| ||--simraligncmp||string||The name and parameters of the comparitor used by the second stage aligner. Default is dot.||
|| ||--simcmp||string||The name of a 'cmp' to be used in comparing the aligned images||
|| ||--simmask||string||A file containing a single 0/1 image to apply as a mask before comparison but after alignment||
|| ||--shrink||int||Optionally shrink the input particles by an integer amount prior to computing similarity scores. For speed purposes.||
|| ||--twostage||int||Optionally run a faster 2-stage similarity matrix, ~5-10x faster, generally same accuracy. Value specifies shrink factor for first stage, typ 1-3||
|| ||--prefilt||bool||Filter each reference (c) to match the power spectrum of each particle (r) before alignment and comparison||
|| ||--sep||int||The number of classes a particle can contribute towards (default is 1)||
|| ||--classkeep||float||The fraction of particles to keep in each class, based on the similarity score generated by the --cmp argument.||
|| ||--classkeepsig||bool||Change the keep ('--keep') criterion from fraction-based to sigma-based.||
|| ||--classiter||int||The number of iterations to perform. Default is 1.||
|| ||--classalign||string||If doing more than one iteration, this is the name and parameters of the 'aligner' used to align particles to the previous class average.||
|| ||--classaligncmp||string||This is the name and parameters of the comparitor used by the fist stage aligner  Default is dot.||
|| ||--classralign||string||The second stage aligner which refines the results of the first alignment in class averaging. Default is None.||
|| ||--classraligncmp||string||The comparitor used by the second stage aligner in class averageing. Default is dot:normalize=1.||
|| ||--classaverager||string||The averager used to generate the class averages. Default is 'mean'.||
|| ||--classcmp||string||The name and parameters of the comparitor used to generate similarity scores, when class averaging. Default is 'dot:normalize=1'||
|| ||--classnormproc||string||Normalization applied during class averaging||
|| ||--classrefsf||bool||Use the setsfref option in class averaging to produce better filtered averages.||
|| ||--classautomask||bool||This will apply an automask to the class-average during iterative alignment for better accuracy. The final class averages are unmasked.||
|| ||--pad||int||To reduce Fourier artifacts, the model is typically padded by ~25% - only applies to Fourier reconstruction||
|| ||--recon||string||Reconstructor to use see e2help.py reconstructors -v||
|| ||--m3dkeep||float||The percentage of slices to keep in e2make3d.py||
|| ||--m3dkeepsig||bool||The standard deviation alternative to the --m3dkeep argument||
|| ||--m3dsetsf||bool||The standard deviation alternative to the --m3dkeep argument||
|| ||--m3diter||int||The number of times the 3D reconstruction should be iterated||
|| ||--m3dpreprocess||string||Normalization processor applied before 3D reconstruction||
|| ||--m3dpostprocess||string||Post processor to be applied to the 3D volume once the reconstruction is completed||
|| ||--lowmem||bool||Make limited use of memory when possible - useful on lower end machines||
||-P||--parallel||string||Run in parallel, specify type:<option>=<value>:<option>:<value>||
-Line 17:
+Line 67:
-'''Fig. 1. Overview of data inputs and outputs in the EMAN2 refinement pipeline. Pink objects are data supplied by the user, blue objects are programs, and green objects are data output by EMAN2 programs.'''



<<Anchor(args)>>

=== Command Line Arguments ===
Most of the command line arguments have defaults, those which are absolutely. The user must atleast specify the total number of iterations, the symmetry and the proportional distribution or number of projections.

==== General parameters ====

|| version || bool || Show program's version number and exit ||
|| h, help || bool || Show help ||
|| c, check || bool || Checks the contents of the current directory to verify that e2refine.py will work ||
|| v, verbose || int || Toggle verbose mode - prints extra infromation to the command line while executing ||
|| input|| string || The input image stack of 2D particles||
|| iter|| int|| The number of refinement iterations ||
|| lowmem|| boolean || A low memory flag used to indicate memory should be used as sparsely as possible ||
|| model|| string || The seeding 3D model ||
|| path || string || The directory where output will be stored ||
|| sym|| string|| The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Symmetry3D.html|symmetry]] being output 3D models and the limit the range of generated projections ||

==== Arguments used to execute e2project3d.py ====

See [[e2project3d|e2project3d.py.]]

|| orientgen|| string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1OrientationGenerator.html|OrientationGenerator]] and parameters used for generation orientations in the asymmetric unit of the 3D model ||
|| projector || string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Projector.html|projector]] used for generating projections ||


==== Arguments used to execute e2simmx.py ====

See [[e2simmx|e2simmx.py.]]

|| shrink || int || The shrink factor applied to particles prior to generation of the similarity matrix (e2simmx.py) ||
|| simalign || string:args || The main [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Aligner.html|aligner]] used during similarity matrix generation  ||
|| simaligncmp || string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|comparator]] used by the main aligner during similarity matrix generation ||
|| simcmp|| string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|comparator]] used to generate the final score which is stored in the similarity matrix ||
|| simralign|| string:args || The refinement [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Aligner.html|aligner]] used during similarity matrix generation ||
|| simraligncmp || string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|comparator]] used by the refine align in similarity matrix generation ||
|| twostage || bool || Performs particle classification in two stages, first coarsely, then locally. Generally gives 5-10x speedups with no accuracy penalty. See [[EMAN2/Programs/e2simmx2stage]] for details.||
==== Arguments used to execute e2classify.py ====


See [[e2classify|e2classify.py]]

|| sep || int || The number of classes each particles can be associated with ||

==== Arguments used to execute e2classaverage.py ====

See [[e2classaverage|e2classaverage.py]].

|| classalign || string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Aligner.html|aligner]] used for alignment during iterative class averaging ||
|| classaligncmp|| string:args || [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|Comparator]] used by the main aligner during iterative class averaging ||
|| classaverager|| string::args || [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Averager.html|Averager]] used for averaging the images in each class ||
|| classcmp || string:args || The main [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|comparator]] used to quality and exclude bad particles in iterative class averaging ||
|| classiter || int || The number of class averaging iterations ||
|| classkeep|| float || The keep threshold used for excluding bad particles in iterative class averaging ||
|| classnormproc|| string:args || The normalization [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Processor.html|processor]] used in class averaging ||
|| classralign|| string:args || The refinement [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Aligner.html|aligner]] used in iterative class averagin ||
|| classraligncmp || string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Cmp.html|comparator]] used by the refinement aligner in iterative class averaging ||

==== Arguments used to execute e2make3d.py  ====

See [[e2make3d|e2make3d.py.]]

|| m3diter|| int || The number of iterations used my make3d when performing the Fourier inversion method of 3D reconstruction ||
|| m3dkeep|| float || The keep threshold used by e2make3d for the purpose of slice exclusion during 3D reconstruction ||
|| m3dpreprocess|| string:args || The normalization [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Processor.html|processor]] applied prior to slice insertion during 3D reconstruction ||
|| pad || int || The amount of padding used by the Fourier inversion 3D reconstruction technique ||
|| recon|| string:args || The [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1Reconstructor.html|reconstructor]] used for performing 3D reconstruction ||

==== Arguments used to post process the 3D reconstruction  ====

The ByMass links will resolve on January 22

|| mass || float || The estimated mass of the particle in kilodalton that, along with the apix argument, is used to run the  [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1NormalizeByMassProcessor.html| normalize.bymass processor]] immediately after 3D reconstruction ||
|| apix || float || The physical distance represented by a single pixel. This parameter, along with the mass argument, is used to run the  [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1NormalizeByMassProcessor.html| normalize.bymass processor]] immediately after 3D reconstruction. The apix argument is also used for generating the x-axis of the automatically generated convergence plots. ||
|| automask3d || float,int,int,int || The threshold, radius, nshells and nshellsgauss parameters, respectively, of the  [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1AutoMask3D2Processor.html| mask.auto3d processor]], which is applied directly after the application of the [[http://blake.bcm.edu/eman2/doxygen_html/classEMAN_1_1NormalizeByMassProcessor.html| normalize.bymass processor]]. ||



<<Anchor(checkfunc)>>

=== Check functionality ===
By specifying the --check flag, e2refine.py checks only whether the specified parameters are valid, and nothing more. Example output is shown below.

{{{
[someone@localhost]$ e2refine.py --check
#### Testing directory contents and command line arguments for e2refine.py
Error: you must specify the --it argument
start.img contains 2498 images of dimensions 100x100
threed.0a.mrc has dimensions 100x100x100
e2refine.py test.... FAILED
#### Test executing projection command: e2project3d.py threed.0a.mrc -f --sym=None --projector=standard --out=e2proj.img --check
Error: you must specify one of either --prop or --numproj
Error: none is an invalid symmetry type. You must specify the --sym argument
e2project3d.py command line arguments test.... FAILED
#### Test executing simmx command: e2simmx.py e2proj.img start.img e2simmx.img -f --saveali --cmp=dot:normalize=1 --align=rotate_translate --aligncmp=dot --check --nofilecheck
e2simmx.py command line arguments test.... PASSED
#### Test executing classify command: e2classify.py e2simmx.img e2classify.img --sep=2 -f --check --nofilecheck
e2classify.py command line arguments test.... PASSED
#### Test executing classaverage command: e2classaverage.py start.img e2classify.img e2classes.1.img --ref=e2proj.img --iter=3 -f --keepsig=1.000000 --cmp=dot:normalize=1 --align=rotate_translate --aligncmp=phase --check --nofilecheck
e2classaverage.py command line arguments test.... PASSED
#### Test executing make3d command: e2make3d.py e2classes.1.img --sym=None --iter=4 -f --recon=fourier --out=threed.0a.mrc --keepsig=1.000000 --check --nofilecheck
Error: none is an invalid symmetry type. You must specify the --sym argument
e2make3d.py command line arguments test.... FAILED
}}}
This functionality will be useful for people who have to submit their jobs to queues - being able to check that the script will work will ensure its successful execution.
+The refinement process produces a large number of different output files in databases within directories named ''refine_xx''. The easiest way to browse these files is with ''[[EMAN2/Programs/e2display]]'', the file browser. For documentation of the file contents, please see the items towards the bottom of [[EMAN2/Concepts|this page]].