EMAN2/Programs/e2refine

e2refine

This program is the heart of single particle reconstruction in EMAN2. It embodies the concept of an iterative 3-D single particle reconstruction in a single step, starting with a 3-D starting model and a set of preprocessed particle data. The overall strategy is similar to that used in EMAN1, with a number of improvements for speed and accuracy. The general idea is that the 3-D orientation of each particle is determined by comparison to a set of projections of the current 3-D model. Particles in near-identical orientations are then aligned and averaged in 2-D. These averages are then used to construct a new 3-D model, which is then reprojected for use in the next cycle of refinement. This process of reference-based classification is somewhat unique to EMAN, and is one reason why it can converge so rapidly to the correct answer even with a poor starting model.

EMAN2 refinement has many more options than EMAN1, and permits much more precise control over the refinement process. This can be both a blessing and a curse. We suggest launching your refinements from the workflow interface which simplifies specifying all of the necessary options. For those in search of detail, we document everything here.

Command Line Arguments

General Options

	--version	bool	show program's version number and exit
-h	--help	bool	show this help message and exit
-c	--check	bool	Checks the contents of the current directory to verify that e2refine.py command will work - checks for the existence of the necessary starting files and checks their dimensions.
-v	--verbose	int	verbose level [0-9], higner number means higher level of verboseness

Options impacting the overall refinement

	--iter	int	The total number of refinement iterations to perform
	--startiter	int	If a refinement crashes, this can be used to pick it up where it left off. This should NOT be used to change parameters, but only to resume an incomplete run.
	--model	string	The name 3D image that will seed the refinement
	--input	string	The name of the image containing the particle data
	--usefilt	string	Note: some unresolved bugs may exist with this option (6/2011) Specify a particle data file that has been low pass or Wiener filtered. Has a one to one correspondence with your particle data. If specified will be used in projection matching routines, and elsewhere.
	--path	string	The name of a directory where results are placed. If not specified (suggested), will use a path of the form refine_xx.
	--mass	float	The mass of the particle in kilodaltons, used to run normalize.bymass. If unspecified nothing happens. Requires the --apix argument.
	--apix	float	The angstrom per pixel of the input particles. This argument is required if you specify the --mass argument. If unspecified, the convergence plot is generated using either the project apix, or an apix of 1.
	--sym	string	Symmetry to be imposed throughout: c<n>, d<n>, h<n>, tet, oct, icos. Omit this option or specify 'c1' for asymmetric reconstructions.
	--lowmem	bool	Make limited use of memory when possible. Slight speed penalty.
-P	--parallel	string	Run in parallel, specify type:<option>=<value>:<option>:<value>. See EMAN2/Parallel

Options related to making projections

Options for comparing particles to projections

See also EMAN2/Programs/e2simmx

--twostage	int	Optionally run a faster 2-stage similarity matrix, ~5-30x faster, almost identical results. Value specifies shrink factor for first stage, typ 1-3
--shrink	int	Optionally shrink the input particles by an integer amount prior to computing similarity scores. For speed. If used with --twostage, this specifies the second stage shrink factor.
--simcmp	string	The name of a comparator to be used in comparing the aligned images
--simalign	string	The name of an aligner to use prior to comparing the images
--simaligncmp	string	Name and options for a comparator to use in first stage alignment for classification
--simralign	string	The name and parameters of the second stage aligner which refines the results of the first alignment. Currently this is either not specified or is 'refine'.
--simraligncmp	string	The name and parameters of the comparator used by the second stage aligner. Default is dot.
--simmask	string	A file containing a single 0/1 image to apply as a mask before comparison but after alignment
--prefilt	bool	Filter each reference (c) to match the power spectrum of each particle (r) before alignment and comparison

These parameters are used by e2simmx, a program that compares each particle to each projection and records quality scores. To do this the particles must first be aligned to the projections using the aligners you specify. Once aligned the 'Main comparator' is used to record the quality score. These quality values are recorded to an image matrix on handed on to the next stage in the refinement process.

The shrink parameter causes all projections and particles to be shrunken by the given amount prior to comparison. This can provide a significant time advantage, though at the expense

of resolution. Note however that the class averaging stage, which can involve iterative alignment, does not use shrunken data.

2 stage simmx is still experimental. If set to 2 instead of zero, classification will be performed in two stages resulting in a 5-25x speedup, but with a potential decrease in accuracy.
PS match ref will force the power spectra of the particle and reference to be the same before comparison. Necessary for some comparators.
Main comparator is used to decide which reference a particle most looks-like (e2help.py cmps -v2)
Aligner - use default
Align comparator and refine align comparator allow you to select which comparators are used for particle->reference alignment. In most cases ccc is adequate, but sometimes you may wish to match the main comparator.
Refine align - if set to 'refine', alignments will be more accurate, and thus classification will be more accurate. Severe speed penalty.

For comparators here are some possible choices:

ccc (no options) - Simple dot product. Fast, can work well, but in some situations will cause a deterministic orientation bias (like GroEL side views which end up tilted). Works poorly for very noisy data unless usefilt particles are used for alignment.

frc zeromask=1:snrweight=1 - Fourier Ring Correlation with signal to noise ratio weighting and reference based masks. Works poorly without SNR weighting. Masking is optional, but a good idea.

phase zeromask=1:snrweight=1 - Mean phase error. same options as for frc. Do NOT use phase without snrweight=1

sqeuclidean normto=1:zeromask=1 - similar to ccc, but with additional options to better match densities. Only works well in conjunction with PS match ref, and usefilt with Wiener filtered particles.

Options for classifying particles based on similarity matrix

Options for generating class-averages

Options related to 3-D Reconstruction of Class-averages and post-processing

--pad	int	To reduce Fourier artifacts, the model is typically padded by ~25% - only applies to Fourier reconstruction. Please read EMAN2/BoxSize
--recon	string	reconstructor to use. Main choices are 'fourier' or 'wiener_fourier'
--m3dkeep	float	The percentage of slices to keep in e2make3d.py
--m3dkeepsig	bool	Similar to classkeepsig above, changes the meaning of --m3dkeep to be in terms of standard deviations
--m3dsetsf	bool	Filters the final 3-D map to match the precomputed structure factor (stored in the project database). Normally used with a --m3dpostprocess=filter.lowpass.* option
--m3diter	int	The number of times the 3D reconstruction should be iterated. 2 and 3 are the only valid values. 2 is faster and normally has sufficient accuracy.
--m3dpreprocess	string	Normalization processor applied before 3D reconstruction
--m3dpostprocess	string	Post processor to be applied to the 3D volume once the reconstruction is completed
--automask3d	string	The 5 parameters of the mask.auto3d processor, applied after 3D reconstruction. These parameters are, in order: isosurface threshold,radius,nshells and ngaussshells. From e2proc3d.py you could achieve the same thing using --process=mask.auto3d:threshold=1.1:radius=30:nshells=5:ngaussshells=5. e2help.py processors -v2 for more information on mask.auto3d.

Parameters for 3D reconstruction:

Use the 'fourier' or 'wiener_fourier' reconstructor
pad - should be some number a bit larger than your box size. This should be a 'good' box size as well (see wiki)
m3dkeep - similar to keep in class-averaging, but determines how many class averages are excluded from the reconstruction
m3dsetsf - This will force the reconstruction to be filtered to match the structure factor determined during CTF correction. If used it should be combined with a gaussian lowpass filter at the targeted resolution
m3dpostprocess - This is an optional filter to apply to the model as a final step, filter.lowpass.gauss with 'cutoff_freq=<1/resolution>' is good with set SF. If set SF is not used, note that the model will already be somewhat filtered even without this.

Please note that the box-sizes suggested in EMAN2 are larger than those in EMAN1 for improved CTF correction. Details on box size selection are here.

The refinement process produces a large number of different output files in databases within directories named refine_xx. The easiest way to browse these files is with EMAN2/Programs/e2display, the file browser. For documentation of the file contents, please see the items towards the bottom of this page.

	--projector	string	Projector to use. 'standard' is the default
	--orientgen	string	The orientation generation argument for e2project3d.py. Typically something like: --orientgen=eman:delta=2.0:inc_mirror=0

--classiter	int	The number of iterations to perform. Default is 1. Larger values reduce model/noise bias, but slightly decrease resolution.
--classcmp	string	The name and parameters of the comparitor used to generate similarity scores, when class averaging. Default is 'dot:normalize=1'
--classalign	string	If doing more than one iteration, this is the name and parameters of the aligner used to align particles to the previous class average.
--classaligncmp	string	This is the name and parameters of the comparator used by the fist stage aligner Default is dot.
--classralign	string	The second stage aligner which refines the results of the first alignment in class averaging. Currently this is either not specified or is 'refine'.
--classraligncmp	string	The comparator used by the second stage aligner in class averageing. Default is dot:normalize=1.
--classaverager	string	The averager used to generate the class averages. Default is 'mean'.
--classkeep	float	The fraction of particles to keep in each class, based on the similarity score generated by the --cmp argument (see also --classkeepsig).
--classkeepsig	bool	Change the keep ('--keep') criterion from fraction-based to sigma-based. eg - with this set, 1.0 would correspond to discarding particles more than 1 standard deviation from the mean
--classnormproc	string	Normalization processor and options applied during class averaging. Typically 'normalize.edgemean'
--classrefsf	bool	This will impose the 1-D structure factor of each model projection onto the corresponding class-average to improve its filtration. This is an alternative to Wiener filtration if the map resolution is regulated.
--classautomask	bool	Experimental. This will apply a 2-D automask to the class-average during iterative alignment for better accuracy. The final class averages are unmasked.