Diff for "EMAN2/Programs/e2refine_easy"

Differences between revisions 3 and 4

e2refine_easy

This is the primary single particle refinement program in EMAN2.1+. It replaces earlier programs such as e2refine.py

and e2refine_evenodd.py. Major features of this program:

While a range of command-line options still exist. You should not normally specify more than a few basic requirements. The rest will be auto-selected for you.
This program will split your data in half and automatically refine the halves independently to produce a gold standard resolution curve for every step in the refinement.
An HTML report file is generated as this program runs (in report/index.html). It is updated on-the-fly and is the best way to monitor the progress of a running job.
The gold standard FSC also permits us to automatically filter the structure at each refinement step. The resolution you specify is a target, NOT the filter resolution.
Many of the 'advanced' options are hidden in the e2projectmanager.py GUI, because most users should never need to specify them.
When continuing a successful refinement, trying to push resolution, etc., using 'startfrom'

	input	string	The name of the image file containing the particle data
	model	string	The map to use as a starting point for refinement

startfrom

string

Path to an existing refine_xx directory to continue refining from. Alternative to --input and --model.

Standard Options:

Short	Name	Type	Description
	targetres	float	Target resolution in A of this refinement run. Usually works best in at least two steps (low/medium resolution, then final resolution) when starting with a poor starting model. Usually 3-4 iterations is sufficient.
	speed	int	(1-7) Balances speed vs precision. Larger values sacrifice a bit of potential resolution for significant speed increases. 1 may yield slightly better results but come with a significant performance penalty. default=5
	sym	bool	Specify symmetry - choices are: c<n>, d<n>, tet, oct, icos.
	breaksym	bool	If selected, reconstruction will be asymmetric with sym= specifying a known pseudosymmetry, not an imposed symmetry.
	iter	int	The total number of refinement iterations to perform. Default=auto
	mass	float	The ~mass of the particle in kilodaltons, used to run normalize.bymass. Due to resolution effects, not always the true mass.
	apix	float	The angstrom per pixel of the input particles. This argument is required if you specify the --mass argument. If unspecified (set to 0), the convergence plot is generated using either the project apix, or if not an apix of 1.
	classkeep	float	The fraction of particles to keep in each class, based on the similarity score. (default=0.9 -> 90%%)
	classautomask	bool	This will apply an automask to the class-average during iterative alignment for better accuracy. The final class averages are unmasked.
	prethreshold	bool	Applies a threshold to the volume just before generating projections. A sort of aggressive solvent flattening for the reference.
	m3dkeep	float	The fraction of slices to keep in e2make3d.py. Default=0.8 -> 80%%
	m3dpostprocess	string	Default=none. An arbitrary post-processor to run after all other automatic processing. Maps are autofiltered, so a low-pass filter should not normally be used here.
-P	parallel	string	Run in parallel, specify type:<option>=<value>:<option>=<value>. See http://blake.bcm.edu/emanwiki/EMAN2/Parallel
	threads	int	Number of threads to run in parallel on a single computer when multi-computer parallelism isn't useful

Complete Options, including advanced options:

Short	Name	Type	Description
	input	string	Image stack containing phase-flipped particles used for alignment
	inputavg	string	Optional file containing alternate version of the particles to use for reconstruction after alignment
	model	string	The map to use as a starting point for refinement
	startfrom	string	Path to an existing refine_xx directory to continue refining from. Alternative to --input and --model.
	targetres	float	Target resolution in A of this refinement run. Usually works best in at least two steps (low/medium resolution, then final resolution) when starting with a poor starting model. Usually 3-4 iterations is sufficient.
	speed	int	(1-7) Balances speed vs precision. Larger values sacrifice a bit of potential resolution for significant speed increases. Set to 1 when really pushing resolution. Set to 7 for initial refinements. default=5
	sym	bool	Specify symmetry - choices are: c<n>, d<n>, tet, oct, icos.
	breaksym	bool	If selected, reconstruction will be asymmetric with sym= specifying a known pseudosymmetry, not an imposed symmetry.
	tophat	bool	Instead of imposing a final Wiener filter, use a tophat filter (similar to Relion). Sharper features, but may exaggerate.
	treeclassify	bool	Classify using a binary tree.
	m3dold	bool	Use the traditional e2make3d program instead of the new e2make3dpar program
	iter	int	The total number of refinement iterations to perform. Default=auto
	mass	float	The ~mass of the particle in kilodaltons, used to run normalize.bymass. Due to resolution effects, not always the true mass.
	apix	float	The angstrom per pixel of the input particles. Normally set to 0, which will read the value from the header of the input file
	sep	int	The number of classes each particle can contribute towards (normally 1). Increasing will improve SNR, but produce rotational blurring.
	classkeep	float	The fraction of particles to keep in each class, based on the similarity score. (default=0.9 -> 90%%)
	classautomask	bool	This will apply an automask to the class-average during iterative alignment for better accuracy. The final class averages are unmasked.
	prethreshold	bool	Applies a threshold to the volume just before generating projections. A sort of aggressive solvent flattening for the reference.
	eulerrefine	bool	Refines Euler angles of class-averages before reconstruction
	m3dkeep	float	The fraction of slices to keep in e2make3d.py. Default=0.8 -> 80%%
	m3dpostprocess	string	Default=none. An arbitrary post-processor to run after all other automatic processing. Maps are autofiltered, so a low-pass filter should not normally be used here.
-P	parallel	string	Run in parallel, specify type:<option>=<value>:<option>=<value>. See http://blake.bcm.edu/emanwiki/EMAN2/Parallel
	threads	int	Number of threads to run in parallel on a single computer when multi-computer parallelism isn't useful
	path	string	The name of a directory where results are placed. Default = create new refine_xx
-v	verbose	int	verbose level [0-9], higner number means higher level of verboseness
	usefilt	string	Specify a particle data file that has been low pass or Wiener filtered. Has a one to one correspondence with your particle data. If specified will be used in projection matching routines, and elsewhere.
	automaskexpand	int	Default=boxsize/20. Specify number of voxels to expand mask before soft edge. Use this if low density peripheral features are cut off by the mask.
	automask3d	string	Default=auto. Specify as a processor, eg - mask.auto3d:threshold=1.1:radius=30:nshells=5:nshellsgauss=5.
	automask3d2	string	Default=none. If specified, this mask will be multiplied by the result of the first mask, eg - using mask.soft to mask out the center of a virus.
	projector	bool	Default=standard. Projector to use with parameters.
	orientgen	string	Default=auto. Orientation generator for projections, eg - eman:delta=5.0:inc_mirror=0:perturb=1
	simalign	string	Default=auto. The name of an 'aligner' to use prior to comparing the images
	simaligncmp	string	Default=auto. Name of the aligner along with its construction arguments
	simralign	string	Default=auto. The name and parameters of the second stage aligner which refines the results of the first alignment
	simraligncmp	string	Default=auto. The name and parameters of the comparitor used by the second stage aligner.
	simcmp	string	Default=auto. The name of a 'cmp' to be used in comparing the aligned images
	simmask	string	Default=auto. A file containing a single 0/1 image to apply as a mask before comparison but after alignment
	shrink	int	Default=auto. Optionally shrink the input particles by an integer amount prior to computing similarity scores. For speed purposes. 0 -> no shrinking
	shrinks1	int	The level of shrinking to apply in the first stage of the two-stage classification process. Default=0 (autoselect)
	prefilt	bool	Default=auto. Filter each reference (c) to match the power spectrum of each particle (r) before alignment and comparison. Applies both to classification and class-averaging.
	cmpdiff	bool	Used only in binary tree classification. Use a mask that focus on the difference of two children.
	treeincomplete	int	Used only in binary tree classification. Incompleteness of the tree on each level.Default=0
	classkeepsig	bool	Change the keep ('--keep') criterion from fraction-based to sigma-based.
	classiter	int	Default=auto. The number of iterations to perform.
	classalign	string	Default=auto. If doing more than one iteration, this is the name and parameters of the 'aligner' used to align particles to the previous class average.
	classaligncmp	string	Default=auto. This is the name and parameters of the comparitor used by the fist stage aligner.
	classralign	string	Default=auto. The second stage aligner which refines the results of the first alignment in class averaging.
	classraligncmp	string	Default=auto. The comparitor used by the second stage aligner in class averageing.
	classaverager	string	Default=auto. The averager used to generate the class averages. Default is auto.
	classcmp	string	Default=auto. The name and parameters of the comparitor used to generate similarity scores, when class averaging.
	classnormproc	string	Default=auto. Normalization applied during class averaging
	classrefsf	bool	Use the setsfref option in class averaging. This matches the filtration of the class-averages to the projections for easier comparison. Disabled when ampcorrect=flatten is used.
	pad	int	Default=auto. To reduce Fourier artifacts, the model is typically padded by ~25 percent - only applies to Fourier reconstruction
	recon	bool	Default=auto. Reconstructor to use see e2help.py reconstructors -v
	m3dkeepsig	bool	Default=auto. The standard deviation alternative to the --m3dkeep argument
	m3dsetsf	string	Default=auto. Name of a file containing a structure factor to apply after refinement
	m3dpreprocess	string	Default=auto. Normalization processor applied before 3D reconstruction
	ampcorrect	bool	Will perform amplitude correction via the specified method. 'flatten' requires a target resolution better than 8 angstroms (experimental). 'none' will disable amplitude correction (experimental).
	classweight	bool	Alter the weight of each class in the reconstruction (experimental).
	sqrtnorm	bool	If set, the sqrt of the number of particles in each class will be used to weight the direct fourier inversion.
	lowmem	bool	Default=auto. Make limited use of memory when possible - useful on lower end machines
	ppid	int	Set the PID of the parent process, used for cross platform PPID

To run this program, you would normally specify only the following options: Use these 2 when starting a new "gold standard" refinement from scratch:

--input=<lst file referencing phase-flipped particles in HDF format> --model=<starting map to seed refinement>

Use this, when you have already achieved sufficient resolution to validate the gold-standard, and you are trying to improve resolution/quality:

--startfrom=<path to existing refine_xx directory to continue from>

Then a subset of these:

--targetres=<in A> Resolution to target in Angstroms in this refinement run. Do not be overoptimistic !
- Generally begin with something conservative like 25, then use --startfrom and reduce to ~12, only after that try for high (3-8 A). Data permitting, of course. Low resolution attempts will run MUCH faster due to more efficient parameters.
--speed=<1-7> Default=5. Larger values will run faster, with a coarser angular step. Smaller values will
- sample the angular step more finely than stric0tly required and increase sep=. A larger value is good for early refinement. A smaller value will push "gold standard" resolution towards its limit.
--sym=<symmetry> Symmetry to enforce during refinement (Cn, Dn, icos, oct, cub).
- Default=c1 (no symmetry)
--mass=<in kDa> Putative mass of object in kDa, but as desired volume varies with resolution
- actual number may vary by as much a ~2x from the true value. The goal is to have a good isosurface in the final map with a threshold of 1.0.
--parallel=<par spec> While not strictly required, without this option the refinement will run on a single CPU
- and you will likely wait a very long time. To use more than one core on a single computer, just say thread:N (eg - thread:4). For other options, like MPI, see:
  http://blake.bcm.edu/emanwiki/EMAN2/Parallel for details.
--threads=<ncpu> For some algorithms, processing in parallel over the network (MPI) works poorly.
- Running on multiple processors on a single machine may still be worthwhile. If you specify this option, in specific cases it will replace your specified --parallel option. Specify the number of cores that can be used on a single machine.

Optional:

--apix=<A/pix> The value will normally come from the particle data if present. You can override with this.
--sep=<classes/ptcl> each particle will be put into N classes. Improves contrast at cost of rotational blur.
--classkeep=<frac> fraction of particles to use in final average. Default 90%. Should be >50%
--m3dkeep=<frac> fraction of class-averages to use in 3-D map. Default=auto --classautomask applies an automask when aligning particles for improved alignment
--m3dpostprocess <name>:<parm>=<value>:... An arbitrary processor
- (e2help.py processors -v2) to apply to the 3-D map after each iteration. Default=none
--path=<path> Normally the new directory will be named automatically. If you prefer your own convention
- you can override, but it may cause minor GUI problems if you break the standard naming convention.

Details about the refinement, and parameters which have automatically been selected are discussed in report/index.html

-  ⇤ ← Revision 3 as of 2016-10-20 14:22:54 → 
  Size: 21054
  Editor: SteveLudtke
  Comment:
+   ← Revision 4 as of 2016-10-20 14:23:33 → ⇥
  Size: 15536
  Editor: SteveLudtke
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 36:
-|| ||medianshrink||bool||Downsamples the volume by a factor of n by computing the local median||
|| ||meanshrink||bool||Downsamples the volume by a factor of n by computing the local average||
|| ||meanshrinkbig||bool||Downsamples the volume by a factor of n without reading the entire volume into RAM. The output file (after shrinking) must fit into RAM. If specified, this must be the ONLY option on the command line. Any other options will be ignored. Output data type will match input data type. Works only on single image files, not stack files.||
|| ||meanshrinkbig||bool||Downsamples the volume by a factor of n without reading the entire volume into RAM. The output file (after shrinking) must fit into RAM. If specified, this must be the ONLY option on the command line. Any other options will be ignored. Output data type will match input data type. Works only on single image files, not stack files.||
|| ||scale||bool||Rescales the image by 'n', generally used with clip option.||
|| ||sym||bool||Symmetry to impose - choices are: c<n>, d<n>, h<n>, tet, oct, icos||
|| ||averager||bool||Averager used for --average and --sym options||
|| ||clip||bool||Make the output have this size by padding/clipping. 1, 3 or 6 arguments. ||
|| ||fftclip||bool||Make the output have this size, rescaling by padding FFT.||
|| ||process||bool||apply a processor named 'processorname' with all its parameters/values.||
|| ||apix||bool||A/pixel for S scaling||
|| ||origin||bool||Set the coordinates for the pixel (0,0,0) for Chimera. THIS HAS NO IMPACT ON IMAGE PROCESSING !||
|| ||mult||bool||Scales the densities by a fixed number in the output||
|| ||multfile||bool||Multiplies the volume by another volume of identical size. This can be used to apply masks, etc.||
|| ||matchto||bool||Match filtration of input volume to this specified volume.||
|| ||outmode||bool||All EMAN2 programs write images with 4-byte floating point values when possible by default. This allows specifying an alternate format when supported (int8, int16, int32, uint8, uint16, uint32). Values are rescaled to fill MIN-MAX range.||
|| ||outnorescale||bool||If specified, floating point values will not be rescaled when writing data as integers. Values outside of range are truncated.||
|| ||mrc16bit||bool||(deprecated, use --outmode instead) output as 16 bit MRC file||
|| ||mrc8bit||bool||(deprecated, use --outmode instead) output as 8 bit MRC file||
|| ||add||bool||Adds a constant 'f' to the densities||
|| ||addfile||bool||Adds the volume to another volume of identical size||
|| ||calcfsc||bool||Calculate a FSC curve between two models. Output is a txt file. This option is the name of the second volume.||
|| ||filtertable||bool||Applies a 2 column (S,amp) file as a filter in Fourier space, assumed 0 outside the defined range.||
|| ||calcsf||bool||Calculate a radial structure factor. Must specify apix.||
|| ||calcradial||bool||Calculate the radial density by shell. Output file becomes a text file. 0 - mean amp, 2 - min, 3 - max, 4 - sigma||
|| ||setsf||bool||Set the radial structure factor. Must specify apix.||
|| ||tophalf||bool||The output only keeps the top half map||
|| ||inputto1||bool||All voxels in the input file are set to 1 after reading. This can be used with mask.* processors to produce a mask file of the correct size.||
|| ||icos5fhalfmap||bool||The input is the icos 5f top half map generated by the 'tophalf' option||
|| ||outtype||bool||Set output image format, mrc, imagic, hdf, etc||
|| ||first||bool||the first image in the input to process [0 - n-1])||
|| ||trans||bool||Translate map by dx,dy,dz ||
|| ||resetxf||bool||Reset an existing transform matrix to the identity matrix||
|| ||align||bool||Align input map to reference specified with --alignref. As with processors, a sequence of aligners is permitted||
|| ||ralignzphi||string||Refine Z alignment within +-10 pixels  and phi +-15 degrees (for C symmetries), specify name of alignment reference here not with --alignref||
|| ||alignref||bool||Alignment reference volume. May only be specified once.||
|| ||alignctod||string||Rotates a map already aligned for C symmetry so the best 2-fold is positioned for specified D symmetry. Does not impose specified symmetry.||
|| ||rot||string||Rotate map. Specify az,alt,phi or convention:par=val:par=val:...  eg - mrc:psi=22:theta=15:omega=7||
|| ||icos5to2||bool||Rotate an icosahedral map from 5-fold on Z (EMAN standard) to 2-fold on Z (MRC standard) orientation||
|| ||icos2to5||bool||Rotate an icosahedral map from 2-fold on Z (MRC standard) to 5-fold on Z (EMAN standard)  orientation||
|| ||last||bool||the last image in the input to process||
|| ||swap||bool||Swap the byte order||
|| ||average||bool||Computes the average of a stack of 3D volumes||
|| ||append||bool||Append output image, i.e., do not write inplace.||
|| ||ppid||int||Set the PID of the parent process, used for cross platform PPID||
|| ||unstacking||bool||Process a stack of 3D images, then output a a series of numbered single image files||
|| ||tomoprep||bool||Produces a special HDF file designed for rapid interactive tomography annotation. This option should be used alone.||
||-v||verbose||bool||verbose level [0-9], higner number means higher level of verboseness||
|| ||step||string||Specify <init>,<step>. Processes only a subset of the input data. For example, 0,2 would process only the even numbered particles||
id> examples/extracthelp.py programs/e2refine_easy.py