Differences between revisions 1 and 2

refine2d.py

This program performs 2-D refinement of particle sets. No 3-D references or models are used.

Usage

refine2d.py [--iter=<iterations>] [--ninitcls=<# initial classes>] [--finalsep=<# split each class>] [--minptcl=<min ptcl/class>] [--proc=<# processors>] [--ctfcw=<SF file for ctf cor>]

[--nosvd] [--nbasis=<# basis images>] [--logptcl]

Parameters

--version	show program's version number and exit
-h, --help	show this help message and exit
--debug	debuging output

--iter=ITER	Number of refinement iterations
-ININITCLS, --ninitcls=NINITCLS	Number of initial classes for alignment iterations
-FFINALSEP, --finalsep=FINALSEP	number of additional class subsplits in final iteration
--minptcl=MINPTCL	Minimum number of particles in a final class-average
-PPROC, --proc=PROC	Processors to use
-CCTFCW, --ctfcw=CTFCW	Structure factor file for full CTF correction
-BNBASIS, --nbasis=NBASIS	Number of basis vectors to use in classification
--nosvd	Use straight k-means for classification instead of SVD based vectorization
--nofinalsort	Do not sort the final class-averages (this can be very slow)
--logptcl	Makes a logfile containing the identity of the class-average for each particle

Description

refine2d.py performs 2-D refinement of a stack of particles with no reference to 3-D models, with or without CTF correction. The overall process is:

make a small set of initial rough class-averages using startnrclasses (these are not intended to be good)
align each particle to each class-average, and keep the alignment from the best match (particles are aligned, not classified)
perform SVD on the set of particles. (for this purpose, equivalent to MSA)
project each particle into the SVD basis, and perform k-means classification on the result
make new averages, and sort/align them
iterate (to step 2)

After several iterations this will produce a very robust set of class-averages without any requirement that they form a consistent 3-D model. This is a very good way to test for heterogeneity among your particles, and as a cross-check to insure that the results of a 3-D refinement agree with the original data (projections of the 3-D model should look like the class-averages from refine2d).

Running refine2d.py

refine2d.py is best run in an empty directory, as many intermediate files are created. Unlike refine, refine2d cannot resume an interrupted refinement in the middle. Each time you run the command, it starts from scratch. The input file may have any name. The program works well with phase-flipped images, with or without CTF correction enabled. If CTF correction is used, it is applied only at the very end of processing, and both corrected and uncorrected averages are produced. Output files are as follows:

iter.final.hed, iter.final.ctfc.hed, iter.final.sort.hed - The final results of the refinement, with or without ctf correction, and with or without a final sort
iter.*.hed - The results after each iteration
basis.*.hed - The basis sets (SVD results) after each iteration
cls???? - directories used for 'finalsep' if present
--iter=<n> : The number of iterations to use depends on the data. for less-noisy/homogeneous data, 4-5 is likely fine. A typical value is 10.
--ninitcls=<n> : The number of classes to generate in each iteration, until the very last iteration. Typically you want to have at least 10-20 particles per class. Much larger numbers are also fine. ie- if you have 100,000 particles, making 100 classes is fine. Large numbers of classes, will, of course, slow the refinement down proportionally.
--finalsep=<nsplit> : In the final iteration, each class can (optionally) be split into several subclasses. The final result will be ninitcls * finalsep classes - bad classes. A 'bad class' is one with too few particles in it.
--minptcl=<n> : If a class has fewer than n particles it won't be included in the final results.
--proc=<n> : Number of processors to use during processing. Parallelism works the same way it does for all other EMAN1 programs.
--ctfcw=<sffile> : Enables CTF correction on the final results, using the structure factor file (sffile) for filtration, as in the refine program.
--nbasis=<n> : Number of basis images to use for classification. Normally the default value is fine. In older versions of refine2d.py this option was broken. If you want to experiment with this, please use a post-1.8 snapshot version of EMAN.

-  ⇤ ← Revision 1 as of 2007-08-20 18:03:31 → 
  Size: 4672
  Editor: SteveLudtke
  Comment:
+   ← Revision 2 as of 2008-11-26 04:42:29 → ⇥
  Size: 4674
  Editor: localhost
  Comment: converted to 1.6 markup
-Deletions are marked like this.
+Additions are marked like this.
 Line 27:
-. perform [http://en.wikipedia.org/wiki/Singular_value_decomposition SVD] on the set of particles. (for this purpose, equivalent to MSA)
+. perform [[http://en.wikipedia.org/wiki/Singular_value_decomposition|SVD]] on the set of particles. (for this purpose, equivalent to MSA)