e2refine2d

e2refine2d.py runs in much the same way as EMAN1's refine2d.py, though it has been improved in a number of subtle ways

This program will take a set of boxed out particle images and perform iterative reference-free classification to produce a set of representative class-averages. The point of this process is to reduce noise levels, so the overall shape of the particle views present in the data can be better observed. Generally cryo-EM single particles are noisy enough that it is difficult to distinguish subtle, or even not-so-subtle differences between particle images. By aligning and averaging similar particles together, less noisy versions of representative views are created. The class-averages produced by this program are typically used for:

This last point can be used to produce 'population-dynamics' movies of a particle in very close to the same orientation.

This program is quite fast for as many as a few thousand particles and ~100 classes. For most purposes if your data set is large (>10,000) particles you might consider using only a subset of the data for speed, though this clearly isn't appropriate for the 3rd use above.

Options:

--path

string

Path for the refinement, default=auto

--iter

int

The total number of refinement iterations to perform

--automask

bool

This will perform a 2-D automask on class-averages to help with centering. May be useful for negative stain data particularly.

--input

string

The name of the file containing the particle data

--ncls

int

Number of classes to generate

--maxshift

int

Maximum particle translation in x and y

--naliref

int

Number of alignment references to when determining particle orientations

--exclude

string

The named file should contain a set of integers, each representing an image from the input file to exclude.

--resume

bool

This will cause a check of the files in the current directory, and the refinement will resume after the last completed iteration. It's ok to alter other parameters.

--initial

string

File containing starting class-averages. If not specified, will generate starting averages automatically

--nbasisfp

int

Number of MSA basis vectors to use when classifying particles

--minchange

int

Minimum number of particles that change group before deicding to terminate. Default = -1 (auto)

--fastseed

bool

Will seed the k-means loop quickly, but may produce less consistent results.

--simalign

string

The name of an 'aligner' to use prior to comparing the images (default=rotate_translate_flip)

--simaligncmp

string

Name of the aligner along with its construction arguments (default=frc)

--simralign

string

The name and parameters of the second stage aligner which refines the results of the first alignment

--simraligncmp

string

The name and parameters of the comparitor used by the second stage aligner. (default=dot).

--simcmp

string

The name of a 'cmp' to be used in comparing the aligned images (default=frc:nweight=1)

--shrink

int

Optionally shrink the input particles by an integer amount prior to computing similarity scores. For speed purposes.

--classkeep

float

The fraction of particles to keep in each class, based on the similarity score generated by the --cmp argument (default=0.85).

--classkeepsig

bool

Change the keep ('--keep') criterion from fraction-based to sigma-based.

--classiter

int

Number of iterations to use when making class-averages (default=5)

--classalign

string

If doing more than one iteration, this is the name and parameters of the 'aligner' used to align particles to the previous class average.

--classaligncmp

string

This is the name and parameters of the comparitor used by the fist stage aligner Default is dot.

--classralign

string

The second stage aligner which refines the results of the first alignment in class averaging. Default is None.

--classraligncmp

string

The comparitor used by the second stage aligner in class averageing. Default is dot:normalize=1.

--classaverager

string

The averager used to generate the class averages. Default is 'mean'.

--classcmp

string

The name and parameters of the comparitor used to generate similarity scores, when class averaging. Default is frc'

--classnormproc

string

Normalization applied during class averaging

--classrefsf

bool

Use the setsfref option in class averaging to produce better filtered averages.

--normproj

bool

Normalizes each projected vector into the MSA subspace. Note that this is different from normalizing the input images since the subspace is not expected to fully span the image

-P

--parallel

string

Run in parallel, specify type:<option>=<value>:<option>:<value>

--dbls

string

data base list storage, used by the workflow. You can ignore this argument.

-v

--verbose

int

verbose level [0-9], higner number means higher level of verboseness