EMAN2 Tomography Workflow Tutorial

Prepare input files (~2 minutes)

Project Manager

Tiltseries Alignment and Tomogram Reconstruction (20 min)

Alignment of the tilt-series is performed iteratively in conjunction with tomogram reconstruction. Tomograms are not normally reconstructed at full resolution, generally limited to 1k x 1k or 2k x 2k, but the tilt-series are aligned at full resolution. For high resolution subtomogram averaging, the raw tilt-series data is used, based on coordinates from particle picking in the downsampled tomograms. On a typical workstation reconstruction takes about 4-5 minutes per tomogram.

For the tutorial tilt-series:

Tomogram reconstruction

When working with your own data:

CTF Estimation (10 min)

For the tutorial tilt-series:

When working with your own data:

Note that this program is only estimating CTF parameters, taking tilt into account. It is not performing any phase-flipping corrections on whole tomograms. CTF correction is performed later as a per-particle process. This process requires metadata determined during tilt-series alignment, so it cannot be used with tomograms reconstructed using other software packages.

Tomogram annotation (optional)

2D particle picking

This section is brief and is only an update to the more detailed tutorial: TomoSeg. Some directory structure and user interfaces have changed in the latest version to match new tomogram workflow as described here:

Particle picking (10-15 min)

3D particle picking

Particle extraction (2 min)

In this pipeline, the full 1k or 2k tomograms are used only as a reference to identify the location of the objects to be averaged. Now that we have particle locations, the software returns to the original tilt-series, extracts a per-particle tilt-series, and reconstructs each particle in 3-D independently.

For the tutorial tilt-series:

For your own data

Initial model generation (10 - 60 min)

Initial model generation

While intuitively it seems like, since the particles are already in 3-D, that the concept of an "initial model" should not be necessary. Unfortunately, due to the missing wedge, and the low resolution of one individual particle (particularly from cells), it is actually critical to make a good starting average, and historically it has been challenging to get a good one, depending on the shape of the molecule. This new procedure based on stochastic gradient descent has proven to be quite robust, but it is difficult for the computer to tell when it has converged sufficiently. For this reason, the default behavior is to run much longer than is normally required, and have a human decide when it's "good enough" and terminate the process. If you use a small shrink value and let it run to completion, it can take some time to run, but this is normally a waste.

For the tutorial tilt-series:

For your own data:

Template matching (5 min)

In this step, we will use the initial model you just produced as a template for finding all of the ribosomes in all 4 tomograms. If you completed the Tomogram Annotation step above, and have already extracted a full set of 1000+ particles, then you can skip this step, as we already have all of the particles.

Particle extraction (~1 hour)

Again, if you already did Tomogram Annotation above, this step isn't necessary. It is only required if you just did Template Matching.

Since this involves several thousand particles instead of 30-50, it will take quite a lot longer to run. The actual time will depend partially on the speed of your storage.

For the tutorial tilt-series:

Subtomogram refinement

3D refinement

Click 3D refinement from the left panel, and input both the particle set and the initial model generated from the last step as a reference. If there is a symmetry of the protein, make sure it is aligned to the symmetry axis before specifying the correct symmetry. If you are willing to split the even/odd set of particles and do a “gold-standard” refinement, specify a resolution number (usually 30-50) in goldstandard, so information beyond that resolution will be randomized independently in the reference for even and odd set. While it is good to have a reasonable mass for the molecular weight of protein (in kDa) and tarres for the target resolution, leaving them as default usually does not hurt. If you have a known structure factor in a .txt file, (you can compute it from a known structure via e2proc3d.py), specify it in setsf. localfilter will filter the averaged map by local resolution, which is especially useful when looking at things in cells where parts of proteins can be very flexible. This is almost always good to check when you want to push toward high resoluion. pkeep controls the fraction of particles that go into the final average. If you know there are many bad particles in the dataset, setting it to be a smaller number may help. Enter the number of threads you want to use in the thread option. Finally, click Launch and wait. For this dataset, it can take a few hours on a decent workstation. The results can be seen in the spt_XX folder. In the folder, threed_XX.hdf files are the main output map after each iteration, and fsc_masked/unmasked/masktight_XX.txt files are the FSC curves between even/odd half set under different masking. You should be able to get to 12-15Å resolution (cutoff 0.143) at this step using this dataset.

Subtilt refinement

Subtilt refinement directory

Once the subtomogram refinement finishes, check the final map and FSC curves. In this dataset, you should be able to achieve a resolution of 13-15Å. Now we can refine the orientation of each individual subtilt, i.e. 2D particles from raw tilt series that are reconstructed into to the 3D particles, and push the resolution of the averaged map.

Click Sub-tilt refinement, choose the folder of the last subtomogram refinement and launch the program. You will need to specify the path to the spt_XX directory containing the last completed subtomogram refinement (typically just “spt_00” for example). Additionally, specify the iter you want to use as a starting point for sub-tilt refinement. If “-1” is specified, the program will attempt to locate the last complete iteration.

The default parameters should be generally fine for this dataset, though you may need to alter the parallel and threads options to use the number of CPU threads available on your computer. The niters value corresponds to the number of iterations of sub-tilt refinement you wish to perform. keep controls the fraction of particles that goes into the final map. If you are certain that tilt images beyond a certain angle (for example, 45 degrees) are radiation damaged, you can put 45 in maxalt, and specify a larger keep number. Otherwise, just use keep 0.5, so the program will judge the quality of subtilt images by their correlation to the averaged map and exclude worst 50% 2D particles.

Tomogram evaluation

Tomogram evaluation

This is a tool that helps you visualize your tomograms with their corresponding metadata, and launch other programs from it. It can be found via Analysis and visualization -> Evaluate tomograms. This can be used at any point of the workflow after tomogram reconstruction.

On the left is a list of tomograms in the project. Clicking the header of each column will sort the table by that attribute. #box is the number of boxes in the tomogram, loss is the average fiducial error in nm, and defocus is the average defocus of the tilt series. Do not be scared by large loss values here. Although the relative value of different tomograms (aligned with the same parameters) in the same project are correlated with tiltseries quality, the exact value here is not as meaningful. You can still get a subnanometer resolution subtomogram average from tilt series with a loss larger than 5 nm.

On the right, the image on the top shows the center slice of the tomogram. The Show2D button shows the selected tomogram in slices, ShowTilts shows the corresponding raw tilt series, and Boxer calls the 3D boxer. PlotLoss will plot the fiducial error per each tilt, and PlotCtf plot the defocus and phase shift at the center of each tilt image. Tiltparams is a bit more complicated. It plots a point list with 6 columns and a number of rows corresponding to the images in the selected tilt series. These are the alignment parameters for the tilt series. The columns represent tilt ID, translation along x and y-axis, tilt angle around y, x and z-axis correspondingly. You can adjust X Col and Y Col in the plot control panel (middle click the plot) to change the display. The first panel below the buttons are the types of particle and their numbers in the dataset. Check and uncheck the boxes will affect the number displayed in #box column on the left. The last box is reserved for comments for each tomogram. You can fill in any comments you have for the selected tomogram and it will be saved with other metadata of the tomogram for future references.

Refinement evaluation

Refinement evaluation

This tool helps visualize and compare results from multiple subtomogram refinement runs. Launch it from Analysis and visualization -> Evaluate SPT refinement. In the GUI, you can look at all spt_XX and sptsgd_XX folders and compare their options and resulting maps. Switch between folder type using the menu at top right. Click the header of a column to sort the table by its content. Uncheck items in the list at bottom-right to hide corresponding columns. Clicking ShowBrowser will bring up the e2display.py browser in the folder of the selected row. PlotParams will plot the Euler angle distribution and other alignment parameters. The 8 columns in the plot are three Euler angles (az, alt, phi), translation in x,y,z, alignment score, and missing wedge coverage score. PlotFSCs will plot the FSC curve under tight mask from each iteration.