Question: What sort of problems will EMAN have with initial model bias ? What if I can't get my structure to converge to something sensible in EMAN, but I can in another program ?
Answer: This is largely answered in the question above relating to resolution of reconstructions. If you are not using optimal options with the refine command, you may, indeed, not get optimal results. The basic options provided by the EMAN tutorial do not give the entire picture. For example, if you continue to use classiter=8 for your high resolution work, you will never get the highest possible resolution. Similarly, if you don't use the refine option to the refine command, you will end up slightly blurring your model in most cases. The bottom line is, to get optimal results, you must specify the optimal options.
Another Answer:
- The primary parameter effecting initial model bias is classiter. Generally if you use at least 3 for classiter, your model bias
problems should be quite limited. You can use larger numbers in the early rounds of refinement to eliminate any bias from the initial model (something in the range of 6-8 for 3 or 4 iterations is common), and as long as you use 3 after that, you should not see much noise bias creeping in to your models.
- classkeep really only affects model/noise bias if you use very small values. Normally a classkeep value of 0.5 - 3 is appropriate, and in
this range, it should not contribute to model bias. If you make it much smaller than this (throwing away more particles), you will A) see the resolution of the structure get worse, since it will tend to throw away the higher resolution data (lower contrast) and B ) noise bias will begin creeping in
- The hard limit shouldn't have much of an impact for model bias.
Three other points if you have noise bias issues:
- Make sure your pad= parameter is large enough to minimize the Fourier reflections coming back into your model. This is a common source for the 'seeds' for noise bias.
- Use an appropriate angular step (ang=)
- classkeep is really only designed to eliminate 'obviously' bad particles. If you try to use it aggressively it will get rid of particles you should be keeping.
- Don't use TOO much data. I know this sounds like an odd statement. Let me explain:
- Real features in your model will improve linearly with the number of particles included in a class-average
- Noise-bias induced features will become stronger as the sqrt(number of particles)
- There is a limit to the SSNR at which the alignment routines will do something useful. Once the SSNR falls below that level, the alignments are likely to be driven by noise rather than signal, and noise bias will become a factor.
- Regardless of whether there is good signal present in your raw particle data, you can ALWAYS continue to improve the final resolution of a structure if you add enough particles. You can make a curve based on the SSNR of the particle data of how many particles will be required to achieve a given resolution (assuming perfect alignments, etc). As long as your alignment routines are working well, your actual data should more or less follow this predicted behavior. When you find you have reached the point where suddenly a lot more than the predicted number of particles is required to achieve a resolution improvement, you have probably reached the noise bias limit (ie - sqrt(n) scaling), and any further improvements you get are probably unreliable. The question is how you can detect that this is the case (ie - whether you can trust the new high resolution features you see). Unfortunately there aren't any really good answers to this question. Maximum likelihood methods can provide one answer to this question, they are so susceptible to the noise model you use, that it's hard to make quantitative arguments based on them.