Making Simulated Single Particle Data

If your goal is to develop software and have a good test data set where you know the ground truth, just give up now. All simulated data sets I have ever seen fall far short of real data. That is, any reasonable algorithm will perform extremely will even with very noisy simulated data, much more so than with real data.

However, if your goal is to better understand how the software works through use of some artificial, but somewhat realistic simulated data, or if you need to perform initial testing on algorithms to make sure they at least work with simulated data, then this is the page to read.

Making simulated data isn't all that difficult. If you wish to start with a structure from the EMDB instead of PDB, you can jump directly to the seconds step:

e2pdb2mrc.py <input.pdb> <output.hdf> --center --apix <apix> --res <resolution> --box <boxsize>

you will need to pick a good box size and A/pix value. Note that resolution here refers to the 1/2 width of a Gaussian blurring operation. This is not at all what resolution is in CryoEM, where it is a measure of noise level. Generally resolution should be a larger number than 2*A/pix.

e2project3d.py <input.hdf> --output