Differences between revisions 4 and 16 (spanning 12 versions)
Revision 4 as of 2010-01-06 21:47:52
Size: 1330
Editor: SteveLudtke
Comment:
Revision 16 as of 2011-10-07 12:28:05
Size: 3806
Editor: SteveLudtke
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

''For those who don't like to read, here is the list of good box sizes:'' for traditional single particle analysis. Bold numbers also work well with shrinking:

32, 33, 36, 40, 42, 44, 48, 50, 52, 54, 56, 60, '''64''', 66, 70, '''72''', 81, 84, '''96''', 98, 100, 104, 105, 112, '''120''', '''128''', 130, 132, 140, 150, '''154''', '''168''', 180, 182, 192, '''196''', '''208''', 210, 220, '''224''', '''240''', 250, '''256''', 260, 288, 300, 330, 352, 360, '''384''', 416, 440, '''448''', 450, '''480''', '''512'''

''For single particle tomography, the list of "good boxes" is a bit different:''

12, 13, 14, 15, 16, 17, 20, 21, 22, 25, 26, 28, 32, 33, 35, 36, 40, 42, 44, 45, 48, 49, 50, 52, 54, 56, 60, 64, 65, 66, 70, 72, 75, 77, 78, 80, 81, 84, 88, 91, 96, 98, 100

Note that if a number is on the list, then 2x the number also tends to be on the list. Since you often use 'shrink=2' when processing. It's a good idea to pick a value
twice one of the numbers on the above list.

These sizes are less well tested, but also probably good:

540, 576, 600, 625, 640, 648, 675, 720, 729, 750, 768, 800, 810, 864, 900, 960, 972, 1000, 1024, 1080, 1125, 1152, 1200, 1215, 1250, 1280, 1296, 1350, 1440, 1458, 1500, 1536, 1600, 1620, 1728, 1800, 1875, 1920, 1944, 2000, 2025, 2048, 2160, 2187, 2250, 2304, 2400, 2430, 2500, 2560, 2592, 2700, 2880, 2916, 3000, 3072, 3125, 3200, 3240, 3375, 3456, 3600, 3645, 3750, 3840, 3888, 4000, 4050, 4320, 4374, 4500, 4608, 4800, 4860, 5000, 5120, 5184, 5400, 5625, 5760, 5832, 6000, 6075, 6144, 6250, 6400, 6480, 6750, 6912, 7200, 7290, 7500, 7680, 7776, 8000, 8100,
----
Line 3: Line 20:

Please also remember that for accurate CTF correction, the box size should be 1.5 - 2x the smallest box that will just contain your particle.
Line 17: Line 36:

From this plot, we can compute when using a larger box-size is better. ie - if you have a box size of 482, your refinement would actually run faster with a box
size of 512, even though it's larger. So, when picking a box size, you can optimize your speed by rounding up to a value from this list :

32, 33, 36, 40, 42, 44, 48, 50, 52, 54, 56, 60, 64, 66, 70, 72, 81, 84, 96, 98, 100, 104, 105, 112, 120, 128, 130, 132, 140, 150, 154, 168, 180, 182, 192, 196, 208, 210, 220, 224, 240, 250, 256, 260, 288, 300, 330, 352, 360, 384, 416, 440, 448, 450, 480, 512

Also note that if you are using shrink= it's a good idea to also confirm that your box size divided by the shrink value is in this list.

Particle Box Size and Speed

For those who don't like to read, here is the list of good box sizes: for traditional single particle analysis. Bold numbers also work well with shrinking:

32, 33, 36, 40, 42, 44, 48, 50, 52, 54, 56, 60, 64, 66, 70, 72, 81, 84, 96, 98, 100, 104, 105, 112, 120, 128, 130, 132, 140, 150, 154, 168, 180, 182, 192, 196, 208, 210, 220, 224, 240, 250, 256, 260, 288, 300, 330, 352, 360, 384, 416, 440, 448, 450, 480, 512

For single particle tomography, the list of "good boxes" is a bit different:

12, 13, 14, 15, 16, 17, 20, 21, 22, 25, 26, 28, 32, 33, 35, 36, 40, 42, 44, 45, 48, 49, 50, 52, 54, 56, 60, 64, 65, 66, 70, 72, 75, 77, 78, 80, 81, 84, 88, 91, 96, 98, 100

Note that if a number is on the list, then 2x the number also tends to be on the list. Since you often use 'shrink=2' when processing. It's a good idea to pick a value twice one of the numbers on the above list.

These sizes are less well tested, but also probably good:

540, 576, 600, 625, 640, 648, 675, 720, 729, 750, 768, 800, 810, 864, 900, 960, 972, 1000, 1024, 1080, 1125, 1152, 1200, 1215, 1250, 1280, 1296, 1350, 1440, 1458, 1500, 1536, 1600, 1620, 1728, 1800, 1875, 1920, 1944, 2000, 2025, 2048, 2160, 2187, 2250, 2304, 2400, 2430, 2500, 2560, 2592, 2700, 2880, 2916, 3000, 3072, 3125, 3200, 3240, 3375, 3456, 3600, 3645, 3750, 3840, 3888, 4000, 4050, 4320, 4374, 4500, 4608, 4800, 4860, 5000, 5120, 5184, 5400, 5625, 5760, 5832, 6000, 6075, 6144, 6250, 6400, 6480, 6750, 6912, 7200, 7290, 7500, 7680, 7776, 8000, 8100,


Various algorithms in EMAN2 will depend non-linearly on the box size of the particle. Sometimes (such as the case with FFTs), this behavior will appear bizzare. For example refinements with a box size of 45 pixels will run roughly twice as fast as those with a box size of 47, and 44 is about 20% faster than 45.

Please also remember that for accurate CTF correction, the box size should be 1.5 - 2x the smallest box that will just contain your particle.

The following plot shows how long it takes to compute one similarity matrix element for a noisy particle aligned to a noise-free reference with the rotate-translate-flip aligner, refine alignment enabled with the dot comparator, and a phase residual for a similarity metric. ie - typical options for a real refinement:

rel_time.jpg

Clearly there are some good box sizes, and some very bad box sizes.

A better way to plot this is with respect to anticipated speed for an N^2 algorithm. This is the reciprocal of the same plot divided by box size squared, normalized so 512 is 1. That is, larger values indicate better relative speeds. Of course, 103 is still faster than 512, but if you look in a local neighborhood for a peak, that will correspond to a good box size to use.

Of course, that plot is very difficult to read actual values off of. The original timing data can be downloaded as profile.txt

From this plot, we can compute when using a larger box-size is better. ie - if you have a box size of 482, your refinement would actually run faster with a box size of 512, even though it's larger. So, when picking a box size, you can optimize your speed by rounding up to a value from this list :

32, 33, 36, 40, 42, 44, 48, 50, 52, 54, 56, 60, 64, 66, 70, 72, 81, 84, 96, 98, 100, 104, 105, 112, 120, 128, 130, 132, 140, 150, 154, 168, 180, 182, 192, 196, 208, 210, 220, 224, 240, 250, 256, 260, 288, 300, 330, 352, 360, 384, 416, 440, 448, 450, 480, 512

Also note that if you are using shrink= it's a good idea to also confirm that your box size divided by the shrink value is in this list.

EMAN2/BoxSize (last edited 2021-10-15 17:30:25 by SteveLudtke)