What sort of desktop computer should I get for EMAN2 reconstructions

People frequently ask what sort of computer they should get for EMAN2 (or other image processing work). This isn't a page for the "minimum required specifications", but a page of recommendations for a high end workstation for data processing. EMAN2 will run even on a fairly basic 4 core laptop. The tutorials are designed so they can run on most typical laptop computers. However, if you are doing a typical single particle reconstruction project with 300,000 particles and a 300x300x300 reconstruction, you aren't going to be using a laptop. The better the computer you buy the faster your work will go. You could process even a large data set on a typical desktop computer _eventually_, but this could easily take 10-50x longer than on a proper workstation, and 100-1000x longer than on a Linux cluster. Expect to spend $3-5k for a minimal effective data processing workstation, and if you plan to do a lot of work in the field, I would suggest getting a proper ~$5-15k workstation with a lot of RAM, many cores, at least one good Nvidia GPU and a lot of high performance storage.

(Almost) Timeless Recommendations

Since I won't always update this page every few months, let me give a few general tips:

Storage:

Update 2023

The 4090 cards can be purchased at sane prices again, and have really outstanding performance. I haven't benchmarked any of the higher-end multi-chip solutions. In my testing on real-world sorts of problems, FFTs of 256x256 images is about 4x faster on the 4090 than on the RTX Titan (2080 generation).

CPU suggestions haven't changed much, the high-end threadrippers are still the leaders right now in my book.

NVMe storage has gotten a lot larger and faster. For $1000 you can get an 8 TB NVMe drive which will do >4 GB/s. Combined with some large drives for long-term storage, this is a great solution and really helps when dealing with large files like tomograms.

Update mid pandemic (2020)

64 Core AMD Threadrippers are now available, and are a great option if you can afford one.

The 3000 series NVidia RTX graphics cards are a great option (particularly the 3090) if you can find them in-stock. At present you can only get them in the secondary market (extremely overpriced)

If you are using the new subtomogram averaging pipeline heavily, make sure you get enough RAM. For a 64 core machine, 256 GB of ram is generally a good idea (this meshes with the general advice above

Update mid 2019

The new(ish) AMD Ryzen ThreadRipper 2990WX has 32 cores running at an unboosted 3 GHz. I have one and it is an excellent choice for CryoEM image processing. Equivalent in performance to an equal number of Intel cores, but substantially cheaper. Other AMD chips are also good. I'm using it with a Gigabyte X399 DESIGNARE EX, which aside from needing a bios update, has been doing well.

Make sure you get enough RAM. With 32 cores, target at least 64 GB of RAM, and if you plan to do serious tomography, at least 128 GB.

Hard drive advice hasn't really changed, though larger drives are now readily available.

Update mid 2017

For about $10k. Not promoting a specific manufacturer, but I speced this Workform 3000.v6 at Silicon Mechanics:

Update late 2014

Earlier this year I updated my workstation to get something for <$10k that would be optimal for processing movies as well as single particle reconstruction for small-medium projects. We have purchased a couple of machines like this, and they are quite cost effective and perform very well. Here are the basic specs:

This was earlier in the year, so there are probably better processor choices now, but this machine performs very well. Disk performance is ~1.2 - 1.5 GB/sec with ~24 TB of storage. 16 cores (32 threads) with threading which actually works (~30% speed boost over using 16 threads). Total cost (self-assembled) was ~$8000. Some of the prices will have fallen since then.

Of course this is just representative, and you can likely get a vendor to build you one for about the same price.

Update late 2013

In the Intel lineup today, for a basic machine (<$2000), I would probably lean towards a single 6-core, i7-3930K Sandy Bridge-E 3.2GHz (3.8GHz Turbo). If you want to go all-out, a machine with dual 8-core Xeon (E5-2690 Sandy Bridge-EP 2.90GHz) processors is currently at the high-end of the lineup, but this will run you about $4000 just for the two CPUs, so you could easily hit $5000-7000 for a machine like this with a decent amount of RAM. My normal RAM recommendations haven't changed much 2G/core is enough for most applications, but if you want to do tomography or deal with large viruses at high resolution, you may want 64+ GB (regardless of the number of cores). RAM is cheap enough that 64 GB isn't all that expensive.

For many projects, you can get away with relatively little in modern computer terms. My quad-core mac laptop, for example, can refine a ribosome to ~12 Å resolution overnight very easily. It's when you start pushing for higher resolutions or larger structures that the computing needs really increase, and in such situations you are probably better off getting some time on a cluster, instead of paying $10k for a super-duper workstation. My recommendation would be to get a high clock speed single CPU computer with 4 or 6 cores for desktop use.

SSD hard drives are (each) ~4x faster than traditional spinning drives. They have improved dramatically in recent years (as shown by their use in all current Mac laptops). They are still expensive, but much less so than they used to be. Many tasks in cryo-EM data processing, particularly with DDD movie data, are disk-limited, so you can improve the interactivity of your computer dramatically by at least supplementing your regular hard-drives with SSDs.

Two options for SSD use:

Regular spinning hard drives: Note that you can also get ~1GB/sec performance out of spinning drives if you get a large enough RAID array. For example if you get an 8 drive RAID and put high speed regular hard drives in it, and have a high speed interconnect (NOT SAN), you can get pretty good performance.

A note on GPU computing: EMAN2 does have support for GPUs available, however, there are many caveats:

Suggestion as of 3/20/2012

Sandy - bridge Xeons are now available, and I've been getting questions about which computer to get again. Note that Macs are still using the earlier Westmere technology. Anyway, here's a quick analysis:

Sandy-bridge Xeons (E5-2600 series) have finally become available, but aren't available in Macs yet. Certainly the Mac Pro loaded with 12 cores will give you the best available performance on a Mac right now. However, it is very far from the most cost-effective solution. So, it really depends on your budget and goals. Westmere still offers a decent price-performance ratio if you want dual CPUs. If you are happy with a single CPU, I'd say Core-i5's are actually the way to go (this is what I just set up in my home PC).

Here is a rough comparison of 3 machines I use: Linux - 12 core Xeon X5675 (3.07 Ghz, westmere): Speedtest = 4100/core -> ~50,000 total (2 CPU ~$2880 total) Mac - 12 core Xeon (2.66 Ghz): Speedtest = 3000 -> ~36,000 total Linux - 4 core i5-2500 (3.3 Ghz+turbo): Speedtest = 6400 (turbo), 5600 (sustained) -> ~22,000 total (1 CPU ~$210)

Now, they have just released the Sandy-bridge Xeons, but, for example, a dual 8 core system: 16 core E5-2690 (2.9 Ghz): Speedtest (Estimated) = 5650 (turbo), 4950 (sustained) -> ~80,000 total (2 CPU ~$4050)

Now, the costs I gave above are just for the CPUs. If you wanted to build, for example, several of the core i5 systems and use them in parallel, you'd need motherboard, case, memory, etc for them as well. A barebones Core I5 pc with 8 GB of ram and a 2TB drive would run you ~$650.

If you built a 16 core system around the E5-2690, $4050 - CPU $600 - motherboard $200 - case $150 - power supply $300 - 32 gb ram $500 - 4x 2TB drives (equivalent)

So ~$5800 for the (almost) equivalent 16 core machine vs $2600 for 4 of the 4-core i5 systems.

ie - you pay ~2x for the privilege of having it all integrated into a single box. Of course, that buys you a bit of flexibility as well, and saves you a lot of effort in configuration and running in parallel, etc. It also gives you 32 GB of ram on one machine, which can be useful for dealing with large volume data, visualization, etc.

On the Mac side, a 12-core 2.93 Ghz westmere system with 2 GB/core of ram -> $8000 and would give a speedtest score of ~45,000. ie ~40% more expensive and 1/2 the speed of a single linux box with the 16 core config, and 3x as expensive and 1/2 the speed of the core-i5 solution.

Please keep in mind that this is just a quick estimate, and that actual prices can vary considerably, but as you can see, the decision you make will depend a lot on your goals and your budget.

Suggestion as of 12/1/2011

Obviously for large jobs you're going to need access to a linux cluster, but regardless you will still need a desktop workstation.

A complete answer to the question depends a bit upon your budgetary constraints, or lack thereof. As you are probably aware, at the 'high end', computers become rapidly more expensive for marginal gains in performance. Generally speaking, we tend to build our own Linux boxes in-house rather than purchasing prebuilt ones, both as a cost-saving measure, and to insure future upgradability. Then again, there is nothing wrong with most available commercial pre-build PCs as long as you get the correct components. For a minimal cost-effective workstation, I would suggest:

hope that helps.

Note that these are just my own personal opinion, and do not represent an official recommendation from anyone other than myself. Your mileage may vary.

EMAN2/FAQ/Computer (last edited 2023-08-31 17:58:11 by SteveLudtke)