The shape of things to come

Detailed image of a bacterial enzyme used to degrade a major organic pollutant.

Imagine you’re a long-suffering biologist, and imagine that the problem is figuring out the three-dimensional shape of a very important molecule. The solution could lead to (a) new insights into disease and potential therapies, and (b) career advancement. What if someone gave you virtually unlimited computer power that could crack the problem you’re trying to solve overnight?

team at Children’s Hospital Boston has created a super-charged way of solving molecule shapes, harnessing idle scientific computer time across the country and around the world to survey vast reference databases – a “Google Shape” if you will.

It’s a brute-force solution in a field noted for its elegant findings. “Sometimes that’s the only way,” says Axel Brunger PhD, a Stanford structural biologist who edited the paper.

“It means you don’t have to be a rock-star scientist who has millions of dollars for computers,” adds computational scientist Ian Stokes-Rees PhD, who describes the souped-up system with structural biologist Piotr Sliz PhD in a paper published online Nov. 22 in PNAS.

Function follows form

To understand why this is big deal, it helps to know a little more about structural biology. The intricate folds and twists in worker-bee proteins are precisely what keep us alive and well, make us sick and unhappy, and offer a possible template to design new therapies and vaccines.

In a recent example, another Children’s team recently found a weak link in HIV infection: during a fleeting period of time, just before the virus latches onto a cell, a protein on its surface assumes a certain shape that can be effectively targeted by neutralizing antibodies. Further research aims to exploit this vulnerability as a potential vaccine mechanism – keeping this shape around a bit longer to strengthen the antibody response.

Unfortunately, a molecule’s architecture can’t be viewed directly by a microscope, so researchers are forced to make large numbers of measurements by x-ray or other device. Visually, the resulting x-ray defraction data set looks like a highly ordered splat of paint. Through the magic of math, x-ray crystallographers eventually transform the spots into coordinates and components of a three-dimensional structure.

Here’s the tricky part: They have to anchor their data to a few known reference points to orient the molecule correctly in space. There are several ways to achieve this, but it takes time. “You typically expect to spend three to four months, if everything works properly,” says Sliz, also an assistant professor at Harvard Medical School.

Most often, researchers compare their data to a dozen or so handpicked molecules likely to have similar sequences of amino acids and similar structural components. Now, using the newly expanded computing power at their fingertips, researchers can compare their splats to the tens of thousands of molecules in the growing and publicly available databases of protein structures and folds.

“Google Shape”

This technique, called Wide Search Molecular Replacement, does much more than search, it helps make sense of the results, Stokes-Rees says.

Overnight service. For one enzyme, it took 14 hours to search all 95,000 different known protein folds (gray) to find 12 good comparisons (green) and 200 close mismatches (red).

The massive new molecular search-and-solve ability taps into an advanced computational infrastructure from CalTech to São Paulo, developed by the same particle physics community that first established the World Wide Web for simple information sharing.

How fast is it? A single project might consume 20,000 to 60,000 grid computing hours that would occupy a desktop computer for three to five years. “We can cut that down to less than a day,” says Peter Doherty, manager of the Grid interface. Moreover, “we’re typically running 1,000 to 6,000 concurrent jobs.” The system is freely available to non-commercial researchers.

In a bonus, it’s relatively green. A typical cluster of scientific computers runs half their time at less than 10 percent capacity, but at 80 percent of the maximum power consumption. The extra molecular computing jobs take little additional electricity.

“This will change the way we do structural biology,” raves Jawdat Al-Bassam PhD, a postdoctoral fellow in a neighboring Children’s lab who is joining the faculty of University of California, Davis, whence he anticipates tapping into the new system. “This may lower the time it takes by months or years, in some cases.”