What is Folding and Why Does it Matter?

Item: What is Folding and Why Does it Matter?
Author: Ben Hardwidge

Written by Ben Hardwidge

June 15, 2009 | 10:07

Tags: #alzheimers #distributed-computing #folding #foldinghome #supercomputer

Companies: #bit-tech #research

What is Folding?

Studying how and why proteins misfold has been a major challenge for medical research scientists for decades, and Scheraga used some of the early forms of computational modelling of the process. "I was among those who started the field," says Scheraga. "We had limited computer time, and we could work with small chains. What is small? Five amino acids, and then we go up to ten and so on."

Scheraga now has his own supercomputer in his lab, featuring 800 CPUs, and his colleagues have now developed the code to calculate the total energy of a string of up to 1,000 amino acids. This is a lot of computer power, but it’s still not enough. He has to apply for additional CPU time at national computing centres in the USA and even in Germany.

The term ‘distributed computing’ is used because rather than running the project on a giant supercomputer at Stanford University, the simulations are divided into chunks known as work units (WUs), which are sent over the Internet to the PCs of people running the Folding@home software. This software uses your PC’s CPU(s) and GPU(s) to process the work.

Once this is complete, the completed WUs are sent back to Stanford and in return, you’re credited with a number of points (an arbitrary scoring system), so you can check how much you have contributed to the project. Unlike a privately run research project of a pharmaceutical company, once Stanford has received enough processed WUs to complete a project, it publishes its findings on its website and medical journals.

Simulating protein folding demands a colossal amount of computer time, even on large-scale supercomputers. However, Scheraga points out that this is "the problem that Vijay Pande has solved" with the Folding@home project. By using the spare clock cycles of CPUs and GPUs on Internet-connected computers all over the world, Folding@home now has a tremendous amount of processing power. It’s a concept that’s commonly known as distributed computing, and it’s perhaps most well known from SETI@home, which uses distributed computing power to look for aliens.

Scheraga doesn’t use Folding@home for his own work, but this is mainly because he likes to have control over the process. "I just don’t know how my code will function on other computers," he says. "I’ve collaborated with lots of people in my years in science, and I’ve always found that such collaborations don’t lead to results in a reasonable amount of time."

What’s in a work unit?

So what exactly comprises a Folding@home work unit? The work varies massively between work units, but your computer will usually only carry out a very small part of a folding process.

In fact, the founder and director of Stanford’s Folding@home project, Vijay Pande (pictured, above), told us that a CPU client with a slower protein might only calculate "one millionth of the process". When we talk about the speed of a protein, we’re simply talking about the time that a protein takes to fold.

Pande points out that proteins fold on a timescale measured in microseconds or milliseconds, and that a Folding@home work unit could represent "somewhere between a nanosecond and a microsecond of overall dynamics". As such, Pande says that a work unit processed by the GPU client "could get close to folding a whole process on a very fast protein".