Paris

On the board: IBM senior vice-president Paul Horn with a mock-up of a Blue Gene array. Credit: AP

Biologists are hailing IBM's US$100 million project to build a ‘petaflop’ computer as likely to revolutionize our understanding of cellular biology. The computer, nicknamed ‘Blue Gene’, would be around 500 times faster than today's most powerful supercomputer.

Computer scientists say that the planned machine, details of which were revealed last week, is the first large leap in computer architecture in decades.

IBM will build the programme around the challenge of modelling protein folding (see below), with much of the research costs going on designing software. It will involve 50 scientists from IBM Research's Deep Computing Institute and Computational Biology Group, and unnamed outside academics.

But Blue Gene's hardware will not be customized to the problem and, if IBM's blueprint works, it will offer all scientific disciplines petaflop computers. These will be capable of more than one quadrillion floating point operations (‘flops’) per second — around two million times more powerful than today's top desktops. Most experts have predicted that fundamental technological difficulties would prevent a petaflop computer being built before around 2015 (see Nature 402 supp., C67).

“It is fantastic that IBM is doing this,” says George Lake, a scientist at the University of Washington and NASA project scientist for high-performance computing in Earth and space science. IBM is showing leadership by ushering in a new generation of supercomputers, he says.

The biggest technological constraints to building a petaflop machine have been latency — increasing the speed with which a chip addresses the memory — and reducing power consumption. A petaflop computer built using conventional chips would consume almost one billion watts of power. IBM reckons Blue Gene will use just one million.

Chip ahoy: IBM vice-president Ambuj Goyal.

Although processor speeds have increased exponentially, the time to fetch data from the memory of a supercomputer, 300 nanoseconds, is only slightly less than half what it was 20 years ago. Putting more and more transistors on a chip is therefore unlikely to lead to much greater speed.

“We set out from scratch, completely ignoring history, and thought how can we get the highest performance out of silicon,” says Monty Denneau, a scientist at IBM's Thomas J. Watson Research Center in Yorktown Heights, New York, who is assistant architect of Blue Gene.

Arvind, a professor of computer science at MIT who is considered one of the top authorities on computer architecture, applauds IBM's approach. “It has made very big steps in rethinking computer architecture to try to do without the components that consume power; it has taken all these research ideas and pulled them together.”

One innovation is IBM's use of a prototype technology that combines processor and memory on the same chip. This radically reduces the access time and bandwidth at which the processor can address the memory, as well as reducing power consumption.

Denneau claims that Blue Gene's chips get latency down to just ten nanoseconds. Blue Gene's bandwidth is such that it could download the entire Internet — around 100 terabytes — in less than one second.

Arvind points out that, although a substantial improvement, this is not enough to allow uninterrupted traffic between the processor and the memory, and bottlenecks in processing would still occur. The conventional solution is to put a special high-speed memory, or cache, in front of the processor, to supply it with the most frequently requested instructions and data. These can then be accessed much faster than instructions and data in the main memory.

The problem facing designers of petaflop computers is that caches consume vast amounts of power. “IBM's real innovation is to have done away with the cache,” says Arvind. To do so, it has turned to another prototype technology: multi-threading.

Each of Blue Gene's one million processors can do eight tasks, or threads, simultaneously. If one set of threads is busy, the next instantly takes up the relay. “No-one has attempted multi-threading at this level,” says Arvind. “Current supercomputers just don't have this feature.”

When built, Blue Gene will have one million processors — each capable of one billion operations per second (one gigaflop) — of which 32 will be placed on each chip. A board containing 64 of these chips would be capable of two teraflops, equivalent to Asci Red, the world's most powerful supercomputer, at Sandia National Laboratory in New Mexico. Asci Red distributes tasks across an array of about 10,000 Pentium Pro chips, and cost almost $60 million to build. Eight of these boards will be placed in two-metre-high racks, and 64 racks will combine to give one-petaflop performance.

Denneau admits that supercomputers are facing stiff competition from clusters of PCs and workstations, which can deliver supercomputing power at a fraction of the cost (op. cit.). But he says that these are still too slow or unreliable for applications requiring tight coupling and low latency.

The cost of building Blue Gene will be a fraction of that of Asci Red, adds Denneau, thanks to a secret proprietary technology used to etch the silicon in the bulky chips. Most of the costs will be in devising parallel computing models and programming methodologies.

Blue Gene's main drawback, says one scientist, is that it may be relatively short of memory. Arvind says the biggest unknown will be reliability. “To put together a petaflop computer in five years is a very big deal,’ he says. ‘It may simply not be possible.’