washington

Shape of things to come? A graphic of the protein myoglobin, carrying oxygen (red). Credit: KEN EWARD/SCIENCE PHOTO LIBRARY

US scientists have set out to compile a comprehensive catalogue of protein structures. The idea takes its cue from the Human Genome Project's bid to decipher the entire human genetic code.

The undertaking, dubbed structural genomics, has been given a boost by the announcement that the National Institute of General Medical Sciences (NIGMS) is to launch a Protein Structure Initiative. This will spend up to $3 million a year at each of three to six pilot research centres.

These centres will explore the best ways to determine protein structure quickly, accurately and at high volume. The ultimate goal is to deduce at least 10,000 protein structures within the next five years, at the lowest possible cost. At present it costs about $200,000 to determine a structure. It is hoped that technical improvements and economies of scale could bring this down to $20,000.

It would be impossible to catalogue every protein. But, by choosing those that do not show sequence similarity to proteins of known structure, structural biologists hope to find novel forms. The aim is to comprehensively describe a finite set of what are thought to be several thousand basic protein shapes, or folds, and variations on them. Currently, only a few hundred are known.

Once this task is complete, it should be possible to assign any protein to a shape family simply by knowing its genetic sequence. Once the detailed structure of one member of a family is known, computer modelling will make it possible to generate a reasonably good structure for any other member.

It is hoped that the project will be a starting point for understanding how virtually any protein works. It could yield information on protein evolution, the basic physics of the relationship between protein sequence and structure, and structure-based drug design and discovery.

This would have a tremendous impact on pharmacology. At the moment, only about one per cent of all protein families are targeted by drugs. But drug development is greatly helped by a knowledge of protein structure. The selection of targets for drugs can become more rational, and drug candidates can be tailored for maximum effect on their targets.

The NIGMS-funded effort was hatched after meetings over the past year between structural biologists and officials at the National Institutes of Health. They have sought to capitalize on two events. One is the explosion of sequence data being generated by the Human Genome Project. The second is the improvement in technologies for studying protein structures. For instance, highly focused, bright X-ray beams, called undulator X-ray sources, at synchrotrons, allow data to be collected from smaller crystals, and in minutes rather than days.

It is hoped that the pilot centres will become expert in deducing protein structures. This would involve selecting and producing target proteins, crystallizing them, determining their structures and analysing the data generated. The information would be stored in a common database.

After three years of developing and refining technologies and testing strategies for producing speedy, high-volume results, the less productive centres might be culled. Those remaining would go into high-volume structure determination, producing most of the structures in the following few years.

Ultimately, large, integrated centres are planned, and NIGMS seeks collaborations with Europe and Japan. Marvin Cassman, NIGMS director, says the focus is on scale and speed. The key is “rapid structure determination at high resolution and high throughput,” he says. “This is intended ultimately to be a kind of assembly-line process.”

The plan is winning plaudits from many researchers. They say that a comprehensive compendium of protein structures will be a boon to biology, and that a large-scale, systematic approach is the way forward.

“It will be great,” says Andrej Sali, an assistant professor in the molecular biophysics laboratories at Rockefeller University in New York. “It will have an impact at least as large as the optimists are predicting, just as the Human Genome Project did.”

But critics echo early objections to the genome project, complaining that the plan will channel valuable resources to an exercise of dubious value. “Trying to figure out function from structure is one of the most difficult enterprises in molecular biology,” wrote Thomas Steitz, a Howard Hughes Medical Institute investigator at Yale University, in a letter to Cassman.

Steitz is a professor of molecular biophysics and biochemistry. He says that structural biologists ought to be left to do what they are already doing — determining the structures of proteins known to be biologically important. “Most of us could come up with a long list of far more important and interesting projects,” he wrote to Cassman.

But others are more impressed. It “could give databases that could be extraordinarily valuable and time saving in research efforts,” says Wayne Hendrickson, a Howard Hughes investigator in the Department of Biochemistry and Molecular Biophysics at Columbia University, New York.