In less than a year, an Anglo-Canadian group has worked out the structures of 50 complex proteins that are relevant to human disease.

The recent explosion of genome sequence data led scientists to hope that working out protein structures could become automated and accelerated. To date, projects attempting this have faced criticism for taking too long to deliver. But the results from the Structural Genomics Consortium (SGC), which specializes in human proteins, mark the coming of age of the field. The group, which is based in Toronto and Oxford, says it will add a further 100 structures to its free-access database in the coming year.

Knowing the shape of a protein is crucial for understanding its biological function, and designing drugs to interact with it. For example, the SGC has just released the three-dimensional structure of a human enzyme (pictured) that converts cortisone to the metabolic hormone cortisol.

“Mice that lack the gene for this enzyme do not develop diabetes, whatever their diet,” says Aled Edwards, the consortium's executive director. “So this target is exciting the pharmaceutical industry at the moment.”

The field of structural genomics aims to reveal the shapes of proteins by using the information obtained from genome-sequencing projects in automated, high-throughput schemes. These would synthesize thousands of proteins in vitro so that they could be examined by X-ray crystallography and nuclear magnetic resonance techniques, which are used to image molecules in three dimensions.

Techniques have come a long way since Max Perutz took 22 years to determine the first structure of a protein, haemoglobin (M. F. Perutz et al. Nature 185, 416–422; 1960). And increasing availability of genome sequence information in the past few years caused structural genomics initiatives to pop up all around the world.

Mortal coils: knowing the shape of enzymes is vital — this one could guide research in diabetes. Credit: SGC

Apart from the Anglo-Canadian group, Germany, Japan and the United States all started major structural genomics projects. The field got off to a slow start, and the ambitious, Berlin-based Protein Structure Factory, which was investigating human proteins, closed after it failed to secure more funding in 2004. But the rate at which structures are being published is starting to accelerate.

The US National Institute of General Medical Sciences, based in Bethesda, Maryland, started its Protein Structure Initiative (PSI) in 2000. By this summer, the PSI's nine 5-year projects will have delivered 1,100 structures — nearly half of them produced in the past year.

And in a few weeks' time, the institute will announce the names of labs that have been selected to participate in the PSI's second phase. This will aim to deposit hundreds of protein structures in public databases each year.

But despite the impressive number of structures produced, the PSI has focused mainly on bacterial proteins. These are simpler than mammalian proteins and are therefore easier to make and purify. The SGC project is the first to solve the structures of a large number of human proteins.

The consortium's achievement is saluted by the PSI's coordinator, John Norvell. “Every step in the process is harder for human proteins,” says Norvell. “Cracking 50 structures in less than a year is very impressive.”

Norvell says that the US National Institutes of Health is planning pilot projects that will work out faster ways of capturing similarly difficult proteins.

He adds that after several years of building technologies, structural genomics initiatives are ready to roll out protein structures in large numbers.