Structural genomics, or structural proteomics, aims to provide three-dimensional information for all proteins. This information can be used to ascribe function to a protein and to reveal or invalidate drug targets. There are structural proteomics projects in both the public and private sectors around the world, including Germany, Japan and the United States.

In 2000, the USNational Institute of General Medical Science in Bethesda, Maryland, launched an initiative to determine the structures of about 10,000 proteins representing different structural (‘fold’) families over the next decade. The studies will result in a public resource linking amino-acid sequence, structural and functional information, eventually allowing scientists to make the three-dimensional atomic structures of most proteins easily obtainable from their corresponding sequences.

One of several participating centres is the Genomics Institute of the Novartis Research Foundation in San Diego, a member of the Joint Center for Structural Genomics (JCSG). Scott Lesley (pictured below, left), who heads the institute's proteomics unit, says his group has processed 50,000 protein samples from more than 20,000 unique expression clones. The pipeline starts with a small screen protocol that uses samples in a 96-well format to evaluate how the targets behave at different steps in the procedure. “Those that are not behaving well after expression, purification or crystalization are set aside and then run through one of several salvage pathways,” says Lesley. Such pathways include using different expression systems (vectors and cells), making point mutations in the cDNA to increase the likelihood it will be expressed, and trying different tags. “All the bacterial expression and purification is done in parallel and the majority is automated by custom instrumentation,” says Lesley.

Credit: GNF

The targets that perform well are applied to a large-scale purification protocol to yield 10–20 mg of protein. The JCSG has already deposited 239 structures in the protein database — mainly from the bacterium Thermotoga maritima, the mouse, and the yeast Saccharomyces cerevisiae — representing at least 19 new fold families. “Most of the tools are in place, and we can increase both output and success rate by implementing salvage approaches in high throughput,” says Lesley.

Similar efforts are underway in other countries. The Protein Structure Factory in Germany, an initiative of the German Human Genome Project and structural biologists from the Berlin area, targets human proteins for structure determination. “Our current focus is to determine the structures of particular protein–protein or protein–ligand complexes,” says Christoph Scheich of the Max Planck Institute for Molecular Genetics in Berlin.

L.B.