Emboldened by their growing success at using computer models to predict protein structures, computational structural biologists have put out a community-wide call to find the “Ten Most Wanted” proteins whose structures they will try to solve.
The plan to offer modelling services to the wider biological community was hatched at the fourth biannual meeting of the Critical Assessment of Techniques for Protein Structure Prediction (CASP4), held in Asilomar, California, last month.
The computational biologists searching for the top ten believe that computer modelling may become a viable alternative to the current practice of using X-ray crystallography and nuclear magnetic resonance techniques to 'solve' the structure of a protein.
Any biologist can suggest a candidate protein for the Ten Most Wanted list, provided its structure is currently unknown. A website has been created at which candidate proteins can be registered, and where the case for their biological importance can be made.
The ten proteins considered to be the most significant will be modelled. “A protein might make our list if, for example, it is believed to play a key role in cell signalling,” says one of the organizers, structural biologist David Baker of the University of Washington.
Protein modellers traditionally pit their prediction skills against each other at CASP meetings. In the year up to a meeting, participants are challenged to model various categories of proteins, and their structures are found experimentally before the meeting.
Although protein-structure prediction is not yet accurate enough to help in drug design, “it has matured to a point where models produced by prediction algorithms can be used to understand and test hypotheses about biological function”, Baker says.
Marked improvements were seen in cases where an amino-acid sequence of interest did not correspond to any known structure. “CASP3 competitors were able to describe 40–60 [amino-acid] residues approximately correctly, whereas at CASP4 even whole small proteins were roughly correct,” says one of the meeting's organizers, John Moult, of the Center for Advanced Research in Biotechnology at Rockville, Maryland.
Fully automated modelling, where protein structures are predicted by computer programs without any human input, was also better at CASP4. Predicted structures determined by combining all fully automated approaches were among the top six in the meeting's “fold recognition” category. There are believed to be about 13,000 different ways a protein can fold, and most structural biologists believe that, once the structures of all the folds are understood, models will be able to predict the structure of every protein.
Proponents of fully automated modelling, such as Dani Fischer of Ben-Gurion University, Israel, believe that the pace of improvement is such that humans will soon be beaten by computers “in the way that Big Blue beat chess world champion Kasparov”.
Others are more cautious. Moult notes that the importance of automated modelling is “that it is scalable, which is important if you want to go for thousands of proteins”, a prospect genomics makes possible. But he says automatic methods are not yet as effective as a combination of human and machine.
The big question, researchers say, is whether modelling will continue to improve fast enough to make an impact. There is a relatively short window of time for structure prediction to be useful before structural genomics — factory-style protein-structure generation using genomic information as starting material — generates all the results needed.