As any management guide will tell you, when faced with a large and complex problem, it can often help to step back and look at the big picture before focusing on the key points. Writing in the Journal of the American Chemical Society, Graham Richards and colleagues describe how such a strategy could be just as applicable when trying to locate ligand-binding sites in protein structures — a problem that is becoming increasingly important for drug discovery in the post-genomic era.

If you have a three-dimensional protein structure, potential binding sites for a ligand could in theory be identified computationally by calculating the energy of interaction between the ligand and the protein in all possible ligand configurations around the protein. But, in reality, the number of calculations that are needed for such a strategy means it is impractical in terms of computational time for all but the most simple ligands.

So, could the number of calculations be reduced without sacrificing predictive ability? Richards and colleagues devised an elegant approach to this problem. Initially, a very simple model of the ligand — just a single feature point — is generated using an algorithm and used in evaluations of the energy of interaction with the protein. After removal of any ligand configurations with a low score in terms of potential for binding, a new round of calculations are carried out using a ligand model generated to have two feature points. Repeating this evaluation–removal process, while increasing the complexity of the ligand model at each step, allows computational time to be efficiently distributed, as only the most relevant configurations are considered in any detail.

To validate the strategy, the authors took seven known ligand–protein structures — including structures as diverse as HIV reverse transcriptase with a non-nucleoside inhibitor and heparin with basic fibroblast growth factor — removed the ligand, and then attempted to locate the binding site using stepwise algorithmic models of the ligand. In all cases, the binding site was correctly identified, and furthermore, so was the orientation of the ligand. So, it seems that this approach could become a valuable tool for analysing the wealth of data that is emerging from structural genomics projects.