Modular Protein Domains

Edited by:
  • G. Cesareni,
  • M. Gimona,
  • M. Sudol &
  • M. Yaffe
Wiley-VCH • 2005 $195/£105

Biological research has reached levels of productivity that are higher than ever before. The conjunction of new techniques that allow high-throughput studies with ambitious goals, such as the sequencing of the human genome, provides us for the first time with a global view of cells and organisms at different levels. The availability of many genome sequences gives us the opportunity to see how evolution creates, discards and tries out new combinations of genes and their products. In the 1960s, when the first protein structures were being determined, the scientific community was fascinated to discover how proteins were made of smaller secondary structure elements. Later it was realized that proteins with different functions and almost no sequence homology could nevertheless have the same fold. However, the big surprise came when researchers found that many proteins were made out of autonomous folds that could occur in different combinations and in different proteins. Thus nature not only varied the sequence of a particular globular fold, but by combining many small autonomous folding pieces it could create completely new functions. These autonomous units are named modular protein domains.

Modular Protein Domains, a new book edited by G. Cesareni, M. Gimona, M. Sudol and M. Yaffe, summarizes the current state of knowledge on most of the identified protein domains. We now have a comprehensive list of globular domains found in living organisms, and although some might still be awaiting discovery, it is likely that most of these domains have been identified. So what do we consider to be a modular protein domain? The general and simple view is that these domains are small autonomous globular folding regions that are found in many different protein sequences in combination with various other modules, exemplified by SH3 or SH2 domains. However, from a stricter point of view, the kinase domain (discussed in Chapter 9) or, for example, the myosin/kinesin motor head (not discussed) could also be considered a modular protein domain, because they are autonomous folding units and they exist in various combinations with other protein domains or protein structures. With such a loose definition, a complete and detailed analysis of all protein domains would need several books. Thus the authors have mainly concentrated on protein domains that are involved in recognizing peptides or small organic molecules such as lipids; the two exceptions being Chapters 9 and 15, in which the authors discuss the eukaryotic kinase domain and ubiquitin-binding modules, respectively.

The discussion of various domains is organized as follows: a brief history of how each domain was discovered; their functional role in the cell; a description of their structure; and the types of ligands they recognize. Regarding ligand recognition, a common theme that emerges upon detailed analysis of different modular protein domains is that for every rule concerning consensus binding sequences, there are many exceptions. So although most SH3 domains will recognize the PXXPa sequence and most SH2 domains will prefer a phosphorylated tyrosine residue to a non-phosphorylated tyrosine, there are frequent exceptions to these rules. PDZ domains are another example; these normally bind to the carboxy terminus of proteins, but in some cases can also recognize an internal β-hairpin or β-strand. For most protein domains there is always the odd case, suggesting that we should be cautious when trying to guess what the target of a particular domain will be. Obviously, this makes structure or sequence-based prediction methods for domain targets difficult (discussed in Chapter 20), and also casts some doubt on massive screenings that use biased peptide libraries (described in Chapter 21). Despite these caveats, Chapters 20 and 21 provide an extremely useful and insightful discussion of the development of experimental and computational tools, respectively, for ligand prediction and screening.

This book is likely to appeal more to those working with eukaryotes than prokaryotes. For example, an unfortunate omission is the absence of a chapter devoted to the two-component system modules of bacteria. Also missing is coverage of DNA- and RNA-binding domains, and other common domains such as zinc fingers and homeodomains; at least one chapter should have been devoted to these all too frequent and important domains. For those domains analysed, their structure, target preferences and function are treated in great detail and constitute a unique source of information for the reader. A chapter devoted to the methods used in screening for targets for specific domains, as well as one on nomenclature, would have been welcome additions. Similarly, the appendix that deals with systems biology and the role of protein modules in cell circuitry could have been expanded. In addition, a chapter on the evolution of these domains and their targets would have been useful. Nonetheless, the book is well written and provides clear, concise and informative reviews on some of the most common domains involved in protein–peptide interactions.