Many of the properties that characterize living organisms are also exhibited by individual cells. These include communication, homeostasis, spatial and temporal organization, reproduction, and adaptation to external stimuli. Biological explanations of these complex phenomena are often based on the logical and informational processes that underpin the mechanisms involved. Two examples of this are the significance of the structure of DNA, and the mechanisms that control gene expression. DNA's structure relates to heredity because of the coding and replicative capacity of its polynucleotide sequence, whereas the interactions of activators and repressors with promoter regions are best understood in terms of the feedback loops that regulate gene expression.

Most experimental investigations of cells, however, do not readily yield such explanations, because they usually put greater emphasis on molecular and biochemical descriptions of phenomena. To explain logical and informational processes on a cellular level, therefore, we need to devise new ways to obtain and analyse data, particularly those generated by genomic and post-genomic studies.

An important part of the search for such explanations is the identification, characterization and classification of the logical and informational modules that operate in cells. For example, the types of modules that may be involved in the dynamics of intracellular communication include feedback loops, switches, timers, oscillators and amplifiers. Many of these could be similar in formal structure to those already studied in the development of machine theory, computing and electronic circuitry. When these modules are coupled in space by processes such as reaction diffusion and regulated cytoskeletal transport, they help to provide a basis for the spatial organization of the cell. The identification and characterization of these modules will require extensive experimental investigation, followed by realistic modelling of the processes involved. Such analyses would allow a catalogue of the module types that operate in cells to be assembled.

The next task is to identify the interacting molecules and biochemical activities that generate the characteristic behaviours of particular modules. This will be more difficult, involving systematic mapping of particular module types to specific molecules, biochemical activities and molecular interactions, and assembling the information into databases. Some examples may be useful in thinking about this. Proteins that can be attacked by proteases are likely to be found in switching modules, as destruction of a protein can bring about an irreversible switch in the behaviour of a regulatory network. Small G-protein GTPases that bind to and then convert GTP to GDP could act as timers because of the time it takes to execute this conversion. In turn, GTPase timers can act as amplifiers if signals are continually sent when in the GTP-bound state, or as components of proofreading modules, as seen during protein translation. Antagonistically acting protein kinases and phosphatases could also act as amplifiers and switches.

If sufficient regularity can be found between molecular entities and logical and informational outcomes to allow appropriate databases to be built, then genomic and post-genomic data could be interrogated more effectively. Generally, three types of information are found in genomic and post-genomic databases: sequence data that define molecules and biochemical activities; interaction data based on co-localization, co-expression and analyses of interacting pairs of proteins (two-hybrid analyses); and functional data based on gene deletions and RNA-interference studies. These data allow specific molecules to be linked to particular biological phenomena.

The objective is to use this information to identify the molecular assemblies involved in a biological phenomenon of interest, and then to determine the likelihood of that assembly being part of a particular logical and informational module. If successful, this approach would not require detailed kinetic analyses of all processes within cells, but rather would rely on more cursory calculations to study the phenomena of interest. The rough nature of the calculations would be justified by comparison with well-understood examples of particular modules.

A useful analogy of what is being proposed is the analysis of an electronic circuit. Once the detailed operations of different types of electronic components have been identified, it is possible to gain insight into what an electronic circuit can do simply by knowing what components are present and how they are connected, even if their precise dynamic behaviour has not been determined. The various logical and informational modules implicated in a biological phenomenon of interest have to be integrated in order to generate a better understanding of how cells work. One process that has been analysed comprehensively in this way is bacterial chemotaxis.

The success of this general approach depends on there being a limited set of biochemical activities and molecular interactions that together can solve the myriad logical and informational problems found in biological systems. If there is only a restricted set of processes that are efficient and stable in operation and which have been exploited by evolution, then there should be only a limited set of possible solutions to real biological problems. Of course, if nature shows no such restraint, then we must go back to the drawing-board if we are ever to understand its complexity.

Further reading:

Bourne, H. R., Sanders, D. A. & McCormick, F. Nature 348, 125–132 (1990).

Bray, D. Proc. Natl Acad. Sci. USA 99, 7–9 (2002).

Guet, C. C., Elowitz, M. B., Hsing, W. & Leibler, S. Science 296, 1466–1470 (2002).

Kauffman, S. A. The Origins of Order: Self-Organization and Selection in Evolution (Oxford Univ. Press, 1993).

Novák, B. & Tyson, J. J. J. Cell Sci. 106, 1153–1168 (1993).

Tyson, J. J., Chen, K. C. & Novak, B. Curr. Opin. Cell Biol. 5, 221–231 (2003).