The seventeenth-century Enlightenment generated two very different approaches to thinking about the natural world. The first sought mathematical models to account for observations, beginning with the motion of the planets in the night sky. This approach generated 'physical science', a web of mathematical and physical models describing and predicting the behaviour of units of matter smaller than the atom, larger than the planets, and much in between.

The second approach emphasized collecting and classification. The archetypal practitioner was the English gentleman, who named plants, rocks and fossils on his estate, then in his empire. From this enterprise came 'natural history', a web of classification systems, non-reductive models and historical statements about the planet and the life it carries. The positions of continents came to be understood as outcomes of historical movements of plates on the globe, obeying physical laws but too complicated to be predicted by them. Life is also understood as the consequence of historical events, occurring within the context of darwinian theory, again consistent with physical laws but not predicted by them.

It is curious that these two very different activities receive the same name: 'science'. This has generated much confusion, and some rancour, with each group feeling itself to be the more 'relevant'. But the disjunction between the two traditions is (I believe) about to end, and in a largely unheralded way.

Many chemical biologists and biophysicists view the future of biology as a metamorphosis in which 'biological' phenomena will be replaced by their underlying 'physical chemical' components. This metamorphosis is ongoing, and very productive, but is unlikely to be the entire story. The surprise will come when biophysicists and chemical biologists discover that they need to research the history of biomolecules if they are truly to understand the physical behaviours that they have worked so hard to characterize.

An early sign of the need for natural history in the physical sciences was the struggle to predict how proteins fold. This seemed to be a good area for applying the physical- science paradigm to biology, so physical chemists mounted a frontal assault, using huge computers to build physical models of proteins in water and making guesses about how atoms interact. The assault failed. The only way to make the computation even vaguely tractable, it turned out, required considerable abstraction of the physical model for the protein. The same physical theory that inspired the computation suggested that these abstractions must compromise the computation's value as a predictive tool.

Natural history offered an entirely different approach. Divergent evolution creates families of proteins that have descended from common ancestors. As proteins evolve from those ancestors, natural selection requires them to remain 'fit'. The principal prerequisite for fitness in a protein is a fold. So proteins diverging from a common ancestor generally conserve their folds. This means that during the evolution of protein sequences, mutations do not accumulate as they would if proteins were formless, functionless organic molecules. Instead, amino acids that are important to the fold suffer substitution differently from those that are not. A signal should lie in the pattern of protein-sequence divergence — the difference between how proteins have divergently evolved in their past, and how they would have evolved had they been formless, functionless molecules.

Natural history did not overcome the challenges obstructing the frontal assault of the physical scientists; it went around them. Today, the secondary and tertiary structure of proteins can reliably be predicted by exploiting the historical signal embedded in a set of protein sequences related by common ancestry. Since 1990, about 30 protein folds have been predicted using the history of protein families. In many cases, the prediction provided information about function as well as form.

Interweaving the histories of biomolecules should deliver a comprehensive model for life. Credit: KEN EWARD/SPL

Genomics is driving the use of history in physical science. Genomic-sequence databases contain historical information about genes in an easy-to-use form. It can be used to build evolutionary trees and reconstruct the sequences of ancestral genes and proteins. Through recombinant-DNA technology, ancient proteins from extinct organisms can be resurrected in the laboratory and studied biochemically to test hypotheses about form and function. These techniques promise to deliver a comprehensive model for life, combining the physical and structural behaviour of biomolecules with two histories — the first told by palaeontological and geological records, the second by molecular sequences.

I believe that this historical model is as important for understanding the behaviour of biomolecules as the powerful instruments used for their physical characterization. Indeed, the human genome, now no more than a set of chemical structures of the organic molecules involved in inheritance, will need natural history to give it value.

Palaeogenomics models have already suggested how the global unification will look when it is complete. For example, a historical analysis of resurrected proteases ancestral to those regulating blood pressure indicated which animal models were appropriate for studying the pharmacology of pharmaceuticals targeted against these enzymes.

Combining natural history and physical science involves huge collaborations and faces many obstacles, not least of which is training. Compartmentalization of academic departments and funding bodies ensures that natural history is only rarely incorporated into biophysics curricula and that the converse is true for natural historians. But the potential rewards are immense. The first to reconcile the two traditions will be the first to glimpse the 'how' and 'why' of life from the molecule to the planet.

WEBLINKS

http://www.stanford.edu/class/history133/Smoc/Unifying.html

http://www.embl-heidelberg.de/predictprotein/predictprotein.html