Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Resurrecting ancient genes: experimental analysis of extinct molecules

Key Points

  • Hypotheses about molecular evolution can be experimentally tested by resurrecting ancient genes and characterizing their functions.

  • An ancient gene is resurrected by phylogenetically inferring its sequence, synthesizing and subcloning it into an expression vector and expressing it in cell culture.

  • Maximum-likelihood methods for ancestral sequence reconstruction are an advance over previous methods because they are more accurate for very ancient sequences and they allow statistical confidence in the inference to be calculated at each sequence site.

  • Recent studies using likelihood-based phylogenetics have resurrected genes that are far more ancient — up to one billion years old — than was previously possible.

  • Ancestral sequence inference can be compromised by erroneous assumptions about the evolutionary process or the phylogenetic tree.

  • Studies that use resurrected genes should critically evaluate statistical confidence in ancestral state inferences, with a particular focus on sites that are known to be functionally important.

  • Errors in ancestral sequence reconstruction will usually — but not always — bias resurrected genes towards non-functionality.

  • In the future, ancestral gene resurrection will be combined with site-directed mutagenesis and experimental evolution systems to determine the specific mechanisms and dynamics by which new protein functions have evolved.


There are few molecular fossils: with the rare exception of DNA fragments preserved in amber, ice or peat, no physical remnants preserve the intermediate forms that existed during the evolution of today's genes. But ancient genes can now be reconstructed, expressed and functionally characterized, thanks to improved techniques for inferring and synthesizing ancestral sequences. This approach, known as 'ancestral gene resurrection', offers a powerful new way to empirically test hypotheses about the function of genes from the deep evolutionary past.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The ancestral gene resurrection strategy.

Similar content being viewed by others


  1. Messier, W. & Stewart, C. B. Episodic adaptive evolution of primate lysozymes. Nature 385, 151–154 (1997).

    CAS  PubMed  Google Scholar 

  2. Bielawski, J. P. & Yang, Z. Maximum likelihood methods for detecting adaptive evolution after gene duplication. J. Struct. Funct. Genomics 3, 201–212 (2003).

    CAS  PubMed  Google Scholar 

  3. Gaucher, E. A., Das, U. K., Miyamoto, M. M. & Benner, S. A. The crystal structure of eEF1A refines the functional predictions of an evolutionary analysis of rate changes among elongation factors. Mol. Biol. Evol. 19, 569–573 (2002).

    CAS  PubMed  Google Scholar 

  4. Bishop, J. G., Dean, A. M. & Mitchell-Olds, T. Rapid evolution in plant chitinases: molecular targets of selection in plant-pathogen co-evolution. Proc. Natl Acad. Sci. USA 97, 5322–5327 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Pauling, L. & Zuckerkandl, E. Chemical paleogenetics: molecular restoration studies of extinct forms of life. Acta Chem. Scand. 17, S9–S16 (1963). Foresaw the power of ancestral gene resurrection almost three decades before the first such study could be carried out.

    CAS  Google Scholar 

  6. Ivics, Z., Hackett, P. B., Plasterk, R. H. & Izsvak, Z. Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells. Cell 91, 501–510 (1997).

    CAS  PubMed  Google Scholar 

  7. Adey, N. B., Tollefsbol, T. O., Sparks, A. B., Edgell, M. H. & Hutchison, C. A. Molecular resurrection of an extinct ancestral promoter for mouse L1. Proc. Natl Acad. Sci. USA 91, 1569–1573 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Zhang, J. & Nei, M. Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J. Mol. Evol. 44 (Suppl. 1), S139–S146 (1997). Evaluated the accuracy of the principal methods for ancestral sequence inference using computer-simulated sequences under various evolutionary conditions.

    CAS  PubMed  Google Scholar 

  9. Hillis, D. M., Bull, J. J., White, M. E., Badgett, M. R. & Molineux, I. J. Experimental phylogenetics: generation of a known phylogeny. Science 255, 589–592 (1992). Using viral lineages generated in the laboratory, this study empirically showed that parsimony reconstruction of ancestral states is highly accurate under simple evolutionary conditions.

    CAS  PubMed  Google Scholar 

  10. Li, W. H. Molecular Evolution (Sinauer, Sunderland, Massachusetts, 1997).

    Google Scholar 

  11. Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981).

    CAS  PubMed  Google Scholar 

  12. Yang, Z., Kumar, S. & Nei, M. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141, 1641–1650 (1995). First description of a maximum likelihood method for ancestral sequence reconstruction.

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Pupko, T., Pe'er, I., Hasegawa, M., Graur, D. & Friedman, N. A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: application to the evolution of five gene families. Bioinformatics 18, 1116–1123 (2002).

    CAS  PubMed  Google Scholar 

  14. Pupko, T., Pe'er, I., Shamir, R. & Graur, D. A fast algorithm for joint reconstruction of ancestral amino acid sequences. Mol. Biol. Evol. 17, 890–896 (2000).

    CAS  PubMed  Google Scholar 

  15. Koshi, J. M. & Goldstein, R. A. Probabilistic reconstruction of ancestral protein sequences. J. Mol. Evol. 42, 313–320 (1996).

    CAS  PubMed  Google Scholar 

  16. Chang, B. S. W., Kazmi, M. A. & Sakmar, T. P. Synthetic gene technology: applications to ancestral gene reconstruction and structure–function studies of receptors. Meth. Enzymol. 343, 274–294 (2002).

    Google Scholar 

  17. Chang, B. S., Jonsson, K., Kazmi, M. A., Donoghue, M. J. & Sakmar, T. P. Recreating a functional ancestral archosaur visual pigment. Mol. Biol. Evol. 19, 1483–1489 (2002). Resurrection of the rhodopsin gene from the ancestor of birds and other dinosaurs indicates that the first dinosaurs might have been nocturnal.

    CAS  PubMed  Google Scholar 

  18. Thornton, J. W., Need, E. & Crews, D. Resurrecting the ancestral steroid receptor: ancient origin of estrogen signaling. Science 301, 1714–1717 (2003). Resurrection of the ancestral steroid hormone receptor indicates that modern receptors evolved from an ancient oestrogen receptor more than 600 million years ago.

    CAS  PubMed  Google Scholar 

  19. Thornton, J. W. Evolution of vertebrate steroid receptors from an ancestral estrogen receptor by ligand exploitation and serial genome expansions. Proc. Natl Acad. Sci. USA 98, 5671–5676 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Gaucher, E. A., Thomson, J. M., Burgan, M. F. & Benner, S. A. Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 425, 285–288 (2003). Resurrection of an elongation factor protein from the ancient ancestor of all bacteria indicates a high-temperature origin for the bacterial kingdom. This is the oldest gene to be resurrected.

    CAS  PubMed  Google Scholar 

  21. Steketee, K. et al. Broadened ligand responsiveness of androgen receptor mutants obtained by random amino acid substitution of H874 and mutation hot spot T877 in prostate cancer. Int. J. Cancer 100, 309–317 (2002).

    CAS  PubMed  Google Scholar 

  22. Jermann, T. M., Opitz, J. G., Stackhouse, J. & Benner, S. A. Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily. Nature 374, 57–59 (1995). A classic early gene resurrection study: reconstruction of a series of ancestral sequences allowed the molecular evolutionary basis of ruminant digestion in artiodactyls to be precisely characterized.

    CAS  PubMed  Google Scholar 

  23. Felsenstein, J. Inferring Phylogenies (Sinauer, Sunderland, Massachusetts, 2003).

    Google Scholar 

  24. Feder, M. E. & Mitchell-Olds, T. Evolutionary and ecological functional genomics. Nature Rev. Genet. 4, 651–657 (2003).

    CAS  PubMed  Google Scholar 

  25. Gaucher, E. A., Gu, X., Miyamoto, M. M. & Benner, S. A. Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem. Sci. 27, 315–321 (2002).

    CAS  PubMed  Google Scholar 

  26. Sun, H. et al. Identification of essential amino acid changes in paired domain evolution using a novel combination of evolutionary analysis and in vitro and in vivo studies. Mol. Biol. Evol. 19, 1490–1500 (2002).

    CAS  PubMed  Google Scholar 

  27. Zhang, J. & Rosenberg, H. F. Complementary advantageous substitutions in the evolution of an antiviral RNase of higher primates. Proc. Natl Acad. Sci. USA 99, 5486–5491 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Golding, G. B. & Dean, A. M. The structural basis of molecular adaptation. Mol. Biol Evol. 15, 355–369 (1998).

    CAS  PubMed  Google Scholar 

  29. Elena, S. F. & Lenski, R. E. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nature Rev. Genet. 4, 457–469 (2003).

    CAS  PubMed  Google Scholar 

  30. Fitch, W. M. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20, 406–416 (1971). Seminal description of the maximum parsimony algorithm for ancestral state inference.

    Google Scholar 

  31. Swofford, D. Phylogenetic Analysis Using Parsimony (software) (Illinois Natural History Survey, Champaign, 1985).

    Google Scholar 

  32. Baba, M. L., Goodman, M., Berger-Cohn, J., Demaille, J. G. & Matsuda, G. The early adaptive evolution of calmodulin. Mol. Biol. Evol. 1, 442–455 (1984).

    CAS  PubMed  Google Scholar 

  33. Malcolm, B. A., Wilson, K. P., Matthews, B. W., Kirsch, J. F. & Wilson, A. C. Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345, 86–89 (1990).

    CAS  PubMed  Google Scholar 

  34. Stewart, C. B. Comparative method in study of protein structure and function: enzyme specificity as an example. Meth. Enzymol. 224, 591–603 (1993).

    CAS  Google Scholar 

  35. Coddington, J. A. Cladistic tests of adaptational hypotheses. Cladistics 4, 3–22 (1988).

    PubMed  Google Scholar 

  36. Frumhof, P. C. & Reeve, H. K. Using phylogeneis to test hypotheses of adaptation: a critique of some current proposals. Evolution 48, 172–180 (1994).

    Google Scholar 

  37. Schultz, T. R., Cocroft, R. B. & Churchill, G. A. The reconstruction of ancestral character states. Evolution 502, 504–511 (1996).

    Google Scholar 

  38. Stackhouse, J., Presnell, S. R., McGeehan, G. M., Nambiar, K. P. & Benner, S. A. The ribonuclease from an extinct bovid ruminant. FEBS Lett. 262, 104–106 (1990). The first ancestral gene resurrection study, using parsimony and de novo gene synthesis.

    CAS  PubMed  Google Scholar 

  39. Nambiar, K. P. et al. Total synthesis and cloning of a gene coding for the ribonuclease S protein. Science 223, 1299–1301 (1984).

    CAS  PubMed  Google Scholar 

  40. Benner, S. A., Caraco, M. D., Thomson, J. M. & Gaucher, E. A. Planetary biology — paleontological, geological, and molecular histories of life. Science 296, 864–868 (2002).

    CAS  PubMed  Google Scholar 

  41. Bruno, W. J. & Halpern, A. L. Topological bias and inconsistency of maximum likelihood using wrong models. Mol. Biol. Evol. 16, 564–566 (1999).

    CAS  PubMed  Google Scholar 

  42. Huelsenbeck, J. P. & Bollback, J. P. Empirical and hierarchical Bayesian estimation of ancestral states. Syst. Biol. 50, 351–366 (2001).

    CAS  PubMed  Google Scholar 

  43. Chandrasekharan, U. M., Sanker, S., Glynias, M. J., Karnik, S. S. & Husain, A. Angiotensin II-forming activity in a reconstructed ancestral chymase. Science 271, 502–505 (1996).

    CAS  PubMed  Google Scholar 

  44. Shi, Y. & Yokoyama, S. Molecular analysis of the evolutionary significance of ultraviolet vision in vertebrates. Proc. Natl Acad. Sci. USA 100, 8308–8313 (2003). A series of ancestral visual pigments were resurrected, explaining the evolution of ultraviolet vision in the principal vertebrate lineages.

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Posada, D. & Crandall, K. A. Selecting models of nucleotide substitution: an application to human immunodeficiency virus 1 (HIV-1). Mol. Biol. Evol. 18, 897–906 (2001).

    CAS  PubMed  Google Scholar 

Download references


I thank E. Gaucher, T. Dean, D. Hillis, J. Bull and L. Ancel Meyers for enlightening discussions and ideas. Two anonymous reviewers made very helpful suggestions on the manuscript. Supported in part by the National Institute of General Medical Sciences.

Author information

Authors and Affiliations


Ethics declarations

Competing interests

The author declares no competing financial interests.

Related links

Related links




oestrogen receptors





Introduction to phylogenetic inference

Joe Thornton's laboratory

Linus Pauling

MrBayes software

PAML software

PAUP software

Peter Wilson's introduction to likelihood-based phylogenetics

Steroid hormone receptors

Steve Benner's laboratory

Willi Hennig



An animal that shows bilateral symmetry across a body axis. Bilaterians include chordates, arthropods, nematodes, annelids and molluscs, among other groups.


The 'same' gene in more than one species. Orthologues descend from a speciation event.


In phylogenetics, sequences that are known a priori to be more distantly related to the other sequences in the analysis (the ingroup sequences) than the ingroup sequences are to each other.


Preferential use of certain DNA codons over others that code for the same amino acid.


A family of biochemical proceduers used to determine the affinity and specificity with which a protein binds a specific ligand or substrate.


The class of bony vertebrate fish with ray-like fins and symmetrical tails. It includes the vast majority of marine and freshwater bony fishes.


The principle that the best-supported evolutionary inference is the one that requires the fewest number of character changes. This criterion rests on the assumption that identical character states among closely related species are more likely to have descended from the same state in the species' common ancestor than to have evolved multiple times.


A member of the animal taxon that includes cows, sheep, pigs, giraffes, camels, oxes, whales, hippopotami and other two-toed hoofed mammals.


A method for hypothesis testing in a likelihood framework. A data set's fit to a more complex model is compared with its fit to a simpler model using the likelihood ratio statistic (twice the ratio of the likelihoods of the two models). The more complex model is adopted if it increases the likelihood more than expected by chance at some critical probability. If the simpler model is a restricted version of the more complex model, the improvement in fit can be evaluated using a chi-square distribution.


In phylogenetics, a probabilistic technique for evaluating trees, evolutionary models and ancestral state assignments. Hypotheses are evaluated by their posterior probabilities.


In Bayesian statistics, the probability that a hypothesis is true after the data have been analysed. The posterior probability is defined as the likelihood of the hypothesis multiplied by its prior probability, divided by the sum of the likelihood multiplied by the prior for all hypotheses.


A technique for efficient numerical calculation of Bayesian posterior probabilities.


A phylogenetically related family of intracellular transcription factors that mediate the effects of oestrogens, androgens, progestins, glucocorticoids and mineralocorticoids on physiology and development.


A measure of support in a parsimony context for individual nodes in a phylogeny. The branch support — also known as the decay index or Bremer support — is the number of extra evolutionary changes that are required for a clade not to occur in the most parsimonious phylogeny.


A measure of support for individual nodes in a phylogeny. Sequence sites are sampled randomly with replacement from the original data set, and the optimal tree is inferred. This process is repeated many times, and the bootstrap proportion for a clade is the frequency of bootstrap replicates in which it occurs. A high bootstrap proportion indicates that the clade is not likely to be the result of sampling error in the sequence data.


A family of statistical methods for comparing two phylogenies as explanations of a data set. The difference in the log-likelihoods of the two trees is calculated separately for each sequence site. If one tree is a better fit to the data than the other, the mean of these differences will be significantly different from zero.


A change among phylogenetic lineages in the substitution rate for a sequence site or set of sites.


A multidimensional plot that shows the fitness (on the vertical axis) for all possible variants of a sequence (occupying the horizontal axis, or sequence space).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thornton, J. Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5, 366–375 (2004).

Download citation

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing