This page has been archived and is no longer updated

 

Neutrality and Molecular Clocks

By: Soojin Yi (School of Biology, Georgia Institute of Technology, Atlanta, GA) © 2013 Nature Education 
Citation: Yi, S. (2013) Neutrality and Molecular Clocks. Nature Education Knowledge 4(2):3
Email
In the early days of molecular evolution, one of the most tantalizing findings was the observation that protein sequence change seemed to occur at a rate proportional to the time since the evolutionary divergence of species. The term 'molecular clock' was coined to describe the phenomenon, comparing the manner in which sequence changes occur to regular ticks of a clock. Is there really such a molecular clock? If so, why? Can we use molecular clocks to date evolutionary events? After several decades of study, we have answers to some of these questions. More importantly, in the process we have learned a lot more about how changes at the molecular level accumulate throughout the genome.
Aa Aa Aa

 

Yi Banner

The Beginning of the Molecular Evolutionary Clock

In the early 1960s, biologists began to investigate how proteins in different species evolve at the sequence level (Zuckerkandl 1962, Margoliash 1963, Doolittle & Blombäck 1964). The proteins analyzed included hemoglobin (Zuckerkandl & Pauling 1962), cytochrome C (Margoliash 1963), and fibrinopeptides (Doolittle & Blombäck 1964). These early investigations led to a remarkable discovery: it appeared that the numbers of differences between protein sequences of different species were roughly proportional to the time since species divergence (Figure 1).

Rates of amino acid changes in fibrinopeptides, hemoglobin
Figure 1
Rates of amino acid changes in fibrinopeptides, hemoglobin, and cytochrome c. The three proteins show different rates of changes per unit time. However, for each protein, the rate of changes per unit time appears to be approximately constant. If these trends hold for other proteins and for all species comparisons, we can easily use the molecular differences to date species divergence. For example, if we found that the fibrinopeptides of two species show 110 changes per 100 residues (shown as the red dot in the plot), we can infer that those two species diverged approximately 120 millions of years ago.
© 2013 Nature Education Modified from Dickerson (1971). All rights reserved. View Terms of Use

Zuckerkandl & Pauling (1965) likened the constant accumulation of amino acid substitutions over time to regular ‘ticks' of clocks, and stated that ‘there may exist a molecular evolutionary clock'. Thus, the term ‘molecular clock' was initially coined to describe changes in amino acids occurring in proportional to time since species divergence.

Since its first use, the term ‘molecular clock' has been used in many different contexts. Nowadays, it is often used to simply refer to the number of changes, or ‘substitutions', accumulated in the sequence of DNA or proteins in a given lineage. The number of substitutions per a defined time unit can be described as the ‘rate' of the molecular clock, which in this context is equivalent to ‘evolutionary rate'. Note that the initial connotation still holds in some cases. For example, when people state ‘we assumed a molecular clock' (particularly in phylogenetic analyses), it means that the numbers of substitutions were assumed to occur constantly over time.

The concept of a constant molecular clock has extraordinary implications for evolutionary biologists. If a constant molecular clock as initially proposed truly existed, inferring timing of evolutionary events would become a rather straightforward problem (Figure 1). However, it has become abundantly clear that substitutions do not occur constantly over time in different lineages (Kumar 2005). Nevertheless, the concept of molecular clock has been extremely influential in the field of molecular evolution. One of the most important ideas inspired by the concept of the molecular clock is the neutral theory of molecular evolution.

The Molecular Clock and Neutrality

When the idea of a constant molecular clock first emerged, it was thought that the predominant evolutionary force underlying amino acid or nucleotide substitutions was natural selection. Following this line of thinking, a constant molecular clock would indicate that adaptive substitutions in different species occur constantly over time. However, it is hard to explain how adaptive substitutions would occur in such a clock-like manner. Theoretically, the fates of adaptive mutations are determined by several evolutionary parameters, such as the strength of the selective advantage of that mutation, the size of the effective population, and adaptive mutation rates (Kimura 1983). These parameters are likely to differ between species, and even within a species, depending upon specific mutations and their interactions with environments.

Instead, Kimura (1968, 1969) proposed that most changes at the molecular level have little functional consequences, or are ‘neutral'. If a mutation has no fitness consequence, its fate in the population is determined completely by random chance. This means that we cannot predict whether a specific neutral mutation will eventually be fixed in the population. However, the rate at which neutral substitutions occur in the population can be predicted because it depends upon a single parameter, namely the mutation rate (Kimura 1968).

Let's imagine a population with N number of haploid individuals. If neutral mutations occur at rate u per individual per generation, the total number of mutations in one generation will be N times u. Since all these new mutations are neutral, their fates are completely determined by chance. In other words, all mutations have equal chance of reaching fixation (which leads to a ‘substitution'). The probability that each new neutral mutation will reach fixation, given that a substitution occurred, is simply 1/N. The rate of substitutions is calculated as the number of new mutations in each generation (Nu) multiplied by the probability each new mutation reaches fixation (1/N), which equals u. In other words, for neutral mutations, the rate of substitution is equal to the rate of mutation!

Therefore, if most mutations are neutral (as proposed in the neutral theory) and if mutation rates are constant over time, substitutions should occur constantly over time as well. We should then observe clock-like regular rates of substitutions at the molecular level. Kimura (1969) thus considered the observation of relatively constant molecular clock in protein sequences as strong support for the neutral theory of molecular evolution.

Testing How the Neutral Molecular Clock Runs

According to the neutral theory, the question of whether substitution rates are constant over time or not is equivalent to whether neutral mutation rates are constant over time. For this reason, many subsequent studies focused on analyzing data from neutral sites to determine whether neutral mutation rates are indeed constant over time. We will briefly review how these studies are implemented, before discussing theoretical debates over molecular clocks. Analyses of protein molecular clocks also continued, but the debates surrounding variation in protein molecular clocks are very different from those concerning neutral molecular clocks, and will not be included in this article. Interested readers should consult Gillespie (1991), Kumar (2005), Kim and Yi (2008), and Bedford et al. (2008).

Most empirical analyses of neutral molecular clocks rely on the theorem that neutral mutation rates can be inferred from neutral substitution rates (Kimura 1968, 1969). In practice, each study defines a certain type of sites in the genome as neutral sites, and compares substitution rates of those sites between lineages.

Which sites in the genome are truly neutral cannot be completely determined, but scientists came up with several useful proxies. Before the era of genome sequencing, most available sequence data were those from protein-coding DNA sequences. Studies often divide protein-coding DNA sequences into two types of sites (Wu & Li 1985). The first type of sites includes those for which any change would lead to amino acid substitutions, or ‘nonsynonymous sites'. The second type of sites, ‘synonymous sites', includes those that encode ‘degenerate' positions in the codon table, where a change does not lead to an amino acid substitution. For example, TCT and TCC both encode serine. If the third position of these codons changes, they will still produce the same amino acids. Such substitutions would be less visible to natural selection. Consequently, molecular clocks at synonymous sites should be closer to the neutral molecular clock than nonsynonymous clocks.

As sequencing techniques advanced, some studies used introns as proxies for neutral sites, since they are not incorporated into the mature mRNAs, and thus are more likely to be neutral (Yi et al. 2002). Sequences of inactive transposable elements that were inserted long before species divergence were also often employed (these are often referred to as ‘ancestral repeats', e.g., Thomas et al., 2003). Finally, some studies used non-coding DNA sequences (all sequences after removing protein-coding DNA sequences) extracted from whole genome alignments to test neutral molecular clocks (Elango et al. 2006, Huttley et al. 2007).

The most commonly used test is the so-called ‘relative rate test' (Sarich & Wilson 1973). Initially, substitution rates per unit time were estimated by dividing the total number of differences (substitutions) between proteins of different species by the divergence time, estimated from fossil records (Figure 1). However, fossil records are not available for many comparisons, and are associated with large error margins. The relative rate test overcomes the necessity for fossil records (Figure 2). As long as an outgroup sequence to the two lineages of interest exists, we can determine whether the two branches follow the same or different molecular clocks, without the knowledge of the absolute time of divergence (Figure 2).

Rates of amino acid changes in fibrinopeptides, hemoglobin, and cytochrome c.
Figure 2
Rates of amino acid changes in fibrinopeptides, hemoglobin, and cytochrome c. The three proteins show different rates of changes per unit time. However, for each protein, the rate of changes per unit time appears to be approximately constant. If these trends hold for other proteins and for all species comparisons, we can easily use the molecular differences to date species divergence. For example, if we found that the fibrinopeptides of two species show 110 changes per 100 residues (shown as the red dot in the plot), we can infer that those two species diverged approximately 120 millions of years ago.
© 2013 Nature Education Modified from Dickerson (1971). All rights reserved. View Terms of Use

What are the Determinants of Neutral Molecular Clocks?

Almost all the controversies at the heart of debates over neutral molecular clocks stem from the question of what the major sources of mutations are. This question is directly relevant to understanding patterns of mutation, which are the ultimate source of evolutionary change and genetic disease. Furthermore, understanding how mutation rates vary between lineages and within genomes is a fundamental question in comparative genomics, which aims to use sequence comparisons to identify genomic regions that are functionally important.

So what determines neutral mutation rates? One of the most important contributors to neutral molecular clocks is lineage-specific variation in generation times. From early on, the idea of a constant neutral molecular clock was perceived as being at odds with the molecular mechanisms of germline mutation. It has long been considered that most mutations arise from errors in DNA replication in germlines (Haldane 1947, Muller 1954). Since mutations occur when germline DNA is replicated for the next generation, they should accumulate in proportion to the number of generations, rather than the absolute amount of time. Therefore, if we compared the numbers of substitutions that have accumulated in two lineages since their divergence, the lineage with longer generation time, having undergone fewer DNA replication events, would harbor fewer substitutions compared to the lineage with the shorter generation time. Consequently, the molecular clock should run more slowly in species with longer generation times. This idea is referred to as the ‘generation-time effect'.

In fact, the generation-time effect was first observed in studies of primates, even before the debate on molecular clock. Morris Goodman, who was using immunological methods to investigate species relatedness at the time, observed that the rate at which some proteins diverge appeared to be decreased in apes, in particular humans, compared to Old World monkeys (Goodman 1961, 1962, 1963). This effect is referred to as ‘hominoid rate slowdown'. Since hominoids have longer generation times than Old World monkeys, this observation can be explained by the generation-time effect.

Wu & Li (1985) were the first to test the generation-time effect hypothesis using DNA sequence data. They used data from 11 genes of primates and rodents. Since primates have a much longer generation time than rodents do, the molecular clock should be faster in rodents compared to primates. Indeed, they found that for synonymous sites, rodents show approximately two times the rate of molecular evolution when compared to primates (Wu & Li 1985). For nonsynonymous sites however, such an effect was not found. In other words, the neutral molecular clock, but not the amino acid molecular clock, ticks faster in the rodent lineage compared to the primate lineage, which fits well with the idea of a generation-time effect.

Subsequent studies provided further support to the hominoid rate slowdown (Li & Tanimura 1987, Bailey et al. 1991) and the rate difference between the rodent and the primate lineages (Gu & Li 1992, Huttley et al. 2007). Moreover, a rate difference was observed in even smaller phylogenetic scales, especially in primates: for example, the human molecular clock runs slower than the chimpanzee molecular clock (Elango et al. 2006); rates in New World monkeys are faster than the rates in hominoids and Old World monkeys (Steiper & Young 2006). The different rates of molecular clocks observed in these studies are qualitatively consistent with the generation-time effect.

However, the actual differences between lineages are not quantitatively consistent with the difference in generation times. For example, Kumar & Subramanian (2002) showed that even though the difference in generation times between primates and rodents is much bigger than that between humans and Old World monkeys, the observed differences in molecular clocks are similar in these two comparisons. It is worthwhile to note that Kumar & Subramanian (2002) used specific statistical filters to remove data that showed ‘heterogeneous' substitution patterns, which might have caused a bias towards slowly evolving sequences (Yi et al. 2002). Nevertheless, the difference in the molecular clocks of primates and rodents appears much less than originally proposed by Wu & Li (1985). For example, Huttley et al. (2007) analyzed whole genome alignments of several species including the opossum, and showed that the rate difference between eutherian lineages and the opossum lineage (~30%) is much greater than the rate difference between human and mouse lineages (~14%). These examples demonstrate that the degree of differences in molecular clocks varies significantly among different studies due to differences in data sets and statistical methods. They also show that rate differences between lineages cannot be completely accounted for by the difference in generation times alone. Clearly, there are other contributors to neutral molecular clocks.

Indeed, life-history traits other than generation times appear to co-vary with molecular clocks. Martin & Palumbi (1993) showed that DNA molecular clocks run faster in species with small body size. This observation led to the hypothesis that metabolic rates are important determinants of molecular clocks. A high metabolic rate produces large numbers of mutagenic oxygen radicals, which would increase mutation rates (Rand 1994). Because metabolic rates and body size generally co-vary with generation times, it has been difficult to distinguish which of these constitutes the main determinant of molecular clock rates. Tsantes & Steiper (2009) have proposed, based upon data from primates, that age at first reproduction, rather than body size, is the main determinant of molecular clocks. Since age at first reproduction reflects the generation time effect, this study supports the idea that the generation time effect is the main determinant of molecular clock. However, this study is still based upon a limited number of lineages (four pairs of species were used). Thus, distinguishing between body size, generation-time effect, and metabolic rates still remains as an important issue in generalizing and understanding neutral molecular clocks.

Furthermore, the importance of factors that do not co-vary with generation times, such as DNA methylation, has been increasingly appreciated. DNA methylation is a chemical modification of genomic DNA found in diverse taxa. In animal genomes, DNA methylation occurs almost exclusively at cytosines followed by guanines (so-called ‘CpG's). Methylated cytosines, in turn, tend to mutate rapidly to thymines due to chemical instability (Bird 1980). Indeed, in the human genome, mutations caused by DNA methylation occur more than an order of magnitude more frequently than other mutations (Nachman & Crowell 2000, Elango et al. 2008). Because mutations caused by DNA methylation occur largely independently of DNA replication, such mutations may follow different molecular clocks than others. Specifically, instead of generation-time dependency, mutations caused by DNA methylation may follow a time-dependent molecular clock, which is similar to what was initially proposed by Zuckerkandl & Pauling (Kim et al. 2006)!

To test this hypothesis, Kim et al. (2006) compared human-chimpanzee divergence to macaque-baboon divergence, two species pairs that share similar divergence times but with different generation times (Steiper et al. 2004). The human-chimpanzee pair (the hominoid pair) has much longer generation times compared to the macaque-baboon pair (the Old World monkey pair). This study showed that for non-CpG sites, the Old World monkey pair accumulated approximately 30% more substitutions, which can be explained by the aforementioned hominoid rate slowdown effect. In contrast, molecular clocks at CpG sites showed similar numbers of substitutions in hominoid and Old World monkey pairs (Figure 3). Thus, time-dependent and generation time-dependent molecular clocks co-exist within the same genomes. The assumption that a single molecular clock may exist for a given lineage is no longer valid, because the predominant mutational forces vary among genomic regions.

(A) Phylogeny of the four taxa analyzed in Kim et al. (2006).
Figure 3
(A) Phylogeny of the four taxa analyzed in Kim et al. (2006). TO denotes the time since the split between the two Old World monkey species, and TH denotes the time since the split between the two hominoids. Fossil records suggest that TO and TH are approximately similar. X and Y denote the common ancestors of the human-chimpanzee pair and of the macaque-baboon pair, respectively. (B) Contrasting molecular clocks of transitions at CpG sites vs. those at non-CpG sites. The Y-axis shows the ratio of the numbers of substitutions accumulated in the baboon-macaque pair (KO) to that in the human-chimpanzee pair (KH). For non-CpG sites, this ratio is around 1.3, which is similar to the ratios observed in earlier studies of hominoid rate slowdown. In contrast, transitions at CpG sites, which are primarily of methylation origin, show no difference between the two pairs.
© 2013 Nature Education (A) and (B) are adapted from Kim et al. (2006). All rights reserved. View Terms of Use

Conclusions

The concept of a constant molecular clock was initially proposed based upon a limited amount of protein sequence data. Even though subsequent studies showed that such an observation is not a general pattern in amino acids, it has had significant influence on the field of molecular evolution, in particular on the development of the neutral theory of molecular evolution for DNA sequence data. Following the neutral theory, studies focused on elucidating patterns of variation in neutral mutation rates. During the last several decades, we have observed that molecular clocks run at different rates between lineages. Moreover, the degree of variation could vary depending upon the different types of data and specific statistical methods used. The generation-time effect continues to hold at a qualitative level, but is insufficient to explain quantitative variation of neutral mutation rates among lineages. Life-history traits and non-replication dependent mutations, such as those caused by DNA methylation, are also important contributors to genomic molecular clocks. Indeed, different types of molecular clocks are observed even within a genome, because the predominant mutational inputs vary between different genomic regions. Thus, rather than assuming a single neutral molecular clock for each genome, future studies should aim to reveal the variation of genomic neutral molecular clocks, to learn about genomic mutational landscapes. Such information is not only useful for understanding the raw material governing molecular evolution and genetic disease, but also constitutes a critical component influencing comparative and functional genomic analyses to identify functional genomic regions.

References and Recommended Reading


Bailey, W. et al. Molecular evolution of the psi eta-globin gene locus: Gibbon phylogeny and the hominoid slowdown. Molecular Biology and Evolution 8, 155-184 (1991).

Bedford, T, I. et al. Overdispersion of the molecular clock varies between yeast, Drosophila and mammals. Genetics 179, 977-984 (2008).

Bird, A. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Research 8, 1499-1504 (1980).

Dickerson, R. E. The structure of cytochrome c and the rates of molecular evolution. Journal of Molecular Evolution 1, 26-45 (1971).

Doolittle, R. F. & Blombäck, B. Amino-acid sequence investigations of fibrinopeptides from various mammals: Evolutionary implications. Nature 202, 147-152 (1964).

Easteal, S. The relative rate of DNA evolution in primates. Molecular Biology and Evolution 8, 115-127 (1991).

Elango, N. et al. Mutations of different molecular origins exhibit contrasting patterns of regional substitution rate variation. PLoS Computational Biology 4, e1000015 (2008).

Elango, N. et al. Variable molecular clocks in hominoids. Proceedings of the National Academy of Sciences (USA) 103, 1370-1375 (2006).

Gillespie, J. H. The Causes of Molecular Evolution. Oxford, UK: Oxford University Press, 1991.

Goodman, M. The role of immunologic differences in the phyletic development of human behavior. Human Biology 33, 131-162 (1961)

Goodman, M. Evolution of the immunologic species specificity of human serum proteins. Human Biology 34, 104-150 (1962).

Goodman, M. "Man's place in the phylogeny of the primates as reflected in serum proteins" in Classification and Human Evolution, ed. S. L. Washburn (Chicago: Aldine Press, 1963), 204-234.

Gu, X. & Li, W.-H. Higher rates of amino acid substitution in rodents than in humans. Molecular Phylogenetics and Evolution 1, 211-214 (1992).

Haldane, J. B. S. The mutation rate of the gene for hemophilia, and its segregation ratios in males and females. Annals of Eugenics 13, 262-272 (1947).

Huttley, G. A. et al. Rates of genome evolution and branching order from whole genome analysis. Molecular Biology ad Evolution 24, 1722-1730 (2007).

Kim, S.-H. et al. Heterogeneous genomic molecular clocks in primates. PLoS Genetics 2, e163 (2006).

Kim, S.-H. & Yi, S. Mammalian nonsynonymous sites are not overdispersed: Comparative genomic analysis of index of dispersion of mammalian proteins. Molecular Biology and Evolution 25, 634-642 (2008).

Kimura, M. The rate of molecular evolution considered from the standpoint of population genetics. Proceedings of the National Academy of Sciences (USA) 63, 1181-1188 (1969).

Kimura, M. The Neutral Theory of Molecular Evolution. Cambridge, UK: Cambridge University Press,1983.

Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624-626 (1968).

Kumar, S. Molecular clocks: Four decades of evolution. Nature Reviews Genetics 6, 654-662 (2005).

Kumar, S. & Subramanian, S. Mutation rates in mammalian genomes. Proceedings of the National Academy of Sciences (USA) 99, 803-808 (2002).

Li, W.-H. & Tanimura, M. The molecular clock runs more slowly in man than in apes and monkeys. Nature 326, 93-96 (1987).

Margoliash, E. Primary structure and evolution of cytochrome C. Proceedings of the National Academy of Sciences (USA) 50, 672-679 (1963).

Martin, A. P. & Palumbi, S. R. Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences (USA) 90, 4087-4091 (1993).

Muller, H. J. "The nature of the genetic effects produced by radiation," in Radiation Biology, ed. A. Hollaender (New York: McGraw-Hill, 1954), 351-473.

Nachman, M. W. & Crowell, S. L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297-304 (2000).

Rand, D. M. Thermal habit, metabolic rate and the evolution of mitochondrial DNA. Trends in Ecology and Evolution 9, 125-131 (1994).

Sarich, V. M. and Wilson, A. C. Generation time and genomic evolution in primates. Science 179, 1144- 1147 (1969).

Steiper, M. E. et al. Genomic data support the hominoid slowdown and an early Oligocene estimate for the hominoid-cercopithecoid divergence. Proceedings of the National Academy of Sciences (USA) 101, 17021-17026 (2004).

Steiper, M. E. & Young, N. M. Primate molecular divergence dates. Molecular Phylogenetics and Evolution 41, 384-394 (2006).

Tsantes, C. & Steiper, M. E. Age at first reproduction explains rate variation in the strepsirrhine molecular clock. Proceedings of the National Academy of Sciences (USA) 106,18165-18170 (2009).

Wu, C.-I. & Li, W.-H.. Evidence for higher rates of nucleotide substitution in rodents than in man. Proceedngs of the National Academy of the Sciences (USA) 82, 1741-1745 (1985).

Yi, S. et al. Slow molecular clocks in Old World monkeys, apes, and humans. Molecular Biology and Evolution 19, 2191-2198 (2002).

Zuckerkandl, E. & Pauling, L. B. "Evolutionary divergence and convergence in proteins," in Evolving Genes and Proteins, eds. V. Bryson & H. J. Vogel (New York: Academic Press, 1965), 97-166.

Zuckerkandl, E. & Pauling, L.B. "Molecular disease, evolution, and genetic heterogeneity," in Horizons in Biochemistry, eds. M. Kasha & B. Pullman (New York: Academic Press, 1962), 189-225.
Email

Flag Inappropriate

This content is currently under construction.

Connect
Connect Send a message


Nature Education Home

Scientific Underpinnings

Visual Browse

Close