Key Points
-
The germline mutation rate varies across the mammalian genome at several scales: between adjacent nucleotides; over hundreds of nucleotides; over hundreds of thousands to millions of nucleotides; and between whole chromosomes.
-
The strongest patterns are observed at the smallest scales.
-
Variation between adjacent nucleotides can either be dependent or independent of context.
-
Large-scale variation in the mutation rate is underestimated by between-species comparisons.
-
Variation between chromosomes is most conspicuous for the sex chromosomes.
-
There is variation in the somatic mutation rate across the genome. The variation has similarities and differences to that observed in the germline.
Abstract
It has been known for many years that the mutation rate varies across the genome. However, only with the advent of large genomic data sets is the full extent of this variation becoming apparent. The mutation rate varies over many different scales, from adjacent sites to whole chromosomes, with the strongest variation seen at the smallest scales. Some of these patterns have clear mechanistic bases, but much of the rate variation remains unexplained, and some of it is deeply perplexing. Variation in the mutation rate has important implications in evolutionary biology and underexplored implications for our understanding of hereditary disease and cancer.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Benzer, S. On the topography of the genetic fine structure. Proc. Natl Acad. Sci. USA 47, 403–415 (1961). This is a classic study demonstrating that the mutation rate varies between sites within a gene; it was performed before the birth of DNA sequencing.
Greenman, C., Wooster, R., Futreal, P. A., Stratton, M. R. & Easton, D. F. Statistical analysis of pathogenicity of somatic mutations in cancer. Genetics 173, 2187–2198 (2006).
Haag-Liautard, C. et al. Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila. Nature 445, 82–85 (2007).
Conrad, D. F. et al. Variation in genome-wide mutation rates within and between human families. Nature Genet. 43, 712–714 (2011).
Gojobori, T., Li, W.-H. & Graur, D. Patterns of nucleotide substitution in pseudogenes and functional genes. J. Mol. Evol. 18, 360–369 (1982).
Bulmer, M. Neighbouring base effects on substitution rates in pseudogenes. Mol. Biol. Evol. 3, 322–329 (1986).
Lynch, M. The Origins of Genome Architecture (Sinauer Associates, Sunderland, Massachusetts, 2007).
Hershberg, R. & Petrov, P. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genetics 6, e1001115 (2010).
Blake, R. D., Hess, S. T. & Nicholson, J. The influence of nearest neighbours on the rate and pattern of spontaneous point mutations. J. Mol. Evol. 34, 189–200 (1992).
Zhao, Z. & Boerwinkle, E. Neighboring-nucleotide effects on single nucleotide polymorphisms: a study of 2.6 million polymorphisms across the human genome. Genome Res. 12, 1679–1686 (2002).
Hwang, D. G. & Green, P. Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. Proc. Natl Acad. Sci. USA 101, 13994–14001 (2004). This is the most comprehensive analysis of the effect of neighbouring nucleotides on the rate of mutation in mammals.
Keightley, P. D., Eory, L., Halligan, D. L. & Kirkpatrick, M. Inference of mutation parameters and selective constraint in Mammalian coding sequences by approximate bayesian computation. Genetics 187, 1153–1161 (2011).
Siepel, A. & Haussler, D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol. Biol. Evol. 21, 468–488 (2004).
Nachman, M. W. & Crowell, S. L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304 (2000).
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Fryxell, K. J. & Moon, W. J. CpG mutation rates in the human genome are highly dependent on local GC content. Mol. Biol. Evol. 22, 650–658 (2005).
Coulondre, C., Miller, J. H., Farabaugh, P. J. & Gilbert, W. Molecular basis of base substitution hot-spots in Escherichia coli. Nature 274, 775–780 (1978).
Nevarez, P. A., DeBoever, C. M., Freeland, B. J., Quitt, M. A. & Bush, E. C. Context dependent substitution biases vary within the human genome. BMC Bioinformatics 11, 462 (2010).
Excoffier, L. & Yang, Z. Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. Mol. Biol. Evol. 16, 1357–1368 (1999).
Wakeley, J. Substitution rate heterogeneity variation among sites in hypervariable region 1 of human mitochondrial DNA. J. Mol. Evol. 37, 613–623 (1993).
Stoneking, M. Hypervariable sites in the mtDNA control region are mutational hotspots. Am. J. Hum. Genet. 67, 1029–1032 (2000).
Eyre-Walker, A. & Awadalla, P. Does human mtDNA recombine? J. Mol. Evol. 53, 430–435 (2001).
Rogozin, I. B. & Pavlov, Y. I. Theoretical analysis of mutation hotspots and their DNA sequence context specificity. Mutat. Res. 544, 65–85 (2003).
Hodgkinson, A., Ladoukakis, E. & Eyre-Walker, A. Cryptic variation in the human mutation rate. PLoS Biol. 7, e27 (2009). This was the first demonstration that the mutation rate varies between adjacent sites in a context-independent manner in the nuclear genome.
Johnson, P. L. F. & Hellmann, I. Mutation rate distribution inferred from coincident SNPs and coincident substitutions. Genome Biol. Evol. 13 May 2011 (doi:10.1093/gbe/evr044).
Musumeci, L. et al. Single nucleotide differences (SNDs) in the dbSNP database may lead to errors in genotyping and haplotyping studies. Hum. Mutat. 31, 67–73 (2010).
Hodgkinson, A. & Eyre-Walker, A. The genomic distribution and local context of coincident SNPs in human and chimpanzee. Genome Biol. Evol. 2, 547–557 (2010).
Bird, A. P. CpG-rich islands and the function of DNA methylation. Nature 321, 209–213 (1986).
Polak, P. & Arndt, P. F. Transcription induces strand-specific mutations at the 5′ end of human genes. Genome Res. 18, 1216–1223 (2008).
Cohen, N. M., Kenigsberg, E. & Tanay, A. Primate CpG islands are maintained by heterogeneous evolutionary regimes involving minimal selection. Cell 145, 773–786 (2011).
Elango, N., Kim, S. H., Vigoda, E. & Yi, S. V. Mutations of different molecular origins exhibit contrasting patterns of regional substitution rate variation. PLoS Comput. Biol. 4, e1000015 (2008).
Fryxell, K. & Zuckerkandl, E. Cytosine deamination plays a primary role in the evolution of mammalian isochores. Mol. Biol. Evol. 17, 1371–1383 (2000).
Bernardi, G. Ischores and the evolutionary genomics of vertebrates. Gene 241, 3–17 (2000).
Boulikas, T. Evolutionary consequences of nonrandom damage and repair of chromatin domains. J. Mol. Evol. 35, 156–180 (1992).
Hanawalt, P. C. & Spivak, G. Transcription-coupled DNA repair: two decades of progress and surprises. Nature Rev. Mol. Cell Biol. 9, 958–970 (2008).
Green, P., Ewing, B., Miller, W., Thomas, P. J. & Green, E. D. Transcription-associated mutational asymmetry in mammalian evolution. Nature Genet. 33, 514–517 (2003). This is an analysis that showed for the first time that transcription affects the pattern but not the rate of mutation in the germline.
Webster, M. T., Smith, N. G. C., Lercher, M. J. & Ellegren, H. Gene expression, synteny and local similarity in human noncoding mutation rates. Mol. Biol. Evol. 21, 1820–1830 (2004).
Ying, H., Epps, J., Williams, R. & Huttley, G. Evidence that localized variation in primate sequence divergence arises from an influence of nucleosome placement on DNA repair. Mol. Biol. Evol. 27, 637–649 (2010).
Boyle, A. P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 21, 456–464 (2011).
Tian, D. et al. Single-nucleotide mutation rate increases close to insertions/deletions in eukaryotes. Nature 455, 105–108 (2008). This paper presents the first evidence that indel mutations are mutagenic in a broad variety of organisms.
Hollister, J. D., Ross-Ibarra, J. & Gaut, B. S. Indel-associated mutation rate varies with mating system in flowering plants. Mol. Biol. Evol. 27, 409–416 (2010).
McDonald, M. J., Wang, W. C., Huang, H. D. & Leu, J. Y. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences. PLoS Biol. 9, e1000622 (2011).
Zhu, L., Wang, Q., Tang, P., Araki, H. & Tian, D. Genomewide association between insertions/deletions and the nucleotide diversity in bacteria. Mol. Biol. Evol. 26, 2353–2361 (2009).
Amos, W. Mutation biases and mutation rate variation around very short human microsatellites revealed by human–chimpanzee–orangutan genomic sequence alignments. J. Mol. Evol. 71, 192–201 (2010).
Lang, G. I. & Murray, A. W. Estimating the per-base-pair mutation rate in the yeast Saccharomyces cerevisiae. Genetics 178, 67–82 (2008).
Eory, L., Halligan, D. L. & Keightley, P. D. Distributions of selectively constrained sites and deleterious mutation rates in the hominid and murid genomes. Mol. Biol. Evol. 27, 177–192 (2010).
Pink, C. J. et al. Evidence that replication-associated mutation alone does not explain between-chromosome differences in substitution rates. Genome Biol. Evol. 1, 13–22 (2009).
Hellmann, I., Ebersberger, I., Ptak, S. E., Paabo, S. & Przeworski, M. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72, 1527–1535 (2003).
Hellmann, I. et al. Why do human diversity levels vary at a megabase scale? Genome Res. 15, 1222–1231 (2005).
Lercher, M. J. & Hurst, L. D. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18, 337–340 (2002).
Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003).
Ananda, G., Chiaromonte, F. & Makova, K. D. A genome-wide view of mutation rate co-variation using multivariate analyses. Genome Biol. 12, R27 (2011).
Bromham, L. & Penny, D. The modern molecular clock. Nature Rev. Genet. 4, 216–224 (2003).
Walser, J. C., Ponger, L. & Furano, A. V. CpG dinucleotides and the mutation rate of non-CpG DNA. Genome Res. 18, 1403–1414 (2008).
Hodgkinson, A. & Eyre-Walker, A. Human triallelic sites: evidence for a new mutational mechanism? Genetics 184, 233–241 (2010).
Averof, M., Rokas, A., Wolfe, K. H. & Sharp, P. M. Evidence for a high frequency of simultaneous double-nucleotide substitutions. Science 287, 1283–1286 (2000).
Schrider, D. R., Hourmozdi, J. N. & Hahn, M. W. Pervasive multinucleotide mutational events in eukaryotes. Curr. Biol. 21, 1051–1054 (2011).
Wolfe, K. H., Sharp, P. M. & Li, W.-H. Mutation rates differ among regions of the mammalian genome. Nature 337, 283–285 (1989).
Matassi, G., Sharp, P. M. & Gautier, C. Chromosomal location effects on gene sequence evolution in mammals. Curr. Biol. 9, 786–791 (1999). This was the first demonstration that the mutation rate varies at a large scale across the mammalian genome.
Williams, E. J. & Hurst, L. D. The proteins of linked genes evolve at similar rates. Nature 407, 900–903 (2000).
Lercher, M. J., Williams, E. J. & Hurst, L. D. Local similarity in evolutionary rates extends over whole chromosomes in human–rodent and mouse–rat comparisons: implications for understanding the mechanistic basis of the male mutation bias. Mol. Biol. Evol. 18, 2032–2039 (2001). This was the first indication that the mutation rate varies between the autosomes, as well as between the autosomes and sex chromosomes.
Lercher, M. J., Chamary, J. V. & Hurst, L. D. Genomic regionality in rates of evolution is not explained by clustering of genes of comparable expression profile. Genome Res. 14, 1002–1013 (2004).
Dermitzakis, E. T., Reymond, A. & Antonarakis, S. E. Conserved non-genic sequences — an unexpected feature of mammalian genomes. Nature Rev. Genet. 6, 151–157 (2005).
Asthana, S. et al. Widely distributed noncoding purifying selection in the human genome. Proc. Natl Acad. Sci. USA 104, 12410–12415 (2007).
Meader, S., Ponting, C. P. & Lunter, G. Massive turnover of functional sequence in human and other mammalian genomes. Genome Res. 20, 1335–1343 (2010).
Hodgkinson, A., Chen, Y. & Eyre-Walker, A. The large scale distribution of somatic mutations in cancer genomes. Hum. Mutat. 23 Sep 2011(doi:10.1002/humu.21616).
Gaffney, D. J. & Keightley, P. D. The scale of mutational variation in the murid genome. Genome Res. 15, 1086–1094 (2005). This is a detailed investigation into the scale over which the mutation rate varies in rodents.
Spencer, C. C. et al. The influence of recombination on human genetic diversity. PLoS Genet. 2, e148 (2006).
Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005).
Imamura, H., Karro, J. E. & Chuang, J. H. Weak preservation of local neutral substitution rates across mammalian genomes. BMC Evol. Biol. 9, 89 (2009).
Tyekucheva, S. et al. Human–macaque comparisons illuminate variation in neutral substitution rates. Genome Biol. 9, R76 (2008).
Chen, C. L. et al. Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. Genome Res. 20, 447–457 (2010).
Pink, C. J. & Hurst, L. D. Timing of replication is a determinant of neutral substitution rates but does not explain slow Y chromosome evolution in rodents. Mol. Biol. Evol. 27, 1077–1086 (2010).
Stamatoyannopoulos, J. A. et al. Human mutation rate associated with DNA replication timing. Nature Genet. 41, 393–395 (2009).
Duret, L. & Arndt, P. F. The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet. 4, e1000071 (2008).
Prendergast, J. G. et al. Chromatin structure and evolution in the human genome. BMC Evol. Biol. 7, 72 (2007).
Eyre-Walker, A. Evidence of selection on silent site base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics 152, 675–683 (1999).
Lercher, M. J., Smith, N. G. C., Eyre-Walker, A. & Hurst, L. D. The evolution of isochores: evidence from SNP frequency distributions. Genetics 162, 1805–1810 (2002).
Ebersberger, I., Metzler, D., Schwarz, C. & Paabo, S. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70, 1490–1497 (2002).
Makova, K. D. & Li, W.-H. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416, 624–626 (2002).
Shen, P. et al. Population genetic implications from sequence variation in four Y-chromosome genes. Proc. Natl Acad. Sci. USA 97, 7354–7359 (2000).
Malcom, C. M., Wyckoff, G. J. & Lahn, B. T. Genic mutation rates in mammals: local similarity, chromosomal heterogeneity, and X-versus-autosome disparity. Mol. Biol. Evol. 20, 1633–1641 (2003).
Axelsson, E., Smith, N. G., Sundstrom, H., Berlin, S. & Ellegren, H. Male-biased mutation rate and divergence in autosomal, Z-linked and W-linked introns of chicken and Turkey. Mol. Biol. Evol. 21, 1538–1547 (2004).
Haldane, J. B. The mutation rate of the gene for haemophilia, and its segregation ratios in males and females. Ann. Eugen. 13, 262–271 (1947).
Ellegren, H. Characteristics, causes and evolutionary consequences of male-biased mutation. Proc. Biol. Sci. 274, 1–10 (2007).
Taylor, J., Tyekucheva, S., Zody, M., Chiaromonte, F. & Makova, K. D. Strong and weak male mutation bias at different sites in the primate genomes: insights from the human–chimpanzee comparison. Mol. Biol. Evol. 23, 565–573 (2006).
Finkel, T., Serrano, M. & Blasco, M. A. The common biology of cancer and ageing. Nature 448, 767–774 (2007).
Stratton, M. R. Exploring the genomes of cancer cells: progress and promise. Science 331, 1553–1558 (2011).
Chapman, M. A. et al. Initial genome sequencing and analysis of multiple myeloma. Nature 471, 467–472 (2011).
Lee, W. et al. The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465, 473–477 (2010).
Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).
Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010).
Jones, S. et al. Core signalling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321, 1801–1806 (2008).
Parsons, D. W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807–1812 (2008).
Wood, L. D. et al. The genomic landscapes of human breast and colorectal cancers. Science 318, 1108–1113 (2007).
Cox, E. C. On the organization of higher chromosomes. Nature New Biol. 239, 133–134 (1972).
Chuang, J. H. & Li, H. Functional bias and spatial organization of genes in mutational hot and cold regions in the human genome. PLoS Biol. 2, E29 (2004).
Wyckoff, G. J., Malcom, C. M., Vallender, E. J. & Lahn, B. T. A highly unexpected strong correlation between fixation probability of nonsynonymous mutations and mutation rate. Trends Genet. 21, 381–385 (2005).
Vallender, E. J. & Lahn, B. T. Uncovering the mutation-fixation correlation in short lineages. BMC Evol. Biol. 7, 168 (2007).
Stoletzki, N. & Eyre-Walker, A. The positive correlation between dN/dS and dS in mammals is due to runs of adjacent substitutions. Mol. Biol. Evol. 28, 1371–1380 (2011).
Denver, D. R., Morris, K., Lynch, M. & Thomas, W. K. High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 430, 679–682 (2004).
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge, UK, 1983).
Andolfatto, P. Adaptive evolution of non-coding DNA in Drosophila. Nature 437, 1149–1152 (2005).
Haddrill, P. R., Charlesworth, B., Halligan, D. L. & Andolfatto, P. Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biol. 6, R67 (2005).
Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nature Rev. Genet. 12, 32–42 (2011).
McVicker, G., Gordon, D., Davis, C. & Green, P. Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009).
Benton, M. J. & Donoghue, P. C. Paleontological evidence to date the tree of life. Mol. Biol. Evol. 24, 26–53 (2007).
Eyre-Walker, A. & Keightley, P. D. High genomic deleterious mutation rates in hominids. Nature 397, 344–347 (1999).
Yu, N. et al. Global patterns of human DNA sequence variation in a 10kb region on chromosome 1. Mol. Biol. Evol. 18, 214–222 (2001).
Ophir, R. & Graur, D. Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene 31, 191–202 (1997).
Sjodin, P., Bataillon, T. & Schierup, M. H. Insertion and deletion processes in recent human history. PLoS ONE 5, e8650 (2010).
Lynch, M. Evolution of the mutation rate. Trends Genet. 26, 345–352 (2010).
Li, W.-H. & Stadler, L. A. Low nucleotide diversity in man. Genetics 129, 513–523 (1991).
Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999).
Leigh, E. G. Evolution of mutation rates. Genetics 73 (Suppl.), 1–18 (1973).
Acknowledgements
The authors are grateful to Y. Chen for help in compiling the data in figure 4 and to L. Hurst, C. Pink, N. Stoletzki and the four anonymous referees for comments on the manuscript.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Related links
Glossary
- Pseudogenes
-
Copies of a gene that are no longer functional, usually because of deactivating mutations, such as premature stop codons.
- Transition
-
A mutation that converts a pyrimidine into another pyrimidine (that is, C↔T) or a purine into another purine (that is, A↔G).
- Transversions
-
Mutations that convert a pyrimidine into a purine or vice versa (for example, C→G or C→A).
- Coincident SNPs
-
(cSNPs). Orthologous sites that contain a SNP in two species.
- Coincident single nucleotide substitutions
-
(cSNSs). Orthologous sites that have substitutions in two independent pairs of species.
- Site frequency spectra
-
The distributions of allele frequencies within a sample of sequences.
- DNase I hypersensitive sites
-
Sites that are digested by the endonuclease DNase I, which preferentially attacks exposed DNA, such as in open chromatin.
- Synonymous sites
-
Sites at which some or all of the mutations do not change the amino acid. The rate of substitution at synonymous sites refers only to the rate of synonymous substitution at such sites.
Rights and permissions
About this article
Cite this article
Hodgkinson, A., Eyre-Walker, A. Variation in the mutation rate across mammalian genomes. Nat Rev Genet 12, 756–766 (2011). https://doi.org/10.1038/nrg3098
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrg3098
This article is cited by
-
Inferring compound heterozygosity from large-scale exome sequencing data
Nature Genetics (2024)
-
3D-GBS: a universal genotyping-by-sequencing approach for genomic selection and other high-throughput low-cost applications in species with small to medium-sized genomes
Plant Methods (2023)
-
CRISPR-based targeted haplotype-resolved assembly of a megabase region
Nature Communications (2023)
-
Evolutionary honing in and mutational replacement: how long-term directed mutational responses to specific environmental pressures are possible
Theory in Biosciences (2023)
-
Very Low Rates of Spontaneous Gene Deletions and Gene Duplications in Dictyostelium discoideum
Journal of Molecular Evolution (2023)