Abstract

Mutations create variation in the population, fuel evolution and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect1,2,3,4,5,6,7,8,9,10. Here we analyze 11,020 de novo mutations from the whole genomes of 250 families. We show that de novo mutations in the offspring of older fathers are not only more numerous11,12,13 but also occur more frequently in early-replicating, genic regions. Functional regions exhibit higher mutation rates due to CpG dinucleotides and show signatures of transcription-coupled repair, whereas mutation clusters with a unique signature point to a new mutational mechanism. Mutation and recombination rates independently associate with nucleotide diversity, and regional variation in human-chimpanzee divergence is only partly explained by heterogeneity in mutation rate. Finally, we provide a genome-wide mutation rate map for medical and population genetics applications. Our results provide new insights and refine long-standing hypotheses about human mutagenesis.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & Population genetics of polymorphism and divergence. Genetics 132, 1161–1176 (1992).

  2. 2.

    & A hidden Markov model approach to variation among sites in rate of evolution. Mol. Biol. Evol. 13, 93–104 (1996).

  3. 3.

    et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

  4. 4.

    et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913 (2005).

  5. 5.

    & De novo mutations in human genetic disease. Nat. Rev. Genet. 13, 565–575 (2012).

  6. 6.

    , & DNA Repair and Mutagenesis (ASM Press, 1995).

  7. 7.

    Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases. Hum. Mutat. 21, 12–27 (2003).

  8. 8.

    Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci. USA 107, 961–968 (2010).

  9. 9.

    & Variation in the mutation rate across mammalian genomes. Nat. Rev. Genet. 12, 756–766 (2011).

  10. 10.

    et al. The influence of genomic context on mutation patterns in the human genome inferred from rare variants. Genome Res. 23, 1974–1984 (2013).

  11. 11.

    et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 (2012).

  12. 12.

    et al. Rate of de novo mutations and the importance of father's age to disease risk. Nature 488, 471–475 (2012).

  13. 13.

    Genomes of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).

  14. 14.

    , , , & Age-associated sperm DNA methylation alterations: possible implications in offspring disease susceptibility. PLoS Genet. 10, e1004458 (2014).

  15. 15.

    DNA replication timing: coordinating genome stability with genome regulation on the X chromosome and beyond. Bioessays 36, 997–1004 (2014).

  16. 16.

    , & Distinct changes of genomic biases in nucleotide substitution at the time of Mammalian radiation. Mol. Biol. Evol. 20, 1887–1896 (2003).

  17. 17.

    et al. Hypermutable non-synonymous sites are under stronger negative selection. PLoS Genet. 4, e1000281 (2008).

  18. 18.

    & Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes. Genome Res. 13, 838–844 (2003).

  19. 19.

    et al. Reduced local mutation density in regulatory DNA of cancer genomes is linked to DNA repair. Nat. Biotechnol. 32, 71–75 (2014).

  20. 20.

    et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010).

  21. 21.

    et al. Estimating the human mutation rate using autozygosity in a founder population. Nat. Genet. 44, 1277–1281 (2012).

  22. 22.

    et al. Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. Mol. Cell 46, 424–435 (2012).

  23. 23.

    et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).

  24. 24.

    , & The choice of nucleotide inserted opposite abasic sites formed within chromosomal DNA reveals the polymerase activities participating in translesion DNA synthesis. DNA Repair (Amst.) 12, 878–889 (2013).

  25. 25.

    , & Substantial regional variation in substitution rates in the human genome: importance of GC content, gene density, and telomere-specific effects. J. Mol. Evol. 60, 748–763 (2005).

  26. 26.

    , , , & A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72, 1527–1535 (2003).

  27. 27.

    & Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356, 519–520 (1992).

  28. 28.

    & Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18, 337–340 (2002).

  29. 29.

    et al. A high-resolution recombination map of the human genome. Nat. Genet. 31, 241–247 (2002).

  30. 30.

    & The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet. 4, e1000071 (2008).

  31. 31.

    et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010).

  32. 32.

    , , & Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009).

  33. 33.

    , , & Analysis of sequence conservation at nucleotide resolution. PLOS Comput. Biol. 3, e254 (2007).

  34. 34.

    , , & Interpreting the role of de novo protein-coding mutations in neuropsychiatric disease. Nat. Genet. 45, 234–238 (2013).

  35. 35.

    et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

  36. 36.

    et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

  37. 37.

    & Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

  38. 38.

    et al. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 91, 1033–1040 (2012).

  39. 39.

    et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

  40. 40.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  41. 41.

    et al. Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. Genome Res. 20, 761–770 (2010).

  42. 42.

    et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

  43. 43.

    et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

  44. 44.

    , , , & Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 1814–1828 (2008).

  45. 45.

    et al. Ensembl 2013. Nucleic Acids Res. 41, D48–D55 (2013).

  46. 46.

    R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014).

  47. 47.

    et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004).

  48. 48.

    et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294, 2348–2351 (2001).

Download references

Acknowledgements

We thank D. Gordenin for very helpful comments. The Genome of the Netherlands (GoNL) Project is funded by the Biobanking and Biomolecular Research Infrastructure (BBMRI-NL), which is financed by the Netherlands Organization for Scientific Research (NWO project 184.021.007). S.R.S., P.P.P. and S.C. are funded by US National Institutes of Health grants 1 R01 MH101244 and 1 R01 GM078598.

Author information

Author notes

    • Laurent C Francioli
    • , Paz P Polak
    •  & Amnon Koren

    These authors contributed equally to this work.

    • Paul I W de Bakker
    •  & Shamil R Sunyaev

    These authors jointly supervised this work.

Affiliations

  1. Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, the Netherlands.

    • Laurent C Francioli
    • , Androniki Menelaou
    • , Ivo Renkens
    • , Wigard P Kloosterman
    •  & Paul I W de Bakker
  2. Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    • Paz P Polak
    • , Sung Chun
    •  & Shamil R Sunyaev
  3. Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

    • Amnon Koren
  4. Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands.

    • Cornelia M van Duijn
  5. Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

    • Morris Swertz
    •  & Cisca Wijmenga
  6. Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

    • Morris Swertz
    •  & Cisca Wijmenga
  7. Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands.

    • Gertjan van Ommen
  8. Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

    • P Eline Slagboom
    •  & Kai Ye
  9. Department of Biological Psychology, VU University Amsterdam, Amsterdam, the Netherlands.

    • Dorret I Boomsma
  10. Genome Institute, Washington University, St. Louis, Missouri, USA.

    • Kai Ye
  11. European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

    • Victor Guryev
  12. Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.

    • Peter F Arndt
  13. Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands.

    • Paul I W de Bakker

Consortia

  1. Genome of the Netherlands Consortium

    A full list of members and affiliations appears in the Supplementary Note.

Authors

  1. Search for Laurent C Francioli in:

  2. Search for Paz P Polak in:

  3. Search for Amnon Koren in:

  4. Search for Androniki Menelaou in:

  5. Search for Sung Chun in:

  6. Search for Ivo Renkens in:

  7. Search for Cornelia M van Duijn in:

  8. Search for Morris Swertz in:

  9. Search for Cisca Wijmenga in:

  10. Search for Gertjan van Ommen in:

  11. Search for P Eline Slagboom in:

  12. Search for Dorret I Boomsma in:

  13. Search for Kai Ye in:

  14. Search for Victor Guryev in:

  15. Search for Peter F Arndt in:

  16. Search for Wigard P Kloosterman in:

  17. Search for Paul I W de Bakker in:

  18. Search for Shamil R Sunyaev in:

Contributions

S.R.S. and P.I.W.d.B. planned and directed the research. L.C.F. called and filtered the mutations. W.P.K. and I.R. validated candidate mutations. L.C.F. designed and executed the simulations. A.K., L.C.F. and A.M. performed replication timing analyses. P.P.P., L.C.F. and A.M. analyzed factors influencing regional mutation rates and spectra. L.C.F. and P.P.P. analyzed mutation clusters. P.P.P. and P.F.A. computed the comparative genomics model and compared it against observed mutation rates. S.C. and P.P.P. created the mutation rate map. L.C.F., P.P.P., A.K., P.I.W.d.B. and S.R.S. wrote the manuscript. A.M., S.C., C.M.v.D., M.S., C.W., G.v.O., P.E.S., D.I.B., K.Y., V.G., P.F.A. and W.P.K. provided critical feedback on the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Paul I W de Bakker or Shamil R Sunyaev.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7, Supplementary Tables 1–4 and Supplementary Note.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3292