Article | Published:

De novo mutations in regulatory elements in neurodevelopmental disorders

Nature volume 555, pages 611616 (29 March 2018) | Download Citation


We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1–3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009)

  2. 2.

    et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012)

  3. 3.

    , & Identification of altered cis-regulatory elements in human disease. Trends Genet. 31, 67–76 (2015)

  4. 4.

    & Looking beyond the genes: the role of non-coding variants in human disease. Human Mol. Genet. 25, 157–165 (2016)

  5. 5.

    & Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015)

  6. 6.

    et al. Regulation of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet. 40, 1348–1353 (2008)

  7. 7.

    et al. Disruption of a long distance regulatory region upstream of SOX9 in isolated disorders of sex development. J. Med. Genet. 48, 825–830 (2011)

  8. 8.

    et al. Disruption of autoregulatory feedback by a mutation in a remote, ultraconserved PAX6 enhancer causes aniridia. Am. J. Hum. Genet. 93, 1126–1134 (2013)

  9. 9.

    et al. Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis. Nat. Genet. 46, 61–64 (2014)

  10. 10.

    et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003)

  11. 11.

    & Alterations to the remote control of Shh gene expression cause congenital abnormalities. Philos. Trans. R. Soc. Lond. B Biol. Sci. 368, 20120357 (2013)

  12. 12.

    et al. Mutations in PTF1A cause pancreatic and cerebellar agenesis. Nat. Genet. 36, 1301–1305 (2004)

  13. 13.

    & Genomics of long-range regulatory elements. Annu. Rev. Genomics Hum. Genet. 11, 1–23 (2010)

  14. 14.

    et al. Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome. Nat. Commun. 6, 6904 (2015)

  15. 15.

    , & Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet. 48, 488–496 (2016)

  16. 16.

    et al. The Human Phenotype Ontology in 2017. Nucleic Acids Res. 45, D865–D876 (2017)

  17. 17.

    et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015)

  18. 18.

    Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature 542, 433–438 (2017)

  19. 19.

    et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005)

  20. 20.

    , , & VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007)

  21. 21.

    et al. Large-scale discovery of enhancers from human heart tissue. Nat. Genet. 44, 89–93 (2011)

  22. 22.

    et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016)

  23. 23.

    et al. ChIP-Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010)

  24. 24.

    et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014)

  25. 25.

    et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015)

  26. 26.

    et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014)

  27. 27.

    . et al. Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans. Preprint at (2017)

  28. 28.

    et al. Genetic variation in human DNA replication timing. Cell 159, 1015–1026 (2014)

  29. 29.

    et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010)

  30. 30.

    & ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012)

  31. 31.

    , , & Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010)

  32. 32.

    et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014)

  33. 33.

    et al. EnhancerAtlas: a resource for enhancer annotation and analysis in 105 human cell/tissue types. Bioinformatics 32, 3543–3551 (2016)

  34. 34.

    , & Integrative genetic and epigenetic analysis uncovers regulatory mechanisms of autoimmune disease. Am. J. Hum. Genet. 101, 75–86 (2016)

  35. 35.

    et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell 155, 1008–1021 (2013)

  36. 36.

    et al. Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics 5, 99 (2004)

  37. 37.

    , & Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007)

  38. 38.

    et al. Genome sequencing of autism-affected families reveals disruption of putative non-coding regulatory DNA. Am. J. Hum. Genet. 98, 58–74 (2016)

  39. 39.

    et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012)

  40. 40.

    et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004)

  41. 41.

    et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in Mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016)

  42. 42.

    et al. An integrative approach to predicting the functional effects of non-coding and coding sequence variation. Bioinformatics 31, 1536–1543 (2015)

  43. 43.

    et al. Software for computing and annotating genomic ranges. PLOS Comput. Biol. 9, e1003118 (2013)

  44. 44.

    et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 3–7 (2013)

  45. 45.

    et al. Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families. Nat. Genet. 47, 1363–1369 (2015)

  46. 46.

    et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016)

  47. 47.

    , , & The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012)

  48. 48.

    et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44 (D1), D110–D115 (2016)

  49. 49.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013)

  50. 50.

    & Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data. BMC Bioinformatics 11, 165 (2010)

Download references


We thank the families for their participation and patience; the DDD study clinicians, research nurses and clinical scientists in the recruiting centres for their hard work and perseverance on behalf of families; the Exome Aggregation Consortium and Genome Aggregation Database ( for making their data and code available; S. Gerety, G. Elgar, S. Aerts, and D. Svetlichnyy for discussions; H. Roest Crollius and L. Moyon for help with gene target prediction; J. Mudge and A. Frankish for help in annotating CNEs; and the Sanger HGI and DNA pipelines teams for their support in generating and processing the data. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund (grant HICF-1009-003), a parallel funding partnership between the Wellcome Trust and the UK Department of Health, and the Wellcome Trust Sanger Institute (grant WT098051). The views expressed in this publication are those of the author(s) and not necessarily those of the Wellcome Trust or the UK Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South Research Ethics Committee and GEN/284/12, granted by the Republic of Ireland Research Ethics Committee). D.R.F. is funded through an MRC Human Genetics Unit program grant to the University of Edinburgh. D.H.G is funded through 1U01 MH105666 and 1R01 MH110927 (psychENCODE consortium). A.S. is supported by the FWO (Postdoctoral Fellow number 12W7318N).

Author information


  1. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK

    • Patrick J. Short
    • , Jeremy F. McRae
    • , Giuseppe Gallone
    • , Alejandro Sifrim
    • , Caroline F. Wright
    • , Helen V. Firth
    • , David R. FitzPatrick
    • , Jeffrey C. Barrett
    •  & Matthew E. Hurles
  2. Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095, USA

    • Hyejung Won
    •  & Daniel H. Geschwind
  3. Center for Autism Research and Treatment, Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095, USA

    • Daniel H. Geschwind
  4. Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095, USA

    • Daniel H. Geschwind
  5. Institute of Biomedical and Clinical Science, University of Exeter Medical School, RILD Level 4, Royal Devon & Exeter Hospital, Barrack Road, Exeter EX2 5DW, UK

    • Caroline F. Wright
  6. East Anglian Medical Genetics Service, Box 134, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK

    • Helen V. Firth
  7. MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK

    • David R. FitzPatrick


  1. Search for Patrick J. Short in:

  2. Search for Jeremy F. McRae in:

  3. Search for Giuseppe Gallone in:

  4. Search for Alejandro Sifrim in:

  5. Search for Hyejung Won in:

  6. Search for Daniel H. Geschwind in:

  7. Search for Caroline F. Wright in:

  8. Search for Helen V. Firth in:

  9. Search for David R. FitzPatrick in:

  10. Search for Jeffrey C. Barrett in:

  11. Search for Matthew E. Hurles in:


Study design: H.V.F., C.F.W., D.R.F., J.C.B. and M.E.H. Method development and data analysis: P.J.S., J.F.M., G.G., A.S., H.W., D.H.G., and M.E.H. Writing: P.J.S. and M.E.H. Experimental and analytical supervision: H.V.F., C.F.W., D.R.F., J.C.B. and M.E.H. Project Supervision: M.E.H.

Competing interests

M.E.H. is a co-founder of, consultant to, and holds shares in, Congenica Ltd, a genetics diagnostic company.

Corresponding author

Correspondence to Matthew E. Hurles.

Reviewer Information Nature thanks M. Daly and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Supplementary information

PDF files

  1. 1.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Tables

    This file contains Supplementary Tables 1-3 comprising: (1) Median depth of coverage in 7,930 individuals for each targeted non-coding element and protein-coding exon. It includes chromosome, start, end, and median coverage of each element. (2) Description of the recurrently mutated clusters of conserved non-coding elements. It includes a numerical id for each cluster, the genomic coordinates of the elements in the cluster, number of observed de novo mutations, the genomic coordinates of the observed DNMs, and the p-value of the of observation. (3) All of the individual elements identified as recurrently mutated. It includes the genomic coordinates of the element, annotation as ‘Conserved’ or ‘Enhancer’, the number of observed de novo mutations, the genomic location of the mutations observed, p-value of the of observation, the nearest gene, and any target genes identified by Hi-C in fetal brain.

About this article

Publication history





Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.