Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

De novo mutations in regulatory elements in neurodevelopmental disorders

Abstract

We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1–3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Selective constraint in targeted non-coding elements.
Figure 2: Enrichment of DNMs across element classes and functional annotations in exome-negative probands.
Figure 3: Recurrently mutated elements.
Figure 4: Modelling the proportion of DNMs in non-coding elements that are likely to be highly penetrant for dominant neurodevelopmental disorders.

References

  1. 1

    Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009)

    Article  ADS  PubMed  Google Scholar 

  2. 2

    Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. 3

    Mathelier, A., Shi, W. & Wasserman, W. W. Identification of altered cis-regulatory elements in human disease. Trends Genet. 31, 67–76 (2015)

    Article  CAS  PubMed  Google Scholar 

  4. 4

    Spielmann, M. & Mundlos, S. Looking beyond the genes: the role of non-coding variants in human disease. Human Mol. Genet. 25, 157–165 (2016)

    Article  CAS  Google Scholar 

  5. 5

    Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Jeong, Y. et al. Regulation of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet. 40, 1348–1353 (2008)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Benko, S. et al. Disruption of a long distance regulatory region upstream of SOX9 in isolated disorders of sex development. J. Med. Genet. 48, 825–830 (2011)

    Article  CAS  PubMed  Google Scholar 

  8. 8

    Bhatia, S. et al. Disruption of autoregulatory feedback by a mutation in a remote, ultraconserved PAX6 enhancer causes aniridia. Am. J. Hum. Genet. 93, 1126–1134 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. 9

    Weedon, M. N. et al. Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis. Nat. Genet. 46, 61–64 (2014)

    Article  CAS  PubMed  Google Scholar 

  10. 10

    Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003)

    Article  CAS  PubMed  Google Scholar 

  11. 11

    Hill, R. E. & Lettice, L. A. Alterations to the remote control of Shh gene expression cause congenital abnormalities. Philos. Trans. R. Soc. Lond. B Biol. Sci. 368, 20120357 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Sellick, G. S. et al. Mutations in PTF1A cause pancreatic and cerebellar agenesis. Nat. Genet. 36, 1301–1305 (2004)

    Article  CAS  PubMed  Google Scholar 

  13. 13

    Noonan, J. P. & McCallion, A. S. Genomics of long-range regulatory elements. Annu. Rev. Genomics Hum. Genet. 11, 1–23 (2010)

    Article  CAS  PubMed  Google Scholar 

  14. 14

    Naville, M. et al. Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome. Nat. Commun. 6, 6904 (2015)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  15. 15

    Whalen, S., Truty, R. M. & Pollard, K. S. Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet. 48, 488–496 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Köhler, S. et al. The Human Phenotype Ontology in 2017. Nucleic Acids Res. 45, D865–D876 (2017)

    Article  CAS  PubMed  Google Scholar 

  17. 17

    Wright, C. F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015)

    Article  PubMed  PubMed Central  Google Scholar 

  18. 18

    Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature 542, 433–438 (2017)

  19. 19

    Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. 20

    Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L. A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21

    May, D. et al. Large-scale discovery of enhancers from human heart tissue. Nat. Genet. 44, 89–93 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. 22

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Blow, M. J. et al. ChIP-Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. 24

    Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. 25

    Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. 26

    Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Carlson, J . et al. Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans. Preprint at https://www.biorxiv.org/content/early/2017/02/14/108290 (2017)

  28. 28

    Koren, A. et al. Genetic variation in human DNA replication timing. Cell 159, 1015–1026 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. 29

    Kong, A. et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010)

    Article  ADS  CAS  PubMed  Google Scholar 

  30. 30

    Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. 31

    Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. 32

    Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Gao, T. et al. EnhancerAtlas: a resource for enhancer annotation and analysis in 105 human cell/tissue types. Bioinformatics 32, 3543–3551 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. 34

    Shooshtari, P., Huang, H. & Cotsapas, C. Integrative genetic and epigenetic analysis uncovers regulatory mechanisms of autoimmune disease. Am. J. Hum. Genet. 101, 75–86 (2016)

    Article  CAS  Google Scholar 

  35. 35

    Parikshak, N. N. et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell 155, 1008–1021 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. 36

    Sandelin, A. et al. Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics 5, 99 (2004)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. 37

    Kryukov, G. V., Pennacchio, L. A. & Sunyaev, S. R. Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    Turner, T. N. et al. Genome sequencing of autism-affected families reveals disruption of putative non-coding regulatory DNA. Am. J. Hum. Genet. 98, 58–74 (2016)

    Article  CAS  PubMed  Google Scholar 

  39. 39

    Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. 40

    Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004)

    Article  PubMed  PubMed Central  Google Scholar 

  41. 41

    Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in Mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. 42

    Shihab, H. A. et al. An integrative approach to predicting the functional effects of non-coding and coding sequence variation. Bioinformatics 31, 1536–1543 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. 43

    Lawrence, M. et al. Software for computing and annotating genomic ranges. PLOS Comput. Biol. 9, e1003118 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. 44

    Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 3–7 (2013)

    Article  CAS  Google Scholar 

  45. 45

    Akawi, N. et al. Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families. Nat. Genet. 47, 1363–1369 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Won, H. et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  47. 47

    Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  48. 48

    Mathelier, A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44 (D1), D110–D115 (2016)

    Article  CAS  PubMed  Google Scholar 

  49. 49

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013)

  50. 50

    McLeay, R. C. & Bailey, T. L. Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data. BMC Bioinformatics 11, 165 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the families for their participation and patience; the DDD study clinicians, research nurses and clinical scientists in the recruiting centres for their hard work and perseverance on behalf of families; the Exome Aggregation Consortium and Genome Aggregation Database (http://gnomad.broadinstitute.org/) for making their data and code available; S. Gerety, G. Elgar, S. Aerts, and D. Svetlichnyy for discussions; H. Roest Crollius and L. Moyon for help with gene target prediction; J. Mudge and A. Frankish for help in annotating CNEs; and the Sanger HGI and DNA pipelines teams for their support in generating and processing the data. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund (grant HICF-1009-003), a parallel funding partnership between the Wellcome Trust and the UK Department of Health, and the Wellcome Trust Sanger Institute (grant WT098051). The views expressed in this publication are those of the author(s) and not necessarily those of the Wellcome Trust or the UK Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South Research Ethics Committee and GEN/284/12, granted by the Republic of Ireland Research Ethics Committee). D.R.F. is funded through an MRC Human Genetics Unit program grant to the University of Edinburgh. D.H.G is funded through 1U01 MH105666 and 1R01 MH110927 (psychENCODE consortium). A.S. is supported by the FWO (Postdoctoral Fellow number 12W7318N).

Author information

Affiliations

Authors

Contributions

Study design: H.V.F., C.F.W., D.R.F., J.C.B. and M.E.H. Method development and data analysis: P.J.S., J.F.M., G.G., A.S., H.W., D.H.G., and M.E.H. Writing: P.J.S. and M.E.H. Experimental and analytical supervision: H.V.F., C.F.W., D.R.F., J.C.B. and M.E.H. Project Supervision: M.E.H.

Corresponding author

Correspondence to Matthew E. Hurles.

Ethics declarations

Competing interests

M.E.H. is a co-founder of, consultant to, and holds shares in, Congenica Ltd, a genetics diagnostic company.

Additional information

Reviewer Information Nature thanks M. Daly and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Figure 1 Coverage in targeted non-coding elements.

Coverage in the targeted non-coding elements is comparable to the protein-coding exons (median 73× and 56×, respectively).

Extended Data Figure 2 Assessment of variant deleteriousness metrics and selective pressure in CNEs.

Dots and bars represent the point estimate and 95% CI, respectively, for MAPS and proportion singletons. a, b, Fathmm-MKL (a) and Genomiser (b) separate benign variation (low MAPS score) from likely damaging variation (high MAPS score), but do not identify any classes of variation under strong selective constraint. c, There was no significant difference in the strength of purifying selection measured by MAPS between sites predicted to result in loss, gain, or no change in transcription factor binding. d, Validation of Fig. 1c using whole-genome data from the UK10K project. While CADD can identify coding variation under strong selective constraint (as measured by the proportion of singletons), CADD is unable to identify strongly constrained non-coding variants. e, f, The subset of CNEs sequenced in the DDD cohort that are predicted to be inactive in all 111 Roadmap Tissues (n = 261) exhibit a similar degree of evolutionary conservation (e) but lower selective constraint (f) in a healthy population compared to CNEs active in at least one tissue (n = 4,046).

Extended Data Figure 3 Genomic factors that affect mutation rate in non-coding elements.

a, Aggregating CpG sites genome-wide into bins of methylation proportion from 0% (unmethylated in all cells) to 100% (methylated in all cells) and calculating the observed/expected ratio reveals differences in mutability not accounted for a by a triplet model alone. b, A mutation rate model incorporating a correction for CpG methylation explains greater variance in rare variant counts in the DDD unaffected parents. c, Levels of rare variation in deep whole genomes (n = 7,509 non-Finnish Europeans) were used to estimate power to detect a hypermutability of 1.1×, 1.2×, or 1.3×. d, The level of rare variation in the fetal brain-active elements (n = 2,613, FB(+)) is slightly lower than in the fetal brain-inactive elements (n = 1694, FB(-)), consistent with similar mutability between the two element sets with slightly stronger purifying selection in the fetal brain-active elements. e, f, Elements with DNMs observed in our study are not enriched in late-replicating regions (e) or in regions with higher recombination rate (f), which have been shown to be hypermutable.

Extended Data Figure 4 Non-coding mutations in exome-positive probands and poorly evolutionarily conserved sites make a minimal contribution to severe developmental disorders.

a, In the 1,691 ‘exome-positive’ probands, there is no evidence for a burden of DNMs in any of the non-coding element classes tested. Red diamonds indicate the observed counts, while black circles and bars indicate the expected count and 95% CI, respectively. b, DNMs in exome-negative probands show a greater degree of evolutionary conservation (measured by PhyloP score) than DNMs in exome-positive probands in two classes: fetal brain-active CNEs (median 1.57 exome-positive, 2.85 exome-negative, n = 368 mutations) and missense changes (median 3.43 exome-positive, 3.98 exome-negative, n = 6,244 mutations).

Extended Data Figure 5 Hypothesis test enumeration and enrichment for mutations in highly conserved fetal brain-active enhancers.

a, We corrected for thirteen tests in order to account for the nested hypotheses based on element class and phenotype in this analysis. b, Evolutionarily conserved fetal brain-active enhancers (n = 106) are enriched for DNMs in exome-negative probands.

Extended Data Figure 6 Gene target prediction for targeted non-coding elements.

Pairwise concordance between four different gene target prediction methods is low. Using predicted targets from fetal brain Hi-C data, elements with an observed DNM in exome-negative probands (n = 286) do not show any bias towards any of the gene sets consistently implicated in neurodevelopmental disorders. Dots and bars represent the point estimate and 95% confidence interval, respectively.

Extended Data Figure 7 Transcription factor binding disruption and transmission disequilibrium test.

ad, Comparison of predicted change in transcription factor binding for observed DNMs compared to null mutation model. Empirical P values derived from comparison with mutations simulated from the null mutation model. e, None of the non-coding element classes tested show any evidence of overtransmission from parents to affected children. Dots and bars represent the point estimate and 95% confidence intervals of estimates of transmission proportions, respectively.

Extended Data Figure 8 Predicted chromatin state for recurrently mutated elements.

chromHMM state of the n = 31 recurrently mutated elements shows enrichment for enhancers and transcribed elements. Elements that overlapped a high confidence DHS but were predicted as quiescent by chromHMM are classed as Overlaps DHS. P values derived from Poisson distribution with parameter lambda defined by the simulated data.

Extended Data Figure 9 Schematic describing each of the thirty-one recurrently mutated elements.

Element is in black, red lollipops denote observed DNMs, grey lollipops denote observed variation at MAF >0.1% in 7,080 unaffected parents, phastcons100 conservation score is shown in blue, and DHSs from the Roadmap Epigenome project are shown in blue/pink in the bottom track.

Extended Data Figure 10 Empirical and simulated power for disease association in targeted non-coding elements.

a, Estimation of the reduction in power due to size differences between non-coding elements and genes (median 600 bp versus 1,800 bp) and ignoring VEP annotations used to stratify benign from likely damaging variants. Dots and bars represent the point estimate and 95% confidence interval, respectively. b, Credible intervals for the proportion of fetal brain-active conserved elements and proportion of sites within those elements with a dominant mechanism for developmental disorders. c, Power calculations for disease-associated non-coding element discovery. Without annotation or tools to discriminate pathogenic from benign variants in non-coding elements (grey), more than 100,000 trios are required to achieve 40% power. With annotation or tools to fully discriminate likely pathogenic from benign variants (blue), 40% power is achieved with only 21,000 trios.

Supplementary information

Life Sciences Reporting Summary (PDF 98 kb)

Supplementary Tables

This file contains Supplementary Tables 1-3 comprising: (1) Median depth of coverage in 7,930 individuals for each targeted non-coding element and protein-coding exon. It includes chromosome, start, end, and median coverage of each element. (2) Description of the recurrently mutated clusters of conserved non-coding elements. It includes a numerical id for each cluster, the genomic coordinates of the elements in the cluster, number of observed de novo mutations, the genomic coordinates of the observed DNMs, and the p-value of the of observation. (3) All of the individual elements identified as recurrently mutated. It includes the genomic coordinates of the element, annotation as ‘Conserved’ or ‘Enhancer’, the number of observed de novo mutations, the genomic location of the mutations observed, p-value of the of observation, the nearest gene, and any target genes identified by Hi-C in fetal brain. (XLSX 6309 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Short, P., McRae, J., Gallone, G. et al. De novo mutations in regulatory elements in neurodevelopmental disorders. Nature 555, 611–616 (2018). https://doi.org/10.1038/nature25983

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing