Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations

Journal name:
Date published:
Published online

It is well established that autism spectrum disorders (ASD) have a strong genetic component; however, for at least 70% of cases, the underlying genetic cause is unknown1. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes—so-called sporadic or simplex families2, 3—we sequenced all coding regions of the genome (the exome) for parent–child trios exhibiting sporadic ASD, including 189 new trios and 20 that were previously reported4. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19)4, for a total of 677 individual exomes from 209 families. Here we show that de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD5. Moreover, 39% (49 of 126) of the most severe or disruptive de novo mutations map to a highly interconnected β-catenin/chromatin remodelling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes: CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3 and SCN1A. Combined with copy number variant (CNV) data, these results indicate extreme locus heterogeneity but also provide a target for future discovery, diagnostics and therapeutics.

At a glance


  1. De novo mutation events in autism spectrum disorder.
    Figure 1: De novo mutation events in autism spectrum disorder.

    a, Haplotype phasing using informative markers shows a strong parent-of-origin bias with 41 of 51 de novo events occurring on the paternally inherited haplotype. Arrows represent sequence reads from paternal (blue) or maternal (red) haplotypes. b, c, Box and whisker plots for 189 SSC probands. b, The paternal estimated age at conception versus the number of observed de novo point mutations (0, n = 53; 1, n = 65; 2, n = 44; 3+, n = 27). c, Decreased non-verbal IQ is significantly associated with an increasing number of extreme mutation events (0, n = 138; 1, n = 41; 2+, n = 10), both with and without CNVs (Supplementary Discussion). d, Browser images showing CNVs identified in the del(18)(q12.2q21.1) syndrome region. The truncating point mutation in SETBP1 occurs within the critical region, identifying the likely causative locus. Each red (deletion) and green (duplication) line represents an identified CNV in cases (solid lines) versus controls (dashed lines), with arrowheads showing point mutation.

  2. Mutations identified in protein-protein interaction (PPI) networks.
    Figure 2: Mutations identified in protein–protein interaction (PPI) networks.

    a, The 49-gene connected component of the PPI network formed from 126 genes with severe de novo mutations among the 209 probands. b, Proband 13844 inherits three rare gene-disruptive CNVs and carries two de novo truncating mutations. c, GeneMANIA22 view of three of the affected genes (b) (red labels) which encode proteins that are part of a β-catenin-linked network. This proband is macrocephalic, impaired cognitively, and has deficits in social behaviour and language development (Supplementary Discussion).


  1. Schaaf, C. P. & Zoghbi, H. Y. Solving the autism puzzle a few pieces at a time. Neuron 70, 806808 (2011)
  2. Sanders, S. J. et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70, 863885 (2011)
  3. Levy, D. et al. Rare de novo and transmitted copy-number variation in autistic spectrum disorders. Neuron 70, 886897 (2011)
  4. O’Roak, B. J. et al. Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nature Genet. 43, 585589 (2011)
  5. Hultman, C. M., Sandin, S., Levine, S. Z., Lichtenstein, P. & Reichenberg, A. Advancing paternal age and risk of autism: new evidence from a population-based study and a meta-analysis of epidemiological studies. Mol. Psychiatry 16, 12031212 (2010)
  6. Fischbach, G. D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron 68, 192195 (2010)
  7. Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl Acad. Sci. USA 107, 961968 (2010)
  8. Xu, B. et al. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nature Genet. 43, 864868 (2011)
  9. Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature http://dx.doi.org/10.1038/nature10945 (this issue)
  10. Hehir-Kwa, J. Y. et al. De novo copy number variants associated with intellectual disability have a paternal origin and age bias. J. Med. Genet. 48, 776778 (2011)
  11. O’Roak, B. J. & State, M. W. Autism genetics: strategies, challenges, and opportunities. Autism Res. 1, 417 (2008)
  12. Nishimura-Akiyoshi, S., Niimi, K., Nakashiba, T. & Itohara, S. Axonal netrin-Gs transneuronally determine lamina-specific subdendritic segments. Proc. Natl Acad. Sci. USA 104, 1480114806 (2007)
  13. Borg, I. et al. Disruption of Netrin G1 by a balanced chromosome translocation in a girl with Rett syndrome. Eur. J. Hum. Genet. 13, 921927 (2005)
  14. Nishiyama, M. et al. CHD8 suppresses p53-mediated apoptosis through histone H1 recruitment during early embryogenesis. Nature Cell Biol. 11, 172182 (2009)
  15. Thompson, B. A., Tremblay, V., Lin, G. & Bochar, D. A. CHD8 is an ATP-dependent chromatin remodeling factor that regulates β-catenin target genes. Mol. Cell. Biol. 28, 38943904 (2008)
  16. Batsukh, T. et al. CHD8 interacts with CHD7, a protein which is mutated in CHARGE syndrome. Hum. Mol. Genet. 19, 28582866 (2010)
  17. Betancur, C. Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting. Brain Res. 1380, 4277 (2011)
  18. Moller, R. S. et al. Truncation of the Down syndrome candidate gene DYRK1A in two unrelated patients with microcephaly. Am. J. Hum. Genet. 82, 11651170 (2008)
  19. Cooper, G. M. et al. A copy number variation morbidity map of developmental delay. Nature Genet. 43, 838846 (2011)
  20. Buysse, K. et al. Delineation of a critical region on chromosome 18 for the del(18)(q12.2q21.1) syndrome. Am. J. Med. Genet. A. 146A, 13301334 (2008)
  21. Hoischen, A. et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nature Genet. 42, 483485 (2010)
  22. Warde-Farley, D. et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38, W214W220 (2010)
  23. Erten, S., Bebek, G., Ewing, R. & Koyutürk, M. DADA: Degree-aware algorithms for network-based disease gene prioritization. BioData Mining 4, 19 (2011)
  24. De Ferrari, G. V. & Moon, R. T. The ups and downs of Wnt signaling in prevalent neurological disorders. Oncogene 25, 75457553 (2006)
  25. Bedogni, F. et al. Tbr1 regulates regional and laminar identity of postmitotic neurons in developing neocortex. Proc. Natl Acad. Sci. USA 107, 1312913134 (2010)
  26. Turner, E. H., Lee, C., Ng, S. B., Nickerson, D. A. & Shendure, J. Massively parallel exon capture and library-free resequencing across 16 genomes. Nature Methods 6, 315316 (2009)
  27. Voineagu, I. et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380384 (2011)
  28. Sakai, Y. et al. Protein interactome reveals converging molecular pathways among autism disorders. Sci. Transl. Med. 3, 86ra49 (2011)
  29. Gilman, S. R. et al. Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses. Neuron 70, 898907 (2011)
  30. Ille, F. & Sommer, L. Wnt signaling: multiple functions in neural development. Cell. Mol. Life Sci. 62, 11001108 (2005)
  31. Tedeschi, A. & Di Giovanni, S. The non-apoptotic role of p53 in neuronal biology: enlightening the dark side of the moon. EMBO Rep. 10, 576583 (2009)
  32. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009)
  33. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genet. 43, (2011)
  34. Hach, F. et al. mrsFAST: a cache-oblivious algorithm for short-read mapping. Nature Methods 7, 576577 (2010)
  35. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621628 (2008)
  36. Moldin, S. O. NIMH Human Genetics Initiative: 2003 update. Am. J. Psychiatry 160, 621622 (2003)
  37. Kessler, R. C. & Ustun, T. B. The World Mental Health (WMH) survey initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI). Int. J. Methods Psychiatr. Res. 13, 93121 (2004)
  38. Biesecker, L. G. et al. The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res. 19, 16651674 (2009)
  39. Talati, A., Fyer, A. J. & Weissman, M. M. A comparison between screened NIMH and clinically interviewed control samples on neuroticism and extraversion. Mol. Psychiatry 13, 122130 (2008)
  40. Baum, A. E. et al. A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder. Mol. Psychiatry 13, 197207 (2008)
  41. Itsara, A. et al. Population analysis of large copy number variants and hotspots of human genetic disease. Am. J. Hum. Genet. 84, 148161 (2009)
  42. Craddock, N. et al. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713720 (2010)
  43. Smoot, M. E., Ono, K., Ruscheinski, J., Wang, P. L. & Ideker, T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431432 (2011)
  44. Assenov, Y., Ramirez, F., Schelhorn, S. E., Lengauer, T. & Albrecht, M. Computing topological parameters of biological networks. Bioinformatics 24, 282284 (2008)
  45. Mamanova, L. et al. Target-enrichment strategies for next-generation sequencing. Nature Methods 7, 111118 (2010)
  46. Bunge, J. & Fitzpatrick, M. Estimating the number of species - a Review. J. Am. Stat. Assoc. 88, 364373 (1993)
  47. Chao, A. & Lee, S. M. Estimating the number of classes via sample coverage. J. Am. Stat. Assoc. 87, 210217 (1992)

Download references

Author information


  1. Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA

    • Brian J. O’Roak,
    • Laura Vives,
    • Santhosh Girirajan,
    • Emre Karakoc,
    • Niklas Krumm,
    • Bradley P. Coe,
    • Roie Levy,
    • Arthur Ko,
    • Choli Lee,
    • Joshua D. Smith,
    • Emily H. Turner,
    • Ian B. Stanaway,
    • Benjamin Vernot,
    • Maika Malig,
    • Carl Baker,
    • Joshua M. Akey,
    • Elhanan Borenstein,
    • Mark J. Rieder,
    • Deborah A. Nickerson,
    • Jay Shendure &
    • Evan E. Eichler
  2. Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, Washington 98195, USA

    • Beau Reilly &
    • Raphael Bernier
  3. Department of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA

    • Elhanan Borenstein
  4. Santa Fe Institute, Santa Fe, New Mexico 87501, USA

    • Elhanan Borenstein
  5. Howard Hughes Medical Institute, Seattle, Washington 98195, USA

    • Evan E. Eichler


E.E.E., J.S. and B.J.O. designed the study and drafted the manuscript. E.E.E. and J.S. supervised the study. R.B., B.R. and B.J.O. analysed the clinical information. R.B., L.V., S.G., E.K., N.K. and B.P.C. contributed to the manuscript. S.G., N.K., B.P.C., A.K., C.B., M.M. and L.V. generated and analysed CNV data. B.J.O. and L.V. performed MIP resequencing and mutation validations. I.B.S., E.H.T., B.J.O. and J.S. developed MIP protocol and analysis. B.V. and J.M.A. generated loci-specific mutation rate estimates. R.L. and E.B. performed PPI network analysis and simulations. E.K. performed DADA analysis. C.L. performed Illumina sequencing. J.D.S., I.B.S., E.H.T. and C.L. analysed sequence data. B.P.C. performed IPA analysis. B.J.O., E.K. and N.K. developed the de novo analysis pipelines and analysed sequence data. D.A.N., M.J.R., J.D.S. and E.H.T. supervised exome sequencing and primary analysis.

Competing financial interests

E.E.E. is on the scientific advisory boards for Pacific Biosciences, Inc and SynapDx Corp. J.S. is a member of the scientific advisory board or serves as a consultant for Aria Diagnostics, Stratos Genomics, Good Start Genetics, and Adaptive TCR. B.J.O. is an inventor on patent PCT/US2009/30620: mutations in contactin associated protein 2 are associated with increased risk for idiopathic autism.

Corresponding authors

Correspondence to:

Access to the raw sequence reads can be found at the NCBI database of Genotypes and Phenotypes (dbGaP) and National Database for Autism Research under accession numbers phs000482.v1.p1 and NDARCOL0001878, respectively.

Author details

Supplementary information

PDF files

  1. Supplementary Information (2.1M)

    This file contains Supplementary Discussion; Supplementary Figures 1–13; Supplementary Tables 2, 4, 6-13; and Supplementary References.

Excel files

  1. Supplementary Tables (203K)

    This file contains Supplementary Tables 1, 3 and 5 which give detailed information on exome capture, sequence coverage, paternal age, de novo mutation sites, and functional annotations.

Additional data