We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible on a cloud platform and through a controlled-access internet portal. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertions and deletions or copy number variations per ASD subject. We identified 18 new candidate ASD-risk genes and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (P = 6 × 10−4). In 294 of 2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried copy number variations and/or chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
BMC Biology Open Access 23 August 2023
Nature Genetics Open Access 27 July 2023
Implementation of Nanopore sequencing as a pragmatic workflow for copy number variant confirmation in the clinic
Journal of Translational Medicine Open Access 10 June 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Lai, M.C., Lombardo, M.V. & Baron-Cohen, S. Autism. Lancet 383, 896–910 (2014).
Robinson, E.B. et al. Genetic risk for autism spectrum disorders and neuropsychiatric variation in the general population. Nat. Genet. 48, 552–555 (2016).
Anagnostou, E. et al. Autism spectrum disorder: advances in evidence-based practice. CMAJ 186, 509–519 (2014).
Carter, M.T. & Scherer, S.W. Autism spectrum disorder in the genetics clinic: a review. Clin. Genet. 83, 399–407 (2013).
Miles, J.H. Complex autism spectrum disorders and cutting-edge molecular diagnostic tests. J. Am. Med. Assoc. 314, 879–880 (2015).
Bourgeron, T. From the genetic architecture to synaptic plasticity in autism spectrum disorder. Nat. Rev. Neurosci. 16, 551–563 (2015).
de la Torre-Ubieta, L., Won, H., Stein, J.L. & Geschwind, D.H. Advancing the understanding of autism disease mechanisms through genetics. Nat. Med. 22, 345–361 (2016).
Scherer, S.W. & Dawson, G. Risk factors for autism: translating genomic discoveries into diagnostics. Hum. Genet. 130, 123–148 (2011).
Tammimies, K. et al. Molecular diagnostic yield of chromosomal microarray analysis and whole-exome sequencing in children with autism spectrum disorder. J. Am. Med. Assoc. 314, 895–903 (2015).
Betancur, C. Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting. Brain Res. 1380, 42–77 (2011).
Szatmari, P. et al. Autism Genome Project. Mapping autism risk loci using genetic linkage and chromosomal rearrangements. Nat. Genet. 39, 319–328 (2007).
Leblond, C.S. et al. Meta-analysis of SHANK mutations in autism spectrum disorders: a gradient of severity in cognitive impairments. PLoS Genet. 10, e1004580 (2014).
Iossifov, I. et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature 515, 216–221 (2014).
Jiang, Y.H. et al. Detection of clinically relevant genetic variants in autism spectrum disorder by whole-genome sequencing. Am. J. Hum. Genet. 93, 249–263 (2013).
Marshall, C.R. et al. Structural variation of chromosomes in autism spectrum disorder. Am. J. Hum. Genet. 82, 477–488 (2008).
Miller, D.T. et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 86, 749–764 (2010).
Pinto, D. et al. Convergence of genes and cellular pathways dysregulated in autism spectrum disorders. Am. J. Hum. Genet. 94, 677–694 (2014).
Yuen, R.K. et al. Whole-genome sequencing of quartet families with autism spectrum disorder. Nat. Med. 21, 185–191 (2015).
Sahin, M. & Sur, M. Genes, circuits, and precision therapies for autism and related neurodevelopmental disorders. Science 350, aab3897 (2015).
Stavropoulos, D.J. et al. Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine. NPJ Genome Med 1, 15012 (2016).
Yuen, R.K. et al. Genome-wide characteristics of de novo mutations in autism. NPJ Genome Med. 1, 160271–1602710 (2016).
Buxbaum, J.D. et al. The autism sequencing consortium: large-scale, high-throughput sequencing in autism spectrum disorders. Neuron 76, 1052–1056 (2012).
De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 209–215 (2014).
Sanders, S.J. et al. Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron 87, 1215–1233 (2015).
Leblond, C.S. et al. Genetic and functional analyses of SHANK2 mutations suggest a multiple hit model of autism spectrum disorders. PLoS Genet. 8, e1002521 (2012).
Talkowski, M.E. et al. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries. Cell 149, 525–537 (2012).
Noor, A. et al. Disruption at the PTCHD1 locus on xp22.11 in autism spectrum disorder and intellectual disability. Sci. Transl. Med. 2, 49ra68 (2010).
Xiong, H.Y. et al. RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease. Science 347, 1254806 (2015).
Anney, R. et al. Individual common variants exert weak effects on the risk for autism spectrum disorders. Hum. Mol. Genet. 21, 4781–4792 (2012).
Glazer, D. Atoms, bits, and cells. Appl. Transl. Genomics 6, 11–14 (2015).
Global Alliance for Genomics and Health. GENOMICS. A federated ecosystem for sharing genomic, clinical data. Science 352, 1278–1280 (2016).
Stein, L.D. The case for cloud computing in genome informatics. Genome Biol. 11, 207 (2010).
An, J.Y. et al. Towards a molecular characterization of autism spectrum disorders: an exome sequencing and systems approach. Transl. Psychiatry 4, e394 (2014).
Kong, A. et al. Rate of de novo mutations and the importance of father's age to disease risk. Nature 488, 471–475 (2012).
Michaelson, J.J. et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 (2012).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Richards, C.S. et al. ACMG recommendations for standards for interpretation and reporting of sequence variations: revisions 2007. Genet. Med. 10, 294–300 (2008).
Utami, K.H. et al. Impaired development of neural-crest cell-derived organs and intellectual disability caused by MED13L haploinsufficiency. Hum. Mutat. 35, 1311–1320 (2014).
Stender, J.D. et al. Control of proinflammatory gene programs by regulated trimethylation and demethylation of histone H4K20. Mol. Cell 48, 28–38 (2012).
Ciernia, A.V. & LaSalle, J. The landscape of DNA methylation amid a perfect storm of autism aetiologies. Nat. Rev. Neurosci. 17, 411–423 (2016).
Sanders, S.J. et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70, 863–885 (2011).
Männik, K. et al. Copy number variations and cognitive phenotypes in unselected populations. J. Am. Med. Assoc. 313, 2044–2054 (2015).
Ameis, S.H. et al. A diffusion tensor imaging study in children with ADHD, autism spectrum disorder, OCD, and matched controls: distinct and non-distinct white matter disruption and dimensional brain-behavior relationships. Am. J. Psychiatry 173, 1213–1222 (2016).
Uddin, M. et al. Brain-expressed exons under purifying selection are enriched for de novo mutations in autism spectrum disorder. Nat. Genet. 46, 742–747 (2014).
Jacquemont, S. et al. Mirror extreme BMI phenotypes associated with gene dosage at the chromosome 16p11.2 locus. Nature 478, 97–102 (2011).
Hadley, D. et al. The impact of the metabotropic glutamate receptor and other gene family interaction networks on autism. Nat. Commun. 5, 4074 (2014).
Samocha, K.E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Corominas, R. et al. Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism. Nat. Commun. 5, 3650 (2014).
Walker, S. & Scherer, S.W. Identification of candidate intergenic risk loci in autism spectrum disorder. BMC Genomics 14, 499 (2013).
Sparrow, S.S., Balla, D.A., Cicchetti, D.V., Harrison, P.L. & Doll, E.A. Vineland Adaptive Behavior Scales (American Guidance Service, 1984).
Van der Auwera, G.A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 1–33 (2013).
Pan, C. et al. Interactive analytics for very large scale genomic data. Preprint at http://biorxiv.org/content/early/2015/12/24/035295.
Merico, D. et al. Compound heterozygous mutations in the noncoding RNU4ATAC cause Roifman syndrome by disrupting minor intron splicing. Nat. Commun. 6, 8718 (2015).
Auton, A. et al. Genomes Project. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Fu, W. et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Köhler, S. et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, D966–D974 (2014).
Stenson, P.D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
Solomon, B.D., Nguyen, A.D., Bear, K.A. & Wolfsberg, T.G. Clinical genomic database. Proc. Natl. Acad. Sci. USA 110, 9851–9855 (2013).
Warde-Farley, D. et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38, W214–W220 (2010).
Merico, D., Isserlin, R., Stueker, O., Emili, A. & Bader, G.D. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PLoS One 5, e13984 (2010).
Abyzov, A., Urban, A.E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).
Zhu, M. et al. Using ERDS to infer copy-number variants in high-coverage genomes. Am. J. Hum. Genet. 91, 408–421 (2012).
Zarrei, M., MacDonald, J.R., Merico, D. & Scherer, S.W. A copy number variation map of the human genome. Nat. Rev. Genet. 16, 172–183 (2015).
Huang, W., Sherman, B.T. & Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Hopkins, A.L. & Groom, C.R. The druggable genome. Nat. Rev. Drug Discov. 1, 727–730 (2002).
Liu, K., Liu, Y., Lau, J.L. & Min, J. Epigenetic targets and drug discovery Part 2: histone demethylation and DNA methylation. Pharmacol. Ther. 151, 121–140 (2015).
Wagner, A.H. et al. DGIdb 2.0: mining clinically relevant drug-gene interactions. Nucleic Acids Res. 44 D1036–D1044 (2016).
We thank the families for their participation in the study, The Centre for Applied Genomics and Google for their analytical and technical support, and staff at Autism Speaks for organizational and fundraising support. This work was funded by Autism Speaks, Autism Speaks Canada, the Canadian Institute for Advanced Research, the University of Toronto McLaughlin Centre, Genome Canada/Ontario Genomics Institute, the Government of Ontario, the Canadian Institutes of Health Research (CIHR), NeuroDevNet, Ontario Brain Institute, the Catherine and Maxwell Meighen Foundation and The Hospital for Sick Children Foundation. Special thanks to B. and (the late) S. Wright for their vision in helping to conceptualize and develop this project and to foundational philanthropic supporters C. Dolan, G. Gund, B. Marcus, V. and J. Morgan and S. Wise. R.K.C.Y. is funded by the CIHR Postdoctoral Fellowship, NARSAD Young Investigator award and Thrasher Early Career Award. R.W. is funded by the Ontario Brain Institute and NeuroDevNet. M.U. is funded by the Banting Postdoctoral Fellowship. M.W. is funded by a CIHR (Institute of Genetics) Clinical Investigatorship Award. L.Z. is funded by the Stollery Children's Hospital Foundation Chair in Autism Research. P.S. is funded by the Patsy and Jamie Anderson Chair in Child and Youth Mental Health. B.M.K. is funded by the Canada Research Chair in Law and Medicine. S.W.S. is funded by the GlaxoSmithKline-CIHR Chair in Genome Sciences at the University of Toronto and The Hospital for Sick Children.
The authors declare no competing financial interests.
Integrated supplementary information
(a) Principle component analysis with HapMap samples using PLINK in two dimensions. (b) Principle component analysis with HapMap samples using PLINK in three dimensions.
Families exhibiting loss-of-function or de novo missense mutations in (a) MED3, (b) PHF3, (c) PER2 and (d) HECTD4 are shown. The de novo or inherited variant alleles are shown below each family member. ‘‘+’’ indicates the allele containing the reference (presumably wild-type) sequence. Males are denoted by squares and females by circles. Symbol with dash line indicates that the sample was not whole-genome sequenced. NA indicates that no DNA sample was available for testing. Black symbols indicate individuals diagnosed with ASD. Arrows pointing to the symbols indicate ASD probands in each family.
(a) Variant search interface. (b) Example of variant search results. (c) Example of gene info search results. (d) Example of read depth search results.
Fraction of samples in different number of de novo mutations detected by Complete Genomics (CG) and Illumina (ILNM) are shown.
Supplementary Figures 1–4, Supplementary note (PDF 3396 kb)
Quality of WGS. (XLSX 914 kb)
Number of de novo SNVs and indels. (XLSX 72 kb)
All de novo variants detected. (XLSX 7305 kb)
All de novo LOF variants detected. (XLSX 36 kb)
All damaging variants in ASD-risk genes. (XLSX 25 kb)
Genes associated with syndromic or related disorders and their potential drug targets (XLSX 14 kb)
Summary of all samples included in MSSNG DB4. (XLSX 168 kb)
Pathogenic CNVs detected. (XLSX 27 kb)
About this article
Cite this article
C Yuen, R., Merico, D., Bookman, M. et al. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder. Nat Neurosci 20, 602–611 (2017). https://doi.org/10.1038/nn.4524
This article is cited by
BMC Medical Genomics (2023)
Implementation of Nanopore sequencing as a pragmatic workflow for copy number variant confirmation in the clinic
Journal of Translational Medicine (2023)
NRXN1 depletion in the medial prefrontal cortex induces anxiety-like behaviors and abnormal social phenotypes along with impaired neurite outgrowth in rat
Journal of Neurodevelopmental Disorders (2023)
BMC Biology (2023)
Investigation of autism-related transcription factors underlying sex differences in the effects of bisphenol A on transcriptome profiles and synaptogenesis in the offspring hippocampus
Biology of Sex Differences (2023)