An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder

Abstract

Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here we present a comparable framework to evaluate rare and de novo noncoding single-nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism, the contribution of de novo noncoding variation is probably modest in comparison to that of de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple-testing burden.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Burden analyses for gene-defined annotation categories.
Fig. 2: Defining annotation categories.
Fig. 3: Category-wide association study.
Fig. 4: Structural variation in 519 ASD families.
Fig. 5: Effective number of tests in CWAS and power calculation.

References

  1. 1.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

    Article  PubMed Central  Google Scholar 

  2. 2.

    Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    de Lange, K. M. et al. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat. Genet. 49, 256–261 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Sanders, S. J. et al. Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron 87, 1215–1233 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature 542, 433–438 (2017).

    Article  Google Scholar 

  6. 6.

    Marshall, C. R. et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat. Genet. 49, 27–35 (2017).

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45 (D1), D896–D901 (2017).

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Power, R. A. et al. Fecundity of patients with schizophrenia, autism, bipolar disorder, depression, anorexia nervosa, or substance abuse vs their unaffected siblings. JAMA Psychiatry 70, 22–30 (2013).

    Article  PubMed  Google Scholar 

  9. 9.

    Jin, S. C. et al. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands. Nat. Genet. 49, 1593–1601 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Visel, A. et al. ChIP–seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Shibata, M., Gulden, F. O. & Sestan, N. From trans to cis: transcriptional regulatory networks in neocortical development. Trends Genet. 31, 77–87 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Silbereis, J. C., Pochareddy, S., Zhu, Y., Li, M. & Sestan, N. The cellular and molecular landscapes of the developing human central nervous system. Neuron 89, 248–268 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Sanders, S. J. et al. Whole genome sequencing in psychiatric disorders: the WGSPD consortium. Nat. Neurosci. 20, 1661–1668 (2017).

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Caskey, C. T., Tompkins, R., Scolnick, E., Caryk, T. & Nirenberg, M. Sequential translation of trinucleotide codons for the initiation and termination of protein synthesis. Science 162, 135–138 (1968).

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Fischbach, G. D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron 68, 192–195 (2010).

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Turner, T. N. et al. Genome sequencing of autism-affected families reveals disruption of putative noncoding regulatory DNA. Am. J. Hum. Genet. 98, 58–74 (2016).

    CAS  Article  PubMed  Google Scholar 

  17. 17.

    Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    O’Roak, B. J. et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246–250 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Kong, A. et al. Rate of de novo mutations and the importance of father’s age to disease risk. Nature 488, 471–475 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Darnell, J. C. et al. FMRP stalls ribosomal translocation on mRNAs linked to synaptic function and autism. Cell 146, 247–261 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Genovese, G. et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat. Neurosci. 19, 1433–1441 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Chaste, P. et al. A genome-wide association study of autism using the Simons Simplex Collection: does reducing phenotypic heterogeneity in autism increase genetic homogeneity? Biol. Psychiatry 77, 775–784 (2015).

    Article  PubMed  Google Scholar 

  27. 27.

    Collins, R. L. et al. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome. Genome Biol. 18, 36 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Talkowski, M. E. et al. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries. Cell 149, 525–537 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Redin, C. et al. The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies. Nat. Genet. 49, 36–45 (2017).

    CAS  Article  PubMed  Google Scholar 

  30. 30.

    Brand, H. et al. Paired-duplication signatures mark cryptic inversions and other complex structural variation. Am. J. Hum. Genet. 97, 170–176 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Turner, T. N. et al. Genomic patterns of de novo mutation in simplex autism. Cell 171, 710–722 (2017).

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Hirschhorn, J. N. & Daly, M. J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Dudbridge, F. & Gusnanto, A. Estimation of significance thresholds for genomewide association scans. Genet. Epidemiol. 32, 227–234 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Neale, B. M. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–245 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Yuen, R. K. et al. Whole-genome sequencing of quartet families with autism spectrum disorder. Nat. Med. 21, 185–191 (2015).

    CAS  Article  PubMed  Google Scholar 

  37. 37.

    Cummings, B. B. et al. Improving genetic diagnosis in Mendelian disease with transcriptome sequencing. Sci. Transl. Med. 9, eaal5209 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Akbarian, S. et al. The PsychENCODE project. Nat. Neurosci. 18, 1707–1712 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    van Berkum, N. L. et al. Hi-C: a method to study the three-dimensional architecture of genomes. J. Vis. Exp. 39, e1869 (2010).

    Google Scholar 

  40. 40.

    Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Johnson, E. C. et al. No evidence that schizophrenia candidate genes are more associated with schizophrenia than noncandidate genes. Biol. Psychiatry 82, 702–708 (2017).

    CAS  Article  PubMed  Google Scholar 

  42. 42.

    Farrell, M. S. et al. Evaluating historical candidate genes for schizophrenia. Mol. Psychiatry 20, 555–562 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Munoz, A. et al. De novo indels within introns contribute to ASD incidence. Preprint at bioRxiv https://doi.org/10.1101/137471 (2017).

  44. 44.

    Brandler, W. M. et al. Paternally inherited noncoding structural variants contribute to autism. Preprint at bioRxiv https://doi.org/10.1101/102327 (2017).

  45. 45.

    Iossifov, I. et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature 515, 216–221 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Ioannidis, J. P. Why most discovered true associations are inflated. Epidemiology 19, 640–648 (2008).

    Article  PubMed  Google Scholar 

  47. 47.

    Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Zook, J. M. et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat. Biotechnol. 32, 246–251 (2014).

    CAS  Article  PubMed  Google Scholar 

  49. 49.

    Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Wei, Q. et al. A Bayesian framework for de novo mutation calling in parents–offspring trios. Bioinformatics 31, 1375–1381 (2015).

    CAS  Article  PubMed  Google Scholar 

  51. 51.

    Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 985–987 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Narzisi, G. et al. Accurate de novo and transmitted indel detection in exome-capture data using microassembly. Nat. Methods 11, 1033–1036 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Lai, Z. et al. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res. 44, e108 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Yang, H. & Wang, K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat. Protoc. 10, 1556–1566 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Willsey, A. J. et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell 155, 997–1007 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Wright, C. F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Cotney, J. et al. The autism-associated chromatin modifier CHD8 regulates other autism risk genes during human neurodevelopment. Nat. Commun. 6, 6404 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Sugathan, A. et al. CHD8 regulates neurodevelopmental pathways associated with autism spectrum disorder in neural progenitors. Proc. Natl. Acad. Sci. USA 111, E4468–E4477 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Bayés, A. et al. Characterization of the proteome, diseases and evolution of the human postsynaptic density. Nat. Neurosci. 14, 19–21 (2011).

    Article  PubMed  Google Scholar 

  63. 63.

    Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L. A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007).

    CAS  Article  PubMed  Google Scholar 

  64. 64.

    Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Doan, R. N. et al. Mutations in human accelerated regions disrupt cognition and social behavior. Cell 167, 341–354 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  PubMed Central  Google Scholar 

  67. 67.

    Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).

    CAS  Article  PubMed  Google Scholar 

  73. 73.

    Kronenberg, Z. N. et al. Wham: identifying structural variants of biological consequence. PLoS Comput. Biol. 11, e1004572 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Handsaker, R. E. et al. Large multiallelic copy number variations in humans. Nat. Genet. 47, 296–303 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  75. 75.

    Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  76. 76.

    Klambauer, G. et al. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res. 40, e69 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Gardner, E. J. et al. The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res. 27, 1916–1929 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Pedersen, B. S., Collins, R. L., Talkowski, M. E. & Quinlan, A. R. Indexcov: fast coverage quality control for whole-genome sequencing. Gigascience 6, 1–6 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We are grateful to the families participating in the Simons Foundation Autism Research Initiative (SFARI) Simplex Collection (SSC). This work was supported by grants from the Simons Foundation for Autism Research Initiative (SFARI 385110 to N.S., A.J.W., M.W.S., S.J.S.; 385027 to M.E.T., J.D.B., B.D., M.J.D., X.H., K.R.; 388196 to G.M., H.C., A.R.Q.; and 346042 to M.E.T.), the US National Institutes of Health (R37MH057881 and U01MH111658 to B.D. and K.R.; HD081256 and GM061354 to M.E.T.; U01MH105575 to M.W.S.; U01MH111662 to M.W.S. and S.J.S.; R01MH110928 and U01MH100239-03S1 to M.W.S., S.J.S., A.J.W.; U01MH111661 to J.D.B.; K99DE026824 to H.B.; U01MH100229 to M.J.D.), the Autism Science Foundation to D.M.W., and the March of Dimes to M.E.T. M.E.T. was also supported by the Desmond and Ann Heathwood MGH Research Scholars award. We thank the SSC principal investigators (A. L. Beaudet, R. Bernier, J. Constantino, E. H. Cook Jr, E. Fombonne, D. Geschwind, D. E. Grice, A. Klin, D. H. Ledbetter, C. Lord, C. L. Martin, D. M. Martin, R. Maxim, J. Miles, O. Ousley, B. Peterson, J. Piggot, C. Saulnier, M. W. State, W. Stone, J. S. Sutcliffe, C. A. Walsh, and E. Wijsman) and the coordinators and staff at the SSC clinical sites; the SFARI staff, in particular N. Volfovsky; D. B. Goldstein for contributing to the experimental design; the Rutgers University Cell and DNA repository for accessing biomaterials; and the New York Genome Center for generating the WGS data.

Author information

Affiliations

Authors

Contributions

Experimental design: D.M.W., H.B., J.-Y.A., M.R.S., J.T.G., M.J.W., X.H., N.S., B.M.N., H.C., A.J.W., J.D.B., M.J.D., M.W.S., A.R.Q., G.T.M., K.R., B.D., M.E.T., and S.J.S. Identification of de novo SNVs and indels: D.M.W., J.-Y.A., S.D., M.C.G., J.D.M., L.S., A.J.W., and S.J.S. Identification of structural variants: H.B., J.-Y.A., M.R.S., J.T.G., R.L.C., R.M.L., A.F., H.Z.W., X.Z., M.C.G., R.E.H., S.K., L.S., S.A.M., A.R.Q., G.T.M., and M.E.T. Confirmation of de novo variants: D.M.W., H.B., S.D., G.B.S., H.Z.W., B.B.C., J.D., C.D., C.A.E., R.Y., M.F.W., and M.J.W. Annotation of functional regions: D.M.W., J.-Y.A., S.D., E.M.-P., J.D.M., Y.L., S.P., J.L.R., N.S., M.E.T., and S.J.S. Generation of midfetal H3K27ac and ATAC–seq data: E.M.-P., T.J.N., A.R.K., and J.L.R. Development of genomic prediction score and de novo score: L.Z., L.K., K.R., and B.D. Analysis of SNVs and indels (Figs. 1–3): D.M.W., J.-Y.A., and S.J.S. Analysis of structural variants (Fig. 4): H.B., M.R.S., J.T.G., X.Z., and M.E.T. Assessment of P-value correlations, effective number of tests, and power analysis (Figs. 3 and 5): D.M.W., J.-Y.A., L.Z., G.B.S., K.R., B.D., and S.J.S. Manuscript preparation: D.M.W., H.B., J.-Y.A., M.R.S., L.Z., J.T.G., R.L.C., S.D., B.M.N., H.C., J.D.B., M.J.D., M.W.S., A.R.Q., G.T.M., K.R., B.D., M.E.T., and S.J.S.

Corresponding authors

Correspondence to Bernie Devlin or Michael E. Talkowski or Stephan J. Sanders.

Ethics declarations

Competing interests

J.L.R. is cofounder, stockholder, and currently on the scientific board of Neurona, a company studying the potential therapeutic use of interneuron transplantation. B.M.N. is an SAB member of Deep Genomics and serves as a consultant for Avanir Therapeutics. All other authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15 and Supplementary Note

Reporting Summary

Supplementary Tables

Supplementary Tables 1–13

Supplementary Data

Visualization plots of de novo structural variants

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Werling, D.M., Brand, H., An, J. et al. An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder. Nat Genet 50, 727–736 (2018). https://doi.org/10.1038/s41588-018-0107-y

Download citation

Further reading