Genome-wide significance testing of variation from single case exomes

Wilfert, Amy B; Chao, Katherine R; Kaushal, Madhurima; Jain, Sanjay; Zöllner, Sebastian; Adams, David R; Conrad, Donald F

doi:10.1038/ng.3697

Analysis
Published: 24 October 2016

Genome-wide significance testing of variation from single case exomes

Amy B Wilfert¹,
Katherine R Chao²,
Madhurima Kaushal³,
Sanjay Jain^3,4,
Sebastian Zöllner⁵,
David R Adams ORCID: orcid.org/0000-0002-6660-1242² &
…
Donald F Conrad ORCID: orcid.org/0000-0003-3828-8970^1,4

Nature Genetics volume 48, pages 1455–1461 (2016)Cite this article

8281 Accesses
25 Citations
40 Altmetric
Metrics details

Subjects

Abstract

Standard techniques from genetic epidemiology are ill-suited to formally assess the significance of variants identified from a single case. We developed a statistical inference framework for identifying unusual functional variation from a single exome or genome, what we refer to as the 'n-of-one' problem. Using this approach we assessed our ability to identify the causal genotypes in over 5 million simulated cases of Mendelian disease, identifying 39% of disease genotypes as the most damaging unit in a typical exome background. We applied our approach to 129 n-of-one families from the Undiagnosed Diseases Program, nominating 60% of 30 disease genes determined to be diagnostic by a standard clinical workup. Our method can currently produce well-calibrated P values when applied to single genomes, can facilitate integration of multiple data types for n-of-one analyses, and, with further work, could become a widely used epidemiological method like linkage analysis or genome-wide association analysis.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Approach to the n-of-one problem.**

Figure 3: Three primary sources of information contribute to the performance of PSAP values: the use of gene-specific models, modeling gene-specific singleton rates and integration of frequency information from ExAC.

**Figure 4: Benchmarking our ability to identify the causal gene in simluated n-of-one cases.**

**Figure 5: Application of PSAP to real cases of the n-of-1 problem.**

**Figure 6: PSAP facilitates integrative analysis of rare disease patients.**

Effective variant filtering and expected candidate variant yield in studies of rare human disease

Article Open access 15 July 2021

Genetic association analysis of 77,539 genomes reveals rare disease etiologies

Article Open access 16 March 2023

Multi-resolution localization of causal variants across the genome

Article Open access 27 February 2020

References

Bamshad, M.J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).
Article CAS PubMed Google Scholar
Gahl, W.A. et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet. Med. 14, 51–59 (2012).
Article CAS PubMed Google Scholar
Cooper, G.M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).
Article CAS PubMed Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mitchell, A.A., Chakravarti, A. & Cutler, D.J. On the probability that a novel variant is a disease-causing mutation. Genome Res. 15, 960–966 (2005).
Article CAS PubMed PubMed Central Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Article PubMed PubMed Central Google Scholar
MacArthur, D.G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).
Article CAS PubMed PubMed Central Google Scholar
Samocha, K.E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
CAS PubMed PubMed Central Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lohmueller, K.E. The distribution of deleterious genetic variation in human populations. Curr. Opin. Genet. Dev. 29, 139–146 (2014).
Article CAS PubMed Google Scholar
Stenson, P.D. et al. The Human Gene Mutation Database (HGMD) and its exploitation in the fields of personalized genomics and molecular evolution. Curr. Protoc. Bioinformatics Chapter 1, Unit 13 (2012).
Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Article CAS PubMed Google Scholar
Lopes, A.M. et al. Human spermatogenic failure purges deleterious mutation load from the autosomes and both sex chromosomes, including the gene DMRT1. PLoS Genet. 9, e1003349 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kantarci, S. et al. Donnai–Barrow syndrome (DBS/FOAR) in a child with a homozygous LRP2 mutation due to complete chromosome 2 paternal isodisomy. Am. J. Med. Genet. A. 146A, 1842–1847 (2008).
Article CAS PubMed PubMed Central Google Scholar
Rey, R.A. et al. Male hypogonadism: an extended classification based on a developmental, endocrine physiology-based approach. Andrology 1, 3–16 (2013).
Article CAS PubMed Google Scholar
Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).
Article CAS PubMed PubMed Central Google Scholar
Huang, N., Lee, I., Marcotte, E.M. & Hurles, M.E. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 6, e1001154 (2010).
PubMed PubMed Central Google Scholar
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case–control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Article CAS PubMed PubMed Central Google Scholar
Michaelson, J.J. et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 (2012).
Article CAS PubMed PubMed Central Google Scholar
Blekhman, R. et al. Natural selection on genes that underlie human disease susceptibility. Curr. Biol. 18, 883–889 (2008).
Article CAS PubMed PubMed Central Google Scholar
Tryka, K.A. et al. NCBI's Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res. 42, D975–D979 (2014).
Article CAS PubMed Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Article PubMed PubMed Central Google Scholar
Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013).
Article CAS PubMed PubMed Central Google Scholar
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Lage, K. et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc. Natl. Acad. Sci. USA 105, 20870–20875 (2008).
Article CAS PubMed PubMed Central Google Scholar
Koboldt, D.C. et al. Exome-based mapping and variant prioritization for inherited Mendelian disorders. Am. J. Hum. Genet. 94, 373–384 (2014).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank D. Wilson and M. Stephens for helpful comments, N. Huang for useful discussions and for providing updated versions of some annotations used in this work, K. Vigh-Conrad for assistance in preparing the figures, D. MacArthur, M. Lek and the members of the ExAC Consortium for generous prepublication sharing of their data, and M. Hoffmann and WU Kidney Translational Research Core (KTRC) for patient enrolment and Genome Technology Access Center (GTAC) for exome sequencing of CAKUT patients. Our work was supported by US National Institutes of Health grant R01MH101810 (to D.F.C.), March of Dimes Foundation grant #6-FY14-430 (to S.J.).

Author information

Authors and Affiliations

Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, USA
Amy B Wilfert & Donald F Conrad
National Institutes of Health Undiagnosed Diseases Program, US National Institutes of Health, Bethesda, Maryland, USA
Katherine R Chao & David R Adams
Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, USA
Madhurima Kaushal & Sanjay Jain
Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri, USA
Sanjay Jain & Donald F Conrad
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
Sebastian Zöllner

Authors

Amy B Wilfert
View author publications
You can also search for this author in PubMed Google Scholar
Katherine R Chao
View author publications
You can also search for this author in PubMed Google Scholar
Madhurima Kaushal
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Jain
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Zöllner
View author publications
You can also search for this author in PubMed Google Scholar
David R Adams
View author publications
You can also search for this author in PubMed Google Scholar
Donald F Conrad
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.F.C. designed the study. S.Z. provided helpful conceptual guidance on modeling and interpretation. D.F.C. wrote the simulation code. A.B.W. developed the PSAP pipeline, performed spike-in analyses, and evaluated the impact of population structure on PSAP values. D.F.C., A.B.W., D.R.A. and K.R.C. performed the UDP analyses. M.K. and S.J. contributed CAKUT samples and data. A.B.W. and D.F.C. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Donald F Conrad.

Ethics declarations

Competing interests

D.F.C. is funded by a research contract with PierianDx to develop novel methods for clinical exome analysis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17, Supplementary Tables 1–7 and Supplementary Note. (PDF 2925 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilfert, A., Chao, K., Kaushal, M. et al. Genome-wide significance testing of variation from single case exomes. Nat Genet 48, 1455–1461 (2016). https://doi.org/10.1038/ng.3697

Download citation

Received: 15 June 2016
Accepted: 19 September 2016
Published: 24 October 2016
Issue Date: December 2016
DOI: https://doi.org/10.1038/ng.3697

This article is cited by

DDX3Y is likely the key spermatogenic factor in the AZFa region that contributes to human non-obstructive azoospermia
- Ann-Kristin Dicke
- Adrian Pilatz
- Frank Tüttelmann
Communications Biology (2023)
Diverse monogenic subforms of human spermatogenic failure
- Liina Nagirnaja
- Alexandra M. Lopes
- Donald F. Conrad
Nature Communications (2022)
Calibrated rare variant genetic risk scores for complex disease prediction using large exome sequence repositories
- Ricky Lali
- Michael Chong
- Guillaume Paré
Nature Communications (2021)
A framework for high-resolution phenotyping of candidate male infertility mutants: from human to mouse
- Brendan J. Houston
- Donald F. Conrad
- Moira K. O’Bryan
Human Genetics (2021)
Disruption of human meiotic telomere complex genes TERB1, TERB2 and MAJIN in men with non-obstructive azoospermia
- Albert Salas-Huetos
- Frank Tüttelmann
- Kenneth I. Aston
Human Genetics (2021)