Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome-wide significance testing of variation from single case exomes

Abstract

Standard techniques from genetic epidemiology are ill-suited to formally assess the significance of variants identified from a single case. We developed a statistical inference framework for identifying unusual functional variation from a single exome or genome, what we refer to as the 'n-of-one' problem. Using this approach we assessed our ability to identify the causal genotypes in over 5 million simulated cases of Mendelian disease, identifying 39% of disease genotypes as the most damaging unit in a typical exome background. We applied our approach to 129 n-of-one families from the Undiagnosed Diseases Program, nominating 60% of 30 disease genes determined to be diagnostic by a standard clinical workup. Our method can currently produce well-calibrated P values when applied to single genomes, can facilitate integration of multiple data types for n-of-one analyses, and, with further work, could become a widely used epidemiological method like linkage analysis or genome-wide association analysis.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Approach to the n-of-one problem.
Figure 2: PSAP calibration.
Figure 3: Three primary sources of information contribute to the performance of PSAP values: the use of gene-specific models, modeling gene-specific singleton rates and integration of frequency information from ExAC.
Figure 4: Benchmarking our ability to identify the causal gene in simluated n-of-one cases.
Figure 5: Application of PSAP to real cases of the n-of-1 problem.
Figure 6: PSAP facilitates integrative analysis of rare disease patients.

References

  1. 1

    Bamshad, M.J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).

    CAS  Article  Google Scholar 

  2. 2

    Gahl, W.A. et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet. Med. 14, 51–59 (2012).

    CAS  Article  Google Scholar 

  3. 3

    Cooper, G.M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).

    CAS  Article  Google Scholar 

  4. 4

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    CAS  Article  Google Scholar 

  5. 5

    Mitchell, A.A., Chakravarti, A. & Cutler, D.J. On the probability that a novel variant is a disease-causing mutation. Genome Res. 15, 960–966 (2005).

    CAS  Article  Google Scholar 

  6. 6

    Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).

    Article  Google Scholar 

  7. 7

    MacArthur, D.G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).

    CAS  Article  Google Scholar 

  8. 8

    Samocha, K.E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).

    CAS  Article  Google Scholar 

  9. 9

    Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10

    Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).

    CAS  Article  Google Scholar 

  11. 11

    Lohmueller, K.E. The distribution of deleterious genetic variation in human populations. Curr. Opin. Genet. Dev. 29, 139–146 (2014).

    CAS  Article  Google Scholar 

  12. 12

    Stenson, P.D. et al. The Human Gene Mutation Database (HGMD) and its exploitation in the fields of personalized genomics and molecular evolution. Curr. Protoc. Bioinformatics Chapter 1, Unit 13 (2012).

  13. 13

    Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).

    CAS  Article  Google Scholar 

  14. 14

    Lopes, A.M. et al. Human spermatogenic failure purges deleterious mutation load from the autosomes and both sex chromosomes, including the gene DMRT1. PLoS Genet. 9, e1003349 (2013).

    CAS  Article  Google Scholar 

  15. 15

    Kantarci, S. et al. Donnai–Barrow syndrome (DBS/FOAR) in a child with a homozygous LRP2 mutation due to complete chromosome 2 paternal isodisomy. Am. J. Med. Genet. A. 146A, 1842–1847 (2008).

    CAS  Article  Google Scholar 

  16. 16

    Rey, R.A. et al. Male hypogonadism: an extended classification based on a developmental, endocrine physiology-based approach. Andrology 1, 3–16 (2013).

    CAS  Article  Google Scholar 

  17. 17

    Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).

    CAS  Article  Google Scholar 

  18. 18

    Huang, N., Lee, I., Marcotte, E.M. & Hurles, M.E. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 6, e1001154 (2010).

    PubMed  PubMed Central  Google Scholar 

  19. 19

    Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).

    CAS  Article  Google Scholar 

  20. 20

    Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case–control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).

    CAS  Article  Google Scholar 

  21. 21

    Michaelson, J.J. et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 (2012).

    CAS  Article  Google Scholar 

  22. 22

    Blekhman, R. et al. Natural selection on genes that underlie human disease susceptibility. Curr. Biol. 18, 883–889 (2008).

    CAS  Article  Google Scholar 

  23. 23

    Tryka, K.A. et al. NCBI's Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res. 42, D975–D979 (2014).

    CAS  Article  Google Scholar 

  24. 24

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  Google Scholar 

  25. 25

    Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013).

    CAS  Article  Google Scholar 

  26. 26

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  27. 27

    Lage, K. et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc. Natl. Acad. Sci. USA 105, 20870–20875 (2008).

    CAS  Article  Google Scholar 

  28. 28

    Koboldt, D.C. et al. Exome-based mapping and variant prioritization for inherited Mendelian disorders. Am. J. Hum. Genet. 94, 373–384 (2014).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank D. Wilson and M. Stephens for helpful comments, N. Huang for useful discussions and for providing updated versions of some annotations used in this work, K. Vigh-Conrad for assistance in preparing the figures, D. MacArthur, M. Lek and the members of the ExAC Consortium for generous prepublication sharing of their data, and M. Hoffmann and WU Kidney Translational Research Core (KTRC) for patient enrolment and Genome Technology Access Center (GTAC) for exome sequencing of CAKUT patients. Our work was supported by US National Institutes of Health grant R01MH101810 (to D.F.C.), March of Dimes Foundation grant #6-FY14-430 (to S.J.).

Author information

Affiliations

Authors

Contributions

D.F.C. designed the study. S.Z. provided helpful conceptual guidance on modeling and interpretation. D.F.C. wrote the simulation code. A.B.W. developed the PSAP pipeline, performed spike-in analyses, and evaluated the impact of population structure on PSAP values. D.F.C., A.B.W., D.R.A. and K.R.C. performed the UDP analyses. M.K. and S.J. contributed CAKUT samples and data. A.B.W. and D.F.C. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Donald F Conrad.

Ethics declarations

Competing interests

D.F.C. is funded by a research contract with PierianDx to develop novel methods for clinical exome analysis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17, Supplementary Tables 1–7 and Supplementary Note. (PDF 2925 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wilfert, A., Chao, K., Kaushal, M. et al. Genome-wide significance testing of variation from single case exomes. Nat Genet 48, 1455–1461 (2016). https://doi.org/10.1038/ng.3697

Download citation

Further reading

Search

Quick links