Abstract

High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present pedigree-VAAST (pVAAST), a disease-gene identification tool designed for high-throughput sequence data in pedigrees. pVAAST uses a sequence-based model to perform variant and gene-based linkage analysis. Linkage information is then combined with functional prediction and rare variant case-control association information in a unified statistical framework. pVAAST outperformed linkage and rare-variant association tests in simulations and identified disease-causing genes from whole-genome sequence data in three human pedigrees with dominant, recessive and de novo inheritance patterns. The approach is robust to incomplete penetrance and locus heterogeneity and is applicable to a wide variety of genetic traits. pVAAST maintains high power across studies of monogenic, high-penetrance phenotypes in a single pedigree to highly polygenic, common phenotypes involving hundreds of pedigrees.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & Linkage and association: basic concepts. Adv. Genet. 60, 51–74 (2008).

  2. 2.

    Our load of mutations. Am. J. Hum. Genet. 2, 111–176 (1950).

  3. 3.

    et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).

  4. 4.

    et al. Testing for an unusual distribution of rare variants. PLoS Genet. 7, e1001322 (2011).

  5. 5.

    & Predicting the effects of amino acid substitutions on protein function. Annu. Rev. Genomics Hum. Genet. 7, 61–80 (2006).

  6. 6.

    et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

  7. 7.

    et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).

  8. 8.

    , , & Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genet. Epidemiol. 37, 409–418 (2013).

  9. 9.

    et al. Adjusted sequence kernel association test for rare variants controlling for cryptic and family relatedness. Genet. Epidemiol. 37, 366–376 (2013).

  10. 10.

    et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet. 42, 483–485 (2010).

  11. 11.

    et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).

  12. 12.

    et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).

  13. 13.

    et al. VAAST 2.0: improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix. Genet. Epidemiol. 37, 622–634 (2013).

  14. 14.

    , & Gene-dropping vs. empirical variance estimation for allele-sharing linkage statistics. Genet. Epidemiol. 30, 652–665 (2006).

  15. 15.

    & Exact genetic linkage computations for general pedigrees. Bioinformatics 18 (suppl. 1), S189–S198 (2002).

  16. 16.

    Fundamentals of biostatistics, edn. 7 (Cengage Learning, Boston, 2011).

  17. 17.

    et al. The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Res. 40, D918–D923 (2012).

  18. 18.

    et al. Inborn errors of human STAT1: allelic heterogeneity governs the diversity of immunological and infectious phenotypes. Curr. Opin. Immunol. 24, 364–378 (2012).

  19. 19.

    et al. Autosomal-dominant chronic mucocutaneous candidiasis with STAT1-mutation can be complicated with chronic active hepatitis and hypothyroidism. J. Clin. Immunol. 32, 1213–1220 (2012).

  20. 20.

    et al. Gain-of-function human STAT1 mutations impair IL-17 immunity and underlie chronic mucocutaneous candidiasis. J. Exp. Med. 208, 1635–1648 (2011).

  21. 21.

    et al. STAT1 mutations in autosomal dominant chronic mucocutaneous candidiasis. N. Engl. J. Med. 365, 54–61 (2011).

  22. 22.

    et al. Dominant gain-of-function STAT1 mutations in FOXP3 wild-type immune dysregulation-polyendocrinopathy-enteropathy-X-linked-like syndrome. J. Allergy Clin. Immunol. 131, 1611–1623 (2013).

  23. 23.

    et al. Chronic mucocutaneous candidiasis caused by a gain-of-function mutation in the STAT1 DNA-binding domain. J. Immunol. 189, 1521–1526 (2012).

  24. 24.

    et al. New and recurrent gain-of-function STAT1 mutations in patients with chronic mucocutaneous candidiasis from Eastern and Central Europe. J. Med. Genet. 50, 567–578 (2013).

  25. 25.

    et al. GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 424, 443–447 (2003).

  26. 26.

    , , & Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet. 30, 97–101 (2002).

  27. 27.

    , , & Design considerations for massively parallel sequencing studies of complex human disease. PLoS ONE 6, e23221 (2011).

  28. 28.

    et al. Genetic risk factors in two Utah pedigrees at high risk for suicide. Transl. Psychiatr. 3, e325 (2013).

  29. 29.

    et al. A permutation procedure to correct for confounders in case-control studies, including tests of rare variation. Am. J. Hum. Genet. 91, 215–223 (2012).

  30. 30.

    , & ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  31. 31.

    et al. Identification of rare variants from exome sequence in a large pedigree with autism. Hum. Hered. 74, 153–164 (2012).

  32. 32.

    et al. De novo mutations in ATP1A3 cause alternating hemiplegia of childhood. Nat. Genet. 44, 1030–1034 (2012).

  33. 33.

    et al. Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat. Commun. 2, 467 (2011).

  34. 34.

    et al. Population structure and genetic diversity of New World maize races assessed by DNA microsatellites. Am. J. Bot. 95, 1240–1253 (2008).

  35. 35.

    et al. Genomic diversity and evolution of the head crest in the rock pigeon. Science 339, 1063–1067 (2013).

  36. 36.

    et al. Epistatic and combinatorial effects of pigmentary gene mutations in the domestic pigeon. Curr. Biol. 24, 459–464 (2014).

  37. 37.

    & A general model for the genetic analysis of pedigree data. Hum. Hered. 21, 523–542 (1971).

  38. 38.

    & A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).

  39. 39.

    , , , & Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953).

  40. 40.

    et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

  41. 41.

    & Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  42. 42.

    A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  43. 43.

    et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  44. 44.

    et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).

  45. 45.

    et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat. Genet. 42, 969–972 (2010).

  46. 46.

    et al. A standard variation file format for human genome sequences. Genome Biol. 11, R88 (2010).

Download references

Acknowledgements

An allocation of computer time on the University of Texas MD Anderson Research Computing High Performance Computing (HPC) facility is gratefully acknowledged. This work was supported by US National Institutes of Health grants R01 GM104390 (M.Y., L.B.J., C.D.H. and H.H.), R01 DK091374 (S.L.G., C.D.H. and L.B.J.), R01 CA164138 (S.V.T. and C.D.H.), R44HG006579 (M.G.R. and M.Y.) and R01 GM59290 (L.B.J.) as well as the University of Luxembourg—Institute for Systems Biology Program. D.S. was supported by grants from the NHLBI (UO1 HL100406 and U01 HL098179) related to this project. H.C. was supported by NIH grants R01 MH094400 and R01 MH099134. H.H. was supported by the MD Anderson Cancer Center Odyssey Program. J.X. was supported by NIH grant R00HG005846.

Author information

Affiliations

  1. Department of Epidemiology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA.

    • Hao Hu
    • , Shankaracharya
    • , Paul Scheet
    •  & Chad D Huff
  2. Institute for Systems Biology, Seattle, Washington, USA.

    • Jared C Roach
    • , Gustavo Glusman
    • , Robert Hubley
    • , Hong Li
    •  & Leroy Hood
  3. Department of Psychiatry, University of Utah, Salt Lake City, Utah, USA.

    • Hilary Coon
  4. Department of Pediatrics, University of Utah, Salt Lake City, Utah, USA.

    • Stephen L Guthery
  5. Department of Pathology, University of Utah School of Medicine, Salt Lake City, Utah, USA.

    • Karl V Voelkerding
  6. ARUP Institute for Clinical and Experimental Pathology, ARUP Laboratories, Salt Lake City, Utah, USA.

    • Karl V Voelkerding
    • , Rebecca L Margraf
    •  & Jacob D Durtschi
  7. Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah, USA.

    • Sean V Tavtigian
  8. Department of Human Genetics and USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, USA.

    • Wilfred Wu
    • , Barry Moore
    • , Lynn B Jorde
    •  & Mark Yandell
  9. Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA.

    • Shuoguo Wang
    •  & Jinchuan Xing
  10. Department of Pediatrics, The Ohio State University, Columbus, Ohio, USA.

    • Vidu Garg
  11. Center for Cardiovascular and Pulmonary Research, Research Institute at Nationwide Children's Hospital, Columbus, Ohio, USA.

    • Vidu Garg
  12. Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg.

    • David J Galas
  13. Pacific Northwest Diabetes Research Institute, Seattle, Washington, USA.

    • David J Galas
  14. Gladstone Institute of Cardiovascular Disease and University of California, San Francisco, San Francisco, California, USA.

    • Deepak Srivastava
  15. Omicia, Inc., Oakland, California, USA.

    • Martin G Reese

Authors

  1. Search for Hao Hu in:

  2. Search for Jared C Roach in:

  3. Search for Hilary Coon in:

  4. Search for Stephen L Guthery in:

  5. Search for Karl V Voelkerding in:

  6. Search for Rebecca L Margraf in:

  7. Search for Jacob D Durtschi in:

  8. Search for Sean V Tavtigian in:

  9. Search for Shankaracharya in:

  10. Search for Wilfred Wu in:

  11. Search for Paul Scheet in:

  12. Search for Shuoguo Wang in:

  13. Search for Jinchuan Xing in:

  14. Search for Gustavo Glusman in:

  15. Search for Robert Hubley in:

  16. Search for Hong Li in:

  17. Search for Vidu Garg in:

  18. Search for Barry Moore in:

  19. Search for Leroy Hood in:

  20. Search for David J Galas in:

  21. Search for Deepak Srivastava in:

  22. Search for Martin G Reese in:

  23. Search for Lynn B Jorde in:

  24. Search for Mark Yandell in:

  25. Search for Chad D Huff in:

Contributions

C.D.H. conceived of the project. C.D.H. oversaw and coordinated the research. C.D.H. and H.H. designed the algorithms. H.H. and B.M. wrote the software. C.D.H., H.H. and P.S. contributed to the statistical development. C.D.H., H.H., J.C.R., M.Y., S.V.T., D.S., K.V.V., L.H., L.B.J., M.G.R. and S.L.G. designed the experiments. H.H., H.C., W.W., R.L.M., J.D.D., S.W., H.L., J.X., Shankaracharya, R.H., B.M., J.C. and G.G. performed the experiments. H.H., C.D.H., M.Y., S.V.T., S.L.G. and L.B.J. analyzed and interpreted the data. H.H. generated the figures. H.H., C.D.H., L.B.J., M.Y., S.L.G., P.S., and S.V.T. wrote the paper. S.L.G., D.S., V.G., D.J.G., L.H., H.L., R.H., K.V.V., R.L.M., J.D.D., G.G. participated in pedigree identification, recruitment and validation.

Competing interests

M.G.R. is a founder and officer of Omicia, Inc.

Corresponding authors

Correspondence to Mark Yandell or Chad D Huff.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Notes 1–4, Supplementary Figures 1–10 and Supplementary Table 1

Zip files

  1. 1.

    Supplementary Code

    pVAAST source code

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.2895

Further reading

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing