Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Exome sequencing and the genetic basis of complex traits

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Discovery of novel variants for increasing numbers of samples.
Figure 2: Association analysis.
Figure 3: Extrapolation of gene burden results.

References

  1. Fuller, C.W. et al. The challenges of sequencing by synthesis. Nat. Biotechnol. 27, 1013–1023 (2009).

    CAS  PubMed  Google Scholar 

  2. Rusk, N. & Kiermer, V. Primer: Sequencing—the next generation. Nat. Methods 5, 15 (2008).

    CAS  PubMed  Google Scholar 

  3. Metzker, M.L. Sequencing technologies the next generation. Nat. Rev. Genet. 11, 31–46 (2010).

    CAS  PubMed  Google Scholar 

  4. Shendure, J. & Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008).

    CAS  PubMed  Google Scholar 

  5. Clarke, J. et al. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol. 4, 265–270 (2009).

    CAS  PubMed  Google Scholar 

  6. Ng, S.B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790–793 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Teer, J.K. & Mullikin, J.C. Exome sequencing: the sweet spot before whole genomes. Hum. Mol. Genet. 19, R145–R151 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Hedges, D.J. et al. Comparison of three targeted enrichment strategies on the SOLiD sequencing platform. PLoS ONE 6, e18595 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Ng, S.B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Pierce, S.B. et al. Am. Mutations in the DBP-deficiency protein HSD17B4 cause ovarian dysgenesis, hearing loss, and ataxia of Perrault Syndrome. J. Hum. Genet. 87, 282–288 (2010).

    CAS  Google Scholar 

  11. Krawitz, P.M. et al. Identity-by-descent filtering of exome sequence data identifies PIGV mutations in hyperphosphatasia mental retardation syndrome. Nat. Genet. 42, 827–829 (2010).

    CAS  PubMed  Google Scholar 

  12. Wang, J.L. et al. TGM6 identified as a novel causative gene of spinocerebellar ataxias using exome sequencing. Brain. 133, 3510–3518 (2010).

    PubMed  Google Scholar 

  13. Ng, S.B., Nickerson, D.A., Bamshad, M.J. & Shendure, J. Massively parallel sequencing and rare disease. Hum. Mol. Genet. 19, R119–R124 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Musunuru, K. et al. Exome sequencing, ANGPTL3 mutations, and familial combined hypolipidemia. N. Engl. J. Med. 363, 2220–2227 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Hoischen, A. et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet. 42, 483–485 (2010).

    CAS  PubMed  Google Scholar 

  16. Zhao, Q. et al. Systematic detection of putative tumor suppressor genes through the combined use of exome and transcriptome sequencing. Genome Biol. 11, R114 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Wei, X. et al. Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nat. Genet. 43, 442–446 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Varela, I. et al. Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma. Nature 469, 539–542 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Agrawal, N. et al. Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science 333, 1154–1157 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. Chang, H. et al. Exome sequencing reveals comprehensive genomic alterations across eight cancer cell lines. PLoS ONE 6, e21097 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Cohen, J.C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).

    CAS  PubMed  Google Scholar 

  22. Ji, W. et al. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat. Genet. 40, 592–599 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Johansen, C.T. et al. Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia. Nat. Genet. 42, 684–687 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Nejentsev, S., Walker, N., Riches, D., Egholm, M. & Todd, J.A. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Ahituv, N. et al. Medical sequencing at the extremes of human body mass. Am. J. Hum. Genet. 80, 779–791 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Romeo, S. et al. Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J. Clin. Invest. 119, 70–79 (2009).

    CAS  PubMed  Google Scholar 

  27. Pritchard, J.K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Pritchard, J.K. & Cox, N. J. The allelic architecture of human disease genes: common disease–common variant...or not? Hum. Mol. Genet. 11, 2417–2423 (2002).

    CAS  PubMed  Google Scholar 

  29. Kryukov, G.V., Pennacchio, L.A. & Sunyaev, S.R. Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Kryukov, G.V., Shpunt, A., Stamatoyannopoulos, J.A. & Sunyaev, S.R. Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl. Acad. Sci. USA 106, 3871–3876 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Boyko, A.R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008).

    PubMed  PubMed Central  Google Scholar 

  32. Williamson, S.H. et al. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl. Acad. Sci. USA 102, 7882–7887 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Eyre-Walker, A., Woolfit, M. & Phelps, T. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173, 891–900 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Yampolsky, L.Y., Kondrashov, F.A. & Kondrashov, A.S. Distribution of the strength of selection against amino acid replacements in human proteins. Hum. Mol. Genet. 14, 3191–3201 (2005).

    CAS  PubMed  Google Scholar 

  35. Fay, J.C., Wyckoff, G.J. & Wu, C.-I. Positive and negative selection on the human genome. Genetics 158, 1227–1234 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Nachman, M.W. & Crowell, S.L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Kondrashov, A.S. Direct estimates of human per nucleotide mutation rates at 20 loci causing Mendelian diseases. Hum. Mutat. 21, 12–27 (2003).

    CAS  PubMed  Google Scholar 

  38. Roach, J.C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Xue, Y. et al. Human Y chromosome base-substitution mutation rate measured by direct sequencing in a deep-rooting pedigree. Curr. Biol. 19, 1453–1457 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. The HIV Controllers Study. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science 330, 1551–1557 (2010).

  41. Ewens, W.J. The sampling theory of selectively neutral alleles. Theor. Popul. Biol. 3, 87–112 (1972).

    CAS  PubMed  Google Scholar 

  42. Kimura, M. Molecular evolutionary clock and the neutral theory. J. Mol. Evol. 26, 24–33 (1987).

    CAS  PubMed  Google Scholar 

  43. Marth, G.T., Czabarka, E., Murvai, J. & Sherry, S.T. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166, 351–372 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Coventry, A. et al. Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat. Commun. 1, 131 (2010).

    PubMed  Google Scholar 

  45. Li, Y. et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat. Genet. 42, 969–972 (2010).

    CAS  PubMed  Google Scholar 

  46. Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Halushka, M.K. et al. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22, 239–247 (1999).

    CAS  PubMed  Google Scholar 

  48. Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999).

    CAS  PubMed  Google Scholar 

  49. Bustamante, C.D. et al. Natural selection on protein-coding genes in the human genome. Nature 437, 1153–1157 (2005).

    CAS  PubMed  Google Scholar 

  50. Sunyaev, S., Ramensky, V. & Bork, P. Towards a structural basis of human non-synonymous single nucleotide polymorphisms. Trends Genet. 16, 198–200 (2000).

    CAS  PubMed  Google Scholar 

  51. Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001).

    CAS  PubMed  Google Scholar 

  52. McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  53. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  54. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Hellmann, I. et al. Selection on human genes as revealed by comparisons to chimpanzee cDNA. Genome Res. 13, 831–837 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. MacArthur, D.G. & Tyler-Smith, C. Loss-of-function variants in the genomes of healthy humans. Hum. Mol. Genet. 19, R125–R130 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Purcell, S., Cherny, S.S. & Sham, P.C. Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19, 149–150 (2003).

    CAS  PubMed  Google Scholar 

  58. Li, B. & Leal, S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Madsen, B.E. & Browning, S.R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).

    PubMed  PubMed Central  Google Scholar 

  60. Liu, D.J. & Leal, S.M. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet. 6, e1001156 (2010).

    PubMed  PubMed Central  Google Scholar 

  61. Price, A.L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).

    PubMed  PubMed Central  Google Scholar 

  62. Bansal, V., Libiger, O., Torkamani, A. & Schork, N.J. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11, 773–785 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Asimit, J. & Zeggini, E. Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44, 293–308 (2010).

    CAS  PubMed  Google Scholar 

  64. Basu, S. & Pan, W. Comparison of statistical tests for disease association with rare variants. Genet. Epidemiol. 35, 606–619 (2011).

    PubMed  PubMed Central  Google Scholar 

  65. Stitziel, N.O., Kiezun, A. & Sunyaev, S.R. Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol. 12, 227 (2011).

    PubMed  PubMed Central  Google Scholar 

  66. Wu, M.C. et al. Rare variant association testing for sequencing data using the sequence kernel association test (SKAT). Am. J. Hum. Genet. 89, 82–93 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Neale, B.M. et al. Testing for an unusual distribution of rare variants. PLoS Genet. 7, e1001322 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Kotowski, I.K. et al. A spectrum of PCSK9 alleles contributes to plasma levels of low-density lipoprotein cholesterol. Am. J. Hum. Genet. 78, 410–422 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Hoffmann, T.J., Marini, N.J. & Witte, J.S. Comprehensive approach to analyzing rare genetic variants. PLoS ONE 5, e13584 (2010).

    PubMed  PubMed Central  Google Scholar 

  70. Ionita-Laza, I., Buxbaum, J.D., Laird, N.M. & Lange, C. A new testing strategy to identify rare variants with either risk or protective effect on disease. PLoS Genet. 7, e1001289 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Tavtigian, S.V. et al. Rare, evolutionarily unlikely missense substitutions in ATM confer increased risk of breast cancer. Am. J. Hum. Genet. 85, 427–446 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. Sul, J.H., Han, B., He, D. & Eskin, E. An optimal weighted aggregated association test for identification of rare variants involved in common diseases. Genetics 188, 181–188 (2011).

    PubMed  PubMed Central  Google Scholar 

  73. Sul, J.H., Han, B. & Eskin, E. Increasing power of groupwise association test with likelihood ratio test. in Research in Computational Molecular Biology, Lecture Notes in Computer Science Vol. 6577/2011 452–467 (Springer, Berlin/Heidelberg, 2011).

    Google Scholar 

  74. Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Cooper, G.M. et al. Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nat. Methods 7, 250–251 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Ng, P.C. & Henikoff, S. Predicting the effects of amino acid substitutions on protein function. Annu. Rev. Genomics Hum. Genet. 7, 61–80 (2006).

    CAS  PubMed  Google Scholar 

  77. Jordan, D.M., Ramensky, V.E. & Sunyaev, S.R. Human allelic variation: perspective from protein function, structure, and evolution. Curr. Opin. Struct. Biol. 20, 342–350 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Thusberg, J., Olatubosun, A. & Vihinen, M. Performance of mutation pathogenicity prediction methods on missense variants. Hum. Mutat. 32, 358–368 (2011).

    PubMed  Google Scholar 

  79. Cooper, G.M. & Shendure, J. Needles in stacks of needles: finding disease-causing variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).

    CAS  PubMed  Google Scholar 

  80. Hicks, S., Wheeler, D.A., Plon, S.E. & Kimmel, M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 32, 661–668 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Stephens, M. & Balding, D.J. Bayesian statistical methods for genetic association studies. Nat. Rev. Genet. 10, 681–690 (2009).

    CAS  PubMed  Google Scholar 

  82. Sladek, R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881–885 (2007).

    CAS  PubMed  Google Scholar 

  83. Saxena, R. et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331–1336 (2007).

    CAS  PubMed  Google Scholar 

  84. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  85. Drmanac, R. et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).

    CAS  PubMed  Google Scholar 

  86. Lipman, P.J. et al. On the follow-up of genome-wide association studies: an overall test for the most promising SNPs. Genet. Epidemiol. 35, 303–309 (2011).

    PubMed  PubMed Central  Google Scholar 

  87. Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    CAS  PubMed  Google Scholar 

  90. Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    CAS  PubMed  Google Scholar 

  92. Keinan, A., Mullikin, J.C., Patterson, N. & Reich, D. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat. Genet. 39, 1251–1255 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Alexander, D.H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Li, H. & Durbin, R. ast and accurate short read alignment with Burrows Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. Holsinger, K.E. & Weir, B.S. Genetics in geographically structured populations: defining, estimating and interpreting FST . Nat. Rev. Genet. 10, 639–650 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. Clayton, D.G. et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat. Genet. 37, 1243–1246 (2005).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors are grateful to S. Pollack for assistance with EIGENSOFT. This work was made possible, in part, by the US National Institutes of Health (NIH; grant 5R01 MH084676) and, in part, by the International HIV Controllers Study, supported by the Collaboration for AIDS Vaccine Discovery of the Bill and Melinda Gates Foundation (to P.I.W.d.B.), and the AIDS Clinical Trials Group, supported by the NIH (grants AI069513, AI34835, AI069432, AI069423, AI069477, AI069501, AI069474, AI069428, AI69467, AI069415, Al32782, AI27661, AI25859, AI28568, AI30914, AI069495, AI069471, AI069532, AI069452, AI069450, AI069556, AI069484, AI069472, AI34853, AI069465, AI069511, AI38844, AI069424, AI069434, AI46370, AI68634, AI069502, AI069419, AI068636, RR024975 and AI077505). Sequencing of the SCZ control individuals was funded by the NIH (grant RC2MH089905), the Herman Foundation and the Stanley Medical Research Institute. N.O.S. was supported, in part, by an NIH Training Grant (T32-HL07604-25; Division of Cardiovascular Medicine, Brigham and Women's Hospital). B.M.N. was supported by a National Institute of Mental Health (NIMH) grant (1R01MH089208-01). R.D. is supported by a Canadian Institutes of Health Research Banting Postdoctoral Fellowship. The views expressed in this paper do not necessarily represent the views of the NIMH, NIH, Department of Health and Human Services (HHS) or the US government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shamil R Sunyaev.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Note and Supplementary Tables 1 and 2 (PDF 164 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Kiezun, A., Garimella, K., Do, R. et al. Exome sequencing and the genetic basis of complex traits. Nat Genet 44, 623–630 (2012). https://doi.org/10.1038/ng.2303

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2303

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing