Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Pleiotropy in complex traits: challenges and strategies

Key Points

  • Genome-wide association studies have identified many novel loci for hundreds of traits. Interestingly, numerous genetic loci have been associated with multiple seemingly distinct traits. These cross-phenotype (CP) associations highlight the relevance of pleiotropy in human disease.

  • There is substantial evidence for CP associations in contemporary gene-mapping studies.

  • Different types of pleiotropy (biological, mediated and spurious pleiotropy) can underlie a CP association.

  • Various analytical approaches have been devised for detecting CP associations, especially methods that are based on summary statistics as opposed to individual-level data. Different methods have relative advantages and disadvantages and are distinguished by their underlying algorithms and by the types of phenotype data that they handle.

  • Study design considerations are crucial for minimizing the identification of spurious CP associations.

  • CP associations can highlight shared biological pathways and, when associated with different diseases, have clinical implications for diagnosis, counselling and treatment.

Abstract

Genome-wide association studies have identified many variants that each affects multiple traits, particularly across autoimmune diseases, cancers and neuropsychiatric disorders, suggesting that pleiotropic effects on human complex traits may be widespread. However, systematic detection of such effects is challenging and requires new methodologies and frameworks for interpreting cross-phenotype results. In this Review, we discuss the evidence for pleiotropy in contemporary genetic mapping studies, new and established analytical approaches to identifying pleiotropic effects, sources of spurious cross-phenotype effects and study design considerations. We also outline the molecular and clinical implications of such findings and discuss future directions of research.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Types of pleiotropy.

Similar content being viewed by others

References

  1. Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009). Characteristics of reported GWAS results listed in the US National Human Genome Research Institute (NHGRI) catalogue are discussed in this paper.

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Plenge, R. M. et al. Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4. Am. J. Hum. Genet. 77, 1044–1060 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Barrett, J. C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genet. 40, 955–962 (2008).

    CAS  PubMed  Google Scholar 

  4. Kyogoku, C. et al. Genetic association of the R620W polymorphism of protein tyrosine phosphatase PTPN22 with human SLE. Am. J. Hum. Genet. 75, 504–507 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Todd, J. A. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genet. 39, 857–864 (2007).

    CAS  PubMed  Google Scholar 

  6. Fletcher, O. & Houlston, R. S. Architecture of inherited susceptibility to common cancer. Nature Rev. Cancer 10, 353–361 (2010).

    CAS  Google Scholar 

  7. Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013). This paper presents a genome-wide analysis of CP associations across five psychiatric disorders.

  8. Stearns, F. W. One hundred years of pleiotropy: a retrospective. Genetics 186, 767–773 (2010). This is a historical review of pleiotropy.

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Wagner, G. P. & Zhang, J. The pleiotropic structure of the genotype–phenotype map: the evolvability of complex organisms. Nature Rev. Genet. 12, 204–213 (2011). This excellent Review discusses pleiotropy in model organisms and the implications for evolution.

    CAS  PubMed  Google Scholar 

  10. Kendler, K. S., Neale, M. C., Kessler, R. C., Heath, A. C. & Eaves, L. J. Major depression and generalized anxiety disorder. Same genes, (partly) different environments? Arch. Gen. Psychiatry 49, 716–722 (1992).

    CAS  PubMed  Google Scholar 

  11. Criswell, L. A. et al. Analysis of families in the Multiple Autoimmune Disease Genetics Consortium (MADGC) collection: the PTPN22 620W allele associates with multiple autoimmune phenotypes. Am. J. Hum. Genet. 76, 561–571 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Eaton, W. W., Rose, N. R., Kalaydjian, A., Pedersen, M. G. & Mortensen, P. B. Epidemiology of autoimmune diseases in Denmark. J. Autoimmun. 29, 1–9 (2007).

    PubMed  PubMed Central  Google Scholar 

  13. Sivakumaran, S. et al. Abundant pleiotropy in human complex diseases and traits. Am. J. Hum. Genet. 89, 607–618 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011). Systematic evaluation of CP associations is carried out in this study across seven autoimmune diseases and application of CPMA method.

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Sirota, M., Schaub, M. A., Batzoglou, S., Robinson, W. H. & Butte, A. J. Autoimmune disease classification by inverse association with SNP alleles. PLoS Genet. 5, e1000792 (2009).

    PubMed  PubMed Central  Google Scholar 

  16. Jostins, L. et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012). This is the largest study of Crohn's disease and ulcerative colitis and identifies more than 100 CP associations.

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Thorleifsson, G. et al. Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nature Genet. 41, 18–24 (2009).

    CAS  PubMed  Google Scholar 

  18. Iles, M. M. et al. A variant in FTO shows association with melanoma risk not due to BMI. Nature Genet. 45, 428–432 (2013).

    CAS  PubMed  Google Scholar 

  19. Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nature Genet. 43, 333–338 (2011).

    CAS  PubMed  Google Scholar 

  20. The Coronary Artery Disease (C4D) Genetics Consortium. A genome-wide association study in Europeans and South Asians identifies five new loci for coronary artery disease. Nature Genet. 43, 339–344 (2011).

  21. Shete, S. et al. Genome-wide association study identifies five susceptibility loci for glioma. Nature Genet. 41, 899–904 (2009).

    CAS  PubMed  Google Scholar 

  22. Yasuno, K. et al. Genome-wide association study of intracranial aneurysm identifies three new risk loci. Nature Genet. 42, 420–425 (2010).

    CAS  PubMed  Google Scholar 

  23. Tomlinson, I. et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nature Genet. 39, 984–988 (2007).

    CAS  PubMed  Google Scholar 

  24. Thomas, G. et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nature Genet. 40, 310–315 (2008).

    CAS  PubMed  Google Scholar 

  25. Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. 22, 1748–1759 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Malhotra, D. & Sebat, J. CNVs: harbingers of a rare variant revolution in psychiatric genetics. Cell 148, 1223–1241 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Heinzen, E. L. et al. Rare deletions at 16p13.11 predispose to a diverse spectrum of sporadic epilepsy syndromes. Am. J. Hum. Genet. 86, 707–718 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    CAS  PubMed  Google Scholar 

  29. Lichtenstein, P. et al. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 373, 234–239 (2009).

    CAS  PubMed  Google Scholar 

  30. Rieck, M. et al. Genetic variation in PTPN22 corresponds to altered function of T and B lymphocytes. J. Immunol. 179, 4704–4710 (2007).

    CAS  PubMed  Google Scholar 

  31. Menard, L. et al. The PTPN22 allele encoding an R620W variant interferes with the removal of developing autoreactive B cells in humans. J. Clin. Invest. 121, 3635–3644 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhang, J. et al. The autoimmune disease-associated PTPN22 variant promotes calpain-mediated Lyp/Pep degradation associated with lymphocyte and dendritic cell hyperresponsiveness. Nature Genet. 43, 902–907 (2011).

    CAS  PubMed  Google Scholar 

  33. Behrens, T. W. Lyp breakdown and autoimmunity. Nature Genet. 43, 821–822 (2011).

    CAS  PubMed  Google Scholar 

  34. Zhernakova, A., van Diemen, C. C. & Wijmenga, C. Detecting shared pathogenesis from the shared genetics of immune-related diseases. Nature Rev. Genet. 10, 43–55 (2009).

    CAS  PubMed  Google Scholar 

  35. Pomerantz, M. M. et al. The 8q24 cancer risk variant rs6983267 shows long-range interaction with MYC in colorectal cancer. Nature Genet. 41, 882–884 (2009).

    CAS  PubMed  Google Scholar 

  36. Wasserman, N. F., Aneas, I. & Nobrega, M. A. An 8q24 gene desert variant associated with prostate cancer risk confers differential in vivo activity to a MYC enhancer. Genome Res. 20, 1191–1197 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Voight, B. F. et al. Plasma HDL cholesterol and risk of myocardial infarction: a Mendelian randomisation study. Lancet 380, 572–580 (2012). This paper presents an example of Mendelian randomization using results from GWASs.

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Hung, R. J. et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature 452, 633–637 (2008).

    CAS  PubMed  Google Scholar 

  39. Thorgeirsson, T. E. et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 452, 638–642 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Chanock, S. J. & Hunter, D. J. Genomics: when the smoke clears. Nature 452, 537–538 (2008).

    CAS  PubMed  Google Scholar 

  41. Lee, S. H., Yang, J., Goddard, M. E., Visscher, P. M. & Wray, N. R. Estimation of pleiotropy between complex diseases using SNP-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Zeger, S. L. & Liang, K. Y. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42, 121–130 (1986).

    CAS  PubMed  Google Scholar 

  43. Lange, C., Silverman, E. K., Xu, X., Weiss, S. T. & Laird, N. M. A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics 4, 195–206 (2003).

    PubMed  Google Scholar 

  44. Liu, J., Pei, Y., Papasian, C. J. & Deng, H. W. Bivariate association analyses for the mixture of continuous and binary traits with the use of extended generalized estimating equations. Genet. Epidemiol. 33, 217–227 (2009).

    PubMed  PubMed Central  Google Scholar 

  45. Lee, P. H. et al. Modifiers and subtype-specific analyses in whole-genome association studies: a likelihood framework. Hum. Hered. 72, 10–20 (2011).

    PubMed  Google Scholar 

  46. Hartley, S. W., Monti, S., Liu, C. T., Steinberg, M. H. & Sebastiani, P. Bayesian methods for multivariate modeling of pleiotropic SNP associations and genetic risk prediction. Front. Genet. 3, 176 (2012).

    PubMed  PubMed Central  Google Scholar 

  47. O'Reilly, P. F. et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS ONE 7, e34861 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Zhang, H., Liu, C. T. & Wang, X. An association test for multiple traits based on the generalized Kendall's tau. J. Am. Stat. Assoc. 105, 473–481 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Ott, J. & Rabinowitz, D. A principal-components approach based on heritability for combining phenotype information. Hum. Hered. 49, 106–111 (1999).

    CAS  PubMed  Google Scholar 

  50. Lange, C. et al. A family-based association test for repeatedly measured quantitative traits adjusting for unknown environmental and/or polygenic effects. Stat. Appl. Genet. Mol. Biol. 3, Article17 (2004).

    PubMed  Google Scholar 

  51. Klei, L., Luca, D., Devlin, B. & Roeder, K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genet. Epidemiol. 32, 9–19 (2008).

    PubMed  Google Scholar 

  52. Ferreira, M. A. & Purcell, S. M. A multivariate test of association. Bioinformatics 25, 132–133 (2009).

    CAS  PubMed  Google Scholar 

  53. Shriner, D. Moving toward system genetics through multiple trait analysis in genome-wide association studies. Front. Genet. 3, 1 (2012). This is a review of multivariate approaches for detecting CP associations.

    PubMed  PubMed Central  Google Scholar 

  54. Ioannidis, J. P., Thomas, G. & Daly, M. J. Validating, augmenting and refining genome-wide association signals. Nature Rev. Genet. 10, 318–329 (2009).

    CAS  PubMed  Google Scholar 

  55. Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nature Genet. 44, 483–489 (2012).

    CAS  PubMed  Google Scholar 

  56. Fisher, R. A. Statistical Methods for Research Workers (Oliver & Boyd, 1925).

    Google Scholar 

  57. Kavvoura, F. K. & Ioannidis, J. P. Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls. Hum. Genet. 123, 1–14 (2008).

    PubMed  Google Scholar 

  58. de Bakker, P. I. et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 17, R122–R128 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Bhattacharjee, S. et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am. J. Hum. Genet. 90, 821–835 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. O'Brien, P. C. Procedures for comparing samples with multiple endpoints. Biometrics 40, 1079–1087 (1984).

    CAS  PubMed  Google Scholar 

  61. Xu, X., Tian, L. & Wei, L. J. Combining dependent tests for linkage or association across multiple phenotypic traits. Biostatistics 4, 223–229 (2003).

    PubMed  Google Scholar 

  62. Yang, Q., Wu, H., Guo, C. Y. & Fox, C. S. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genet. Epidemiol. 34, 444–454 (2010).

    PubMed  PubMed Central  Google Scholar 

  63. van der Sluis, S., Posthuma, D. & Dolan, C. V. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS Genet. 9, e1003235 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Huang, J., Johnson, A. D. & O'Donnell, C. J. PRIMe: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies. Bioinformatics 27, 1201–1206 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).

    PubMed  PubMed Central  Google Scholar 

  66. Lin, D. Y. & Sullivan, P. F. Meta-analysis of genome-wide association studies with overlapping subjects. Am. J. Hum. Genet. 85, 862–872 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Cortes, A. & Brown, M. A. Promise and pitfalls of the immunochip. Arthritis Res. Ther. 13, 101 (2011).

    PubMed  PubMed Central  Google Scholar 

  68. Voight, B. F. et al. The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 8, e1002793 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Vansteelandt, S. et al. On the adjustment for covariates in genetic association analysis: a novel, simple principle to infer direct causal effects. Genet. Epidemiol. 33, 394–405 (2009).

    PubMed  PubMed Central  Google Scholar 

  70. Lipman, P. J. & Lange, C. CGene: an R package for implementation of causal genetic analyses. Eur. J. Hum. Genet. 19, 1292–1294 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Vanderweele, T. J. & Vansteelandt, S. Odds ratios for mediation analysis for a dichotomous outcome. Am. J. Epidemiol. 172, 1339–1348 (2010).

    PubMed  PubMed Central  Google Scholar 

  72. VanderWeele, T. J. et al. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. Am. J. Epidemiol. 175, 1013–1020 (2012).

    PubMed  PubMed Central  Google Scholar 

  73. Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson, N. & Davey Smith, G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat. Med. 27, 1133–1163 (2008).

    PubMed  Google Scholar 

  74. Glymour, M. M., Tchetgen, E. J. & Robins, J. M. Credible Mendelian randomization studies: approaches for evaluating the instrumental variable assumptions. Am. J. Epidemiol. 175, 332–339 (2012).

    PubMed  PubMed Central  Google Scholar 

  75. McGuffin, P. et al. The heritability of bipolar affective disorder and the genetic relationship to unipolar depression. Arch. Gen. Psychiatry 60, 497–502 (2003).

    PubMed  Google Scholar 

  76. Rommelse, N. N., Franke, B., Geurts, H. M., Hartman, C. A. & Buitelaar, J. K. Shared heritability of attention-deficit/hyperactivity disorder and autism spectrum disorder. Eur. Child Adolesc. Psychiatry 19, 281–295 (2010).

    PubMed  PubMed Central  Google Scholar 

  77. McKay, G. J. et al. Evidence of association of APOE with age-related macular degeneration: a pooled analysis of 15 studies. Hum. Mutat. 32, 1407–1416 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Ferreira, M. A. et al. Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder. Nature Genet. 40, 1056–1058 (2008).

    CAS  PubMed  Google Scholar 

  79. Wang, K. et al. Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects. Hum. Mol. Genet. 19, 2059–2067 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Smyth, D. J. et al. Shared and distinct genetic variants in type 1 diabetes and celiac disease. N. Engl. J. Med. 359, 2767–2777 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Zhernakova, A. et al. Meta-analysis of genome-wide association studies in celiac disease and rheumatoid arthritis identifies fourteen non-HLA shared loci. PLoS Genet. 7, e1002004 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Gregory, A. P. et al. TNF receptor 1 genetic risk mirrors outcome of anti-TNF therapy in multiple sclerosis. Nature 488, 508–511 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Barabasi, A. L., Gulbahce, N. & Loscalzo, J. Network medicine: a network-based approach to human disease. Nature Rev. Genet. 12, 56–68 (2011).

    CAS  PubMed  Google Scholar 

  84. Goh, K. I. et al. The human disease network. Proc. Natl Acad. Sci. USA 104, 8685–8690 (2007). A first step is taken in this study towards the construction of the genotype–phenotype map in humans using known disease genes reported in OMIM (Online Mendelian Inheritance in Man).

    CAS  PubMed  PubMed Central  Google Scholar 

  85. Lee, D. S. et al. The implications of human metabolic network topology for disease comorbidity. Proc. Natl Acad. Sci. USA 105, 9880–9885 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. DePaolo, J., Goker-Alpan, O., Samaddar, T., Lopez, G. & Sidransky, E. The association between mutations in the lysosomal protein glucocerebrosidase and parkinsonism. Mov. Disord. 24, 1571–1578 (2009).

    Google Scholar 

  87. Denny, J. C. et al. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics 26, 1205–1210 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Denny, J. C. et al. Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Am. J. Hum. Genet. 89, 529–542 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Pendergrass, S. A. et al. The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery. Genet. Epidemiol. 35, 410–422 (2011).

    PubMed  PubMed Central  Google Scholar 

  90. Pendergrass, S. A. et al. Phenome-wide association study (PheWAS) for detection of pleiotropy within the population architecture using genomics and epidemiology (PAGE) network. PLoS Genet. 9, e1003087 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Rasmussen-Torvik, L. J. et al. High density GWAS for LDL cholesterol in African Americans using electronic medical records reveals a strong protective variant in APOE. Clin. Transl. Sci. 5, 394–399 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. Smoller, J. W., Lunetta, K. L. & Robins, J. Implications of comorbidity and ascertainment bias for identifying disease genes. Am. J. Med. Genet. 96, 817–822 (2000).

    CAS  PubMed  Google Scholar 

  93. Berkson, J. Limitations of the application of fourfold table analysis to hospital data. Biometrics 2, 47–53 (1946).

    CAS  PubMed  Google Scholar 

  94. Wray, N. R., Lee, S. H. & Kendler, K. S. Impact of diagnostic misclassification on estimation of genetic correlations using genome-wide genotypes. Eur. J. Hum. Genet. 20, 668–674 (2012).

    PubMed  PubMed Central  Google Scholar 

  95. McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008). This Review presents an overview of key considerations and challenges in GWASs.

    CAS  PubMed  Google Scholar 

  96. Laurie, C. C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34, 591–602 (2010).

    PubMed  PubMed Central  Google Scholar 

  97. Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 38, 904–909 (2006).

    CAS  PubMed  Google Scholar 

  98. Price, A. L., Zaitlen, N. A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nature Rev. Genet. 11, 459–463 (2010).

    CAS  PubMed  Google Scholar 

  99. Rosenberg, N. A. et al. Genome-wide association studies in diverse populations. Nature Rev. Genet. 11, 356–366 (2010).

    CAS  PubMed  Google Scholar 

  100. Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nature Rev. Genet. 11, 499–511 (2010).

    CAS  PubMed  Google Scholar 

  101. Kann, M. Advances in translational bioinformatics: computational approaches for the hunting of disease genes. Brief. Bioinform. 11, 96–110 (2010).

    CAS  PubMed  Google Scholar 

  102. Adzhubei, I. et al. A method and server for predicting damaging missense mutations. Nature Methods 7, 248–249 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protoc. 4, 1073–1081 (2009).

    CAS  Google Scholar 

  104. Freedman, M. et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nature Genet. 43, 513–518 (2011).

    CAS  PubMed  Google Scholar 

  105. Fehrmann, R. et al. Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet. 7, e1002197 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  106. Majewski, J. & Pastinen, T. The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet. 27, 72–79 (2011).

    CAS  PubMed  Google Scholar 

  107. Gilad, Y., Rifkin, S. A. & Pritchard, J. K. Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet. 24, 408–415 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  108. Baker, M. Biorepositories: building better biobanks. Nature 486, 141–146 (2012).

    CAS  PubMed  Google Scholar 

  109. Cantor, R., Lange, K. & Sinsheimer, J. S. Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86, 6–22 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. Eleftherohorinou, H. et al. Pathway analysis of GWAS provides new insights into genetic susceptibility to 3 inflammatory diseases. PLoS Genet. 4, e8068 (2009).

    Google Scholar 

  111. Evans, M. J. & Kaufman, M. H. Establishment in culture of pluripotential cells from mouse embryos. Nature 292, 154–156 (1981).

    CAS  PubMed  Google Scholar 

  112. Smithies, O., Gregg, R. G., Boggs, S. S., Koralewski, M. A. & Kucherlapati, R. S. Insertion of DNA sequences into the human chromosomal β-globin locus by homologous recombination. Nature 317, 230–234 (1985).

    CAS  PubMed  Google Scholar 

  113. Thomas, K. R., Folger, K. R. & Capecchi, M. R. High frequency targeting of genes to specific sites in the mammalian genome. Cell 44, 419–428 (1986).

    CAS  PubMed  Google Scholar 

  114. Li, H. et al. In vivo genome editing restores haemostasis in a mouse model of haemophilia. Nature 475, 217–221 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  115. Esvelt, K. M. & Wang, H. H. Genome-scale engineering for systems and synthetic biology. Mol. Syst. Biol. 9, 641 (2013).

    PubMed  PubMed Central  Google Scholar 

  116. Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Rev. Genet. 12, 628–640 (2011).

    CAS  PubMed  Google Scholar 

  117. Hu, X. et al. Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets. Am. J. Hum. Genet. 89, 496–506 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  118. Henderson, N. C. et al. Galectin-3 regulates myofibroblast activation and hepatic fibrosis. Proc. Natl Acad. Sci. USA 103, 5060–5065 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  119. Radosavljevic, G. et al. The roles of galectin-3 in autoimmunity and tumor progression. Immunol. Res. 52, 100–110 (2012).

    CAS  PubMed  Google Scholar 

  120. Honjo, Y., Nangia-Makker, P., Inohara, H. & Raz, A. Down-regulation of galectin-3 suppresses tumorigenicity of human breast carcinoma cells. Clin. Cancer Res. 7, 661–668 (2001).

    CAS  PubMed  Google Scholar 

  121. Shekhar, M. P., Nangia-Makker, P., Tait, L., Miller, F. & Raz, A. Alterations in galectin-3 expression and distribution correlate with breast cancer progression: functional analysis of galectin-3 in breast epithelial-endothelial interactions. Am. J. Pathol. 165, 1931–1941 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  122. Baptiste, T. A., James, A., Saria, M. & Ochieng, J. Mechano-transduction mediated secretion and uptake of galectin-3 in breast carcinoma cells: implications in the extracellular functions of the lectin. Exp. Cell Res. 313, 652–664 (2007).

    CAS  PubMed  Google Scholar 

  123. Nangia-Makker, P. et al. Cleavage of galectin-3 by matrix metalloproteases induces angiogenesis in breast cancer. Int. J. Cancer 127, 2530–2541 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  124. Palmer, T. M. et al. Using multiple genetic variants as instrumental variables for modifiable risk factors. Stat. Methods Med. Res. 21, 223–242 (2012).

    PubMed  PubMed Central  Google Scholar 

  125. Duerr, R. H. et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314, 1461–1463 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  126. Evans, D. M. et al. Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nature Genet. 43, 761–767 (2011).

    CAS  PubMed  Google Scholar 

  127. Silverberg, M. S. et al. Ulcerative colitis-risk loci on chromosomes 1p36 and 12q15 found by genome-wide association study. Nature Genet. 41, 216–220 (2009).

    CAS  PubMed  Google Scholar 

  128. Strange, A. et al. A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nature Genet. 42, 985–990 (2010).

    CAS  PubMed  Google Scholar 

  129. Franke, A. et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nature Genet. 42, 1118–1125 (2010).

    CAS  PubMed  Google Scholar 

  130. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  131. Broderick, P. et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nature Genet. 44, 58–61 (2012).

    CAS  Google Scholar 

  132. Sanders, S. J. et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70, 863–885 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  133. Pober, B. R. Williams-Beuren syndrome. N. Engl. J. Med. 362, 239–252 (2010).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the US National Institute of Mental Health (NIMH) grants R01-MH079799 and K24MH094614 (both to J.W.S.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jordan W. Smoller.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

PowerPoint slides

Glossary

Genome-wide association studies

(GWASs). Studies in which hundreds of thousands (or millions) of genetic markers are tested for association with a phenotypic trait; they are an unbiased approach to survey the entire genome for disease-associated regions using common variation.

Genome-wide-significant

A term describing the statistical significance threshold that accounts for multiple testing in GWASs.

Complex traits

Traits controlled by a combination of many genes and environmental factors.

Pleiotropy

A gene or genetic variant that affects more than one phenotypic trait.

Heritability

The proportion of phenotypic variance attributed to genetic differences among individuals in a population.

Colocalizing

Different genetic variants in high linkage disequilibrium located in the same gene that affect different phenotypes.

Single-nucleotide polymorphisms

Single-nucleotides in the genome that vary across individuals in the population.

Linkage disequilibrium

(LD). The correlation between genetic markers owing to limited recombination.

Copy number variants

Regions of the genome in which the copy number is polymorphic (for example, deletions and duplications) across individuals.

Polygenic

Controlled by many genes.

Population stratification

A source of bias in genome-wide association studies that occurs when a phenotype and the allele frequency of a single-nucleotide polymorphism vary owing to ancestral differences.

Batch effect

Systematic biases in the data that arise from differences in sample handling.

Genotype imputation

Inference of missing genotypes or untyped single-nucleotide polymorphisms using statistical techniques.

Ascertainment bias

A consequence of collecting a nonrandom subsample with a systematic bias so that results based on the subsample are not representative of the entire sample.

Tag SNPs

Single-nucleotide polymorphisms (SNPs) chosen to represent a region of the genome owing to strong linkage disequilibrium.

Multivariate analyses

The simultaneous inclusion of two or more phenotypes in one analysis when testing the association with a genetic variant.

Univariate analyses

Tests of association between one phenotype and a genetic variant.

Polygenic scoring

A score that aggregates the number of risk alleles a subject carries weighted by the effect size of the allele for a particular trait. The risk allele and effect size for each single-nucleotide polymorphism is generally taken from a genome-wide association study of an independent study.

Linear mixed-effect model

A linear model that contains both fixed and random effects. This type of model can be used to estimate genetic correlation between traits using a genome-wide set of single-nucleotide polymorphisms.

Cohort studies

Observational studies in which defined groups of people (the cohorts) are followed over time and outcomes are compared in subsets of the cohort who were exposed to different levels of factors of interest. These studies can either be prospectively or retrospectively carried out from historical records.

Cross-sectional studies

Studies in which data are collected on subjects at one specific point in time and subjects are not selected for a particular trait or exposure.

Case–control study

Compares cases (that is, a selected group of individuals: for example, those diagnosed with a disorder) with controls (that is, a comparison group of individuals: for example, those who are not diagnosed with the disorder). Genome-wide association case–control studies test whether genetic marker allele frequencies differ between cases and controls.

Generalized estimating equations

A statistical technique used to estimate regression parameters that does not require the joint distribution of the variables to be fully specified.

Log-linear model

A statistical model that captures the dependence among a set of categorical variables.

Bayesian network

A network that captures relationships between variables or nodes of interest (for example, phenotypes and SNPs). Bayesian networks can incorporate prior information in establishing relationships between variables.

Ordinal regression

A regression model in which the outcome variable is ordinal.

Non-parametric approach

A statistical analysis method that does not rely on specific distributional assumptions (for example, normality) for the variables being analysed.

Principal components analysis

A statistical method used to simplify data sets by transforming a series of correlated variables into a smaller number of uncorrelated factors. It is also commonly used to infer continuous axes of variation in genetic data, often representing genetic ancestry.

Summary statistics

A statistic that summarizes a set of observations. In the context of genome-wide association studies, meta-analyses can be carried out solely by using summary statistics and typically include estimates of the effect size (for example, odds ratio) and standard error.

Effect heterogeneity

Different effect sizes across phenotypes.

Expression quantitative trait loci

Loci at which genetic allelic variation is associated with variation in gene expression.

Fine mapping

Extensively genotyping or sequencing a region of the genome that was identified in genome-wide association studies to identify the causal variant.

Confounding factor

A variable (for example, batch effects or population structure) that is associated with both the genotype and the phenotype of interest and can give rise to a spurious association.

Genetic architecture

A genetic model (that is, the number of single-nucleotide polymorphisms, effect sizes, allele frequency, and so on) underlying a phenotypic trait.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Solovieff, N., Cotsapas, C., Lee, P. et al. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 14, 483–495 (2013). https://doi.org/10.1038/nrg3461

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg3461

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing