A mixed-model approach for genome-wide association studies of correlated traits in structured populations

Abstract

Genome-wide association studies (GWAS) are a standard approach for studying the genetics of natural variation. A major concern in GWAS is the need to account for the complicated dependence structure of the data, both between loci as well as between individuals. Mixed models have emerged as a general and flexible approach for correcting for population structure in GWAS. Here, we extend this linear mixed-model approach to carry out GWAS of correlated phenotypes, deriving a fully parameterized multi-trait mixed model (MTMM) that considers both the within-trait and between-trait variance components simultaneously for multiple traits. We apply this to data from a human cohort for correlated blood lipid traits from the Northern Finland Birth Cohort 1966 and show greatly increased power to detect pleiotropic loci that affect more than one blood lipid trait. We also apply this approach to an Arabidopsis thaliana data set for flowering measurements in two different locations, identifying loci whose effect depends on the environment.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Simulation results.
Figure 2: GWAS of LDL and triglycerides.
Figure 3: Venn diagrams summarizing the GWAS of A. thaliana flowering data32.
Figure 4: Summary of FRS6 results.

References

  1. 1

    Platt, A., Vilhjálmsson, B.J. & Nordborg, M. Conditions under which genome-wide association studies will be positively misleading. Genetics 186, 1045–1052 (2010).

    Article  Google Scholar 

  2. 2

    Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    CAS  Article  Google Scholar 

  3. 3

    Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

    CAS  Article  Google Scholar 

  4. 4

    Hamza, T.H. et al. Genome-wide gene environment study identifies glutamate receptor gene GRIN2A as a Parkinson's disease modifier gene via interaction with coffee. PLoS Genet. 7, e1002237 (2011).

    CAS  Article  Google Scholar 

  5. 5

    Lynch, M. & Walsh, B. Genetics and Analysis of Quantitative Traits (Sinauer Associates, Sunderland, Massachusetts, 1997).

  6. 6

    Jiang, C. & Zeng, Z.B. Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111–1127 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Ferreira, M.A. & Purcell, S.M. A multivariate test of association. Bioinformatics 25, 132–133 (2009).

    CAS  Article  Google Scholar 

  8. 8

    Zhang, L., Pei, Y.F., Li, J., Papasian, C.J. & Deng, H.W. Univariate/multivariate genome-wide association scans using data from families and unrelated samples. PLoS ONE 4, e6502 (2009).

    Article  Google Scholar 

  9. 9

    Knott, S.A. & Haley, C.S. Multitrait least squares for quantitative trait loci detection. Genetics 156, 899–911 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10

    Henderson, C.R. Application of Linear Models in Animal Breeding (University of Guelph, Guelph, Canada, 1984).

  11. 11

    Thomas, D. Gene–environment-wide association studies: emerging approaches. Nat. Rev. Genet. 11, 259–272 (2010).

    CAS  Article  Google Scholar 

  12. 12

    Ober, C. & Vercelli, D. Gene-environment interactions in human disease: nuisance or opportunity? Trends Genet. 27, 107–115 (2011).

    CAS  Article  Google Scholar 

  13. 13

    Yu, J. et al. A unified mixed model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).

    CAS  Article  Google Scholar 

  14. 14

    Atwell, S. et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010).

    CAS  Article  Google Scholar 

  15. 15

    Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).

    CAS  Article  Google Scholar 

  16. 16

    Olsen, H.G. et al. Genome-wide association mapping in Norwegian Red cattle identifies quantitative trait loci for fertility and milk production on BTA12. Anim. Genet. 42, 466–474 (2011).

    CAS  Article  Google Scholar 

  17. 17

    Tian, F. et al. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162 (2011).

    CAS  Article  Google Scholar 

  18. 18

    Zhao, K. et al. An Arabidopsis example of association mapping in structured samples. PLoS Genet. 3, e4 (2007).

    Article  Google Scholar 

  19. 19

    Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).

    Article  Google Scholar 

  20. 20

    Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).

    CAS  Article  Google Scholar 

  21. 21

    Idaghdour, Y. et al. Geographical genomics of human leukocyte gene expression variation in southern Morocco. Nat. Genet. 42, 62–67 (2010).

    CAS  Article  Google Scholar 

  22. 22

    International Multiple Sclerosis Genetics Consortium and Wellcome Trust Case Control Consortium 2. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).

  23. 23

    Stich, B., Piepho, H.P., Schulz, B. & Melchinger, A.E. Multitrait association mapping in sugar beet (Beta vulgaris L.). Theor. Appl. Genet. 117, 947–954 (2008).

    Article  Google Scholar 

  24. 24

    Lee, S.H., Wray, N.R., Goddard, M.E. & Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

    Article  Google Scholar 

  25. 25

    Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    CAS  Article  Google Scholar 

  26. 26

    Lee, S.H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).

    CAS  Article  Google Scholar 

  27. 27

    Deary, I.J. et al. Genetic contributions to stability and change in intelligence from childhood to old age. Nature 482, 212–215 (2012).

    CAS  Article  Google Scholar 

  28. 28

    Kim, S. & Xing, E.P. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 5, e1000587 (2009).

    Article  Google Scholar 

  29. 29

    Manning, A.K. et al. Meta-analysis of gene-environment interaction: joint estimation of SNP and SNP × environment regression coefficients. Genet. Epidemiol. 35, 11–18 (2011).

    Article  Google Scholar 

  30. 30

    Horton, M.W. et al. Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat. Genet. 44, 212–216 (2012).

    CAS  Article  Google Scholar 

  31. 31

    Sabatti, C. et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41, 35–46 (2009).

    CAS  Article  Google Scholar 

  32. 32

    Li, Y., Huang, Y., Bergelson, J., Nordborg, M. & Borevitz, J.O. Association mapping of local climate-sensitive quantitative trait loci in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 107, 21199–21204 (2010).

    CAS  Article  Google Scholar 

  33. 33

    Kathiresan, S. et al. A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med. Genet. 8 (suppl. 1) S17 (2007).

    Article  Google Scholar 

  34. 34

    Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    CAS  Article  Google Scholar 

  35. 35

    Lin, R. & Wang, H. Arabidopsis FHY3/FAR1 gene family and distinct roles of its members in light control of Arabidopsis development. Plant Physiol. 136, 4010–4022 (2004).

    CAS  Article  Google Scholar 

  36. 36

    Fisher, R. The correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinburgh 52, 399–433 (1918).

    Article  Google Scholar 

  37. 37

    Price, A.L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).

    CAS  Article  Google Scholar 

  38. 38

    Buckler, E.S. et al. The genetic architecture of maize flowering time. Science 325, 714–718 (2009).

    CAS  Article  Google Scholar 

  39. 39

    Valdar, W. et al. Genetic and environmental effects on complex traits in mice. Genetics 174, 959–984 (2006).

    CAS  Article  Google Scholar 

  40. 40

    Smith, E.N. & Kruglyak, L. Gene-environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008).

    Article  Google Scholar 

  41. 41

    Gilmour, A., Gogel, B., Cullis, B., Welham, S.J. & Thompson, R. ASReml User Guide Release 1.0 (VSN International, Hemel Hempstead, UK, 2002).

  42. 42

    Henderson, C. & Quaas, R.L. Multiple trait evaluation using relatives' records. J. Anim. Sci. 43, 1188–1197 (1976).

    Article  Google Scholar 

Download references

Acknowledgements

We thank the NFBC1966 Study Investigators for allowing us to use their phenotype and genotype data in our study. The NFBC1966 Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with the Broad Institute, the University of California, Los Angeles (UCLA), the University of Oulu and the National Institute for Health and Welfare in Finland. This manuscript was not prepared in collaboration with investigators of the NFBC1966 Study and does not necessarily reflect the opinions or views of the NFBC1966 Study Investigators, the Broad Institute, UCLA, the University of Oulu, the National Institute for Health and Welfare in Finland or the NHLBI. We furthermore thank N.B. Freimer and S.K. Service for their help in preprocessing the NFBC1966 data. We would also like to thank P. Forai for excellent IT and cluster support at the Gregor Mendel Institute, the INRA MIGALE bioinformatics platform for further computational resources and J. Dekkers, P. Donnelly, E. Eskin, C. Niango and A. Price for comments on the manuscript and/or helpful discussions. This work was supported by grants to M.N. from the US National Institutes of Health (P50 HG002790) and the European Union Framework Programme 7 (TransPLANT, grant agreement 283496), as well as by grants from the Deutsche Forschungsgemeinschaft (DFG) (A.K., KO4184/1-1) and the Ecologie des Forêts, Prairies et milieux Aquatiques (EFPA) department of INRA (V.S.).

Author information

Affiliations

Authors

Contributions

All authors helped design the study. A.K., B.J.V. and V.S. developed the theory and implemented the simulations. A.K., B.J.V. and M.N. wrote the paper with input from V.S., A.P. and Q.L.

Corresponding author

Correspondence to Magnus Nordborg.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–7, Supplementary Figures 1–14 and Supplementary Note (PDF 2464 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Korte, A., Vilhjálmsson, B., Segura, V. et al. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nat Genet 44, 1066–1071 (2012). https://doi.org/10.1038/ng.2376

Download citation

Further reading

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing