Genome-wide association studies (GWAS) are a standard approach for studying the genetics of natural variation. A major concern in GWAS is the need to account for the complicated dependence structure of the data, both between loci as well as between individuals. Mixed models have emerged as a general and flexible approach for correcting for population structure in GWAS. Here, we extend this linear mixed-model approach to carry out GWAS of correlated phenotypes, deriving a fully parameterized multi-trait mixed model (MTMM) that considers both the within-trait and between-trait variance components simultaneously for multiple traits. We apply this to data from a human cohort for correlated blood lipid traits from the Northern Finland Birth Cohort 1966 and show greatly increased power to detect pleiotropic loci that affect more than one blood lipid trait. We also apply this approach to an Arabidopsis thaliana data set for flowering measurements in two different locations, identifying loci whose effect depends on the environment.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Platt, A., Vilhjálmsson, B.J. & Nordborg, M. Conditions under which genome-wide association studies will be positively misleading. Genetics 186, 1045–1052 (2010).
Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).
Hamza, T.H. et al. Genome-wide gene environment study identifies glutamate receptor gene GRIN2A as a Parkinson's disease modifier gene via interaction with coffee. PLoS Genet. 7, e1002237 (2011).
Lynch, M. & Walsh, B. Genetics and Analysis of Quantitative Traits (Sinauer Associates, Sunderland, Massachusetts, 1997).
Jiang, C. & Zeng, Z.B. Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111–1127 (1995).
Ferreira, M.A. & Purcell, S.M. A multivariate test of association. Bioinformatics 25, 132–133 (2009).
Zhang, L., Pei, Y.F., Li, J., Papasian, C.J. & Deng, H.W. Univariate/multivariate genome-wide association scans using data from families and unrelated samples. PLoS ONE 4, e6502 (2009).
Knott, S.A. & Haley, C.S. Multitrait least squares for quantitative trait loci detection. Genetics 156, 899–911 (2000).
Henderson, C.R. Application of Linear Models in Animal Breeding (University of Guelph, Guelph, Canada, 1984).
Thomas, D. Gene–environment-wide association studies: emerging approaches. Nat. Rev. Genet. 11, 259–272 (2010).
Ober, C. & Vercelli, D. Gene-environment interactions in human disease: nuisance or opportunity? Trends Genet. 27, 107–115 (2011).
Yu, J. et al. A unified mixed model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).
Atwell, S. et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010).
Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).
Olsen, H.G. et al. Genome-wide association mapping in Norwegian Red cattle identifies quantitative trait loci for fertility and milk production on BTA12. Anim. Genet. 42, 466–474 (2011).
Tian, F. et al. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162 (2011).
Zhao, K. et al. An Arabidopsis example of association mapping in structured samples. PLoS Genet. 3, e4 (2007).
Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).
Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).
Idaghdour, Y. et al. Geographical genomics of human leukocyte gene expression variation in southern Morocco. Nat. Genet. 42, 62–67 (2010).
International Multiple Sclerosis Genetics Consortium and Wellcome Trust Case Control Consortium 2. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).
Stich, B., Piepho, H.P., Schulz, B. & Melchinger, A.E. Multitrait association mapping in sugar beet (Beta vulgaris L.). Theor. Appl. Genet. 117, 947–954 (2008).
Lee, S.H., Wray, N.R., Goddard, M.E. & Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Lee, S.H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).
Deary, I.J. et al. Genetic contributions to stability and change in intelligence from childhood to old age. Nature 482, 212–215 (2012).
Kim, S. & Xing, E.P. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 5, e1000587 (2009).
Manning, A.K. et al. Meta-analysis of gene-environment interaction: joint estimation of SNP and SNP × environment regression coefficients. Genet. Epidemiol. 35, 11–18 (2011).
Horton, M.W. et al. Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat. Genet. 44, 212–216 (2012).
Sabatti, C. et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41, 35–46 (2009).
Li, Y., Huang, Y., Bergelson, J., Nordborg, M. & Borevitz, J.O. Association mapping of local climate-sensitive quantitative trait loci in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 107, 21199–21204 (2010).
Kathiresan, S. et al. A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med. Genet. 8 (suppl. 1) S17 (2007).
Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Lin, R. & Wang, H. Arabidopsis FHY3/FAR1 gene family and distinct roles of its members in light control of Arabidopsis development. Plant Physiol. 136, 4010–4022 (2004).
Fisher, R. The correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinburgh 52, 399–433 (1918).
Price, A.L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).
Buckler, E.S. et al. The genetic architecture of maize flowering time. Science 325, 714–718 (2009).
Valdar, W. et al. Genetic and environmental effects on complex traits in mice. Genetics 174, 959–984 (2006).
Smith, E.N. & Kruglyak, L. Gene-environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008).
Gilmour, A., Gogel, B., Cullis, B., Welham, S.J. & Thompson, R. ASReml User Guide Release 1.0 (VSN International, Hemel Hempstead, UK, 2002).
Henderson, C. & Quaas, R.L. Multiple trait evaluation using relatives' records. J. Anim. Sci. 43, 1188–1197 (1976).
We thank the NFBC1966 Study Investigators for allowing us to use their phenotype and genotype data in our study. The NFBC1966 Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with the Broad Institute, the University of California, Los Angeles (UCLA), the University of Oulu and the National Institute for Health and Welfare in Finland. This manuscript was not prepared in collaboration with investigators of the NFBC1966 Study and does not necessarily reflect the opinions or views of the NFBC1966 Study Investigators, the Broad Institute, UCLA, the University of Oulu, the National Institute for Health and Welfare in Finland or the NHLBI. We furthermore thank N.B. Freimer and S.K. Service for their help in preprocessing the NFBC1966 data. We would also like to thank P. Forai for excellent IT and cluster support at the Gregor Mendel Institute, the INRA MIGALE bioinformatics platform for further computational resources and J. Dekkers, P. Donnelly, E. Eskin, C. Niango and A. Price for comments on the manuscript and/or helpful discussions. This work was supported by grants to M.N. from the US National Institutes of Health (P50 HG002790) and the European Union Framework Programme 7 (TransPLANT, grant agreement 283496), as well as by grants from the Deutsche Forschungsgemeinschaft (DFG) (A.K., KO4184/1-1) and the Ecologie des Forêts, Prairies et milieux Aquatiques (EFPA) department of INRA (V.S.).
The authors declare no competing financial interests.
About this article
Cite this article
Korte, A., Vilhjálmsson, B., Segura, V. et al. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nat Genet 44, 1066–1071 (2012). https://doi.org/10.1038/ng.2376
Nature Reviews Genetics (2022)
Large-scale genomic analyses reveal insights into pleiotropy across circulatory system diseases and nervous system disorders
Nature Communications (2022)
Theoretical and Applied Genetics (2022)
Molecular Breeding (2022)
Physiology and Molecular Biology of Plants (2022)