Although pioneered by human geneticists as a potential solution to the challenging problem of finding the genetic basis of common human diseases1, 2, genome-wide association (GWA) studies have, owing to advances in genotyping and sequencing technology, become an obvious general approach for studying the genetics of natural variation and traits of agricultural importance. They are particularly useful when inbred lines are available, because once these lines have been genotyped they can be phenotyped multiple times, making it possible (as well as extremely cost effective) to study many different traits in many different environments, while replicating the phenotypic measurements to reduce environmental noise. Here we demonstrate the power of this approach by carrying out a GWA study of 107 phenotypes in Arabidopsis thaliana, a widely distributed, predominantly self-fertilizing model plant known to harbour considerable genetic variation for many adaptively important traits3. Our results are dramatically different from those of human GWA studies, in that we identify many common alleles of major effect, but they are also, in many cases, harder to interpret because confounding by complex genetics and population structure make it difficult to distinguish true associations from false. However, a-priori candidates are significantly over-represented among these associations as well, making many of them excellent candidates for follow-up experiments. Our study demonstrates the feasibility of GWA studies in A. thaliana and suggests that the approach will be appropriate for many other organisms.
At a glance
- Genome-wide association studies for common diseases and complex traits. Nature Rev. Genet. 6, 95–108 (2005) &
- Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
- Naturally occurring genetic variation in Arabidopsis thaliana . Annu. Rev. Plant Biol. 55, 141–172 (2004) , &
- The pattern of polymorphism in Arabidopsis thaliana . PLoS Biol. 3, e196 (2005) et al.
- Role of FRIGIDA and FLC in determining variation in flowering time of Arabidopsis thaliana . Plant Physiol. 138, 1163–1173 (2005) et al.
- Recombination and linkage disequilibrium in Arabidopsis thaliana . Nature Genet. 39, 1151–1155 (2007) et al.
- The extent of linkage disequilibrium in Arabidopsis thaliana . Nature Genet. 30, 190–193 (2002) et al.
- Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet. 1, e60 (2005) et al.
- An Arabidopsis example of association mapping in structured samples. PLoS Genet. 3, e4 (2007) et al.
- Association mapping in structured populations. Am. J. Hum. Genet. 67, 170–181 (2000) , , &
- Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 38, 904–909 (2006) et al.
- A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genet. 38, 203–208 (2005) et al.
- Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008) et al.
- Structure of the Arabidopsis RPM1 gene enabling dual-specificity disease resistance. Science 269, 843–846 (1995) et al.
- Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290, 344–347 (2000) et al.
- A non-parametric test reveals selection for rapid flowering in the Arabidopsis genome. PLoS Biol. 4, e137 (2006) et al.
- FLOWERING LOCUS C encodes a novel MADS domain protein that acts as a repressor of flowering. Plant Cell 11, 949–956 (1999) &
- Cloning of DOG1, a quantitative trait locus controlling seed dormancy in Arabidopsis . Proc. Natl Acad. Sci. USA 103, 17042–17047 (2006) , , &
- Natural variants of AtHKT1 enhance Na+ accumulation in two wild populations of Arabidopsis . PLoS Genet. 2, e210 (2006) et al.
- Variation in molybdenum content across broadly distributed populations of Arabidopsis thaliana is controlled by a mitochondrial molybdenum transporter (MOT1). PLoS Genet. 4, e1000004 (2008) et al.
- A single amino acid replacement in ETC2 acts as major modifier of trichome patterning in natural Arabidopsis populations. Curr. Biol. 19, 1747–1751 (2009) , &
- ACD6, a novel ankyrin protein, is a regulator and an effector of salicylic acid signaling in the Arabidopsis defense response. Plant Cell 15, 2408–2420 (2003) , , &
- Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009) et al.
- A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS Genet. 4, e1000074 (2008) et al.
- A genomewide association study of skin pigmentation in a South Asian population. Am. J. Hum. Genet. 81, 1119–1132 (2007) et al.
- Genomic control for association studies. Biometrics 55, 997–1004 (1999) &
- A general population-genetic model for the production by population structure of spurious genotype-phenotype associations in discrete, admixed, or spatially distributed populations. Genetics 173, 1665–1678 (2006) &
- Next-generation genetics in plants. Nature 456, 720–723 (2008) &
- Supplementary Information (34.4M)
This file contains Supplementary Information which comprises: 1 Genotyping; 2 Association Mapping Methods; 3 Enrichment for a priori candidates, Supplementary Figures 1-152 with legends, Supplementary References and Supplementary Tables 1-7.