Synopsis

Subject Categories: Functional genomics | Plant Biology

Molecular Systems Biology 5 Article number: 242  doi:10.1038/msb.2008.79
Published online: 17 February 2009
Citation: Molecular Systems Biology 5:242

Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

Stephanie Plantegenet1,a, Johann Weber2,a, Darlene R Goldstein3,4, Georg Zeller5,6, Cindy Nussbaumer1, Jérôme Thomas2, Detlef Weigel6, Keith Harshman2 & Christian S Hardtke1

  1. Department of Plant Molecular Biology, University of Lausanne, Biophore Building, Lausanne, Switzerland
  2. Lausanne DNA Array Facility, Center for Integrative Genomics, University of Lausanne, Genopode Building, Lausanne, Switzerland
  3. École Polytechnique Fédérale de Lausanne (EPFL), Institut de mathématiques (IMA), Bâtiment MA, Lausanne, Switzerland
  4. Swiss Institute of Bioinformatics, Lausanne, Switzerland
  5. Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany
  6. Max Planck Institute for Developmental Biology, Department of Molecular Biology, Tübingen, Germany

Correspondence to: Christian S Hardtke1 Department of Plant Molecular Biology, University of Lausanne, Biophore Building, Lausanne, CH-1015, Switzerland. Tel.: +41 21 692 4251; Fax: +41 21 692 4195; Email: christian.hardtke@unil.ch

Received 4 September 2008; Accepted 18 December 2008; Published online 17 February 2009

aThese authors contributed equally to this work

Top

Article highlights

  • Heritable gene expression level polymorphisms (ELPs) between natural strains are strong candidates for quantitative trait loci (QTL) that could explain intra-specific phenotypic variation.
  • Here we test the assumption that ELPs with simple, single locus inheritance primarily represent sequence variation in the corresponding regulatory sequences in Arabidopsis thaliana, through comprehensive genome-wide analyses linking variation in expression level, regulatory sequence and gene structure.
  • We find that a large fraction of genes representing ELPs with simple inheritance carry uni-parental indels that likely impair gene function. Thus, these ELPs primarily appear to reflect the consequences of structural differences in the corresponding genes, rather than variation in regulatory elements, even if such variation is observed.
  • Our results are in line with the experimentally observed preponderance of indels with drastic effects on gene integrity in cloned Arabidopsis QTL, suggesting that they do not reflect a technical bias and that Arabidopsis QTL representing more subtle regulatory polymorphisms might be less common than anticipated.

Top

Synopsis

Microarray technologies have had a major impact on quantitative genetic analyses, enabling, for instance, large-scale discovery of genetically controlled gene expression level differences. This has led to the identification of numerous expression quantitative trait loci (eQTL) in various organisms (Brem et al, 2002; Morley et al, 2004; Doss et al, 2005; Li et al, 2006; West et al, 2007; Stranger et al, 2007b; Potokina et al, 2008). In this study, we focused on expression level polymorphisms (ELPs) that are observed between parents and display simple, single locus inheritance. Such loci constitute a highly heritable subset of cis-acting eQTL (Doss et al, 2005; Petretto et al, 2006; West et al, 2006, 2007; Keurentjes et al, 2007; Stranger et al, 2007b; Potokina et al, 2008) and are strong quantitative trait locus (QTL) candidates to explain phenotypic variation between parental lines. Despite the abundance of ELPs with simple inheritance, little is known about their molecular basis. Generally, however, they are assumed to reflect cis-acting sequence polymorphisms in regulatory elements of the corresponding genes (Jansen and Nap, 2001; Cowles et al, 2002; Schadt et al, 2003; Pastinen and Hudson, 2004; Ronald et al, 2005; Williams et al, 2007), although few studies have addressed this issue systematically (Cowles et al, 2002; GuhaThakurta et al, 2006).

Counter to the idea that regulatory polymorphisms are major determinants of phenotypic variability, QTL cloning in Arabidopsis thaliana has mostly identified knockout mutations as the underlying molecular cause (e.g. Aukerman et al, 1997; Grant et al, 1995; Johanson et al, 2000; Kliebenstein et al, 2001; Kroymann et al, 2003; Kroymann et al, 2001; Mouchel et al, 2004; Werner et al, 2005). Even if many of these loci represent ELPs, generally, a preponderance of indels is observed among these drastic mutations (Koornneef et al, 2004). However, because structural changes are easier to discover, successful reports of QTL isolation might reflect a technical bias towards knockout alleles. Indeed, studies of recombinant inbred line (RIL) populations created from Arabidopsis accessions have identified numerous eQTL (Keurentjes et al, 2007; West et al, 2007), including a sizable fraction of loci representing parental ELPs with simple inheritance. In this study, we analyzed the molecular basis of such ELPs in greater detail, by comprehensive comparison of gene expression, sequence variation and gene structure.

We identified parental ELPs in 480 genes through microarray analyses of the Arabidopsis accessions Eil-0 and Lc-0, a number comparable with studies of other accessions (West et al, 2006; Keurentjes et al, 2007). Among them, ELPs with simple inheritance were determined by comparison between single nucleotide polymorphism (SNP)-genotyped RILs derived from the two accessions and their two parents in microarray analyses. Approximately 20% of parental ELPs displayed simple cis-inheritance; taking into account technical limitations (such as arbitrary thresholds for differential expression), this group is very likely to be even bigger (approx44%).

To determine whether these ELPs with simple inheritance are associated with increased regulatory sequence variation in the corresponding genes, we compared their promoter sequences with control genes that displayed very low variability across all microarray experiments. Indeed, sequence diversity between Eil-0 and Lc-0 was considerably higher in the ELP sample, supporting the regulatory hypothesis. However, parallel analyses of genomic Eil-0 and Lc-0 DNA using Arabidopsis tiling arrays, empirically calibrated for indel detection using the sequence data (Figure 3), revealed numerous indels of various sizes in both Eil-0 and Lc-0. Such indels were particularly abundant in genes representing ELPs with simple inheritance and, unlike in controls, were generally larger and frequently affected exons. Thus, uni-parental indels that likely impair or even abolish gene function appear to be much more frequent in genes representing ELPs with simple inheritance than in controls. In the vast majority of cases, the allele that carried deletions was expressed at lower level, consistent with the idea that the majority of deletions negatively affect gene function and lead to loss of selection on gene maintenance and consequently expression. Supporting this notion, parental ELPs that carried indels in their coding region also displayed increased regulatory sequence variation. Notably, the observation that alleles carrying deletions were expressed at lower levels could simply reflect decreased hybridization signal if the deletion overlapped with the probe. However, except for a minority of loci with large deletions, this was generally not evident from the expression array analyses.

Figure 3
Figure 3 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Genomic tiling array analysis of the Eil-0 and Lc-0 genomic DNA hybridized against a tile of the Col-0 genome. Two independent hybridizations were performed for each genotype. For classification of deletions, thresholds were determined by an empirical approach based on the promoter sequencing data described in Figure 2. The deduced settings of a signal drop below 2.8-fold (–1.5 on log2 scale), a minimum run >35 and for maximum gap less than or equal to150 allowed detection of indels >30 bp, but detected neither smaller indels nor SNPs. Examples are shown for tiles of individual sequenced regions. (A) Promoter region of At1g29030. No polymorphisms were observed among Lc-0 or Eil-0 as compared with Col-0 or each other. (B) Promoter region of At1g05830. Sequencing revealed a few dispersed small indels and SNPs between the three genotypes. (C) Promoter region of At1g13650. Sequencing revealed an extended stretch of many small indels and SNPs. (D) Promoter region of At1g33480. Sequencing revealed several small indels and SNPs. Only a 39 bp deletion in Eil-0 is picked up as a positive (red horizontal bars) by our settings. Gene structure is shown at the bottom of each panel (thick yellow blocks, exons; small yellow blocks, UTRs; yellow lines, introns). Difference in hybridization signal between Lc-0 or Eil-0 versus Col-0 along the oligonucleotides representing the tiling path is shown as white vertical bars. Upward deviation from the base line indicates positive hybridization signal, downward deviation negative hybridization signal.

Figure 2
Figure 2 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Sequence analysis of regulatory regions of a sample of 61 genes representing parental ELPs with simple inheritance and a control group of 85 genes, which displayed very low variability and differential expression (see Supplementary Materials and methods) in the array experiments ('controls'). For the ELPs with simple inheritance, only genes which perfectly matched predictions (see Figure 1C), and for which at least 10 precise predictions could be made (i.e. loci located in unambiguous chromosome segments in at least five RIL) were included. (A) Summary of sequence analyses of regulatory regions from 61 ELPs with simple inheritance and 85 control genes. Observed total absolute values (tot. line), per gene average values (av. line) and median values (me. line) are indicated. Note that numbers for promoter sequences and 5' leader sequences do not add up to the total, because leader sequences were not defined for all genes investigated. (B) Relative abundance of SNPs (based on total sequence investigated). (C) Relative amount of bp affected by indels (based on total sequence investigated). Asterisks indicate t-test significance between the ELPs with simple inheritance and the control group (*P<0.05; **P<0.01; NS, not significant).

Figure 1
Figure 1 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Assessment of ELP heritability by microarray analyses. RIL from a cross between Eil-0 and Lc-0 were genotyped with a set of 79 genome-wide SNP markers (Warthmann et al, 2007), defining the parental origin of chromosome segments. (A) Principles for the assessment of the heritability of ELPs observed between the Eil-0 and Lc-0 parents. Genotyped RIL from the S6 generation were compared with both parents in dye swap replicates. Based on the RIL genotype for a particular chromosome segment as determined by the flanking SNP markers, differential expression of a parental ELP locus on this chromosome segment was not expected in hybridizations of the RIL against the parent from whom the segment was inherited. However, differential expression (>2-fold) was expected in hybridizations against the other parent. ELPs located in regions of ambiguous genotype, i.e. heterozygous regions or segments spanning recombination breakpoints, were omitted from the analysis of that particular RIL. (B) Summary of parental ELP behavior in the hybridizations of the seven RILs (EL lines) against the two parent lines based on the principles outlined in (A). (C) Percentage of parental ELPs matching predictions across all RIL-parent hybridizations at a given frequency (100 or the 10% intervals below).

Full figure and legend (308K)Figures & Tables index

Full figure and legend (139K)Figures & Tables index

Full figure and legend (180K)Figures & Tables index

To independently corroborate our findings, we analyzed reported ELPs with simple inheritance that were found between the Arabidopsis Bay-0 and Sha accessions (West et al, 2006, 2007), using a different microarray platform and a different conceptual approach to extract heritable ELPs. Again, in genome tiling arrays, we observed a strong preponderance of indels in the ELPs with simple inheritance (>6-fold enrichment) as compared with controls. Similar to our results for Eil-0 and Lc-0, their majority occurred at the level of exons (approx33% of loci) or genes (18%).

These findings were corroborated by a recently developed algorithm (Zeller et al, 2008) that identified reduced or absent hybridization signal over extended tracts in an oligonucleotide array-based re-sequencing effort of the Bay-0 and Sha genomes. Matching our tiling array analysis, in nearly all cases, stretches of reduced hybridization matched the presence of deletions as detected by the tiling array approach (Figure 6).

Figure 6
Figure 6 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Indel analysis and polymorphic region prediction (PRP) analysis of genes representing ELPs with simple inheritance and controls between the Bay-0 and Sha accessions. (A) Percentage of genes representing ELPs with simple inheritance or controls (same group Figure 5) that carry indels in one parent as compared with the other or that display similar gene structure. (B) Detailed classification of genes shown in (A), categories similar to Figure 5B. (C, D) PRP predictions. The graphs (logarithmic scale) represent total PRP size observed in a given gene (in bp, equaling sum of all individual PRPs with respect to the gene model Atxgyyyyy.1, TAIR 7.0 annotation) detected in one accession plotted against the same value for the other accession. Classification of genes is similar to Figure 5B. (C) Genes representing ELPs with simple inheritance. (D) Control genes.

Figure 5
Figure 5 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Summary of indel analyses. (A, B) Indel analysis of genes representing parental ELPs between the Eil-0 and Lc-0 accessions. (A) Correlation between strict ELP heritability (matching of hybridization predictions, see Figure 1C) and presence of deletions in the corresponding genes in one of the parents. Percentage of genes in each class displaying structural changes between Eil-0 and Lc-0 ('indels') or not ('similar'). Controls represent an extended group of 97 genes as described in Figure 2. (B) Detailed classification of the parental ELPs and controls shown in (A). None: no indels detected in Eil-0 as compared with Lc-0; introns: indel(s) detected in intron(s) of one parent as compared with the other; UTRs: indel(s) detected in UTR(s) or UTR(s) and intron(s) of one parent as compared with the other; exons: indel(s) detected in exon(s) or exons, UTR(s) and/or intron(s) of one parent as compared with the other; whole gene: >50% of gene deleted or duplicated in one parent as compared with the other. (C) Correlation between the presence of indels in the coding region and increased sequence variation in the corresponding 5' regulatory regions in the parental ELP genes. The quartiles as well as the average (wider line) are indicated. The distribution between the two groups is statistically significant (P<0.0390, t-test). (D) Expression microarray hybridization signal distribution of all genes in the Eil-0 versus Lc-0 parent comparison. (E) As in (D), shown for the parental ELPs with simple inheritance.

Full figure and legend (306K)Figures & Tables index

Full figure and legend (302K)Figures & Tables index

In summary, our data suggest that ELPs with simple inheritance in Arabidopsis primarily reflect the consequences of structural differences in the corresponding genes, rather than variation in regulatory elements, even if such a variation is observed. Thus, although functional variation in cis-regulatory elements contributes clearly to phenotypic variation (Bentsink et al, 2006; Rus et al, 2006; Sibout et al, 2008), large-effect changes that impact the integrity of transcribed regions should be considered as an equally valid explanation for expression variation. Indeed, the prevalence of indels in ELPs with simple inheritance mirrors the preponderance of indels with drastic effect on gene integrity underlying cloned QTL, suggesting that the latter do not reflect a technical bias in the ease of detection. Thus, Arabidopsis QTL representing more subtle regulatory polymorphisms might be less common than anticipated.

Top

Acknowledgements

We would like to thank Dr K Osmont for helpful comments on the manuscript, O Hagenbüchle and A Paillusson for Affymetrix tiling array hybridizations and E Farmer and P Reymond for the PCR products used to make the custom-spotted DNA microarrays. Contributions: CSH, KH, SP and JW conceived this study and analyzed the data together with DRG and GZ. CSH wrote the manuscript with help from KH, SP, JW, DRG and DW. Recombinant inbred lines were contributed by CSH and SP. All molecular biology experiments except microarray hybridizations were performed by SP. Microarray hybridizations were performed by CN and JT. Statistical analyses of microarray experiments were performed by DRG, JW and GZ. DRG was funded by the Swiss National Science Foundation National Centre for Competence in Research (Plant Survival). This work was supported by the University of Lausanne, by Swiss National Science Foundation Grant 3100A0-107631 to CSH and by the SystemsX 'Plant growth in a changing environment' project funding for CSH.

Top

References

  1. Aukerman MJ, Hirschfeld M, Wester L, Weaver M, Clack T, Amasino RM, Sharrock RA (1997) A deletion in the PHYD gene of the Arabidopsis Wassilewskija ecotype defines a role for phytochrome D in red/far-red light sensing. Plant cell 9: 1317–1326 | Article | PubMed | ISI | ChemPort |
  2. Bentsink L, Jowett J, Hanhart CJ, Koornneef M (2006) Cloning of DOG1, a quantitative trait locus controlling seed dormancy in Arabidopsis. Proc Natl Acad Sci USA 103: 17042–17047 | Article | PubMed | ADS | ChemPort |
  3. Brem RB, Yvert G, Clinton R, Kruglyak L (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296: 752–755 | Article | PubMed | ISI | ADS | ChemPort |
  4. Cowles CR, Hirschhorn JN, Altshuler D, Lander ES (2002) Detection of regulatory variation in mouse genes. Nat Genet 32: 432–437 | Article | PubMed | ISI | ChemPort |
  5. Doss S, Schadt EE, Drake TA, Lusis AJ (2005) Cis-acting expression quantitative trait loci in mice. Genome Res 15: 681–691 | Article | PubMed | ISI | ChemPort |
  6. Grant MR, Godiard L, Straube E, Ashfield T, Lewald J, Sattler A, Innes RW, Dangl JL (1995) Structure of the Arabidopsis RPM1 gene enabling dual specificity disease resistance. Science 269: 843–846 | Article | PubMed | ISI | ADS | ChemPort |
  7. GuhaThakurta D, Xie T, Anand M, Edwards SW, Li G, Wang SS, Schadt EE (2006) Cis-regulatory variations: a study of SNPs around genes showing cis-linkage in segregating mouse populations. BMC Genomics 7: 235 | Article | PubMed | ChemPort |
  8. Jansen RC, Nap JP (2001) Genetical genomics: the added value from segregation. Trends Genet 17: 388–391 | Article | PubMed | ISI | ChemPort |
  9. Johanson U, West J, Lister C, Michaels S, Amasino R, Dean C (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290: 344–347 | Article | PubMed | ISI | ADS | ChemPort |
  10. Keurentjes JJ, Fu J, Terpstra IR, Garcia JM, van den Ackerveken G, Snoek LB, Peeters AJ, Vreugdenhil D, Koornneef M, Jansen RC (2007) Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc Natl Acad Sci USA 104: 1708–1713 | Article | PubMed | ADS | ChemPort |
  11. Kliebenstein DJ, Lambrix VM, Reichelt M, Gershenzon J, Mitchell-Olds T (2001) Gene duplication in the diversification of secondary metabolism: tandem 2-oxoglutarate-dependent dioxygenases control glucosinolate biosynthesis in Arabidopsis. Plant Cell 13: 681–693 | Article | PubMed | ISI | ChemPort |
  12. Koornneef M, Alonso-Blanco C, Vreugdenhil D (2004) Naturally occurring genetic variation in Arabidopsis thaliana. Annu Rev Plant Biol 55: 141–172 | Article | PubMed | ISI | ChemPort |
  13. Kroymann J, Donnerhacke S, Schnabelrauch D, Mitchell-Olds T (2003) Evolutionary dynamics of an Arabidopsis insect resistance quantitative trait locus. Proc Natl Acad Sci USA 100 (Suppl 2): 14587–14592 | ADS |
  14. Kroymann J, Textor S, Tokuhisa JG, Falk KL, Bartram S, Gershenzon J, Mitchell-Olds T (2001) A gene controlling variation in Arabidopsis glucosinolate composition is part of the methionine chain elongation pathway. Plant Physiol 127: 1077–1088 | Article | PubMed | ChemPort |
  15. Li Y, Alvarez OA, Gutteling EW, Tijsterman M, Fu J, Riksen JA, Hazendonk E, Prins P, Plasterk RH, Jansen RC, Breitling R, Kammenga JE (2006) Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet 2: e222 | Article | PubMed | ChemPort |
  16. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430: 743–747 | Article | PubMed | ISI | ADS | ChemPort |
  17. Mouchel CF, Briggs GC, Hardtke CS (2004) Natural genetic variation in Arabidopsis identifies BREVIS RADIX, a novel regulator of cell proliferation and elongation in the root. Genes Dev 18: 700–714 | Article | PubMed
  18. Pastinen T, Hudson TJ (2004) Cis-acting regulatory variation in the human genome. Science 306: 647–650 | Article | PubMed | ISI | ADS | ChemPort |
  19. Petretto E, Mangion J, Dickens NJ, Cook SA, Kumaran MK, Lu H, Fischer J, Maatz H, Kren V, Pravenec M, Hubner N, Aitman TJ (2006) Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet 2: e172 | Article | PubMed | ChemPort |
  20. Potokina E, Druka A, Luo Z, Wise R, Waugh R, Kearsey M (2008) Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome-wide transcriptional regulation. Plant J 53: 90–101 | PubMed | ChemPort |
  21. Ronald J, Brem RB, Whittle J, Kruglyak L (2005) Local regulatory variation in Saccharomyces cerevisiae. PLoS Genet 1: e25 | Article | PubMed | ChemPort |
  22. Rus A, Baxter I, Muthukumar B, Gustin J, Lahner B, Yakubova E, Salt DE (2006) Natural variants of AtHKT1 enhance Na+ accumulation in two wild populations of Arabidopsis. PLoS Genet 2: e210 | Article | PubMed | ChemPort |
  23. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, Friend SH (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422: 297–302 | Article | PubMed | ISI | ADS | ChemPort |
  24. Sibout R, Plantegenet S, Hardtke CS (2008) Flowering as a condition for xylem expansion in Arabidopsis hypocotyl and root. Curr Biol 18: 458–463 | Article | PubMed | ChemPort |
  25. Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, Beazley C, Ingle CE, Dunning M, Flicek P, Koller D, Montgomery S, Tavare S, Deloukas P, Dermitzakis ET (2007b) Population genomics of human gene expression. Nat Genet 39: 1217–1224 | Article | PubMed | ChemPort |
  26. Werner JD, Borevitz JO, Warthmann N, Trainer GT, Ecker JR, Chory J, Weigel D (2005) Quantitative trait locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural flowering-time variation. Proc Natl Acad Sci USA 102: 2460–2465 | Article | PubMed | ADS | ChemPort |
  27. West MA, Kim K, Kliebenstein DJ, van Leeuwen H, Michelmore RW, Doerge RW, St Clair DA (2007) Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175: 1441–1450 | Article | PubMed | ChemPort |
  28. West MA, van Leeuwen H, Kozik A, Kliebenstein DJ, Doerge RW, St Clair DA, Michelmore RW (2006) High-density haplotyping with microarray-based expression and single feature polymorphism markers in Arabidopsis. Genome Res 16: 787–795 | Article | PubMed | ChemPort |
  29. Williams RB, Chan EK, Cowley MJ, Little PF (2007) The influence of genetic variation on gene expression. Genome Res 17: 1707–1716 | Article | PubMed | ChemPort |
  30. Zeller G, Clark RM, Schneeberger K, Bohlen A, Weigel D, Ratsch G (2008) Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays. Genome Res 18: 918–929 | Article | PubMed | ChemPort |

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

RESEARCH

System-wide molecular evidence for phenotypic buffering in Arabidopsis

Nature Genetics Brief Communication (01 Feb 2009)

The genetics of plant metabolism

Nature Genetics Technical Report (01 Jul 2006)

Extra navigation

.
ADVERTISEMENT