Levels of gene expression underpin organismal phenotypes1,2, but the nature of selection that acts on gene expression and its role in adaptive evolution remain unknown1,2. Here we assayed gene expression in rice (Oryza sativa)3, and used phenotypic selection analysis to estimate the type and strength of selection on the levels of more than 15,000 transcripts4,5. Variation in most transcripts appears (nearly) neutral or under very weak stabilizing selection in wet paddy conditions (with median standardized selection differentials near zero), but selection is stronger under drought conditions. Overall, more transcripts are conditionally neutral (2.83%) than are antagonistically pleiotropic6 (0.04%), and transcripts that display lower levels of expression and stochastic noise7,8,9 and higher levels of plasticity9 are under stronger selection. Selection strength was further weakly negatively associated with levels of cis-regulation and network connectivity9. Our multivariate analysis suggests that selection acts on the expression of photosynthesis genes4,5, but that the efficacy of selection is genetically constrained under drought conditions10. Drought selected for earlier flowering11,12 and a higher expression of OsMADS18 (Os07g0605200), which encodes a MADS-box transcription factor and is a known regulator of early flowering13—marking this gene as a drought-escape gene11,12. The ability to estimate selection strengths provides insights into how selection can shape molecular traits at the core of gene action.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Genome Biology Open Access 07 September 2022
Nature Communications Open Access 23 July 2022
Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships
Nature Communications Open Access 24 September 2021
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Raw FASTQ reads for 188 accessions with resequenced genomes were downloaded from the SRA under SRA BioProject accession numbers PRJNA422249 and PRJNA557122. Raw FASTQ reads for a further 27 accessions included in the 3K-RG project were downloaded from the SRA under BioProject accession number PRJEB6180. RNA sequence data that support the findings of this study have been deposited under SRA BioProject accession number PRJNA588478. Processed RNA expression count data have been deposited in Zenodo (https://zenodo.org/record/3533431 with DOI 10.5281/zenodo.3533431), alongside a sample metadata file with a key to the RNA sequence data in SRA BioProject accession number PRJNA588478. This key can also be found in Supplementary Table 4. Source Data for Figs. 1–4 and Extended Data Figs. 1–8 are provided with the paper.
Selection analyses were run using custom-made scripts in Python version 2.7, which are available in Supplementary Notes 1, 2, and on GitHub in repositories icalic/Linear-regression-analysis (https://github.com/icalic/Linear-regression-analysis.git) and icalic/Logistic-regression-analysis (https://github.com/icalic/Logistic-regression-analysis.git). For all other analyses we used previously developed, publicly available software and code: leaf area was assessed using ImageJ v.1.52 and GIMP v.2.10.0; RNA-seq data were processed and analysed using Drop-seq tools v.1.12, STAR aligner v.020201, Picard tools v.2.9.0, DChip v.2010.01 and R v.3.4.3 packages edgeR v.3.14 and lme4 v.1.1; gene-set enrichment analyses were performed using PlantGSEA v.1; statistical analyses were performed in R v.3.4.3, further using packages lme4 v.1.1 and corpcor v.1.6.9; and genome analyses were performed using bbduk v.37.66, bwa-mem v.0.7.16a-r1181, the GATK GenotypeGVCFs engine v.3.8-0-ge9d806836, vcftools v.0.1.15, jvarkit suite v.1, Beagle v.4.1, plink v.1.9 and GAPIT v.3.
Fay, J. C. & Wittkopp, P. J. Evaluating the role of natural selection in the evolution of gene regulation. Heredity 100, 191–199 (2008).
Romero, I. G., Ruvinsky, I. & Gilad, Y. Comparative studies of gene expression and the evolution of gene regulation. Nat. Rev. Genet. 13, 505–516 (2012).
Wing, R. A., Purugganan, M. D. & Zhang, Q. The rice genome revolution: from an ancient grain to green super rice. Nat. Rev. Genet. 19, 505–517 (2018).
Kingsolver, J. G. et al. The strength of phenotypic selection in natural populations. Am. Nat. 157, 245–261 (2001).
Lande, R. & Arnold, S. J. The measurement of selection on correlated characters. Evolution 37, 1210–1226 (1983).
Anderson, J. T., Lee, C. R., Rushworth, C. A., Colautti, R. I. & Mitchell-Olds, T. Genetic trade-offs and conditional neutrality contribute to local adaptation. Mol. Ecol. 22, 699–708 (2013).
Lemos, B., Bettencourt, B. R., Meiklejohn, C. D. & Hartl, D. L. Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein–protein interactions. Mol. Biol. Evol. 22, 1345–1354 (2005).
Lehner, B. Selection to minimise noise in living systems and its implications for the evolution of gene expression. Mol. Syst. Biol. 4, 170 (2008).
MacNeil, L. T. & Walhout, A. J. Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression. Genome Res. 21, 645–657 (2011).
Conner, J. & Via, S. Natural selection on body size in Tribolium: possible genetic constraints on adaptive evolution. Heredity 69, 73–83 (1992).
Franks, S. J. Plasticity and evolution in drought avoidance and escape in the annual plant Brassica rapa. New Phytol. 190, 249–257 (2011).
Kumar, A. et al. Breeding high-yielding drought-tolerant rice: genetic variations and conventional and molecular approaches. J. Exp. Bot. 65, 6265–6278 (2014).
Fornara, F. et al. Functional characterization of OsMADS18, a member of the AP1/SQUA subfamily of MADS box genes. Plant Physiol. 135, 2207–2219 (2004).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Ayroles, J. F. et al. Systems genetics of complex traits in Drosophila melanogaster. Nat. Genet. 41, 299–307 (2009).
Conner, J. Field measurements of natural and sexual selection in the fungus beetle, Bolitotherus cornutus. Evolution 42, 736–749 (1988).
Hoekstra, H. E. et al. Strength and tempo of directional selection in the wild. Proc. Natl Acad. Sci. USA 98, 9157–9160 (2001).
Nourmohammad, A. et al. Adaptive evolution of gene expression in Drosophila. Cell Rep. 20, 1385–1395 (2017).
Ghalambor, C. K. et al. Non-adaptive plasticity potentiates rapid adaptive evolution of gene expression in nature. Nature 525, 372–375 (2015).
Kenkel, C. D. & Matz, M. V. Gene expression plasticity as a mechanism of coral adaptation to a variable environment. Nat. Ecol. Evol. 1, 0014 (2016).
Zhang, L. & Li, W. H. Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Mol. Biol. Evol. 21, 236–239 (2004).
Hendry, A. P. & Kinnison, M. T. The pace of modern life: measuring rates of contemporary microevolution. Evolution 53, 1637–1653 (1999).
Duveau, F. et al. Fitness effects of altering gene expression noise in Saccharomyces cerevisiae. eLife 7, e37272 (2018).
Jimenez-Gomez, J. M., Corwin, J. A., Joseph, B., Maloof, J. N. & Kliebenstein, D. J. Genomic analysis of QTLs and genes altering natural variation in stochastic noise. PLoS Genet. 7, e1002295 (2011).
Plessis, A. et al. Multiple abiotic stimuli are integrated in the regulation of rice gene expression under field conditions. eLife 4, e08411 (2015).
Wilkins, O. et al. EGRINs (environmental gene regulatory influence networks) in rice that function in the response to water deficit, high temperature, and agricultural environments. Plant Cell 28, 2365–2384 (2016).
Huang, X. et al. Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat. Genet. 44, 32–39 (2011).
Wang, Y. et al. Background-independent quantitative trait loci for drought tolerance identified using advanced backcross introgression lines in rice. Crop Sci. 53, 430–441 (2013).
Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034.e6 (2019).
Zaidem, M. L., Groen, S. C. & Purugganan, M. D. Evolutionary and ecological functional genomics, from lab to the wild. Plant J. 97, 40–55 (2019).
Keurentjes, J. J. et al. Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc. Natl Acad. Sci. USA 104, 1708–1713 (2007).
Caicedo, A. L. et al. Genome-wide patterns of nucleotide polymorphism in domesticated rice. PLoS Genet. 3, e163 (2007).
Garris, A. J., Tai, T. H., Coburn, J., Kresovich, S. & McCouch, S. Genetic structure and diversity in Oryza sativa L. Genetics 169, 1631–1638 (2005).
Gutaker, R. M. et al. Genomic history and ecology of the geographic spread of rice. Preprint at bioRxiv https://doi.org/10.1101/748178 (2019).
McCouch, S. R. et al. Open access resources for genome-wide association mapping in rice. Nat. Commun. 7, 10532 (2016).
McNally, K. L. et al. Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. Proc. Natl Acad. Sci. USA 106, 12273–12278 (2009).
Torres, R. O., McNally, K. L., Cruz, C. V., Serraj, R. & Henry, A. Screening of rice genebank germplasm for yield and selection of new drought tolerance donors. Field Crops Res. 147, 12–22 (2013).
Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).
Abramoff, M. D., Magalhaes, P. J. & Ram, S. J. Image processing with ImageJ. Biophoton. Int. 11, 36–42 (2004).
Bracken, B. Barcoded plate-based single cell RNA-seq. https://www.protocols.io/view/barcoded-plate-based-single-cell-rna-seq-nkgdctw (2018).
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Soumillon, M., Cacchiarelli, D., Semrau, S., van Oudenaarden, A. & Mikkelsen, T. S. Characterization of directed differentiation by high-throughput single-cell RNA-seq. Preprint at bioRxiv https://doi.org/10.1101/003236 (2014).
Li, C. & Wong, W. H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
R Core Team. R: a language and environment for statistical computing. http://www.R-project.org/ (R Foundation for Statistical Computing, Vienna, 2016).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Bates, D., Maechler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Yi, X., Du, Z. & Su, Z. PlantGSEA: a gene set enrichment analysis toolkit for plant community. Nucleic Acids Res. 41, W98–W103 (2013).
Brodie, E. D. III, Moore, A. J. & Janzen, F. J. Visualizing and quantifying natural selection. Trends Ecol. Evol. 10, 313–318 (1995).
Janzen, F. J. & Stern, H. S. Logistic regression for empirical studies of multivariate selection. Evolution 52, 1564–1571 (1998).
Koenig, W. D., Albano, S. S. & Dickinson, J. L. A comparison of methods to partition selection acting via components of fitness: do larger male bullfrogs have greater hatching success? J. Evol. Biol. 4, 309–320 (1991).
Kassambara, A. Practical Guide to Principal Component Methods in R: PCA, M (CA), FAMD, MFA, HCPC, factoextra (STHDA, 2017).
Davidson, R. M. et al. Comparative transcriptomics of three Poaceae species reveals patterns of gene expression evolution. Plant J. 71, 492–502 (2012).
Yanai, I. et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21, 650–659 (2005).
Schäfer, J. & Strimmer, K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4, Article32 (2005).
Larracuente, A. M. et al. Evolution of protein-coding genes in Drosophila. Trends Genet. 24, 114–123 (2008).
Keren, L. et al. Noise in gene expression is coupled to growth rate. Genome Res. 25, 1893–1902 (2015)
Hieno, A. et al. ppdb: plant promoter database version 3.0. Nucleic Acids Res. 42, D1188–D1192 (2014).
Yamamoto, Y. Y. et al. Identification of plant promoter constituents by analysis of local distribution of short sequences. BMC Genomics 8, 67 (2007).
Weirauch, M. T. et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443 (2014).
Proost, S. et al. PLAZA: a comparative genomics resource to study gene and genome evolution in plants. Plant Cell 21, 3718–3731 (2009).
Van Bel, M. et al. PLAZA 4.0: an integrative resource for functional, evolutionary and comparative plant genomics. Nucleic Acids Res. 46, D1190–D1196 (2018).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
Van der Auwera, G. A. et al. From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Tropf, F. C. et al. Human fertility, molecular genetics, and natural selection in modern societies. PLoS ONE 10, e0126821 (2015).
Lipka, A. E. et al. GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397–2399 (2012).
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).
Segura, V. et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44, 825–830 (2012).
Bland, J. M. & Altman, D. G. Multiple significance tests: the Bonferroni method. Br. Med. J. 310, 170 (1995).
Fournier-Level, A. et al. A map of local adaptation in Arabidopsis thaliana. Science 334, 86–89 (2011).
Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).
Mather, K. A. et al. The extent of linkage disequilibrium in rice (Oryza sativa L.). Genetics 177, 2223–2232 (2007).
Zhao, K. et al. Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat. Commun. 2, 467 (2011).
We thank B. U. Principe, P. C. Maturan and L. Holongbayan for assistance with field management, tissue sampling and trait measurements; the staff of IRRI’s Climate Unit for providing weather data; Z. Fresquez for help with tissue processing; L. Harshman for assistance with a pilot RNA-seq run; the New York University Center for Genomics and Systems Biology GenCore Facility for sequencing support; and New York University High Performance Computing for supplying computational resources. We are grateful to current and former members of the Purugganan laboratory (particularly J. Flowers, R. Gutaker, A. Plessis, O. Wilkins and M. Zaidem) and the IRRI Strategic Innovation and Rice Breeding research platforms (particularly S. Dixit, A. Kohli, Y. Ludwig, K. McNally, R. Oliva, V. Roman-Reyna and N. Tsakirpaloglou) for insightful discussions; M. Quintana for sharing scripts in R; and S. Zaaijer for codesigning the figures. This work was funded in part by grants from the Zegar Family Foundation, the National Science Foundation Plant Genome Research Program and the NYU Abu Dhabi Research Institute to M.D.P., a fellowship from the Natural Sciences and Engineering Research Council of Canada through Grant PDF-502464-2017 to Z.J.-L., and a fellowship from the Gordon and Betty Moore Foundation/Life Sciences Research Foundation through Grant GBMF2550.06 to S.C.G.
The authors declare no competing interests.
Peer review information Nature thanks Anthony Greenberg, Detlef Weigel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Geographical origins of 220 O. sativa accessions, of which 4 constitute additionally replicated checks (Supplementary Table 1). Seven accessions that are not from Eurasia or Africa are not shown. Varietal group (vg.) Indica accessions are indicated in indigo and vg. Japonica accessions are indicated in jade. Map data ©2019 Google. b, Populations of Indica and Japonica accessions (planted in triplicate alongside one another) were monitored for total lifetime fitness in wet (magenta) and dry (blue) fields. Both fields had identical layouts. Numbers reflect Indica populations with 3 × 136 accessions = 408 individuals planted in each field; Extended Data Fig. 3 shows Japonica populations. Under drought conditions, both multiplicative fitness components (flowering success (lime) and fecundity (green)) were relevant (multiplying to total lifetime fitness), but in wet conditions only the latter was relevant (fecundity equating to total lifetime fitness, magenta). c, Drought exerts truncating selection on the populations (declining and shifting blue versus magenta bar), and end-of-season was reached earlier under drought conditions. d, Cumulative rainfall shows one major rainfall event that caused the rainout shelter over the dry field to close temporarily after the start of the drought treatment and the sampling of leaf tissue for RNA sequencing (>51 DAS). e, During the period of flowering (>51 DAS), there was an increasing deficit in soil water potential. f, g, Patterns of volumetric soil moisture and vapour pressure deficit (VPD) were consistent with the pattern of soil water potential. Lighter shades of grey in f indicate deeper layers of soil. Grey and mustard lines in g indicate the VPD in the wet and dry field, respectively. h, Day length increased over the course of the experiment. i, Air temperature generally increased over the course of the experiment (grey and mustard lines indicate the wet and dry field, respectively).
Extended Data Fig. 2 Systems genetics of gene expression in the Indica populations in wet and dry field environments.
a, Environmental bias for transcript expression. Magenta and blue dots represent transcripts showing a 1.5-fold difference in expression between the wet and dry field environments, respectively. ANOVA, Indica environment FDR-adjusted q < 0.001, n = 136 accessions. b, Distribution of cross-environment genetic correlations (rWD) for transcripts showing significant (blue) genotype × environment (G × E) variance. ANOVA, Indica genotype × environment FDR-adjusted q < 0.001, n = 136 accessions.
Extended Data Fig. 3 Systems genetics of gene expression in the Japonica populations in wet and dry field environments.
a, Monitoring the Japonica populations, with 3 × 84 accessions = 252 individuals planted in both the wet and dry fields, for flowering success, fecundity fitness and total lifetime fitness (legend as in Extended Data Fig. 1b, c). b, Environmental bias for transcript expression. Magenta and blue dots represent transcripts showing a 1.5-fold difference in expression between the wet and dry field environments, respectively. ANOVA, Japonica environment FDR-adjusted q < 0.01, n = 84 accessions. c, Distribution of broad-sense heritabilities (H2) for transcripts with significant expression polymorphism. ANOVA, Japonica genotype FDR-adjusted q < 0.01, n = 84 accessions. d, Distribution of cross-environment genetic correlations (rWD) for transcripts showing significant (blue) genotype × environment (G × E) variance. ANOVA, Japonica genotype × environment FDR-adjusted q < 0.01, n = 84 accessions.
Extended Data Fig. 4 The strength and pattern of selection on Indica rice-leaf transcript levels under drought conditions differ across fitness components.
a, The strength of selection |S| on gene expression differed between selection for flowering success (lime), and fecundity (green) in the dry field. Mann–Whitney U-test, two-sided P < 0.001, n = 15,343. b, Positive directional selection (n = 11,304) was stronger than negative selection (n = 4,039) for fecundity under drought (green) (Mann–Whitney U-test, two-sided P < 0.001), and selection for flowering success showed higher absolute values (Kolmogorov–Smirnov test, two-sided P < 0.001, n = 15,343). c, Patterns of quadratic selection differed significantly for the two fitness components. Kolmogorov–Smirnov test, two-sided P < 0.001, n = 15,343. d, Patterns of conditional neutrality (light grey) and antagonistic pleiotropy (lime and green for transcripts beneficial for flowering success and fecundity, respectively) for gene expression under drought conditions. Black indicates transcripts that experienced selection in the same direction for both fitness components.
Extended Data Fig. 5 Stochastic expression noise and transcript connectivity limit the efficacy of selection on gene expression.
a, b, Partial correlation analyses of factors that negatively (grey) and positively (mustard) influence the strength of selection |S| on gene expression for flowering success (a) and fecundity (b) fitness in dry conditions. Dots indicate statistical significance of Pearson’s partial r (t-test, two-sided P < 0.05, n = 14,753) (Supplementary Table 14). c, Global expression stochasticity limits fecundity under drought conditions. Spearman’s ρ = −0.174, t-test, two-sided P = 0.042, n = 136 accessions. d, As in wet conditions, |S| is bounded by expression connectivity under drought conditions. Kruskal–Wallis test, P = 0.0008, n = 12,502 transcripts. Left, box plot with centre line = median, cross = mean, box limits = upper and lower quartiles, whiskers = 1.5 × interquartile range, points = outliers. Right, mean ± s.e.m. e, In dry as well as in wet conditions, |S| is limited by gene regulatory constraints as assessed through the number of cis-regulatory elements in the promoter (n = 3,907 transcripts, Mann–Whitney U-test, two-sided P = 0.000015), and the number of transcription factors regulating a gene (n = 2,905 transcripts, Mann–Whitney U-test, two-sided P = 0.0027) illustrated for selection for total lifetime fitness under drought. Left, boxes and whiskers as in d. Right, mean ± s.e.m.
Extended Data Fig. 6 Distributions of transcript–trait correlations for the three higher-level traits measured in the dry field environment.
a, Absolute Pearson’s correlations |r| of transcripts with leaf area (green). n = 15,635 transcripts. The cloud delineates transcripts (listed) that show significant linear or quadratic selection differentials for fecundity under drought conditions, and significant correlations with leaf area (Supplementary Text). b, Absolute Pearson’s correlations |r| of transcripts with chlorophyll concentration (green). n = 15,635 transcripts. The cloud delineates a transcript that shows a significant quadratic selection differential for fecundity under drought conditions, and a significant correlation with chlorophyll concentration (Supplementary Text). c, Absolute Pearson’s correlations |r| of transcripts with flowering time (lime). n = 15,635 transcripts. The cloud delineates transcripts (listed) that show significant linear selection differentials for flowering success under drought conditions, and significant correlations with early flowering (Supplementary Text).
Extended Data Figure 7 Genome-wide association mapping of the genetic architecture of transcripts that covary significantly with fitness in the Indica population under drought conditions.
Three out of eight transcripts are partially controlled by trans-eQTLs (illustrated for expression of the glycine-rich family protein-coding gene Os11g0209000 under drought conditions). Supplementary Table 27 provides results for other transcripts and for expression principal components or eigengenes as suites of transcripts. a, PCA of 179,634 SNP markers from the Indica population that were selected for analysis; the three principal components, plus a fourth, were included as cofactors in the multi-locus linear mixed model. b, Distribution of expected versus observed P values for associations between SNP markers and Os11g0209000 expression in a Q–Q plot. n = 131 genotypes; multi-locus linear mixed model, two-sided, Bonferroni-adjusted P < 0.05 for 179,634 SNP markers. c, The Manhattan plot indicates two significant trans-eQTL peaks for expression of Os11g0209000 (gene location indicated with vertical red bar). Only the top approximately 5% of SNPs (10,000 SNPs) are shown.
Extended Data Fig. 8 Genome-wide association mapping for fitness in the wet and dry field environments.
Taking the top approximately 0.5% of SNPs (1,000 SNPs) with the strongest association to total lifetime fitness in the wet (magenta) and dry (blue) field conditions after genome-wide association mapping, we observed no enrichment for transcripts (n = 809 and 142 transcripts in the wet and dry fields, respectively) that were expressed in the leaves and had significant linear selection differentials S (n = 408 plants, t-test, two-sided, unadjusted P < 0.05) among transcripts (n = 1,960 transcripts in the wet field and n = 1,671 transcripts in the dry field) from genes in 100-kb regions surrounding these SNPs, compared to transcripts from genes in other genomic regions (χ2, not significant (ns); two-sided P = 0.862 for the wet field and P = 0.85 for the dry field). Supplementary Table 27 provides genome-wide association mapping results for total lifetime fitness in wet and dry conditions, and for flowering success and fecundity under drought conditions.
This file contains Supplementary Text and References, and Supplementary Notes 1-2
Supplementary Table 5 | Systems genetics analysis of variance in the transcriptome of the Indica population in wet and dry field conditions
Supplementary Table 6 | Gene set enrichment analysis of transcripts showing environmentally biased expression patterns in the Indica population
Supplementary Table 7 | Systems genetics analysis of variance in the transcriptome of the Japonica population in wet and dry field conditions
Supplementary Table 8 | Gene set enrichment analysis of transcripts showing environmentally biased expression patterns in the Japonica population
Supplementary Table 10 | Selection differentials for the Indica population across field environments and fitness components
Supplementary Table 11 | Gene set enrichment analyses on the tails of the distributions of |S| for the Indica population across field environments and fitness components
Supplementary Table 13 | Metadata per transcript of factors that may influence the strength of selection on gene expression
Supplementary Table 15 | Global levels of stochastic expression noise per Indica accession in each of the two field environments
Supplementary Table 16 | Global levels of gene expression plasticity per Indica accession in each of the two field environments
Supplementary Table 17 | Metadata per transcript and analysis of gene regulatory network factors that may influence the strength of selection on gene expression
Supplementary Table 18 | Principal components/eigengenes as suites of transcripts for multivariate selection analyses for the Indica population in wet and dry field conditions
Supplementary Table 19 | Statistical analyses for the higher-level traits measured in the Indica and Japonica populations in wet and dry field conditions
Supplementary Table 20 | Multivariate selection analyses on the higher-level traits for the Indica and Japonica populations in wet and dry field conditions
Supplementary Table 21 | Gene set term enrichment analyses on the tails of the distributions of principal components for the transcriptomes of the Indica population across field environments and fitness components
Supplementary Table 23 | Strength of selection on genes grouped by gene ontology biological process for the Indica population in the two field environments
Supplementary Table 26 | Principal component (PC) loadings for SNPs on PCs included as cofactors in genome-wide association mapping
Supplementary Table 27 | Genome-wide association mapping of fitness and (suites of) transcripts under selection in the Indica population across field environments
About this article
Cite this article
Groen, S.C., Ćalić, I., Joly-Lopez, Z. et al. The strength and pattern of natural selection on gene expression in rice. Nature 578, 572–576 (2020). https://doi.org/10.1038/s41586-020-1997-2
This article is cited by
Genome Biology (2022)
Nature Communications (2022)
Nature Ecology & Evolution (2022)
Science China Technological Sciences (2022)