Abstract
Here we report a multi-tissue gene expression resource that represents the genotypic and phenotypic diversity of modern inbred maize, and includes transcriptomes in an average of 255 lines in seven tissues. We mapped expression quantitative trait loci and characterized the contribution of rare genetic variants to extremes in gene expression. Some of the new mutations that arise in the maize genome can be deleterious; although selection acts to keep deleterious variants rare, their complete removal is impeded by genetic linkage to favourable loci and by finite population size1,2,3,4. Modern maize breeders have systematically reduced the effects of this constant mutational pressure through artificial selection and self-fertilization, which have exposed rare recessive variants in elite inbred lines5. However, the ongoing effect of these rare alleles on modern inbred maize is unknown. By analysing this gene expression resource and exploiting the extreme diversity and rapid linkage disequilibrium decay of maize6, we characterize the effect of rare alleles and evolutionary history on the regulation of expression. Rare alleles are associated with the dysregulation of expression, and we correlate this dysregulation to seed-weight fitness. We find enrichment of ancestral rare variants among expression quantitative trait loci mapped in modern inbred lines, which suggests that historic bottlenecks have shaped regulation. Our results suggest that one path for further genetic improvement in agricultural species lies in purging the rare deleterious variants that have been associated with crop fitness.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Pervasive under-dominance in gene expression underlying emergent growth trajectories in Arabidopsis thaliana hybrids
Genome Biology Open Access 04 September 2023
-
A graph-based genome and pan-genome variation of the model plant Setaria
Nature Genetics Open Access 08 June 2023
-
A role for heritable transcriptomic variation in maize adaptation to temperate environments
Genome Biology Open Access 24 March 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



Accession codes
References
Kimura, M., Maruyama, T. & Crow, J. F. The mutation load in small populations. Genetics 48, 1303–1312 (1963)
Marth, G. T. et al. The functional spectrum of low-frequency coding variation. Genome Biol. 12, R84 (2011)
Henn, B. M., Botigué, L. R., Bustamante, C. D., Clark, A. G. & Gravel, S. Estimating the mutation load in human genomes. Nat. Rev. Genet. 16, 333–343 (2015)
Gibson, G. Rare and common variants: twenty arguments. Nat. Rev. Genet. 13, 135–145 (2012)
Troyer, A. F. A retrospective view of corn genetic resources. J. Hered. 81, 17–24 (1990)
Remington, D. L. et al. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc. Natl Acad. Sci. USA 98, 11479–11484 (2001)
Kono, T. J. Y. et al. The role of deleterious substitutions in crop genomes. Mol. Biol. Evol. 33, 2307–2317 (2016)
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009)
Li, X. et al. Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants. Am. J. Hum. Genet. 95, 245–256 (2014)
Zhao, J. et al. A burden of rare variants associated with extremes of gene expression in human peripheral blood. Am. J. Hum. Genet. 98, 299–309 (2016)
Jiao, Y. et al. Genome-wide genetic changes during modern breeding of maize. Nat. Genet. 44, 812–815 (2012)
Gore, M. A. et al. A first-generation haplotype map of maize. Science 326, 1115–1117 (2009)
Tenaillon, M. I. et al. Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc. Natl Acad. Sci. USA 98, 9161–9166 (2001)
Vigouroux, Y. et al. Rate and pattern of mutation at microsatellite loci in maize. Mol. Biol. Evol. 19, 1251–1260 (2002)
Beissinger, T. M. et al. Recent demography drives changes in linked selection across the maize genome. Nat. Plants 2, 16084 (2016)
Duvick, D. N. The contribution of breeding to yield advances in maize (Zea mays L.). Adv. Agron. 86, 83–145 (2005)
Troyer, A. F. & Wellin, E. J. Heterosis decreasing in hybrids: yield test inbreds. Crop Sci. 49, 1969–1976 (2009)
Flint-Garcia, S. A. et al. Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J. 44, 1054–1064 (2005)
Eveland, A. L., McCarty, D. R. & Koch, K. E. Transcript profiling by 3′-untranslated region sequencing resolves expression of gene families. Plant Physiol. 146, 32–44 (2008)
Lohman, B. K., Weber, J. N. & Bolnick, D. I. Evaluation of TagSeq, a reliable low-cost alternative for RNAseq. Mol. Ecol. Resour. 16, 1315–1321 (2016)
Bukowski, R. et al. Construction of the third generation Zea mays haplotype map. Gigascience https://doi.org/10.1093/gigascience/gix134 (2017)
Romay, M. C. et al. Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol. 14, R55 (2013)
Yao, H., Dogra Gray, A., Auger, D. L. & Birchler, J. A. Genomic dosage effects on heterosis in triploid maize. Proc. Natl Acad. Sci. USA 110, 2665–2669 (2013)
Josephs, E. B., Lee, Y. W., Stinchcombe, J. R. & Wright, S. I. Association mapping reveals the role of purifying selection in the maintenance of genomic variation in gene expression. Proc. Natl Acad. Sci. USA 112, 15390–15395 (2015)
Gout, J.-F., Kahn, D., Duret, L. & Paramecium Post-Genomics Consortium. The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLoS Genet. 6, e1000944 (2010)
Hufford, M. B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012)
Hung, H.-Y. et al. The relationship between parental genetic or phenotypic divergence and progeny variation in the maize nested association mapping population. Heredity 108, 490–499 (2012)
Rodgers-Melnick, E. et al. Recombination in diverse maize is stable, predictable, and associated with genetic load. Proc. Natl Acad. Sci. USA 112, 3823–3828 (2015)
Wan, C. Y. & Wilkins, T. A. A modified hot borate method significantly enhances the yield of high-quality RNA from cotton (Gossypium hirsutum L.). Anal. Biochem. 223, 7–12 (1994)
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014)
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013)
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015)
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010)
Money, D. et al. LinkImpute: fast and accurate genotype imputation for nonmodel organisms. G3 5, 2383–2390 (2015)
Bradbury, P. J. et al. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635 (2007)
Swarts, K. et al. Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants. Plant Genome 7, https://doi.org/10.3835/plantgenome2014.05.0023 (2014)
Ramu, P. et al. Cassava haplotype map highlights fixation of deleterious mutations during clonal propagation. Nat. Genet. 49, 959–963 (2017)
Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLOS Comput. Biol. 6, e1000770 (2010)
Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012)
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)
Kisselbach, T. A. The Structure and Reproduction of Corn (Cold Spring Harbor Laboratory, 1999)
Acknowledgements
We thank J. Pardo, J. Wallace, R. Punna, K. Shirasawa and S. Miller for assistance with tissue collection; J. Budka and G. Inzinna for field and greenhouse assistance; R. Bukowski for running the maize HapMap genotyping pipeline; L. Johnson and Z. Miller for database curation; G. Gibson, M. Wolfe, J.-L. Jannink, M. Hufford and J. Ross-Ibarra for discussions; P. Schweitzer, J. Mosher, A. Tate, J. Mattison, M. Magallanes-Lundback, I. Holländer and D. Daujotyte for guidance on RNA extraction, library preparation automation and sequencing; and S. Miller for copy-editing. This work was supported by the US Department of Agriculture–Agricultural Research Service and the National Science Foundation grants IOS-0922493 and IOS-1238014 to E.S.B. The National Science Foundation Graduate Research Fellowship Program grant DGE-1650441 and the Section of Plant Breeding and Genetics at Cornell University provided support to K.A.G.K. The Taiwanese Ministry of Science and Technology Overseas Project for Post Graduate Research grant 104-2917-I-564-015 supported S.-Y.C.
Author information
Authors and Affiliations
Contributions
K.A.G.K. and E.S.B. designed the experiments and wrote the manuscript. K.A.G.K performed the analyses and made the RNA-seq libraries. K.A.G.K., S.-Y.C., and M.-H.S. extracted RNA. N.K.L. managed germplasm and plants with K.A.G.K., M.C.R., K.L.S. and A.L. produced and imputed HapMap genotypic data. P.J.B. implemented matrixEQTL in Java/TASSEL. F.L. implemented SNP calling from RNA-seq data.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Reviewer Information Nature thanks N. Springer and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Figure 1 Tissues that were expression profiled by 3′ RNA-seq.
See additional details regarding tissue collection in Methods. Illustrations inspired by ref. 41.
Extended Data Figure 2 Higher numbers of rare alleles are upstream of genes in extreme-expressing individuals, for the most highly expressed genes.
Quadratic regression of the expression rank of each line, for each of the top 5,000 most-expressed genes versus the average local (5-kb upstream) rare-allele count. a, Base of leaf three (n = 263 unique inbred samples). b, Tip of leaf three (n = 265 unique inbred samples). c, Adult leaves collected during the day (n = 204 unique inbred samples). d, Adult leaves collected at night (n = 260 unique inbred samples). e, Kernels at 350-growing-degree days (n = 229 unique inbred samples). f, Roots of germinating seedling (n = 273 unique inbred samples). g, Shoots of germinating seedling (n = 278 unique inbred samples).
Extended Data Figure 3 Higher numbers of rare alleles are upstream of genes in extreme-expressing individuals, for the medium-expressed genes.
Quadratic regression of the expression rank of each line, for each of the top 5,001–10,000 most-expressed genes versus the average local (5-kb upstream) rare-allele count. a, Base of leaf three (n = 263 unique inbred samples). b, Tip of leaf three (n = 265 unique inbred samples). c, Adult leaves collected during the day (n = 204 unique inbred samples). d, Adult leaves collected at night (n = 260 unique inbred samples). e, Kernels at 350-growing-degree days (n = 229 unique inbred samples).f, Roots of germinating seedling (n = 273 unique inbred samples). g, Shoots of germinating seedling (n = 278 unique inbred samples).
Extended Data Figure 4 Comparison of the number of rare cis alleles near genes with differing expression levels.
The 10,000 most-expressed genes in each tissue are divided into groups of 1,000 on the basis of expression level. Plots in each panel show genes ranked 1–1,000, 1,001–2,000, …, 9,001–10,000 from left to right. Each of the individuals represented in each tissue is ranked for expression for each of the 1,000 genes in each group. Individuals in the bottom five expression ranks (fuchsia) versus the middle two quartiles (yellow) versus the top five expression ranks (blue) (mean ± s.e.m.). Y axes refer to mean upstream (within 5 kb) rare-allele count. a, Roots of germinating seedling (n = 273 unique inbred samples). b, Shoots of germinating seedling (n = 278 unique inbred samples). c, Kernels at 350-growing-degree days (n = 229 unique inbred samples). d, Base of leaf three (n = 263 unique inbred samples). e, Tip of leaf three (n = 265 unique inbred samples). f, Adult leaves collected during the day (n = 204 unique inbred samples). g, Adult leaves collected at night (n = 260 unique inbred samples).
Extended Data Figure 5 eQTL R2 distribution comparisons between SNPs in 0.0–0.1 (tropical MAF) and 0.1–0.2 (RNA-set MAF) versus 0.1–0.2 (RNA-set and tropical MAF).
a, Adult leaves collected at night (n = 260 unique inbred samples). b, Adult leaves collected during the day (n = 204 unique inbred samples). c, Tip of leaf three (n = 265 unique inbred samples). d, Base of leaf three (n = 263 unique inbred samples). e, Kernels at 350-growing-degree days (n = 229 unique inbred samples). f, Shoots of germinating seedling (n = 278 unique inbred samples). g, Roots of germinating seedling (n = 273 unique inbred samples). All pairs of distributions within each tissue are significantly different. P < 2.2 × 10−16 two-sided Wilcoxon signed-rank test and Kolmogorov–Smirnov test.
Extended Data Figure 6 eQTL R2 distribution comparisons between SNPs in 0.0–0.1 (tropical MAF) and 0.4–0.5 (RNA-set MAF) versus 0.4–0.5 (RNA-set and tropical MAF).
a, Adult leaves collected at night (n = 260 unique inbred samples). b, Adult leaves collected during the day (n = 204 unique inbred samples). c, Tip of leaf three (n = 265 unique inbred samples). d, Base of leaf three (n = 263 unique inbred samples). e, Kernels at 350-growing-degree days (n = 229 unique inbred samples). f, Shoots of germinating seedling (n = 278 unique inbred samples). g, Roots of germinating seedling (n = 273 unique inbred samples). All pairs of distributions within each tissue are significantly different. P < 2.2 × 10−16 two-sided Wilcoxon signed-rank test and Kolmogorov–Smirnov test.
Extended Data Figure 7 Expression value and dysregulation of 5,000 most-expressed genes are both predictive of fitness.
Orange boxes represent correlations between predicted and true seed weight when using expression values. Yellow boxes represent correlations between predicted and true seed weight when using absolute deviation in expression from the population mean. Range of correlations between predicted and true seed weight is displayed from ten repetitions of nested tenfold cross validation (ten inner and ten outer) using ridge regression. In the box plots, the middle horizontal lines represent the median, hinges represent the 25th and 75th percentiles (the interquartile range), the upper and lower whiskers extend to maximum and minimum points no more than 1.5× interquartile range beyond the hinges, and individual dots are outliers beyond the whiskers. Sample sizes: 2-cm root tips of germinating seedlings (unique n = 181) and whole shoots of germinating seedlings (unique n = 183); the 2-cm base (unique n = 181) and tip (unique n = 182) of leaf 3; leaves collected in the field during the day (unique n = 135) and night (unique n = 187); and 350-growing-degree-day kernels (unique n = 171), post sexual maturity (anthesis).
Extended Data Figure 8 Cumulative expression dysregulation of the 5,000 most-expressed genes in each tissue versus seed weight.
a, Adult leaves collected at night (n = 221 unique inbred samples). b, Adult leaves collected during the day (n = 171 unique inbred samples). c, Tip of leaf three (n = 226 unique inbred samples). d, Base of leaf three (n = 224 unique inbred samples). e, Kernels at 350-growing-degree days (n = 195 unique inbred samples). f, Shoots of germinating seedling (n = 235 unique inbred samples). g, Roots of germinating seedling (n = 226 unique inbred samples). Regression statistics in Extended Data Table 1. Sweet corn and popcorn lines were excluded from these regressions.
Extended Data Figure 9 Mean upstream rare-allele count from the 5,000 most highly expressed genes versus seed weight.
a, Adult leaves collected at night (n = 221 unique inbred samples). b, Adult leaves collected during the day (n = 171 unique inbred samples). c, Tip of leaf three (n = 226 unique inbred samples). d, Base of leaf three (n = 224 unique inbred samples). e, Kernels at 350-growing-degree days (n = 195 unique inbred samples). f, Shoots of germinating seedling (n = 235 unique inbred samples). g, Roots of germinating seedling (n = 226 unique inbred samples).
Supplementary information
Supplementary Table 1
This table contains collection details for all sampled genotypes. Sequencing batch, tissue of origin, RNAseq depth, and subpopulation membership are specified for each sample. (XLS 445 kb)
Rights and permissions
About this article
Cite this article
Kremling, K., Chen, SY., Su, MH. et al. Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature 555, 520–523 (2018). https://doi.org/10.1038/nature25966
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature25966
This article is cited by
-
A role for heritable transcriptomic variation in maize adaptation to temperate environments
Genome Biology (2023)
-
Pervasive under-dominance in gene expression underlying emergent growth trajectories in Arabidopsis thaliana hybrids
Genome Biology (2023)
-
An efficient CRISPR–Cas12a promoter editing system for crop improvement
Nature Plants (2023)
-
Unveiling the characteristics of popcorn by genome re-sequencing and integrating the ESTs and proteome data
Cereal Research Communications (2023)
-
GWAS across multiple environments and WGCNA suggest the involvement of ZmARF23 in embryonic callus induction from immature maize embryos
Theoretical and Applied Genetics (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.