Sorghum and maize share a close evolutionary history that can be explored through comparative genomics1,2. To perform a large-scale comparison of the genomic variation between these two species, we analysed ~13 million variants identified from whole-genome resequencing of 499 sorghum lines together with 25 million variants previously identified in 1,218 maize lines. Deleterious mutations in both species were prevalent in pericentromeric regions, enriched in non-syntenic genes and present at low allele frequencies. A comparison of deleterious burden between sorghum and maize revealed that sorghum, in contrast to maize, departed from the domestication-cost hypothesis that predicts a higher deleterious burden among domesticates compared with wild lines. Additionally, sorghum and maize population genetic summary statistics were used to predict a gene deleterious index with an accuracy greater than 0.5. This research represents a key step towards understanding the evolutionary dynamics of deleterious variants in sorghum and provides a comparative genomics framework to start prioritizing these variants for removal through genome editing and breeding.
This is a preview of subscription content
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
The raw sequencing data for the TERRA-MEPP lines are available through the NCBI BioProject PRJNA513297. The raw data for Mace et al.5 are available through the BioProject PRJNA182489. The TERRA-REF raw data are available through the data commons database at CyVerse: http://datacommons.cyverse.org/browse/iplant/home/shared/terraref. The gene expression raw data are available through the BioProject PRJNA503076. The SIFT raw results and VCF files, among others, are available through the CyVerse repository: (http://datacommons.cyverse.org/browse/iplant/home/shared/GoreLab/dataFromPubs/Lozano_MaizeSorghum_2019).
The code used throughout the article is available at the GitHub repository: https://github.com/GoreLab/Sorghum-HapMap
Swigonová, Z. et al. Close split of sorghum and maize genome progenitors. Genome Res. 14, 1916–1923 (2004).
Wang, X. et al. Genome alignment spanning major Poaceae lineages reveals heterogeneous evolutionary rates and alters inferred dates for key evolutionary events. Mol. Plant 8, 885–898 (2015).
Fuller, D. Q. & Stevens, C. J. in Plants and People in the African Past: Progress in African Archaeobotany (eds Mercuri, A. M. et al.) 427–452 (Springer International, 2018).
Sagnard, F. et al. Genetic diversity, structure, gene flow and evolutionary relationships within the Sorghum bicolor wild–weedy–crop complex in a western African region. Theor. Appl. Genet. 123, 1231–1246 (2011).
Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013).
Matsuoka, Y. et al. A single domestication for maize shown by multilocus microsatellite genotyping. Proc. Natl Acad. Sci. USA 99, 6080–6084 (2002).
Piperno, D. R., Ranere, A. J., Holst, I., Iriarte, J. & Dickau, R. Starch grain and phytolith evidence for early ninth millennium b.p. maize from the central Balsas River valley, Mexico. Proc. Natl Acad. Sci. USA 106, 5019–5024 (2009).
Lin, Z. et al. Parallel domestication of the Shattering1 genes in cereals. Nat. Genet. 44, 720–724 (2012).
Lai, X., Yan, L., Lu, Y. & Schnable, J. C. Largely unlinked gene sets targeted by selection for domestication syndrome phenotypes in maize and sorghum. Plant J. 93, 843–855 (2018).
Beissinger, T. M. et al. Recent demography drives changes in linked selection across the maize genome. Nat. Plants 2, 16084 (2016).
Hufford, M. B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012).
Wang, L. et al. The interplay of demography and selection during maize domestication and expansion. Genome Biol. 18, 215 (2017).
Yang, J. et al. Incomplete dominance of deleterious alleles contributes substantially to trait variation and heterosis in maize. PLoS Genet. 13, e1007019 (2017).
Smith, O. et al. A domestication history of dynamic adaptation and genomic deterioration in Sorghum. Nat. Plants 5, 369–379 (2019).
Ellstrand, N. C. & Foster, K. W. Impact of population structure on the apparent outcrossing rate of grain sorghum (Sorghum bicolor). Theor. Appl. Genet. 66, 323–327 (1983).
Muraya, M. M. et al. Wild sorghum from different eco-geographic regions of Kenya display a mixed mating system. Theor. Appl. Genet. 122, 1631–1639 (2011).
Hufford, M. B., Gepts, P. & Ross-Ibarra, J. Influence of cryptic population structure on observed mating patterns in the wild progenitor of maize (Zea mays ssp. parviglumis). Mol. Ecol. 20, 46–55 (2011).
McCormick, R. F. et al. The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J. 93, 338–354 (2018).
Winchell, F., Stevens, C. J., Murphy, C., Champion, L. & Fuller, D. Q. Evidence for Sorghum domestication in fourth millennium bc eastern Sudan: spikelet morphology from ceramic impressions of the Butana Group. Curr. Anthropol. https://doi.org/10.1086/693898 (2017).
de Wet, J. M. J. & Huckabay, J. P. The origin of Sorghum bicolor. II. Distribution and domestication. Evolution 21, 787–802 (1967).
Morris, G. P. et al. Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc. Natl Acad. Sci. USA 110, 453–458 (2013).
Brown, P. J., Myles, S. & Kresovich, S. Genetic support for phenotype-based racial classification in Sorghum. Crop Sci. 51, 224–230 (2011).
Deschamps, S. et al. A chromosome-scale assembly of the sorghum genome using nanopore sequencing and optical mapping. Nat. Commun. 9, 4844 (2018).
Ramu, P. et al. Cassava haplotype map highlights fixation of deleterious mutations during clonal propagation. Nat. Genet. 49, 959–963 (2017).
Valluru, R. et al. Deleterious mutation burden and its association with complex traits in sorghum (Sorghum bicolor). Genetics https://doi.org/10.1534/genetics.118.301742 (2019).
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Rodgers-Melnick, E. et al. Recombination in diverse maize is stable, predictable, and associated with genetic load. Proc. Natl Acad. Sci. USA 112, 3823–3828 (2015).
Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).
Mezmouk, S. & Ross-Ibarra, J. The pattern and distribution of deleterious mutations in maize. G3 4, 163–171 (2014).
Schnable, J. C., Springer, N. M. & Freeling, M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl Acad. Sci. USA 108, 4069–4074 (2011).
Moyers, B. T., Morrell, P. L. & McKay, J. K. Genetic costs of domestication and improvement. J. Hered. 109, 103–116 (2018).
Hamblin, M. T. et al. Challenges of detecting directional selection after a bottleneck: lessons from Sorghum bicolor. Genetics 173, 953–964 (2006).
Flagel, L., Brandvain, Y. & Schrider, D. R. The unreasonable effectiveness of convolutional neural networks in population genetic inference. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msy224 (2018).
Schrider, D. R. & Kern, A. D. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 34, 301–312 (2018).
Kremling, K. A. G. et al. Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature 555, 520–523 (2018).
Washburn, J. D. et al. Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence. Proc. Natl Acad. Sci. USA 116, 5542–5549 (2019).
Wang, H., Cimen, E., Singh, N. & Buckler, E. Deep learning for plant genomics and crop improvement. Curr. Opin. Plant Biol. 54, 34–41 (2020).
Weber, J. A., Aldana, R., Gallagher, B. D. & Edwards, J. S. Sentieon DNA pipeline for variant detection—software-only solution, over 20× faster than GATK 3.3 with identical results. Preprint at https://doi.org/10.7287/peerj.preprints.1672v2 (2016).
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Krumm, N. et al. Excess of rare, inherited truncating mutations in autism. Nat. Genet. 47, 582–588 (2015).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Browning, B. L. & Browning, S. R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).
Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).
Valluru, R. et al. Deleterious mutation burden and its association with complex traits in Sorghum (Sorghum bicolor). Genetics 211, 1075–1087 (2019).
Schnable, P. S. et al. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009).
Zhang, Y. et al. Differentially regulated orthologs in sorghum and the subgenomes of maize. Plant Cell 29, 1938–1951 (2017).
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).
Keightley, P. D. & Jackson, B. C. Inferring the probability of the derived vs. the ancestral allelic state at a polymorphic site. Genetics 209, 897–906 (2018).
Bukowski, R. et al. Construction of the third-generation Zea mays haplotype map. GigaScience 7, gix134 (2018).
Rodgers-Melnick, E., Vera, D. L., Bass, H. W. & Buckler, E. S. Open chromatin reveals the functional maize genome. Proc. Natl Acad. Sci. USA 113, E3177–E3184 (2016).
Simons, Y. B., Turchin, M. C., Pritchard, J. K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).
Henn, B. M. et al. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc. Natl Acad. Sci. USA 113, E440–E449 (2016).
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
Thornton, K. R. A C++ template library for efficient forward-time population genetic simulation of large populations. Genetics 198, 157–166 (2014).
Garrison, E. vcflib: A C++ library for parsing and manipulating VCF files v1.0.0-rc2 https://github.com/vcflib/vcflib (2019).
Zhang, M., Zhou, L., Bawa, R., Suren, H. & Holliday, J. A. Recombination rate variation, hitchhiking, and demographic history shape deleterious load in poplar. Mol. Biol. Evol. 33, 2899–2910 (2016).
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinform. 15, 356 (2014).
Nei, M. & Li, W. H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl Acad. Sci. USA 76, 5269–5273 (1979).
Gao, F., Ming, C., Hu, W. & Li, H. New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3 6, 1563–1571 (2016).
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. Preprint at http://arxiv.org/abs/1512.03385 (2015).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Advances in Neural Information Processing Systems 25 (eds Pereira, F. et al.) 1097–1105 (Curran Associates, 2012).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
We thank J. Schnable for the maize, sorghum and setaria orthologue lists. We also thank the Ross-Ibarra lab at UC Davis for helpful comments and sound advice on an earlier draft of this manuscript. We thank J. Grimwood and J. Schmutz at the HudsonAlpha Institute for the sequencing of the included TERRA-REF lines. The information, data or work presented herein was funded in part by the Advanced Research Projects Agency-Energy (ARPA-E), US Department of Energy, under Award Numbers DE-AR0000598, DE-AR0000661 and DE-AR0000594. This work was also supported by the United States Department of Agriculture–Agricultural Research Service (USDA–ARS). The views and opinions of the authors expressed herein do not necessarily state or reflect those of the United States government or any agency thereof. This work was also supported by the Next-Generation BioGreen 21 Program (Project No. PJ01321305), Rural Development Administration, Republic of Korea. For J.P.R.d.S., this work was partially supported by FAPESP grant nos 2017/03625-2 and 2017/25674-5 / CAPES (São Paulo Research Foundation) Finance Code 001 / Conselho Nacional de Desenvolvimento Cientıfico e Tecnológico (CNPq).
The authors declare no competing interests.
Peer review information Nature Plants thanks Xuehui Huang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lozano, R., Gazave, E., dos Santos, J.P.R. et al. Comparative evolutionary genetics of deleterious load in sorghum and maize. Nat. Plants 7, 17–24 (2021). https://doi.org/10.1038/s41477-020-00834-5
A genome variation map provides insights into the genetics of walnut adaptation and agronomic traits
Genome Biology (2021)
Nature Plants (2021)