Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Comparative evolutionary genetics of deleterious load in sorghum and maize

Abstract

Sorghum and maize share a close evolutionary history that can be explored through comparative genomics1,2. To perform a large-scale comparison of the genomic variation between these two species, we analysed ~13 million variants identified from whole-genome resequencing of 499 sorghum lines together with 25 million variants previously identified in 1,218 maize lines. Deleterious mutations in both species were prevalent in pericentromeric regions, enriched in non-syntenic genes and present at low allele frequencies. A comparison of deleterious burden between sorghum and maize revealed that sorghum, in contrast to maize, departed from the domestication-cost hypothesis that predicts a higher deleterious burden among domesticates compared with wild lines. Additionally, sorghum and maize population genetic summary statistics were used to predict a gene deleterious index with an accuracy greater than 0.5. This research represents a key step towards understanding the evolutionary dynamics of deleterious variants in sorghum and provides a comparative genomics framework to start prioritizing these variants for removal through genome editing and breeding.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Population structure and LD patterns in sorghum.
Fig. 2: Comparative analysis of deleterious alleles in sorghum and maize.
Fig. 3: Genomic landscape of sorghum.
Fig. 4: CNN architecture.

Similar content being viewed by others

Data availability

The raw sequencing data for the TERRA-MEPP lines are available through the NCBI BioProject PRJNA513297. The raw data for Mace et al.5 are available through the BioProject PRJNA182489. The TERRA-REF raw data are available through the data commons database at CyVerse: http://datacommons.cyverse.org/browse/iplant/home/shared/terraref. The gene expression raw data are available through the BioProject PRJNA503076. The SIFT raw results and VCF files, among others, are available through the CyVerse repository: (http://datacommons.cyverse.org/browse/iplant/home/shared/GoreLab/dataFromPubs/Lozano_MaizeSorghum_2019).

Code availability

The code used throughout the article is available at the GitHub repository: https://github.com/GoreLab/Sorghum-HapMap

References

  1. Swigonová, Z. et al. Close split of sorghum and maize genome progenitors. Genome Res. 14, 1916–1923 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Wang, X. et al. Genome alignment spanning major Poaceae lineages reveals heterogeneous evolutionary rates and alters inferred dates for key evolutionary events. Mol. Plant 8, 885–898 (2015).

    Article  CAS  PubMed  Google Scholar 

  3. Fuller, D. Q. & Stevens, C. J. in Plants and People in the African Past: Progress in African Archaeobotany (eds Mercuri, A. M. et al.) 427–452 (Springer International, 2018).

  4. Sagnard, F. et al. Genetic diversity, structure, gene flow and evolutionary relationships within the Sorghum bicolor wild–weedy–crop complex in a western African region. Theor. Appl. Genet. 123, 1231–1246 (2011).

    Article  PubMed  Google Scholar 

  5. Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013).

    Article  PubMed  Google Scholar 

  6. Matsuoka, Y. et al. A single domestication for maize shown by multilocus microsatellite genotyping. Proc. Natl Acad. Sci. USA 99, 6080–6084 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Piperno, D. R., Ranere, A. J., Holst, I., Iriarte, J. & Dickau, R. Starch grain and phytolith evidence for early ninth millennium b.p. maize from the central Balsas River valley, Mexico. Proc. Natl Acad. Sci. USA 106, 5019–5024 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Lin, Z. et al. Parallel domestication of the Shattering1 genes in cereals. Nat. Genet. 44, 720–724 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Lai, X., Yan, L., Lu, Y. & Schnable, J. C. Largely unlinked gene sets targeted by selection for domestication syndrome phenotypes in maize and sorghum. Plant J. 93, 843–855 (2018).

    Article  CAS  PubMed  Google Scholar 

  10. Beissinger, T. M. et al. Recent demography drives changes in linked selection across the maize genome. Nat. Plants 2, 16084 (2016).

    Article  PubMed  Google Scholar 

  11. Hufford, M. B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wang, L. et al. The interplay of demography and selection during maize domestication and expansion. Genome Biol. 18, 215 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Yang, J. et al. Incomplete dominance of deleterious alleles contributes substantially to trait variation and heterosis in maize. PLoS Genet. 13, e1007019 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Smith, O. et al. A domestication history of dynamic adaptation and genomic deterioration in Sorghum. Nat. Plants 5, 369–379 (2019).

    Article  PubMed  Google Scholar 

  15. Ellstrand, N. C. & Foster, K. W. Impact of population structure on the apparent outcrossing rate of grain sorghum (Sorghum bicolor). Theor. Appl. Genet. 66, 323–327 (1983).

    Article  CAS  PubMed  Google Scholar 

  16. Muraya, M. M. et al. Wild sorghum from different eco-geographic regions of Kenya display a mixed mating system. Theor. Appl. Genet. 122, 1631–1639 (2011).

    Article  PubMed  Google Scholar 

  17. Hufford, M. B., Gepts, P. & Ross-Ibarra, J. Influence of cryptic population structure on observed mating patterns in the wild progenitor of maize (Zea mays ssp. parviglumis). Mol. Ecol. 20, 46–55 (2011).

    Article  PubMed  Google Scholar 

  18. McCormick, R. F. et al. The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J. 93, 338–354 (2018).

    Article  CAS  PubMed  Google Scholar 

  19. Winchell, F., Stevens, C. J., Murphy, C., Champion, L. & Fuller, D. Q. Evidence for Sorghum domestication in fourth millennium bc eastern Sudan: spikelet morphology from ceramic impressions of the Butana Group. Curr. Anthropol. https://doi.org/10.1086/693898 (2017).

  20. de Wet, J. M. J. & Huckabay, J. P. The origin of Sorghum bicolor. II. Distribution and domestication. Evolution 21, 787–802 (1967).

    Article  PubMed  Google Scholar 

  21. Morris, G. P. et al. Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc. Natl Acad. Sci. USA 110, 453–458 (2013).

    Article  CAS  PubMed  Google Scholar 

  22. Brown, P. J., Myles, S. & Kresovich, S. Genetic support for phenotype-based racial classification in Sorghum. Crop Sci. 51, 224–230 (2011).

    Article  Google Scholar 

  23. Deschamps, S. et al. A chromosome-scale assembly of the sorghum genome using nanopore sequencing and optical mapping. Nat. Commun. 9, 4844 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Ramu, P. et al. Cassava haplotype map highlights fixation of deleterious mutations during clonal propagation. Nat. Genet. 49, 959–963 (2017).

    Article  CAS  PubMed  Google Scholar 

  25. Valluru, R. et al. Deleterious mutation burden and its association with complex traits in sorghum (Sorghum bicolor). Genetics https://doi.org/10.1534/genetics.118.301742 (2019).

  26. Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Rodgers-Melnick, E. et al. Recombination in diverse maize is stable, predictable, and associated with genetic load. Proc. Natl Acad. Sci. USA 112, 3823–3828 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).

    Article  CAS  PubMed  Google Scholar 

  29. Mezmouk, S. & Ross-Ibarra, J. The pattern and distribution of deleterious mutations in maize. G3 4, 163–171 (2014).

    Article  PubMed  Google Scholar 

  30. Schnable, J. C., Springer, N. M. & Freeling, M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl Acad. Sci. USA 108, 4069–4074 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Moyers, B. T., Morrell, P. L. & McKay, J. K. Genetic costs of domestication and improvement. J. Hered. 109, 103–116 (2018).

    Article  PubMed  Google Scholar 

  32. Hamblin, M. T. et al. Challenges of detecting directional selection after a bottleneck: lessons from Sorghum bicolor. Genetics 173, 953–964 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Flagel, L., Brandvain, Y. & Schrider, D. R. The unreasonable effectiveness of convolutional neural networks in population genetic inference. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msy224 (2018).

  34. Schrider, D. R. & Kern, A. D. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 34, 301–312 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Kremling, K. A. G. et al. Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature 555, 520–523 (2018).

    Article  CAS  PubMed  Google Scholar 

  36. Washburn, J. D. et al. Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence. Proc. Natl Acad. Sci. USA 116, 5542–5549 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Wang, H., Cimen, E., Singh, N. & Buckler, E. Deep learning for plant genomics and crop improvement. Curr. Opin. Plant Biol. 54, 34–41 (2020).

    Article  CAS  PubMed  Google Scholar 

  38. Weber, J. A., Aldana, R., Gallagher, B. D. & Edwards, J. S. Sentieon DNA pipeline for variant detection—software-only solution, over 20× faster than GATK 3.3 with identical results. Preprint at https://doi.org/10.7287/peerj.preprints.1672v2 (2016).

  39. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Krumm, N. et al. Excess of rare, inherited truncating mutations in autism. Nat. Genet. 47, 582–588 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Browning, B. L. & Browning, S. R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Valluru, R. et al. Deleterious mutation burden and its association with complex traits in Sorghum (Sorghum bicolor). Genetics 211, 1075–1087 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Schnable, P. S. et al. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009).

    Article  CAS  PubMed  Google Scholar 

  49. Zhang, Y. et al. Differentially regulated orthologs in sorghum and the subgenomes of maize. Plant Cell 29, 1938–1951 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Keightley, P. D. & Jackson, B. C. Inferring the probability of the derived vs. the ancestral allelic state at a polymorphic site. Genetics 209, 897–906 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Bukowski, R. et al. Construction of the third-generation Zea mays haplotype map. GigaScience 7, gix134 (2018).

    Article  Google Scholar 

  53. Rodgers-Melnick, E., Vera, D. L., Bass, H. W. & Buckler, E. S. Open chromatin reveals the functional maize genome. Proc. Natl Acad. Sci. USA 113, E3177–E3184 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Simons, Y. B., Turchin, M. C., Pritchard, J. K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Henn, B. M. et al. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc. Natl Acad. Sci. USA 113, E440–E449 (2016).

    Article  CAS  PubMed  Google Scholar 

  56. Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).

    Article  CAS  PubMed  Google Scholar 

  57. Thornton, K. R. A C++ template library for efficient forward-time population genetic simulation of large populations. Genetics 198, 157–166 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Garrison, E. vcflib: A C++ library for parsing and manipulating VCF files v1.0.0-rc2 https://github.com/vcflib/vcflib (2019).

  59. Zhang, M., Zhou, L., Bawa, R., Suren, H. & Holliday, J. A. Recombination rate variation, hitchhiking, and demographic history shape deleterious load in poplar. Mol. Biol. Evol. 33, 2899–2910 (2016).

    Article  CAS  PubMed  Google Scholar 

  60. Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinform. 15, 356 (2014).

    Article  Google Scholar 

  61. Nei, M. & Li, W. H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl Acad. Sci. USA 76, 5269–5273 (1979).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Gao, F., Ming, C., Hu, W. & Li, H. New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3 6, 1563–1571 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. Preprint at http://arxiv.org/abs/1512.03385 (2015).

  64. Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Advances in Neural Information Processing Systems 25 (eds Pereira, F. et al.) 1097–1105 (Curran Associates, 2012).

  65. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

Download references

Acknowledgements

We thank J. Schnable for the maize, sorghum and setaria orthologue lists. We also thank the Ross-Ibarra lab at UC Davis for helpful comments and sound advice on an earlier draft of this manuscript. We thank J. Grimwood and J. Schmutz at the HudsonAlpha Institute for the sequencing of the included TERRA-REF lines. The information, data or work presented herein was funded in part by the Advanced Research Projects Agency-Energy (ARPA-E), US Department of Energy, under Award Numbers DE-AR0000598, DE-AR0000661 and DE-AR0000594. This work was also supported by the United States Department of Agriculture–Agricultural Research Service (USDA–ARS). The views and opinions of the authors expressed herein do not necessarily state or reflect those of the United States government or any agency thereof. This work was also supported by the Next-Generation BioGreen 21 Program (Project No. PJ01321305), Rural Development Administration, Republic of Korea. For J.P.R.d.S., this work was partially supported by FAPESP grant nos 2017/03625-2 and 2017/25674-5 / CAPES (São Paulo Research Foundation) Finance Code 001 / Conselho Nacional de Desenvolvimento Cientıfico e Tecnológico (CNPq).

Author information

Authors and Affiliations

Authors

Contributions

R.L., E.G., E.S.B., J.R.-I. and M.A.G. designed the project and all the experiments. R.L. and J.P.R.d.S. performed the CNN analysis. R.L. and E.G. performed the bioinformatic analysis. M.G.S. performed the burden simulation analysis. R.V. and N.B. performed the transcriptome analysis. S.B.F., N.S., T.C.M., E.A.C. and P.J.B. contributed the plant material. M.T.P. constructed the genomic libraries for sequencing the 14 S. bicolor subsp. verticilliflorum accessions. P.J.B., N.S., T.C.M., E.S.B., J.R.-I. and M.A.G. contributed the whole-genome sequence data. R.L., E.G., M.G.S., J.P.R.d.S., R.V., S.B.F., J.R.-I. and M.A.G. analysed the data. R.L. prepared the figures and together with M.A.G. and J.R.-I. wrote the manuscript. All authors provided critical insights and read and approved the manuscript.

Corresponding authors

Correspondence to Jeffrey Ross-Ibarra or Michael A. Gore.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Plants thanks Xuehui Huang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–14.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lozano, R., Gazave, E., dos Santos, J.P.R. et al. Comparative evolutionary genetics of deleterious load in sorghum and maize. Nat. Plants 7, 17–24 (2021). https://doi.org/10.1038/s41477-020-00834-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41477-020-00834-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research