Abstract
A diploid genome in the Saccharum complex facilitates our understanding of evolution in the highly polyploid Saccharum genus. Here we have generated a complete, gap-free genome assembly of Erianthus rufipilus, a diploid species within the Saccharum complex. The complete assembly revealed that centromere satellite homogenization was accompanied by the insertions of Gypsy retrotransposons, which drove centromere diversification. An overall low rate of gene transcription was observed in the palaeo-duplicated chromosome EruChr05 similar to other grasses, which might be regulated by methylation patterns mediated by homologous 24 nt small RNAs, and potentially mediating the functions of many nucleotide-binding site genes. Sequencing data for 211 accessions in the Saccharum complex indicated that Saccharum probably originated in the trans-Himalayan region from a diploid ancestor (x = 10) around 1.9–2.5 million years ago. Our study provides new insights into the origin and evolution of Saccharum and accelerates translational research in cereal genetics and genomics.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The assembled genome sequences and all raw sequencing data for E. rufipilus were deposited in the National Genomics Data Center (NGDC) database under Bioproject accession PRJCA014818. Genome assemblies and annotation files of E. rufipilus are also available in sugarcane database (http://sugarcane.zhangjisenlab.cn/SugarcaneDB/#/downloads). Source data are provided with this paper.
References
Talukdar, D., Verma, D. K., Malik, K., Mohapatra, B. & Yulianto, R. in Sugarcane Biotechnology: Challenges and Prospects (ed. Mohan, C.) 123–137 (Springer, 2017).
D'Hont, A., Lu, Y., Feldmann, P. & Glaszmann, J.-C. Cytoplasmic diversity in sugar cane revealed by heterogous probes. Sugar Cane 1, 12–15 (1993).
Lu, Y. et al. Relationships among ancestral species of sugarcane revealed with RFLP using single copy maize nuclear probes. Euphytica 78, 7–18 (1994).
Daniels, J. & Roach, B. T. in Developments in Crop Science Vol. 11 (ed. Heinz, D.) 7–84 (Elsevier, 1987).
Brandes, E. Origin, dispersal and use in breeding of the Melanesian garden sugarcane and their derivatives, Saccharum officinarum L. Proc. Int. Soc. Sugar Cane Technol. 9, 709–750 (1956).
Glaszmann, J.-C., Lu, Y. & Lanaud, C. Variation of nuclear ribosomal DNA in sugarcane. J. Genet. Breed. 44, 191–197 (1990).
Irvine, J. E. Saccharum species as horticultural classes. Theor. Appl. Genet. 98, 186–194 (1999).
Soltis, P. S., Marchant, D. B., Van de Peer, Y. & Soltis, D. E. Polyploidy and genome evolution in plants. Curr. Opin. Genet. Dev. 35, 119–125 (2015).
Paterson, A., Bowers, J. & Chapman, B. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl Acad. Sci. USA 101, 9903–9908 (2004).
Zhang, Q. et al. Genomic insights into the recent chromosome reduction of autopolyploid sugarcane Saccharum spontaneum. Nat. Genet. 54, 885–896 (2022).
Piperidis, N. & D’Hont, A. Sugarcane genome architecture decrypted with chromosome‐specific oligo probes. Plant J. 103, 2039–2051 (2020).
Thirugnanasambandam, P. P., Hoang, N. V. & Henry, R. J. The challenge of analyzing the sugarcane genome. Front. Plant Sci. 9, 616 (2018).
Michael, T. P. & VanBuren, R. Building near-complete plant genomes. Curr. Opin. Plant Biol. 54, 26–33 (2020).
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Wang, B. et al. High-quality Arabidopsis thaliana genome assembly with nanopore and HiFi long reads. Genomics Proteomics Bioinformatics 20, 4–13 (2021).
Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).
Song, J.-M. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol. Plant 14, 1757–1767 (2021).
Li, K. et al. Gapless indica rice genome reveals synergistic contributions of active transposable elements and segmental duplications to rice genome evolution. Mol. Plant 14, 1745–1756 (2021).
Belser, C. et al. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Commun. Biol. 4, 1047 (2021).
Hufford, M. B. et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373, 655–662 (2021).
Sankaranarayanan, S. R. et al. Loss of centromere function drives karyotype evolution in closely related Malassezia species. eLife 9, e53944 (2020).
Chmátal, L. et al. Centromere strength provides the cell biological basis for meiotic drive and karyotype evolution in mice. Curr. Biol. 24, 2295–2300 (2014).
Huang, Y. et al. The formation and evolution of centromeric satellite repeats in Saccharum species. Plant J. 106, 616–629 (2021).
Li, J. Flora of China. Harv. Pap. Bot. 13, 301–302 (2007).
Wang, X. et al. Characterization of the chromosomal transmission of intergeneric hybrids of Saccharum spp. and Erianthus fulvus by genomic in situ hybridization. Crop Sci. 50, 1642–1648 (2010).
Lloyd Evans, D., Joshi, S. V. & Wang, J. Whole chloroplast genome and gene locus phylogenies reveal the taxonomic placement and relationship of Tripidium (Panicoideae: Andropogoneae) to sugarcane. BMC Evol. Biol. 19, 33 (2019).
Welker, C. A., McKain, M. R., Vorontsova, M. S., Peichoto, M. C. & Kellogg, E. A. Plastome phylogenomics of sugarcane and relatives confirms the segregation of the genus Tripidium (Poaceae: Andropogoneae). Taxon 68, 246–267 (2019).
Welker, C. A. D., Vorontsova, M. S. & Kellogg, E. A. A new combination in the genus Tripidium (Poaceae: Andropogoneae). Phytotaxa 471, 297–300 (2020).
Yu, F. et al. Chromosome-specific painting unveils chromosomal fusions and distinct allopolyploid species in the Saccharum complex. N. Phytol. 233, 1953–1965 (2022).
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
Nurk, S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 30, 1291–1305 (2020).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Zhang, J. et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 50, 1565–1573 (2018).
Paterson, A. H. et al. The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551–556 (2009).
Mitros, T. et al. Genome biology of the paleotetraploid perennial biomass crop Miscanthus. Nat. Commun. 11, 5442 (2020).
Scelfo, A. & Fachinetti, D. Keeping the centromere under control: a promising role for DNA methylation. Cells 8, 912 (2019).
Emms, D. & Kelly, S. STAG: Species Tree inference from All Genes. Preprint at bioRxiv https://doi.org/10.1101/267914 (2018).
Zhang, G. et al. The reference genome of Miscanthus floridulus illuminates the evolution of Saccharinae. Nat. Plants 7, 608–618 (2021).
Wang, X., Tang, H. & Paterson, A. H. Seventy million years of concerted evolution of a homoeologous chromosome pair, in parallel, in major Poaceae lineages. Plant Cell 23, 27–37 (2011).
Zhou, D. & Robertson, K. D. in Genome Stability: From Virus to Human Application (eds Kovalchuk, I. & Kovalchuk, O.) Ch 24 (Academic Press, 2016).
Matzke, M. A. & Mosher, R. A. RNA-directed DNA methylation: an epigenetic pathway of increasing complexity. Nat. Rev. Genet. 15, 394–408 (2014).
Huang, B., Spooner, D. M. & Liang, Q. Genome diversity of the potato. Proc. Natl Acad. Sci. USA 115, E6392–E6393 (2018).
Bredeson, J. V. et al. Sequencing wild and cultivated cassava and related species reveals extensive interspecific hybridization and genetic diversity. Nat. Biotechnol. 34, 562–570 (2016).
Myles, S. et al. Genetic structure and domestication history of the grape. Proc. Natl Acad. Sci. USA 108, 3530–3535 (2011).
Petit, J. R. et al. Climate and atmospheric history of the past 420,000 years from the Vostok ice core, Antarctica. Nature 399, 429–436 (1999).
Zheng, B., Xu, Q. & Shen, Y. The relationship between climate change and Quaternary glacial cycles on the Qinghai–Tibetan Plateau: review and speculation. Quat. Int. 97-98, 93–101 (2002).
Bever, J. D. & Felber, F. The theoretical population genetics of autopolyploidy. Oxf. Surv. Evolut. Biol. 8, 185 (1992).
Garsmeur, O. et al. A mosaic monoploid reference sequence for the highly complex genome of sugarcane. Nat. Commun. 9, 2638 (2018).
Trujillo-Montenegro, J. H. et al. Unraveling the genome of a high yielding Colombian sugarcane hybrid. Front. Plant Sci. 12, 694859 (2021).
Souza, G. M. et al. Assembly of the 373k gene space of the polyploid sugarcane genome reveals reservoirs of functional diversity in the world’s leading biomass crop. Gigascience 8, giz129 (2019).
Shearman, J. R. et al. A draft chromosome-scale genome assembly of a commercial sugarcane. Sci. Rep. 12, 20474 (2022).
Zhang, H. et al. Boom-bust turnovers of megabase-sized centromeric DNA in Solanum species: rapid evolution of DNA sequences associated with centromeres. Plant Cell 26, 1436–1447 (2014).
Bilinski, P. et al. Diversity and evolution of centromere repeats in the maize genome. Chromosoma 124, 57–65 (2015).
Bowers, J. E. et al. Comparative physical mapping links conservation of microsynteny to chromosome structure and recombination in grasses. Proc. Natl Acad. Sci. USA 102, 13206–13211 (2005).
Erdmann, R. M. & Picard, C. L. RNA-directed DNA methylation. PLoS Genet. 16, e1009034 (2020).
Rodin, S. N. & Riggs, A. D. Epigenetic silencing may aid evolution by gene duplication. J. Mol. Evol. 56, 718–729 (2003).
Keller, T. E. & Yi, S. V. DNA methylation and evolution of duplicate genes. Proc. Natl Acad. Sci. USA 111, 5932–5937 (2014).
Schuster, R. Continental movements,“Wallace’s Line” and Indomalayan-Australasian dispersal of land plants: some eclectic concepts. Bot. Rev. 38, 3–86 (1972).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Jin, J.-J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 21, 241 (2020).
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9, 772 (2012).
Myers, E. W. The fragment assembly string graph. Bioinformatics 21, ii79–ii85 (2005).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).
Bao, Z. & Eddy, S. R. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12, 1269–1276 (2002).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
Abrusán, G., Grundmann, N., DeMester, L. & Makalowski, W. TEclass—a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25, 1329–1330 (2009).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Stanke, M. & Morgenstern, B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 33, W465–W467 (2005).
Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res. 27, 49–54 (1999).
Blum, M. et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 49, D344–D354 (2021).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).
Tatusov, R. L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).
Katoh, K., Asimenos, G. & Toh, H. Multiple alignment of DNA sequences with MAFFT. Methods Mol. Biol. 537, 39–64 (2009).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
De Bie, T., Cristianini, N., Demuth, J. P. & Hahn, M. W. CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22, 1269–1271 (2006).
Sanderson, M. J. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003).
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).
Grabherr, M. G. et al. Trinity: reconstructing a full-length transcriptome without a genome from RNA-seq data. Nat. Biotechnol. 29, 644 (2011).
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Zhang, Q. et al. Structure, phylogeny, allelic haplotypes and expression of sucrose transporter gene families in Saccharum. BMC Genomics 17, 88 (2016).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Langdon, W. B. Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks. BioData Min. 8, 1 (2015).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Raj, A., Stephens, M. & Pritchard, J. K. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics 197, 573–589 (2014).
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014).
Liu, X. & Fu, Y.-X. Exploring population size changes using SNP frequency spectra. Nat. Genet. 47, 555–559 (2015).
Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).
Acknowledgements
This work was supported by the National Key Research and Development programme (2021YFF1000101 and 2021YFF1000104); the Science and Technology Planting Project of Guangdong Province (2019B020238001), the National High-tech R&D Program (2013AA100604); the National Natural Science Foundation of China (31660420); the Science and Technology Major Project of Guangxi (AA17202025); the Fujian Provincial Department of Education (JA12082); the Natural Science Foundation of Fujian Province, China (2019J0102); the National Natural Science Foundation of China (32201794); the fellowship of China National Postdoctoral Program for Innovative Talents (BX20220349); and the National Natural Science Foundation of China (32001605).
Author information
Authors and Affiliations
Contributions
J.Z. conceived this project and coordinated research activities; J.Z., W.Y. and H.T. designed the experiments; T.W., J.Z., B.W., X.L., B.C., X.M., R.M. and M.Z. collected E. rufipilus and sugarcane materials; X.H., T.W., B.W., Zhe Z. and H.S. compared the morphological and anatomical features; Z.Y., Y.H. and Z.D. performed the oligo-FISH experiments; T.W., B.W., Q.Z., G.W. and Y.L. assembled, validated and annotated the T2T E. rufipilus; B.W., Y.Q. and T.W. characterized the centromeric sequences; Zeyu Z., L.G. and Yongjun W. analysed bisulfite sequencing data and sRNA sequencing data; T.W., B.Y., Q.Z., J.M., Y.Z., Yuhao W., Z.L., H.P. and S.C. conducted the genomic characteristics analysis and evolution of PdCPs. T.W., B.W., R.G., Y.Q. and Yuhao W. studied the population genomics. J.Z., T.W., H.T. and J.W. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Plants thanks Ling-Ling Chen, John Riascos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 FISH using chromosome-specific oligo probes according to the S. officinarum genome in the same metaphase cells of Erianthus rufipilus Yunnan2009-3.
a, b, c, d and e. E. rufipilus chromosome-specific oligo probes for Chr01, Chr03, Chr05, Chr07, and Chr09 are visualized in red. E. rufipilus chromosome-specific oligo probes Chr02, Chr04, Chr06, Chr08, and Chr10 are visualized in green. Karyotypes of E. rufipilus are shown in f and g. These results confirm the chromosome number of E. rufipilus as 20 and the basic chromosome number as x = 10. Experiment was repeated at least 5 times independently with similar results. Bars = 10 μm.
Extended Data Fig. 2 Genome-wide chromatin interactions in the E. rufipilus genome assembly at 1000-Kbp resolution.
a. Genome-wide chromatin interactions. b. Chromatin interactions in each of its 10 Chromosomes. The intensity of pixels represents the links between 1000-kb windows on all chromosomes. Darker red color indicates higher contact probability.
Extended Data Fig. 3 Alignment of Bionano optical map against in-silico maps of the E. rufipilus genome in the centromere regions.
High match lines between Bionano optical map and in-silico maps of the E. rufipilus genome verifies the accurate assembly of centromeres.
Extended Data Fig. 4 Characteristics and epigenetics of all 10 centromeres of E. rufipilus.
The distributions of CEN137 per 10-kbp on forward (red) or reverse (blue) strands, genes (dark blue), transcript abundance (green), LTR in the GYPSY superfamily (purple), methylation patterns (CG, green; CHG, yellow; CHH, dark blue) and CEN137 sequence similarity on the 10 centromeric regions were plotted successively. The bottom shows the methylation patterns on a chromosomal scale.
Extended Data Fig. 5 Whole plastid phylogram indicates the Erianthus rufipilus belongs to Saccharum genus.
Phylogram of whole chloroplast alignments for Erianthus rufipilus accessions (in red color) from trans-Himalayan region and “Saccharum” (Tripidium) rufipilum accessions from South African Sugarcane Research Institute (in red color) with 52 representative chloroplast sequences. Numbers next to nodes represent bootstraps and the scale bar at the base of the phylogeny represents the expected number of substitutions per site.
Extended Data Fig. 6 Gene family expansion and contraction in E. rufipilus and representative species in the grass lineages.
Red color and blue color indicate different rate of change of evolution (lambda) in the tree, and the size of circle represents the average expansion (contraction) ratio of expansion and contraction. The numbers on the left and on the right represent the expansion (+) or contraction (-) of gene families on the node and in each species, respectively. The numbers in the brackets represent the number of rapidly expanding or contracting gene families.
Extended Data Fig. 7 Evolution of paleo-duplicated chromosome pairs of Saccharum.
Syntenic blocks were identified on E. rufipilus Chr05 and Chr08 as well as in the corresponding chromosomes in representative grass lineages. The colors in the legend indicate the Ks value of the syntenic gene pairs, and the green arrows indicate the large fragment inversion on E. rufipilus Chr08 and the corresponding chromosome in representative grass lineages.
Extended Data Fig. 8 Population characteristics and evolution of E. rufipilus.
a. Principal component analysis with PC1 and PC2. b. Genetic differentiation values (Fst) between different groups are presented on the dashed line and nucleotide diversity (π) in different groups are presented in the circles. c. Genome-wide linkage disequilibrium (LD) analysis of E. rufipilus accessions. d. The distribution of Tajima’s D values among E. rufipilus accessions. e. ADMIXTURE plot of E. rufipilus accessions for K = 3 through 7.
Extended Data Fig. 9 Selective sweeps in Saccharum.
a. Selective sweep detection by estimating ROD (the genomic nucleotide diversity decrease ratio) implicated genes related to disease resistance or response to water deprivation in S. spontaneum. b. Selective sweeps were detected by estimating ROD and implicated genes related to reproduction, development, or photosynthesis in S. officinarum. c. Expression profiles (log2(FPKM + 1)) of the genes (marked in a) in the developmental gradient leaf segments in S. spontaneum. d. Expression profiles of the genes (marked in b) in the developmental gradient leaf segments in S. officinarum.
Extended Data Fig. 10 Deleterious mutations in Saccharum.
a. Comparison of total deleterious mutation numbers in the (corresponding) chromosomes of S. officinarum, E. rufipilus, and S. spontaneum. On each box, the centerline represents the median; the lower and upper hinges represent the 25th and 75th percentiles and the whiskers represent 1.5× the interquartile range. n = 10 for all the 3 species. T-test, P-values adjusted using Holm procedure, * stand for P < 0.1, and **** stand for P < 0.0001. b. Distributions of deleterious mutations on different chromosomes in S. officinarum, E. rufipilus, and S. spontaneum. c. GO term enrichment for genes carrying deleterious mutations in E. rufipilus. Fisher’s exact test, with P-values adjusted using the Benjamini–Hochberg correction for multiple hypothesis testing.
Supplementary information
Supplementary Information
Supplementary Notes 1 and 2, Figs. 1–20 and Tables 1–18.
Supplementary Data 1
The seed LTR sequences of E. rufipilus.
Supplementary Data 2
Estimated times of LTRs inserted to centromeres.
Supplementary Data 3
The total number of methylated cytosines of CG, CHG and CHH context in root, stem and leaf.
Supplementary Data 4
Collinear gene pair IDs in inverted regions of E. rufipilus and Sorghum.
Supplementary Data 5
Enriched GO terms of rapidly expanding gene families.
Supplementary Data 6
Synteny gene pairs on PdCPs in E. rufipilus, rice, sorghum, Miscanthus, S. spontaneum Np-X and S. spontaneum AP85-441.
Supplementary Data 7
NBS genes in E. rufipilus, rice and sorghum.
Supplementary Data 8
P values of CG, CHG and CHH methylation on chromosome 05, chromosome 08 and other chromosomes (t-test).
Supplementary Data 9
Correlation of CG, CHG and CHH methylation with gene expression.
Supplementary Data 10
Highly expressed 24 nt sRNA at promoter in the examined tissues.
Supplementary Data 11
Annotation of the corresponding selective sweep genes in S. spontaneum.
Supplementary Data 12
Annotation of the corresponding selective sweep genes in S. officinarum.
Source data
Source Data Fig. 1
Unprocessed gel of PCR product for gap filling in Fig. 1b.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, T., Wang, B., Hua, X. et al. A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus. Nat. Plants 9, 554–571 (2023). https://doi.org/10.1038/s41477-023-01378-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41477-023-01378-0
This article is cited by
-
Three near-complete genome assemblies reveal substantial centromere dynamics from diploid to tetraploid in Brachypodium genus
Genome Biology (2024)
-
Technology-enabled great leap in deciphering plant genomes
Nature Plants (2024)
-
The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars
Nature Genetics (2024)