Abstract
The short read-length of next-generation sequencing makes it challenging to characterize highly repetitive regions (HRRs) such as centromeres, telomeres and ribosomal DNAs. Based on recent strategies that combined long-read sequencing and exogenous enzymatic labelling of open chromatin, we developed single-molecule targeted accessibility and methylation sequencing (STAM-seq) in plants by further integrating nanopore adaptive sampling to investigate the HRRs in wild-type Arabidopsis and DNA methylation mutants that are defective in CG- or non-CG methylation. We found that CEN180 repeats show higher chromatin accessibility and lower DNA methylation on their forward strand, individual rDNA units show a negative correlation between their DNA methylation and accessibility, and both accessibility and CHH methylation levels are lower at telomere compared to adjacent subtelomeric region. Moreover, DNA methylation-deficient mutants showed increased chromatin accessibility at HRRs, consistent with the role of DNA methylation in maintaining heterochromatic status in plants. STAM-seq can be applied to study accessibility and methylation of repetitive sequences across diverse plant species.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The STAM-seq data generated in this study have been deposited in the Genome Sequence Archive in National Genomics Data Center, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA008945) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa. Source data are provided with this paper.
Code availability
Source code for analysis is available at https://github.com/ZhaiLab-SUSTech/STAM-seq/.
References
Lloyd, J. P. B. & Lister, R. Epigenome plasticity in plants. Nat. Rev. Genet. 23, 55–68 (2022).
Lu, Z., Hofmeister, B. T., Vollmers, C., DuBois, R. M. & Schmitz, R. J. Combining ATAC-seq with nuclei sorting for discovery of cis-regulatory regions in plant genomes. Nucleic Acids Res. 45, e41 (2017).
Cokus, S. J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215–219 (2008).
Treangen, T. J. & Salzberg, S. L. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat. Rev. Genet. 13, 36–46 (2012).
Logsdon, G. A., Vollger, M. R. & Eichler, E. E. Long-read human genome sequencing and its applications. Nat. Rev. Genet. 21, 597–614 (2020).
Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).
Hou, X., Wang, D., Cheng, Z., Wang, Y. & Jiao, Y. A near-complete assembly of an Arabidopsis thaliana genome. Mol. Plant https://doi.org/10.1016/j.molp.2022.05.014 (2022).
Ni, P. et al. Genome-wide detection of cytosine methylations in plant from nanopore data using deep learning. Nat. Commun. 12, 5976 (2021).
Wang, Y., Zhao, Y., Bollas, A., Wang, Y. & Au, K. F. Nanopore sequencing technology, bioinformatics and applications. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01108-x (2021).
Stergachis, A. B., Debo, B. M., Haugen, E., Churchman, L. S. & Stamatoyannopoulos, J. A. Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science 368, 1449–1454 (2020).
Lee, I. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat. Methods 17, 1191–1199 (2020).
Shipony, Z. et al. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat. Methods 17, 319–327 (2020).
Wang, Y. et al. Single-molecule long-read sequencing reveals the chromatin basis of gene expression. Genome Res. 29, 1329–1342 (2019).
Abdulhay, N. J. et al. Massively multiplex single-molecule oligonucleosome footprinting. eLife 9, e59404 (2020).
Payne, A. et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat. Biotechnol. 39, 442–450 (2021).
Liang, Z. et al. DNA N-adenine methylation in Arabidopsis thaliana. Dev. Cell 45, 406–416.e3 (2018).
Stroud, H. et al. Non-CG methylation patterns shape the epigenetic landscape in Arabidopsis. Nat. Struct. Mol. Biol. 21, 64–72 (2014).
Stroud, H., Greenberg, M. V. C., Feng, S., Bernatavichute, Y. V. & Jacobsen, S. E. Comprehensive analysis of silencing mutants reveals complex regulation of the Arabidopsis methylome. Cell 152, 352–364 (2013).
May, B. P., Lippman, Z. B., Fang, Y., Spector, D. L. & Martienssen, R. A. Differential regulation of strand-specific transcripts from Arabidopsis centromeric satellite repeats. PLoS Genet. 1, e79 (2005).
Liu, Z.-W., Liu, J., Liu, F. & Zhong, X. Depositing centromere repeats induces heritable intragenic heterochromatin establishment and spreading in Arabidopsis. Nucleic Acids Res. 51, 6039–6054 (2023).
Liu, Y. et al. Genome-wide mapping reveals R-loops associated with centromeric repeats in maize. Genome Res. 31, 1409–1418 (2021).
Liu, Q. et al. Non–B-form DNA tends to form in centromeric regions and has undergone changes in polyploid oat subgenomes. Proc. Natl Acad. Sci. USA 120, e2211683120 (2023).
Du, Y., Topp, C. N. & Dawe, R. K. DNA binding of centromere protein C (CENPC) is stabilized by single-stranded RNA. PLoS Genet. 6, e1000835 (2010).
Luo, S. & Preuss, D. Strand-biased DNA methylation associated with centromeric regions in Arabidopsis. Proc. Natl Acad. Sci. USA 100, 11133–11138 (2003).
Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature https://doi.org/10.1038/s41586-023-06062-z (2023).
Wright, D. A. & Voytas, D. F. Athila4 of Arabidopsis and Calypso of soybean define a lineage of endogenous plant retroviruses. Genome Res. 12, 122–131 (2002).
Lee, S. C. et al. Arabidopsis retrotransposon virus-like particles and their regulation by epigenetically activated small RNA. Genome Res. 30, 576–588 (2020).
Zhong, Z. et al. DNA methylation-linked chromatin accessibility affects genomic architecture in Arabidopsis. Proc. Natl Acad. Sci. USA 118, e2023347118 (2021).
Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022).
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Copenhaver, G. P. & Pikaard, C. S. Two‐dimensional RFLP analyses reveal megabase‐sized clusters of rRNA gene variants in Arabidopsis thaliana, suggesting local spreading of variants as the mode for gene homogenization during concerted evolution. Plant J. 9, 273–282 (1996).
Sáez-Vásquez, J. & Delseny, M. Ribosome biogenesis in plants: from functional 45S ribosomal DNA organization to ribosome assembly factors. Plant Cell 31, 1945–1967 (2019).
Lai, W. K. M. & Pugh, B. F. Understanding nucleosome dynamics and their links to gene expression and DNA replication. Nat. Rev. Mol. Cell Biol. 18, 548–562 (2017).
Pontvianne, F. et al. Nucleolin is required for DNA methylation state and the expression of rRNA gene variants in Arabidopsis thaliana. PLoS Genet. 6, e1001225 (2010).
Fultz, D., McKinlay, A., Enganti, R. & Pikaard, C. S. Sequence and epigenetic landscapes of active and silenced nucleolus organizers in Arabidopsis. Preprint at bioRxiv https://doi.org/10.1101/2023.06.07.544131 (2023).
Tucker, S., Vitins, A. & Pikaard, C. S. Nucleolar dominance and ribosomal RNA gene silencing. Curr. Opin. Cell Biol. 22, 351–356 (2010).
Bailey, S. M., Cornforth, M. N., Kurimasa, A., Chen, D. J. & Goodwin, E. H. Strand-specific postreplicative processing of mammalian telomeres. Science 293, 2462–2465 (2001).
Richards, E. J. & Ausubel, F. M. Isolation of a higher eukaryotic telomere from Arabidopsis thaliana. Cell 53, 127–136 (1988).
Richards, E. J., Goodman, H. M. & Ausubel, F. M. The centromere region of Arabidopsis thaliana chromosome 1 contains telomere-similar sequences. Nucleic Acids Res. 19, 3351–3357 (1991).
Vaquero-Sedas, M. I., Luo, C. & Vega-Palas, M. A. Analysis of the epigenetic status of telomeres by using ChIP-seq data. Nucleic Acids Res. 40, e163–e163 (2012).
Shay, J. W. & Wright, W. E. Telomeres and telomerase: three decades of progress. Nat. Rev. Genet. 20, 299–309 (2019).
Dubocanin, D. et al. Single-molecule architecture and heterogeneity of human telomeric DNA and chromatin. Preprint at bioRxiv https://doi.org/10.1101/2022.05.09.491186 (2022).
Riha, K., McKnight, T. D., Fajkus, J., Vyskot, B. & Shippen, D. E. Analysis of the G-overhang structures on plant telomeres: evidence for two distinct telomere architectures. Plant J. 23, 633–641 (2000).
Chakravarti, D., LaBella, K. A. & DePinho, R. A. Telomeres: history, health, and hallmarks of aging. Cell 184, 306–322 (2021).
Yue, X. et al. Simultaneous profiling of histone modifications and DNA methylation via nanopore sequencing. Nat. Commun. 13, 7939 (2022).
Altemose, N. et al. DiMeLo-seq: a long-read, single-molecule method for mapping protein–DNA interactions genome wide. Nat. Methods https://doi.org/10.1038/s41592-022-01475-6 (2022).
Weng, Z. et al. BIND&MODIFY: a long-range method for single-molecule mapping of chromatin modifications in eukaryotes. Genome Biol. 24, 61 (2023).
Song, J.-M. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol. Plant 14, 1757–1767 (2021).
Hufford, M. B. et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373, 655–662 (2021).
Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nat. Genet. 55, 1221–1231 (2023).
Mathieu, O., Reinders, J., Čaikovski, M., Smathajitt, C. & Paszkowski, J. Transgenerational stability of the Arabidopsis epigenome is coordinated by CG methylation. Cell 130, 851–862 (2007).
Long, Y., Jia, J., Mo, W., Jin, X. & Zhai, J. FLEP-seq: simultaneous detection of RNA polymerase II position, splicing status, polyadenylation site and poly(A) tail length at genome-wide scale by single-molecule nascent RNA sequencing. Nat. Protoc. https://doi.org/10.1038/s41596-021-00581-7 (2021).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Rigal, M. et al. Epigenome confrontation triggers immediate reprogramming of DNA methylation and transposon silencing in Arabidopsis thaliana F1 epihybrids. Proc. Natl Acad. Sci. USA 113, E2083–E2092 (2016).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Acknowledgements
This work is supported by the National Natural Science Foundation of China (32270631 to X.D.), the Youth Innovation Promotion Association of CAS (2018131), the National Key R&D Program of China Grant (2019YFA0903903), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2016ZT06S172), the Shenzhen Sci-Tech Fund (KYTDPT20181011104005), the Key Laboratory of Molecular Design for Plant Cell Factory of Guangdong Higher Education Institutes (2019KSYS006), the Stable Support Plan Program of Shenzhen Natural Science Fund Grant (20200925153345004), the Youth Innovation Promotion Association of CAS (Y2022039) and the Center for Computational Science and Engineering at Southern University of Science and Technology.
Author information
Authors and Affiliations
Contributions
J.Z., W.M. and Y.S. conceived and designed the experiments. Y.S., W.M., Y.L. and B.L. performed the experiments. W.M. and Y.S. analysed the data. J.Z. and X.D. oversaw the study. X.C. and T.L. provided conceptual insight. W.M., Y.S., J.Z. and X.D. wrote the manuscript, and all authors revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Plants thanks Ian Henderson, James Walker and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The fraction of bases called as 6 mA in the libraries.
Arabidopsis genomic DNA was treated with or without the EcoGII enzyme. Shown are the average methylation levels for A nucleotides.
Extended Data Fig. 2 Sequencing coverage of HRR regions relative to non-HRR regions in the library with or without adaptive sampling.
The target regions include centromeres, telomeres, and 45 S rDNA in Arabidopsis genome.
Extended Data Fig. 3 Comparison of STAM-seq with WGBS.
The IGV genomic tracks display a target region located adjacent to the left telomere on Chromosome 1. The tracks show the DNA methylation levels observed in the CG, CHG, and CHH contexts of aggregated STAM-seq data compared to previously published WGBS data (ENA accession PRJEB9919).
Extended Data Fig. 4 Comparison of DNA methylation levels between wildtype and mutants over transposable elements (TE) and gene bodies.
a. Metaplots of DNA methylation levels in CG, CHG, and CHH contexts over TE in wildtype and mutant samples. b. Metaplots of DNA methylation levels in CG, CHG, and CHH contexts over gene bodies in wildtype and mutant samples.
Extended Data Fig. 5 Comparison of STAM-seq profiles between replicates.
a. Correlation of STAM-seq accessibility signal between replicates over promoters. Shown is the average 6 mA ratio over the TSS ± 200. b-d. The correlation of STAM-seq DNA methylation levels between replicates. Methylation levels for CG, CHG, and CHH were illustrated in panels b, c, and d, respectively.
Extended Data Fig. 6 Strand-specific analysis of accessibility for genic regions.
a, b. The strand-specific view of chromatin accessibility and DNA methylation signal at genic region. Only reads that fully spanned the region were shown. The reads in each panel are arranged from top to bottom in order of increased chromatin accessibility. c. Metaplots illustrating the chromatin accessibility signal over genes from different strands. The blue color corresponds to the forward strand (5′→3′) of the reference genome, while the red color represents the reverse strand (3′→5′).
Extended Data Fig. 7 Strand-specific analysis for centromeric region.
a. Strand-specific view of CG methylation for the centromeric region. The reads in each panel are arranged from top to bottom in order of decreased CG methylation. b. Strand-specific view of CHH methylation for the centromeric region. The reads in each panel are arranged from top to bottom in order of decreased CHH methylation. c. Strand-specific view of chromatin accessibility signal for the centromeric region. Only reads that fully spanned the region were shown. The inaccessible reads (accessibility signal = 0) are highlighted by brackets. The reads in each panel are arranged from top to bottom in order of increased chromatin accessibility. d. Distribution of the accessibility signal for the example depicted in (c). The dotted line marks the division between accessible and inaccessible reads, with the left side representing the region containing the inaccessible reads.
Extended Data Fig. 8 Epigenetic patterns over the 45 S rRNA genes.
a. Metaplots of accessibility (top) and DNA methylation (bottom) over the 45 S rRNA genes. The schematic of 45 S rRNA gene region was shown on the top. b. Metaplots of accessibility (top) and DNA methylation (bottom) at transcription initiation site (TATATAGGGGG, +1 is underlined) of the 45 S rRNA genes. The accessibility signal reveals the nucleosome positioning pattern.
Extended Data Fig. 9 Epigenetic patterns between different 45 S rDNA classes.
a, d. Metaplots of accessibility (a), CG methylation (b), CHG methylation (c), and CHH methylation (d) in different 45 S rRNA gene variants.
Extended Data Fig. 10 DNA methylation level of the three consecutive cytosines in the telomeric repeat.
The methylation level was calculated as the ratio of 5-methylcytosine (5mC) to total cytosine (C) at these positions.
Supplementary information
Supplementary Information
Supplementary Tables 1 and 2 and Supplementary Figs. 1–10.
Source data
Source Data Fig. 3
Statistical source data.
Source Data Fig. 5
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mo, W., Shu, Y., Liu, B. et al. Single-molecule targeted accessibility and methylation sequencing of centromeres, telomeres and rDNAs in Arabidopsis. Nat. Plants 9, 1439–1450 (2023). https://doi.org/10.1038/s41477-023-01498-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41477-023-01498-7
This article is cited by
-
Measuring open chromatin and DNA methylation in repeat arrays
Nature Plants (2023)