Abstract
Probing epigenetic features on DNA has tremendous potential to advance our understanding of the phased epigenome. In this study, we use nanopore sequencing to evaluate CpG methylation and chromatin accessibility simultaneously on long strands of DNA by applying GpC methyltransferase to exogenously label open chromatin. We performed nanopore sequencing of nucleosome occupancy and methylome (nanoNOMe) on four human cell lines (GM12878, MCF-10A, MCF-7 and MDA-MB-231). The single-molecule resolution allows footprinting of protein and nucleosome binding, and determination of the combinatorial promoter epigenetic signature on individual molecules. Long-read sequencing makes it possible to robustly assign reads to haplotypes, allowing us to generate a fully phased human epigenome, consisting of chromosome-level allele-specific profiles of CpG methylation and chromatin accessibility. We further apply this to a breast cancer model to evaluate differential methylation and accessibility between cancerous and noncancerous cells.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
NanoNOMe data for GM12878, MCF-10A, MCF-7 and MDA-MB-231 are available at National Center for Biotechnology Information Bioproject ID PRJNA510783 (http://www.ncbi.nlm.nih.gov/bioproject/510783). Processed single-read data in select regions are deposited in Zenodo (https://zenodo.org/record/3969567) and processed methylation frequency files are available in GEO accession GSE155791.
Code availability
Source code for analysis is available at https://github.com/timplab/nanoNOMe.
References
Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Kelly, T. K. et al. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res 22, 2497–2506 (2012).
Clark, S. J. et al. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat. Commun. 9, 781 (2018).
Lai, B. et al. Principles of nucleosome organization revealed by single-cell micrococcal nuclease sequencing. Nature 562, 281–285 (2018).
Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).
Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).
Rand, A. C. et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat. Methods 14, 411–413 (2017).
Shipony, Z. et al. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat. Methods 17, 319–327 (2020).
Wang, Y. et al. Single-molecule long-read sequencing reveals the chromatin basis of gene expression. Genome Res 29, 1329–1342 (2019).
Stergachis, A. B., Debo, B. M., Haugen, E., Churchman, L. S. & Stamatoyannopoulos, J. A. Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science 368, 1449–1454 (2020).
Abdulhay, N. J. et al. Massively multiplex single-molecule oligonucleosome footprinting. Preprint at bioRxiv https://doi.org/10.1101/2020.05.20.105379 (2020).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).
Olova, N. et al. Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data. Genome Biol. 19, 33 (2018).
Ji, L. et al. Methylated DNA is over-represented in whole-genome bisulfite sequencing data. Front. Genet. 5, 341 (2014).
Ziebarth, J. D., Bhattacharya, A. & Cui, Y. CTCFBSDB 2.0: a database for CTCF-binding sites and genome organization. Nucleic Acids Res. 41, D188–D194 (2013).
Gaffney, D. J. et al. Controls of nucleosome positioning in the human genome. PLoS Genet. 8, e1003036 (2012).
Valouev, A. et al. Determinants of nucleosome organization in primary human cells. Nature 474, 516–520 (2011).
Hesselberth, J. R. et al. Global mapping of protein–DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283–289 (2009).
Luscombe, N. M., Austin, S. E., Berman, H. M. & Thornton, J. M. An overview of the structures of protein–DNA complexes. Genome Biol. 1, REVIEWS001 (2000).
Boyle, A. P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res 21, 456–464 (2011).
Fu, Y., Sinha, M., Peterson, C. L. & Weng, Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 4, e1000138 (2008).
Hartl, D. et al. CG dinucleotides enhance promoter activity independent of DNA methylation. Genome Res. 29, 554–563 (2019).
Pinello, L., Farouni, R. & Yuan, G.-C. Haystack: systematic analysis of the variation of epigenetic states and cell-type specific regulatory elements. Bioinformatics 34, 1930–1933 (2018).
Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).
Gigante, S. et al. Using long-read sequencing to detect imprinted DNA methylation. Nucleic Acids Res. 47, e46 (2019).
Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2017).
Cotton, A. M. et al. Landscape of DNA methylation on the X chromosome reflects CpG density, functional chromatin state and X-chromosome inactivation. Hum. Mol. Genet. 24, 1528–1539 (2015).
Hellman, A. & Chess, A. Gene body-specific methylation on the active X chromosome. Science 315, 1141–1143 (2007).
Sharp, A. J. et al. DNA methylation profiles of human active and inactive X chromosomes. Genome Res 21, 1592–1600 (2011).
Jirtle, R. L. Genomic imprinting and cancer. Exp. Cell. Res. 248, 18–24 (1999).
Morison, I. M., Ramsay, J. P. & Spencer, H. G. A census of mammalian imprinting. Trends Genet. 21, 457–465 (2005).
Holliday, D. L. & Speirs, V. Choosing the right cell line for breast cancer research. Breast Cancer Res. 13, 215 (2011).
Messier, T. L. et al. Histone H3 lysine 4 acetylation and methylation dynamics define breast cancer subtypes. Oncotarget 7, 5094 (2016).
Subik, K. et al. The expression patterns of ER, PR, HER2, CK5/6, EGFR, Ki-67 and AR by immunohistochemical analysis in breast cancer cell lines. Breast Cancer 4, 35–41 (2010).
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Hansen, K. D., Langmead, B. & Irizarry, R. A. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13, R83 (2012).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8, 289–317 (2016).
Patterson, M. et al. WhatsHap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22, 498–509 (2015).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Acknowledgements
This study was supported by National Human Genome Research Institute (project no. 5R01HG009190).
Author information
Authors and Affiliations
Contributions
I.L. and W.T. conceived the study. I.L., T.G. and N.S. acquired data. I.L., R.R., T.G., M.M., A.G., F.J.S., K.D.H., J.T.S. and W.T. analyzed and interpreted data. I.L. and W.T. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
W.T. has two patents (8,748,091 and 8,394,584) licensed to Oxford Nanopore Technologies. I.L., T.G., N.S., F.S., J.T.S. and W.T. have received travel funds to speak at symposia organized by Oxford Nanopore Technologies. J.T.S. received research funding from Oxford Nanopore Technologies.
Additional information
Peer review information Lei Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. Nature Methods thanks Jeff Vierstra and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Assessment of CpG and GpC dual methylation calling.
The ability of nanopore sequencing to distinguish cytosine methylation at CpG and GpC contexts is shown by a, examining current level shifts depending on the placement of the methylation on a 6-mer (n = 256 unique 6mers for each group). Data are presented as median values, interquartile range (IQR), and 1.5X IQR. The performance of the methylation caller was validated by b, measuring methylation frequencies for calling methylation in samples treated by methyltransferases.
Extended Data Fig. 2 Bulk NanoNOMe profiles at CTCF binding sites.
Metaplots of a, methylation and b, accessibility as a function of distance to CTCF binding motifs in nanoNOMe, WGBS, and MNAse-seq agree very closely.
Extended Data Fig. 3 Pairwise Comparison of Methylation and Accessibility at Gene Promoters.
Pairwise scatter plot of average CpG methylation to GpC accessibility for 400 bp regions centered at each gene TSS, colored by its gene expression quartile.
Extended Data Fig. 4 GpC accessibility kernel estimation on single reads.
GpC methylation calls were smoothed using a Gaussian kernel estimator. a, Distributions of length of accessible and inaccessible runs and b, metaplot of accessibility near CTCF binding sites before and after the smoothing, along with (c) example of read-level plot of accessibility from a 2 kb region around a CTCF binding site.
Extended Data Fig. 5 Single-read epigenetic assessment on CTCF binding sites.
a, Heatmaps of lengths of runs of accessible chromatin calls on individual reads with respect to distance from CTCF binding sites, separated based on presence of ChIP-seq peaks. b, Density distributions of inaccessible runs at the CTCF binding sites, showing that sites without CTCF binding have long inaccessible runs suggesting nucleosome binding while those with CTCF binding have short inaccessible runs (sub-nucleosomal footprints) suggesting CTCF binding. c, Inaccessible runs were classified as either sub-nucleosomal or nucleosome binding depending on their lengths based on mixed Gaussian models.
Extended Data Fig. 6 CTCF binding classification.
Single-read (a) methylation and (b) accessibility plots on a CTCF binding motif, clustered by the presence of sub-nucleosomal footprint at the binding motif, predicted as events of CTCF protein binding.
Extended Data Fig. 7 Comparison of protein binding prediction with ChIP-seq.
a, The fractions of CTCF-bound reads determined by sub-nucleosomal footprints were compared with ChIP-seq coverage enrichments per CTCF binding motif, showing that the ChIP-seq signal tends to increase with CTCF binding fraction, and b, the distributions of the fractions were stratified by binding motifs with ChIP-seq peaks to those without peaks, showing that sites with ChIP-seq peaks have higher fractions of CTCF binding. Data are presented as median values, interquartile range (IQR), and 1.5X IQR, as well as density distributions.
Extended Data Fig. 8 Haplotype phasing results on GM12878 nanoNOMe data.
a, The number of reads that could be phased into maternal or paternal read based on the presence of heterozygous SNV in the read, showing that 65% of reads could be phased. b, The fractions of the chromosomes that could be phased (the fraction that had > 10x coverage on each allele after phasing) shows on average, 86 % of the genome could be phased.
Extended Data Fig. 9 X-chromosome inactivation promoter comparisons.
Methylation and accessibility in 500 bp and 100 bp windows, respectively, centered at TSS compared between maternal and paternal alleles (N = number of genes in the group), a, by plotting and comparing the distributions using boxplots and one-sided Wilcoxon rank-sum test (Data are presented as median values, interquartile range (IQR), and 1.5X IQR, CpG XCI Pat > Mat p-value = 0, GpC XCI Mat > Pat p-value = 1.9e-229), and b, by density plots of the difference in methylation frequencies between the two alleles.
Extended Data Fig. 10 Differentially methylated and differentially accessible regions between alleles in GM12878.
Methylation was compared between the two alleles across the genome to find regions of significant difference and were tested using one-sided Fisher’s exact test, and accessibility peaks were compared by 1) finding peaks of accessibility on each allele separately, 2) selecting peaks that occur exclusively in one allele, 3) and comparing the accessibility frequency between the two alleles in these candidate regions. The detected DMRs and DARs are a, shown as volcano plots, with dashed lines representing thresholds for considering the region as DMR/DAR. b, Examining existing (GEO Accession GSM1155957) ATAC-seq data, we compared allele specific accessible in ATAC-seq peaks that overlapped with a heterozygous SNP. In the 321 DARs detectable via ATAC-seq, we saw high correlation with nanoNOMe (r = 0.76).
Supplementary information
Supplementary Information
Supplementary figures, tables and descriptions of Supplementary Data.
Supplementary Data 1
nanoNOMe accessibility peaks in GM12878
Supplementary Data 2
CTCF-binding sites in GM12878
Supplementary Data 3
Estimated protein-bound regions near a subset of gene TSS in GM12878
Supplementary Data 4
Protein binding stratified by promoter epigenetic signatures
Supplementary Data 5
Allele-specific DMRs and DARs in GM12878
Supplementary Data 6
Gene promoter regions with allele-specific DMRs and DARs in GM12878
Supplementary Data 7
Heterozygous SVs in GM12878
Supplementary Data 8
DMRs and DARs in MCF-7 and MDA-MB-231 in comparison to MCF-10A
Supplementary Data 9
Summary of DMRs and DARs with respect to genomic contexts and SVs
Supplementary Data 10
SVs in MCF-10A, MCF-7, and MDA-MB-231
Supplementary Data 11
Promoter epigenetic signatures of differentially expressed genes in MCF-10A, MCF-7 and MDA-MB-231
Supplementary Data 12
Protein-binding regions near differentially expressed genes in MCF-10A, MCF-7 and MDA-MB-231
Rights and permissions
About this article
Cite this article
Lee, I., Razaghi, R., Gilpatrick, T. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat Methods 17, 1191–1199 (2020). https://doi.org/10.1038/s41592-020-01000-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-020-01000-7
This article is cited by
-
The variation and evolution of complete human centromeres
Nature (2024)
-
Single-nucleoid architecture reveals heterogeneous packaging of mitochondrial DNA
Nature Structural & Molecular Biology (2024)
-
Molecular pathology as basis for timely cancer diagnosis and therapy
Virchows Archiv (2024)
-
Nucleosome reorganisation in breast cancer tissues
Clinical Epigenetics (2024)
-
Assessing the efficacy of target adaptive sampling long-read sequencing through hereditary cancer patient genomes
npj Genomic Medicine (2024)