Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing

Abstract

Probing epigenetic features on DNA has tremendous potential to advance our understanding of the phased epigenome. In this study, we use nanopore sequencing to evaluate CpG methylation and chromatin accessibility simultaneously on long strands of DNA by applying GpC methyltransferase to exogenously label open chromatin. We performed nanopore sequencing of nucleosome occupancy and methylome (nanoNOMe) on four human cell lines (GM12878, MCF-10A, MCF-7 and MDA-MB-231). The single-molecule resolution allows footprinting of protein and nucleosome binding, and determination of the combinatorial promoter epigenetic signature on individual molecules. Long-read sequencing makes it possible to robustly assign reads to haplotypes, allowing us to generate a fully phased human epigenome, consisting of chromosome-level allele-specific profiles of CpG methylation and chromatin accessibility. We further apply this to a breast cancer model to evaluate differential methylation and accessibility between cancerous and noncancerous cells.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Overview and assessment of nanoNOMe.
Fig. 2: DNA accessibility assessment on TSSs and CTCF-binding sites using individual reads.
Fig. 3: Single-read epigenetic analysis on gene promoters.
Fig. 4: Allele-specific methylation and accessibility in GM12878.
Fig. 5: Comparative epigenomic analysis of breast cancer model.

Data availability

NanoNOMe data for GM12878, MCF-10A, MCF-7 and MDA-MB-231 are available at National Center for Biotechnology Information Bioproject ID PRJNA510783 (http://www.ncbi.nlm.nih.gov/bioproject/510783). Processed single-read data in select regions are deposited in Zenodo (https://zenodo.org/record/3969567) and processed methylation frequency files are available in GEO accession GSE155791.

Code availability

Source code for analysis is available at https://github.com/timplab/nanoNOMe.

References

  1. 1.

    Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Kelly, T. K. et al. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res 22, 2497–2506 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Clark, S. J. et al. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat. Commun. 9, 781 (2018).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Lai, B. et al. Principles of nucleosome organization revealed by single-cell micrococcal nuclease sequencing. Nature 562, 281–285 (2018).

    CAS  PubMed  Google Scholar 

  6. 6.

    Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    CAS  PubMed  Google Scholar 

  8. 8.

    Rand, A. C. et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat. Methods 14, 411–413 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Shipony, Z. et al. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat. Methods 17, 319–327 (2020).

    CAS  PubMed  Google Scholar 

  10. 10.

    Wang, Y. et al. Single-molecule long-read sequencing reveals the chromatin basis of gene expression. Genome Res 29, 1329–1342 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Stergachis, A. B., Debo, B. M., Haugen, E., Churchman, L. S. & Stamatoyannopoulos, J. A. Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science 368, 1449–1454 (2020).

    CAS  PubMed  Google Scholar 

  12. 12.

    Abdulhay, N. J. et al. Massively multiplex single-molecule oligonucleosome footprinting. Preprint at bioRxiv https://doi.org/10.1101/2020.05.20.105379 (2020).

  13. 13.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  14. 14.

    Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Olova, N. et al. Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data. Genome Biol. 19, 33 (2018).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Ji, L. et al. Methylated DNA is over-represented in whole-genome bisulfite sequencing data. Front. Genet. 5, 341 (2014).

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Ziebarth, J. D., Bhattacharya, A. & Cui, Y. CTCFBSDB 2.0: a database for CTCF-binding sites and genome organization. Nucleic Acids Res. 41, D188–D194 (2013).

    CAS  PubMed  Google Scholar 

  18. 18.

    Gaffney, D. J. et al. Controls of nucleosome positioning in the human genome. PLoS Genet. 8, e1003036 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Valouev, A. et al. Determinants of nucleosome organization in primary human cells. Nature 474, 516–520 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Hesselberth, J. R. et al. Global mapping of protein–DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283–289 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Luscombe, N. M., Austin, S. E., Berman, H. M. & Thornton, J. M. An overview of the structures of protein–DNA complexes. Genome Biol. 1, REVIEWS001 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Boyle, A. P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res 21, 456–464 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Fu, Y., Sinha, M., Peterson, C. L. & Weng, Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 4, e1000138 (2008).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Hartl, D. et al. CG dinucleotides enhance promoter activity independent of DNA methylation. Genome Res. 29, 554–563 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Pinello, L., Farouni, R. & Yuan, G.-C. Haystack: systematic analysis of the variation of epigenetic states and cell-type specific regulatory elements. Bioinformatics 34, 1930–1933 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).

    CAS  PubMed  Google Scholar 

  27. 27.

    Gigante, S. et al. Using long-read sequencing to detect imprinted DNA methylation. Nucleic Acids Res. 47, e46 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Cotton, A. M. et al. Landscape of DNA methylation on the X chromosome reflects CpG density, functional chromatin state and X-chromosome inactivation. Hum. Mol. Genet. 24, 1528–1539 (2015).

    CAS  PubMed  Google Scholar 

  30. 30.

    Hellman, A. & Chess, A. Gene body-specific methylation on the active X chromosome. Science 315, 1141–1143 (2007).

    CAS  PubMed  Google Scholar 

  31. 31.

    Sharp, A. J. et al. DNA methylation profiles of human active and inactive X chromosomes. Genome Res 21, 1592–1600 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Jirtle, R. L. Genomic imprinting and cancer. Exp. Cell. Res. 248, 18–24 (1999).

    CAS  PubMed  Google Scholar 

  33. 33.

    Morison, I. M., Ramsay, J. P. & Spencer, H. G. A census of mammalian imprinting. Trends Genet. 21, 457–465 (2005).

    CAS  PubMed  Google Scholar 

  34. 34.

    Holliday, D. L. & Speirs, V. Choosing the right cell line for breast cancer research. Breast Cancer Res. 13, 215 (2011).

    PubMed  PubMed Central  Google Scholar 

  35. 35.

    Messier, T. L. et al. Histone H3 lysine 4 acetylation and methylation dynamics define breast cancer subtypes. Oncotarget 7, 5094 (2016).

    PubMed  PubMed Central  Google Scholar 

  36. 36.

    Subik, K. et al. The expression patterns of ER, PR, HER2, CK5/6, EGFR, Ki-67 and AR by immunohistochemical analysis in breast cancer cell lines. Breast Cancer 4, 35–41 (2010).

    PubMed  Google Scholar 

  37. 37.

    Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020).

    CAS  Google Scholar 

  40. 40.

    Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Hansen, K. D., Langmead, B. & Irizarry, R. A. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13, R83 (2012).

    PubMed  PubMed Central  Google Scholar 

  42. 42.

    Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8, 289–317 (2016).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Patterson, M. et al. WhatsHap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22, 498–509 (2015).

    CAS  PubMed  Google Scholar 

  44. 44.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This study was supported by National Human Genome Research Institute (project no. 5R01HG009190).

Author information

Affiliations

Authors

Contributions

I.L. and W.T. conceived the study. I.L., T.G. and N.S. acquired data. I.L., R.R., T.G., M.M., A.G., F.J.S., K.D.H., J.T.S. and W.T. analyzed and interpreted data. I.L. and W.T. wrote the paper.

Corresponding author

Correspondence to Winston Timp.

Ethics declarations

Competing interests

W.T. has two patents (8,748,091 and 8,394,584) licensed to Oxford Nanopore Technologies. I.L., T.G., N.S., F.S., J.T.S. and W.T. have received travel funds to speak at symposia organized by Oxford Nanopore Technologies. J.T.S. received research funding from Oxford Nanopore Technologies.

Additional information

Peer review information Lei Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. Nature Methods thanks Jeff Vierstra and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Assessment of CpG and GpC dual methylation calling.

The ability of nanopore sequencing to distinguish cytosine methylation at CpG and GpC contexts is shown by a, examining current level shifts depending on the placement of the methylation on a 6-mer (n = 256 unique 6mers for each group). Data are presented as median values, interquartile range (IQR), and 1.5X IQR. The performance of the methylation caller was validated by b, measuring methylation frequencies for calling methylation in samples treated by methyltransferases.

Extended Data Fig. 2 Bulk NanoNOMe profiles at CTCF binding sites.

Metaplots of a, methylation and b, accessibility as a function of distance to CTCF binding motifs in nanoNOMe, WGBS, and MNAse-seq agree very closely.

Extended Data Fig. 3 Pairwise Comparison of Methylation and Accessibility at Gene Promoters.

Pairwise scatter plot of average CpG methylation to GpC accessibility for 400 bp regions centered at each gene TSS, colored by its gene expression quartile.

Extended Data Fig. 4 GpC accessibility kernel estimation on single reads.

GpC methylation calls were smoothed using a Gaussian kernel estimator. a, Distributions of length of accessible and inaccessible runs and b, metaplot of accessibility near CTCF binding sites before and after the smoothing, along with (c) example of read-level plot of accessibility from a 2 kb region around a CTCF binding site.

Extended Data Fig. 5 Single-read epigenetic assessment on CTCF binding sites.

a, Heatmaps of lengths of runs of accessible chromatin calls on individual reads with respect to distance from CTCF binding sites, separated based on presence of ChIP-seq peaks. b, Density distributions of inaccessible runs at the CTCF binding sites, showing that sites without CTCF binding have long inaccessible runs suggesting nucleosome binding while those with CTCF binding have short inaccessible runs (sub-nucleosomal footprints) suggesting CTCF binding. c, Inaccessible runs were classified as either sub-nucleosomal or nucleosome binding depending on their lengths based on mixed Gaussian models.

Extended Data Fig. 6 CTCF binding classification.

Single-read (a) methylation and (b) accessibility plots on a CTCF binding motif, clustered by the presence of sub-nucleosomal footprint at the binding motif, predicted as events of CTCF protein binding.

Extended Data Fig. 7 Comparison of protein binding prediction with ChIP-seq.

a, The fractions of CTCF-bound reads determined by sub-nucleosomal footprints were compared with ChIP-seq coverage enrichments per CTCF binding motif, showing that the ChIP-seq signal tends to increase with CTCF binding fraction, and b, the distributions of the fractions were stratified by binding motifs with ChIP-seq peaks to those without peaks, showing that sites with ChIP-seq peaks have higher fractions of CTCF binding. Data are presented as median values, interquartile range (IQR), and 1.5X IQR, as well as density distributions.

Extended Data Fig. 8 Haplotype phasing results on GM12878 nanoNOMe data.

a, The number of reads that could be phased into maternal or paternal read based on the presence of heterozygous SNV in the read, showing that 65% of reads could be phased. b, The fractions of the chromosomes that could be phased (the fraction that had > 10x coverage on each allele after phasing) shows on average, 86 % of the genome could be phased.

Extended Data Fig. 9 X-chromosome inactivation promoter comparisons.

Methylation and accessibility in 500 bp and 100 bp windows, respectively, centered at TSS compared between maternal and paternal alleles (N = number of genes in the group), a, by plotting and comparing the distributions using boxplots and one-sided Wilcoxon rank-sum test (Data are presented as median values, interquartile range (IQR), and 1.5X IQR, CpG XCI Pat > Mat p-value = 0, GpC XCI Mat > Pat p-value = 1.9e-229), and b, by density plots of the difference in methylation frequencies between the two alleles.

Extended Data Fig. 10 Differentially methylated and differentially accessible regions between alleles in GM12878.

Methylation was compared between the two alleles across the genome to find regions of significant difference and were tested using one-sided Fisher’s exact test, and accessibility peaks were compared by 1) finding peaks of accessibility on each allele separately, 2) selecting peaks that occur exclusively in one allele, 3) and comparing the accessibility frequency between the two alleles in these candidate regions. The detected DMRs and DARs are a, shown as volcano plots, with dashed lines representing thresholds for considering the region as DMR/DAR. b, Examining existing (GEO Accession GSM1155957) ATAC-seq data, we compared allele specific accessible in ATAC-seq peaks that overlapped with a heterozygous SNP. In the 321 DARs detectable via ATAC-seq, we saw high correlation with nanoNOMe (r = 0.76).

Supplementary information

Supplementary Information

Supplementary figures, tables and descriptions of Supplementary Data.

Reporting Summary

Supplementary Data 1

nanoNOMe accessibility peaks in GM12878

Supplementary Data 2

CTCF-binding sites in GM12878

Supplementary Data 3

Estimated protein-bound regions near a subset of gene TSS in GM12878

Supplementary Data 4

Protein binding stratified by promoter epigenetic signatures

Supplementary Data 5

Allele-specific DMRs and DARs in GM12878

Supplementary Data 6

Gene promoter regions with allele-specific DMRs and DARs in GM12878

Supplementary Data 7

Heterozygous SVs in GM12878

Supplementary Data 8

DMRs and DARs in MCF-7 and MDA-MB-231 in comparison to MCF-10A

Supplementary Data 9

Summary of DMRs and DARs with respect to genomic contexts and SVs

Supplementary Data 10

SVs in MCF-10A, MCF-7, and MDA-MB-231

Supplementary Data 11

Promoter epigenetic signatures of differentially expressed genes in MCF-10A, MCF-7 and MDA-MB-231

Supplementary Data 12

Protein-binding regions near differentially expressed genes in MCF-10A, MCF-7 and MDA-MB-231

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, I., Razaghi, R., Gilpatrick, T. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat Methods 17, 1191–1199 (2020). https://doi.org/10.1038/s41592-020-01000-7

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing