Abstract
Although circulating tumor DNA (ctDNA) assays are increasingly used to inform clinical decisions in cancer care, they have limited ability to identify the transcriptional programs that govern cancer phenotypes and their dynamic changes during the course of disease. To address these limitations, we developed a method for comprehensive epigenomic profiling of cancer from 1 ml of patient plasma. Using an immunoprecipitation-based approach targeting histone modifications and DNA methylation, we measured 1,268 epigenomic profiles in plasma from 433 individuals with one of 15 cancers. Our assay provided a robust proxy for transcriptional activity, allowing us to infer the expression levels of diagnostic markers and drug targets, measure the activity of therapeutically targetable transcription factors and detect epigenetic mechanisms of resistance. This proof-of-concept study in advanced cancers shows how plasma epigenomic profiling has the potential to unlock clinically actionable information that is currently accessible only via direct tissue sampling.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout


Data availability
BED files containing genomic alignments of all sequenced fragments as well as ChIP-seq peak locations are available through GEO under accession number GSE243474. Due to privacy restrictions regarding genomic data, raw sequencing data can be shared upon reasonable request under a data use agreement. Requests should be directed to the corresponding author at freedman@broadinstitute.org and should receive a response within 2 weeks.
The following public datasets were used: DNAse hypersensitivity sites (https://zenodo.org/record/3838751/files/DHS_Index_and_Vocabulary_hg19_WM20190703.txt.gz), TCGA ATAC-seq peak calls (https://api.gdc.cancer.gov/data/116ebba2-d284-485b-9121-faf73ce0a4ec; lifted over to hg19 from hg38), Human Protein Atlas database annotations (https://www.proteinatlas.org/download/proteinatlas.tsv.zip) and Encode list of high-noise regions for exclusion from ChIP-seq analysis (https://github.com/Boyle-Lab/Blacklist/blob/master/lists/hg19-blacklist.v2.bed.gz).
Code availability
Scripts to reproduce analyses from this study are available at https://github.com/Baca-Lab/cfchip_manuscript.
References
Nuzzo, P. V. et al. Detection of renal cell carcinoma using plasma and urine cell-free DNA methylomes. Nat. Med. 26, 1041–1043 (2020).
Berchuck, J. E. et al. Detecting neuroendocrine prostate cancer through tissue-informed cell-free DNA methylation analysis. Clin. Cancer Res. 28, 928–938 (2022).
Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).
Doebley, A.-L. et al. A framework for clinical cancer subtyping from nucleosome profiling of cell-free DNA. Nat. Commun. 13, 7475 (2022).
De Sarkar, N. et al. Nucleosome patterns in circulating tumor DNA reveal transcriptional regulation of advanced prostate cancer phenotypes. Cancer Discov. 13, 632–653 (2022).
Cristiano, S. et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature 570, 385–389 (2019).
Sadeh, R. et al. ChIP-seq of plasma cell-free nucleosomes identifies gene expression programs of the cells of origin. Nat. Biotechnol. 39, 586–598 (2021).
Vad-Nielsen, J., Meldgaard, P., Sorensen, B. S. & Nielsen, A. L. Cell-free Chromatin Immunoprecipitation (cfChIP) from blood plasma can determine gene-expression in tumors from non-small-cell lung cancer patients. Lung Cancer 147, 244–251 (2020).
Bradner, J. E., Hnisz, D. & Young, R. A. Transcriptional addiction in cancer. Cell 168, 629–643 (2017).
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).
Tam, W. L. & Weinberg, R. A. The epigenetics of epithelial–mesenchymal plasticity in cancer. Nat. Med. 19, 1438–1449 (2013).
Pomerantz, M. M. et al. Prostate cancer reactivates developmental epigenomic programs during metastatic progression. Nat. Genet. 52, 790–799 (2020).
Ahcene Djaballah, S., Daniel, F., Milani, A., Ricagno, G. & Lonardi, S. HER2 in colorectal cancer: the long and winding road from negative predictive factor to positive actionable target. Am. Soc. Clin. Oncol. Educ. Book 42, 1–14 (2022).
Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).
Kaukonen, D. et al. Analysis of H3K4me3 and H3K27me3 bivalent promotors in HER2+ breast cancer cell lines reveals variations depending on estrogen receptor status and significantly correlates with gene expression. BMC Med. Genomics 13, 92 (2020).
Takeda, D. Y. et al. A somatically acquired enhancer of the androgen receptor is a noncoding driver in advanced prostate cancer. Cell 174, 422–432 (2018).
Baca, S. C. et al. Reprogramming of the FOXA1 cistrome in treatment-emergent neuroendocrine prostate cancer. Nat. Commun. 12, 1979 (2021).
Cejas, P. et al. Subtype heterogeneity and epigenetic convergence in neuroendocrine prostate cancer. Nat. Commun. 12, 5775 (2021).
Loyfer, N. et al. A DNA methylation atlas of normal human cell types. Nature 613, 355–364 (2023).
Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).
Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8, 1324 (2017).
Yu, G., Wang, L.-G. & He, Q.-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383 (2015).
Layer, R. M. et al. GIGGLE: a search engine for large-scale integrated genome analysis. Nat. Methods 15, 123–126 (2018).
Uhlén, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Chen, Y. et al. VirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue. Bioinformatics 29, 266–267 (2013).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Zheng, R. et al. Cistrome Data Browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res. 47, D729–D735 (2019).
Acknowledgements
The authors thank M. Merino for providing the HER2 IHC staining. This work is supported by the US Department of Defense (DoD) (awards W81XWH-21-1-0358 and W81XWH-21-1-0299 to S.C.B. and W81XWH-18-2-0056 to Z.S.); the National Cancer Institute (R01 CA137008 to A.N.H.); the Breast Cancer Research Foundation (BCRF-21-159 to Z.S.); Kræftens Bekæmpelse (R281-A16566 and R342-A19788 to Z.S.); Det Frie Forskningsråd Sundhed og Sygdom (7016-00345B to Z.S.); National Institutes of Health P01 CA228696-01A1 to Z.S. and M.L.F.; and the University of Massachusetts Boston–Dana-Farber/Harvard Cancer Center U54 Partnership Grant (UMass Boston: 2 U54 CA156734-12; DF/HCC: 2 U54 CA156732-12). M.L.F. is supported by the Claudia Adams Barr Program for Innovative Cancer Research, the Dana-Farber Cancer Institute Presidential Initiatives Fund, the H.L. Snyder Medical Research Foundation, the Cutler Family Fund for Prevention and Early Detection, the Donahue Family Fund, W81XWH-21-1-0339 and W81XWH-22-1-0951 (DoD) and the Movember PCF Challenge Award. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
S.C.B., J.-H.S. and M.L.F. designed the study. M.A., C.H.C., J.A.D., W.D.F., T.F.G., A.N.H., S.H., M.E.H., K.L.L., N.L., C.M., K.N., M.G.O., H.A.P., M.M.P., A.R., J.R., M.T., S.M.T., P.Y.W. and J.E.B. contributed plasma samples. M.P.D., G.L., T.E.Z., H.S., J.C. and I.M. performed the cfChIP-seq and cfMeDIP-seq experiments. B.F., K.S., S.S., M.D., X.Q., Z.Z., R.L., Y.J. and L.T. analyzed the data, with guidance from S.C.B., Z.S. and H.L. R.M. and T.E.Z. compiled clinical data. S.C.B., T.K.C. and M.L.F. wrote the paper, with input from all authors.
Corresponding author
Ethics declarations
Competing interests
S.C.B., T.K.C. and M.L.F. are co-founders and shareholders of Precede Biosciences. J.D. is a consultant for Kymera Therapeutics and has a sponsored research agreement with Kymera Therapeutics. M.T. served on an advisory board for Incyte. A.N.H. reports research support from Amgen, Blueprint Medicines, BridgeBio, Bristol-Myers Squibb, C4 Therapeutics, Eli Lilly, Novartis, Nuvalent, Pfizer, Roche/Genentech and Scorpion Therapeutics and paid consulting for Engine Biosciences, Nuvalent, Oncovalent, TigaTx and Tolremo Therapeutics. J.R. receives research funding from Equillium, Kite/Gilead, Novartis and Oncternal and consults or is on advisory boards for AvroBio, Akron Biotech, Clade Therapeutics, Garuda Therapeutics, LifeVault Bio, Novartis, Smart Immune and TScan Therapeutics. The remaining authors report no competing interests.
Peer review
Peer review information
Nature Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. Primary handling editor: Anna Maria Ranzoni, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Genomic features overlapping cfChIP-seq and cfMeDIP-seq peaks.
(a) Overlaps for the top 1,000 ctDNA-correlated regulatory elements (CREs) by significance are plotted for each assay type. (b) Overlap of the top 1,000 cfMeDIP-seq CREs with CpG islands, shores, and shelves. Random regions matched for chromosome and size are shown for comparison.
Extended Data Fig. 2 Examples of positive and negative ctDNA-correlated regulatory elements (CREs).
Normalized read counts from epigenomic features correlate with ctDNA fraction at CREs. Spearman correlation coefficients and two-sided p-values are indicated.
Extended Data Fig. 3 Classification of cancer plasma based on H3K4me3 cfChIP-seq profiles.
(a) Receiver operating characteristic (ROC) curves for logistic regression-based classification of cancer plasma vs. healthy plasma, using as features the promoter H3K4me3 signal at a set of tissue-specific genes defined in the Human Protein Atlas (HPA) database24 (Methods). The classifier considered genes that were annotated as ‘tissue enriched’ or ‘tissue enhanced’ as well as ‘Not detected in immune cells’ in the HPA database. AUC, area under the curve. (b) ROC curves for classification of three cancer types with the most examples in the cohort.
Extended Data Fig. 4 H3K4me3 cfChIP-seq signal at promoters of selected genes of interest.
Promoter H3K4me3 signal is shown at selected genes across N = 202 biologically independent plasma samples stratified by cancer type. Orange indicates cancer types in which the indicated gene is expected to be expressed. Wilcoxon two-sided p-values are indicated for comparison of samples in which expression is expected versus all other samples. For NECTIN4 and ERBB3, signal is compared between healthy volunteer plasma and cancer patient plasma because these genes are expressed across various cancer types. Signal at GAPDH is shown as a control. Lower, middle, and upper hinges indicate 25th, 50th, and 75th percentiles; whiskers extend to 1.5 x the inter-quartile ranges (IQR).
Extended Data Fig. 5 Correlation of serum PSA with H3K4me3 cfChIP-seq signal at KLK3.
Correlation of serum PSA with ctDNA content is shown as a comparison. Pearson two-sided p-values are indicated.
Extended Data Fig. 6 Aggregate H3K27ac cfChIP signal at regulatory elements identified by ATAC-seq in tumor tissue.
Signal in cancer plasma (orange) and healthy plasma (gray) is compared at regulatory elements in the corresponding cancer type defined by ATAC-seq in TCGA tumors14. Dark lines show the mean signal across all samples in the indicated class. For comparison, signal at ‘common’ REs is shown, which include 10,000 regulatory elements with DNAse hypersensitivity across most or all cell types20 (Methods). Boxplots indicate area under the curve for the aggregate H3K27ac profile for each sample. Lower, middle, and upper hinges indicate 25th, 50th, and 75th percentiles; whiskers extend to 1.5 x the inter-quartile ranges (IQR). Wilcoxon test two-sided p-values are indicated for comparison of healthy vs cancer samples.
Extended Data Fig. 8 Aggregate H3K27ac cfChIP-seq signal at HIF2α binding sites in renal cell carcinoma (RCC) and at AR binding sites in prostate cancer.
Healthy volunteer samples are shown for comparison. Boxplots indicate area under the curve for the aggregate H3K27ac profile for each sample. Lower, middle, and upper hinges indicate 25th, 50th, and 75th percentiles; whiskers extend to 1.5 x the inter-quartile ranges (IQR). Wilcoxon test two-sided p-values are indicated for comparison of healthy vs cancer samples.
Extended Data Fig. 9 H3K27ac cfChIP-seq distinguishes prostate cancer subtype-specific FOXA1 binding sites.
(a) H3K4me3 cfChIP-seq signal at the FOXA1 promoter in prostate adenocarcinoma (PRAD) vs. neuroendocrine prostate cancer (NEPC) for N = 25 biologically independent samples. (b) Aggregate H3K27ac cfChIP signal at Boxplots indicate aggregate signal at the indicated sites for the indicated epigenetic features for N = 29 biologically independent samples. NEPC-FOXA1 and PRAD-FOXA1 indicate FOXA1 binding sites that are preferentially bound in neuroendocrine prostate cancer (NEPC) compared to prostate adenocarcinoma (PRAD), as described previously17. Aggregate signal at differential FOXA1 binding sites for each sample is normalized to signal at shared FOXA1 binding sites that are common to NEPC and PRAD. Wilcoxon test two-sided p-values are indicated. Boxplots indicate area under the curve for the aggregate cfChIP-seq profile for each sample. Lower, middle, and upper hinges indicate 25th, 50th, and 75th percentiles; whiskers extend to 1.5 x the inter-quartile ranges (IQR).
Extended Data Fig. 10 Aggregate H3K27ac cfChIP signal at neuroendocrine-enriched regulatory elements.
Dark lines show the mean signal across all samples in the indicated class. ‘NE’ indicates samples with neuroendocrine differentiation (SCLC, NEPC, or Merkel cell carcinoma). Wilcoxon test two-sided p-value is indicated. Boxplots indicate area under the curve for the aggregate cfChIP-seq profile for each sample. Lower, middle, and upper hinges indicate 25th, 50th, and 75th percentiles; whiskers extend to 1.5 x the inter-quartile ranges (IQR).
Supplementary information
Supplementary Table
Excel spreadsheet containing Supplementary Tables 1 and 2.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Baca, S.C., Seo, JH., Davidsohn, M.P. et al. Liquid biopsy epigenomic profiling for cancer subtyping. Nat Med 29, 2737–2741 (2023). https://doi.org/10.1038/s41591-023-02605-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-023-02605-z