ChIP-seq of plasma cell-free nucleosomes identifies gene expression programs of the cells of origin

Abstract

Cell-free DNA (cfDNA) in human plasma provides access to molecular information about the pathological processes in the organs or tumors from which it originates. These DNA fragments are derived from fragmented chromatin in dying cells and retain some of the cell-of-origin histone modifications. In this study, we applied chromatin immunoprecipitation of cell-free nucleosomes carrying active chromatin modifications followed by sequencing (cfChIP-seq) to 268 human samples. In healthy donors, we identified bone marrow megakaryocytes, but not erythroblasts, as major contributors to the cfDNA pool. In patients with a range of liver diseases, we showed that we can identify pathology-related changes in hepatocyte transcriptional programs. In patients with metastatic colorectal carcinoma, we detected clinically relevant and patient-specific information, including transcriptionally active human epidermal growth factor receptor 2 (HER2) amplifications. Altogether, cfChIP-seq, using low sequencing depth, provides systemic and genome-wide information and can inform diagnosis and facilitate interrogation of physiological and pathological processes using blood samples.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Chromatin immunoprecipitation from plasma.
Fig. 2: cfChIP-seq of multiple marks is informative on gene expression.
Fig. 3: H3K4me3 cfChIP-seq signal is correlated with expression levels.
Fig. 4: cfChIP-seq identifies cell type and program-specific expression patterns.
Fig. 5: cfChIP-seq detects changes in liver-specific transcriptional programs.
Fig. 6: cfChIP-seq identifies molecular heterogeneity in patients with CRC.

Data availability

Data collected in this study were deposited in the European Genome-phenome Archive (EMBL-EBI) repository. BED files and browser tracks are available in the Zenodo repository: https://doi.org/10.5281/zenodo.3967253.

Browser tracks can be viewed by the UCSC genome browser.

• Session: http://genome.ucsc.edu/s/nirfriedman/cfChIP-seq

• Track hub: http://www.cs.huji.ac.il/~nir/Hubs/cfChIP-seq/hub.txt

Additional data from public repositories are listed here:

The datasets are as follows: UCSC known genes (AH5036); Ensembl transcripts (AH5046); genomic annotations (AH5040): AnnotationHub (http://bioconductor.org/packages/release/bioc/html/AnnotationHub.html); consolidated ChIP-seq: Roadmap Epigenomics (https://egg2.wustl.edu/roadmap/data/byFileType/alignments/consolidated/); mRNA-seq: Roadmap Epigenomics (https://egg2.wustl.edu/roadmap/data/byDataType/rna/expression/57epigenomes.RPKM.pc.gz); consolidated ChromHMM calls: Roadmap Epigenomics (http://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/all.mnemonics.bedFiles.tgz).

Code availability

R code for processing cfChIP-seq data is available at https://github.com/nirfriedman/cfChIP-seq.git.

References

  1. 1.

    Mandel, P. Les acides nucleiques du plasma sanguin chez l’homme. CR Acad. Sci. Paris 142, 241–243 (1948).

    CAS  Google Scholar 

  2. 2.

    Lo, Y. M. et al. Rapid clearance of fetal DNA from maternal plasma. Am. J. Hum. Genet. 64, 218–224 (1999).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  3. 3.

    De Vlaminck, I. et al. Circulating cell-free DNA enables noninvasive diagnosis of heart transplant rejection. Sci. Transl. Med. 6, 241ra77 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  4. 4.

    Schwarzenbach, H., Hoon, D. S. & Pantel, K. Cell-free nucleic acids as biomarkers in cancer patients. Nat. Rev. Cancer 11, 426–437 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  5. 5.

    Sun, K. et al. Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments. Proc. Natl Acad. Sci. USA 112, E5503–E5512 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  6. 6.

    Lu, J.-L. & Liang, Z.-Y. Circulating free DNA in the era of precision oncology: pre- and post-analytical concerns. Chronic Dis. Transl. Med. 2, 223–230 (2016).

    PubMed  PubMed Central  Google Scholar 

  7. 7.

    Wan, J. C. et al. Liquid biopsies come of age: towards implementation of circulating tumour DNA. Nat. Rev. Cancer 17, 223–238 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  8. 8.

    Lehmann-Werman, R. et al. Identification of tissue-specific cell death using methylation patterns of circulating DNA. Proc. Natl Acad. Sci. USA 113, E1826–E1834 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  9. 9.

    Guo, S. et al. Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA. Nat. Genet. 49, 635–642 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Kang, S. et al. CancerLocator: non-invasive cancer diagnosis and tissue-of-origin prediction using methylation profiles of cell-free DNA. Genome Biol. 18, 53 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  11. 11.

    Moss, J. et al. Comprehensive human cell-type methylation atlas reveals origins of circulating cell-free DNA in health and disease. Nat. Commun. 9, 448142 (2018).

    Article  CAS  Google Scholar 

  12. 12.

    Kornberg, R. D. & Lorch, Y. Twenty-five years of the nucleosome, fundamental particle of the eukaryote chromosome. Cell 98, 285–294 (1999).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  13. 13.

    Li, B., Carey, M. & Workman, J. L. The role of chromatin during transcription. Cell 128, 707–719 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  14. 14.

    Guenther, M. G., Levine, S. S., Boyer, L. A., Jaenisch, R. & Young, R. A. A chromatin landmark and transcription initiation at most promoters in human cells. Cell 130, 77–88 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Berger, S. L. The complex language of chromatin regulation during transcription. Nature 447, 407 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  16. 16.

    Venkatesh, S. & Workman, J. L. Histone exchange, chromatin structure and the regulation of transcription. Nat. Rev. Mol. Cell Biol. 16, 178 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. 17.

    Heintzman, N. D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  18. 18.

    Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  19. 19.

    Heintzman, N. D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Lenhard, B., Sandelin, A. & Carninci, P. Metazoan promoters: emerging characteristics and insights into transcriptional regulation. Nat. Rev. Genet. 13, 233–245 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  21. 21.

    Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  22. 22.

    Lawrence, M., Daujat, S. & Schneider, R. Lateral thinking: how histone modifications regulate gene expression. Trends Genet. 32, 42–56 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  23. 23.

    ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).

    Article  CAS  Google Scholar 

  24. 24.

    Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  26. 26.

    Lara-Astiaso, D. et al. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Weiner, A. et al. High-resolution chromatin dynamics during a yeast stress response. Mol. Cell 58, 371–386 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Holdenrieder, S. et al. Nucleosomes in serum of patients with benign and malignant diseases. Int. J. Cancer 95, 114–120 (2001).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  29. 29.

    Holdenrieder, S. et al. Cell-free DNA in serum and plasma: comparison of ELISA and quantitative PCR. Clin. Chem. 51, 1544–1546 (2005).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  30. 30.

    Rumore, P. M. & Steinman, C. R. Endogenous circulating DNA in systemic lupus erythematosus. Occurrence as multimeric complexes bound to histone. J. Clin. Invest. 86, 69–74 (1990).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. 31.

    Gezer, U. et al. Characterization of H3K9me3- and H4K20me3-associated circulating nucleosomal DNA by high-throughput sequencing in colorectal cancer. Tumour Biol. 34, 329–336 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  32. 32.

    Bauden, M. et al. Circulating nucleosomes as epigenetic biomarkers in pancreatic cancer. Clin. Epigenetics 7, 106 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  33. 33.

    Deligezer, U. et al. H3K9me3/H4K20me3 ratio in circulating nucleosomes as potential biomarker for colorectal cancer. Circulating Nucleic Acids in Plasma and Serum 97–103 (Springer, 2011).

  34. 34.

    Ulz, P. et al. Inferring expressed genes by whole-genome sequencing of plasma DNA. Nat. Genet. 48, 1273–1278 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  35. 35.

    Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Xu, R.-H. et al. Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma. Nat. Mater. 16, 1155–1161 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Haller, N., Tug, S., Breitbach, S., Jörgensen, A. & Simon, P. Increases in circulating cell-free DNA during aerobic running depend on intensity and duration. Int. J. Sports Physiol. Perform. 12, 455–462 (2017).

    PubMed  Article  PubMed Central  Google Scholar 

  38. 38.

    Ramachandran, S., Ahmad, K. & Henikoff, S. Transcription and remodeling produce asymmetrically unwrapped nucleosomal intermediates. Mol. Cell 68, 1038–1053 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. 39.

    Zemmour, H. et al. Non-invasive detection of human cardiomyocyte death using methylation patterns of circulating DNA. Nat. Commun. 9, 1443 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  40. 40.

    Li, W. et al. CancerDetector: ultrasensitive and non-invasive cancer detection at the resolution of individual reads using cell-free DNA methylation sequencing data. Nucleic Acids Res. 46, e89 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  41. 41.

    Shen, S. Y. et al. Sensitive tumour detection and classification using plasma cell-free DNA methylomes. Nature 563, 579–583 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  42. 42.

    Lehmann-Werman, R. et al. Monitoring liver damage using hepatocyte-specific methylation markers in cell-free circulating DNA. JCI Insight 3, e120687 (2018).

    PubMed Central  Article  Google Scholar 

  43. 43.

    Cristiano, S. et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature 570, 385–389 (2019).

  44. 44.

    Gutin, J. et al. Fine-resolution mapping of TF binding and chromatin interactions. Cell Rep. 22, 2797–2807 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Singh, S. S. et al. Widespread suppression of intragenic transcription initiation by H-NS. Genes Dev. 28, 214–219 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Rhee, H. S. & Pugh, B. F. Comprehensive genome-wide protein–DNA interactions detected at single-nucleotide resolution. Cell 147, 1408–1419 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Mizuta, R. et al. DNase γ is the effector endonuclease for internucleosomal DNA fragmentation in necrosis. PLoS ONE 8, e80223 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. 48.

    Ozawa, T. et al. CCAT1 and CCAT2 long noncoding RNAs, located within the 8q.24.21 ‘gene desert’, serve as important prognostic biomarkers in colorectal cancer. Ann. Oncol. 28, 1882–1888 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Tan, D. S. W. et al. Long noncoding RNA EGFR-AS1 mediates epidermal growth factor receptor addiction and modulates treatment response in squamous cell carcinoma. Nat. Med. 23, 1167–1175 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  50. 50.

    GTEx Consortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

    PubMed Central  Article  CAS  Google Scholar 

  51. 51.

    Cancer Genome Atlas Research Networket al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    PubMed Central  Article  CAS  Google Scholar 

  52. 52.

    Karlić, R., Chung, H.-R., Lasserre, J., Vlahovicek, K. & Vingron, M. Histone modification levels are predictive for gene expression. Proc. Natl Acad. Sci. USA 107, 2926–2931 (2010).

    PubMed  Article  PubMed Central  Google Scholar 

  53. 53.

    Liu, C. L. et al. Single-nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol. 3, e328 (2005).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  54. 54.

    Swarup, V. & Rajeswari, M. R. Circulating (cell‐free) nucleic acids—a promising, non‐invasive tool for early detection of several human diseases. FEBS Lett. 581, 795–799 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  55. 55.

    Leon, S. A., Shapiro, B., Sklaroff, D. M. & Yaros, M. J. Free DNA in the serum of cancer patients and the effect of therapy. Cancer Res. 37, 646–650 (1977).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Lam, W. K. J. et al. DNA of erythroid origin is present in human plasma and informs the types of anemia. Clin. Chem. 63, 1614–1623 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  57. 57.

    Deutsch, V. R. & Tomer, A. Megakaryocyte development and platelet production. Br. J. Haematol. 134, 453–466 (2006).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  58. 58.

    Stunnenberg, H. G., International Human Epigenome Consortium & Hirst, M. The International Human Epigenome Consortium: A Blueprint for Scientific Collaboration and Discovery. Cell 167, 1897 (2016).

  59. 59.

    Giannini, E. G., Testa, R. & Savarino, V. Liver enzyme alteration: a guide for clinicians. CMAJ 172, 367–379 (2005).

    PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    Drew, K. et al. Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Mol. Syst. Biol. 13, 932 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  62. 62.

    Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes-2019. Nucleic Acids Res. 47, D559–D563 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  63. 63.

    Kamburov, A., Stelzl, U., Lehrach, H. & Herwig, R. The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Res. 41, D793–D800 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  64. 64.

    King, K. R. et al. IRF3 and type I interferons fuel a fatal response to myocardial infarction. Nat. Med. 23, 1481–1487 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Czaja, A. J. Chemokines as orchestrators of autoimmune hepatitis and potential therapeutic targets. Aliment. Pharmacol. Ther. 40, 261–279 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  66. 66.

    Mercer, F. & Unutmaz, D. The biology of FoxP3: a key player in immune suppression during infections, autoimmune diseases and cancer. Adv. Exp. Med. Biol. 665, 47–59 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. 67.

    Lachmann, A. et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat. Commun. 9, 1366 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  68. 68.

    Aizarani, N. et al. A human liver cell atlas reveals heterogeneity and epithelial progenitors. Nature 572, 199–204 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  69. 69.

    Jungermann, K. & Katz, N. Functional specialization of different hepatocyte populations. Physiol. Rev. 69, 708–764 (1989).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  70. 70.

    Reinert, T. et al. Analysis of circulating tumour DNA to monitor disease burden following colorectal cancer surgery. Gut 65, 625–634 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  71. 71.

    Tannapfel, A. & Reinacher-Schick, A. Chemotherapy associated hepatotoxicity in the treatment of advanced colorectal cancer (CRC). Z. Gastroenterol. 46, 435–440 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  72. 72.

    Bradner, J. E., Hnisz, D. & Young, R. A. Transcriptional addiction in cancer. Cell 168, 629–643 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  73. 73.

    Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  74. 74.

    Nissan, A. et al. Colon cancer associated transcript-1: a novel RNA expressed in malignant and pre-malignant human tissues. Int. J. Cancer 130, 1598–1606 (2012).

    CAS  Article  PubMed  Google Scholar 

  75. 75.

    Coulson, J. M. Transcriptional regulation: cancer, neurons and the REST. Curr. Biol. 15, R665–R668 (2005).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  76. 76.

    Rademakers, G. et al. The role of enteric neurons in the development and progression of colorectal cancer. Biochim. Biophys. Acta Rev. Cancer 1868, 420–434 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  77. 77.

    Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    Koppens, M. A. J. et al. Large variety in a panel of human colon cancer organoids in response to EZH2 inhibition. Oncotarget 7, 69816–69828 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  79. 79.

    Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

    Article  CAS  Google Scholar 

  80. 80.

    Ferrari, A. et al. A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers. Nat. Commun. 7, 12222 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  81. 81.

    Sartore-Bianchi, A. et al. Dual-targeted therapy with trastuzumab and lapatinib in treatment-refractory, KRAS codon 12/13 wild-type, HER2-positive metastatic colorectal cancer (HERACLES): a proof-of-concept, multicentre, open-label, phase 2 trial. Lancet Oncol. 17, 738–746 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  82. 82.

    Ulz, P. et al. Inference of transcription factor binding from cell-free DNA enables tumor subtype prediction and early detection. Nat. Commun. 10, 4666 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. 83.

    Schoenfelder, S. & Fraser, P. Long-range enhancer–promoter contacts in gene expression control. Nat. Rev. Genet. 20, 437–455 (2019).

  84. 84.

    Shen-Orr, S. S. & Gaujoux, R. Computational deconvolution: extracting cell type-specific information from heterogeneous samples. Curr. Opin. Immunol. 25, 571–578 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  85. 85.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  CAS  Google Scholar 

  86. 86.

    Matys, V. et al. TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).

    CAS  Article  Google Scholar 

  87. 87.

    Lachmann, A. et al. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26, 2438–2444 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  88. 88.

    Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  89. 89.

    Vivian, J. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat. Biotechnol. 35, 314–316 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  90. 90.

    Rouillard, A. D. et al. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database 2016, baw100 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank N. Kaminski, J. Moss, E. Pikarsky, N. Rajewsky, O.J. Rando, A. Regev and members of the Friedman lab for discussions and comments on this manuscript. We thank L. Friedman for help with illustrations and graphics. This work was supported by the European Research Council’s AdG Grants 340712 ‘ChromatinSys’ (to N.F.) and 786575 ‘RxmiRcanceR’ (to E.G.); the Israel Science Foundation’s I-CORE program grant 1796/12 (to T.K. and N.F.) and grants 2612/18 (to N.F.), 3020/20 (to A.G.), 2473/17 (to E.G.) and 486/17 (to E.G.); Israel Ministry of Science and Technology grant 3-14352 (to A.G.); National Institutes of Health grants RM1HG006193 (to N.F.) and CA197081-02 (to E.G.); Deutsche Forschungsgemeinschaft SFB841 (to E.G.); and DKFZ-MOST grant (to E.G.).

Author information

Affiliations

Authors

Contributions

R. Sadeh and N.F. developed the concept. R. Sadeh, N.F. and I.S. designed the experiments with help from E.G., B.G., A.Z. and Y.D. R. Sadeh developed the cfChIP-seq method with help from A.R. and I.S. I.S., R. Sadeh and A.R. performed cfChIP-seq experiments. N.F. and G.F. developed analytical tools with help from J.G., M.N., G.M. and T.K. N.F., R. Sadeh, G.F., I.S. and J.G. analyzed the data. I.F.F, D.N. and R. Shemer performed the cfDNA methylation assays. Z.K. collected healthy donor samples. D.Y., T.P., A.H., J.E.C., A.S., M.T., A.G., M.M., S.A.G., A.B.Y., E.S., R. Safadi, D.P., E.G., B.G. and A.Z. provided clinical insights, recruited patients and collected patient samples. N.F., R. Sadeh, J.G., G.F. and A.C. wrote the paper with input from all authors.

Corresponding author

Correspondence to Nir Friedman.

Ethics declarations

Competing interests

A patent application for cfChIP-seq has been submitted by the Hebrew University of Jerusalem. R. Sadeh, I.S., J.G. and N.F. are founders of Senseera.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Supporting data for Fig. 1.

a, Distribution of reads for cfChIP-seq with different antibodies on four samples (H012.1, H012.2, H013.1, and H013.2). We divided the genome into regions that contain (putative) TSS based on our catalogue (see below) and (putative) Enhancers. Since there are regions that are marked as both (in different tissues), we consider the intersection separately. For each subset we show the fraction of reads mapped to the region. Within each bar, the fraction estimated as background (based on our background model, Methods) is marked in dark gray. b, Genome browser view (as in Fig. 1c). c, Metaplots (as in Fig. 1d) of ChIP-seq samples from the Roadmap Epigenomics compendium. d, Scatter plots showing signal levels from cfChIP-seq versus Leukocyte ChIP-seq of H3K4me3, H3K4me2, and H3K36me3 (similar to Fig. 1e). e, Estimation of the amount of specific reads in cfChIP-seq. Top panel: box plot of the estimate of % reads that are above background levels for all the cfChIP-seq samples analyzed in the manuscript (Supplementary Table 1) compared to selected ChIP-seq samples from Roadmap Epigenomics compendium. Bottom panel: percent of the signal above background that is in the expected genomic locations (i.e H3K4me1 and H3K4me2 - promoters and enhancers, H3K4me3 - promoters, H3K36me3 - gene bodies). For comparison, the same analysis pipeline was applied to selected Roadmap Epigenomic ChIP-seq samples against the same marks. Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge.

Extended Data Fig. 2 Supporting data for Fig. 1.

a, Fragment length distribution for all samples in this manuscript. Each row represents a histogram of fragment length of a specific sample. Color represents the number of fragments/million with that length (RPM). b, Reproducibility of the cfChIP-seq assay. Shown are technical repeats, biological repeats (two samples from the same donor) and comparison of two different donors for three histone marks. Each dot is a gene, and values are normalized counts at the gene promoter (H3K4me2/3) or body (H3K36me3).

Extended Data Fig. 3 Supporting data for Fig. 2.

a, Testing gene sets defined by highly expressed in different cancer types (TCGA, Methods) against genes with higher signal in a CRC tumor sample (Fig. 2a). Hypergeometric test with FDR corrected q-values. b, Levels of H3K4me2 coverage over colon-specific enhancers (y-axis) in healthy donors and in CRC cancer samples. Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge, n = 144. c, Average coverage of H3K36me3 across gene bodies (meta gene). d, Coverage of H3K36me3 cfChIP-seq over gene bodies in a healthy donor (H012.1) for genes at different leukocyte expression quantiles. Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge.

Extended Data Fig. 4 Supporting data for Fig. 3.

a, Comparison of H3K4me3 cfChIP-seq signal from a healthy donor (H012.1) with expected gene expression levels, based on the expression in cells contributing to cfDNA in healthy subjects (Methods). Each dot is a gene. x-axis: normalized number of H3K4me3 reads in gene promoter. y-axis: expected expression in number of transcripts/million (TPM). b, Comparison (as in A) of Leukocytes H3K4me3 ChIP-seq signal vs. Leukocytes gene expression levels (both for Roadmap Epigenomic sample E062). c, Comparison (as in A) of H3K4me3 cfChIP-seq signal from a healthy donor (H012.1) vs. Liver gene expression levels (Roadmap Epigenomics sample E066). d, Summary of correlations of healthy cfChIP-seq levels against different expression patterns from Roadmap Epigenomics and BLUEPRINT. For each category of expression profiles we plot the boxplot of r2 values. Red line denotes the correlation against the predicted expression mixture of cells contributing to cfDNA pool (panel A). Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge. e, Comparison of the expression levels of genes in two clusters of Fig. 3c (see inset). Cluster A contains 4,690 genes that change between samples, and Cluster B contains 10,177 genes that do not change between samples. Violin plots show the distribution of expression levels in three tissues - PBMC, Heart, and Liver, from the Roadmap Epigenomics expression data. f, Overlap of both clusters with the set of genes with CpG island promoters (blue) and housekeeping genes (green; based on analysis of GTEX compendium, see Methods). For clarity we show each cluster in a separate Venn diagram.

Extended Data Fig. 5 cfChIP-seq is highly sensitive.

a, Schematics of the parameters involved in determining cfChIP-seq sensitivity. 1. Number of informative nucleosomes is the total number of signature-specific nucleosomes in the plasma that carry a mark of interest; 2. The percent contribution of the signature-positive cells to the circulation; 3. Total number of genomes in circulation; 4. The specific capture probability of marked nucleosomes by the cfChIP-seq assay; and 5. The non-specific capture probability of nucleosomes (background). The signal to noise ratio (SNR) is the ratio of the specific to non-specific capture probabilities. b, Simulation analysis of event detection power as a function of percent positive (x-axis) and number of informative locations (y-axis). Detection is defined as 95% probability of assay results (capture & sequencing) that reject the null hypothesis of background signal with p < 0.05 (Poisson test, Methods). Simulation assumes number of genomes = 10,000 (10 ml plasma of healthy donor), capture probability of 1%, and SNR of 500 (Methods, Supplementary Note). The size of several example signatures are shown.

Extended Data Fig. 6 Sensitivity analysis.

a, Total sizes (in nucleosomes) of TSS (Left) and Enhancer (Right) signatures of various cell types. b, Estimates of specific capture rate and of SNR (specific capture / non-specific capture) over 88 healthy samples, assuming 1000 genomes/ml and 2 ml input. Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge. c, Signal level is linear with input. Plasma of a healthy donor was spiked in with different amounts of yeast nucleosomes (x-axis). The number of counts observed (y-axis) for signatures of different sizes. Error bars show 20-80% range over 100 different sampled signatures of the given size. d, Genome browser of chrY male-specific promoters (left) and a representative autosomal region (right) in the male/female titration experiment. e, Test of sensitivity using male spike-in. Plasma of healthy female and male donors were titrated at different ratios. Detection of male-specific promoters as a function of percent of chrY genomes in the sample (x-axis). Shown are the number of counts (y-axis) and significance (circle radius) of signal above background distribution (Methods). f, Simulation study of the effect of capture probability on detection. The blue marks denote the concentrations used in the male-female titration experiment which had capture probabilities ~0.1% and SNRs of ~500-800. g, Simulation study of the effect of SNR levels on detection probability.

Extended Data Fig. 7 cfChIP-seq liver signal is proportional to % liver cfDNA.

a, % Liver as estimated using DNA CpG methylation markers vs. signature strength. b, % Liver as estimated using DNA CpG methylation markers vs. estimate of % liver in Fig. 5a.

Extended Data Fig. 8 Supporting data for Fig. 6.

a, Evaluation of classification of CRC samples vs. healthy samples using Digestive (Top) and COAD (Bottom) signature scores (as Fig. 6c). b, Intra-patient comparisons (as Fig. 6e). Inset: time samples drawn on the patient timeline (Fig. 6d).

Extended Data Fig. 9 Supporting data for Fig. 6.

a, Levels of CRC associated genes in different samples. Each point is a sample plotted with % CRC (x-axis) vs. normalized number of reads of the gene (y-axis). Solid points - the signal of the gene is significantly above the expectation given % CRC (Methods). b, Example of immune-related genes in CRC samples. Same as (A). c, Clustering of gene set enrichment in CRC samples (see Supplementary Table 11). d, Venn diagram of overlaps between cancer gene signatures that were identified in our analysis. e, Evaluation of cancer signatures in CRC samples from TCGA, grouped by their CMS subtype. Box limits: 25% –75% quantiles, middle: median, upper (lower) whisker to the largest (smallest) value no further than 1.5 * inter-quartile range from the hinge.

Supplementary information

Supplementary Information

Supplementary Note.

Reporting Summary

Supplementary Table 1

Sequencing statistics for samples sequenced in this study

Supplementary Table 2

Genes with abnormal signal (per sample)

Supplementary Table 3

Tumor signatures

Supplementary Table 4

Individuals and samples clinical information

Supplementary Table 5

Cell type signatures

Supplementary Table 6

Full analysis of tissue signatures versus samples

Supplementary Table 7

Full analysis of gene sets versus samples with reference

Supplementary Table 8

Differentially marked genes in pairwise comparisons

Supplementary Table 9

Liver clusters enrichments

Supplementary Table 10

Gene set counts in CRC samples relative to healthy reference

Supplementary Table 11

CRC signature enrichments

Supplementary Table 12

Roadmap Epigenomics samples used

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sadeh, R., Sharkia, I., Fialkoff, G. et al. ChIP-seq of plasma cell-free nucleosomes identifies gene expression programs of the cells of origin. Nat Biotechnol (2021). https://doi.org/10.1038/s41587-020-00775-6

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing