Sensitive tumour detection and classification using plasma cell-free DNA methylomes

Shen, Shu Yi; Singhania, Rajat; Fehringer, Gordon; Chakravarthy, Ankur; Roehrl, Michael H. A.; Chadwick, Dianne; Zuzarte, Philip C.; Borgida, Ayelet; Wang, Ting Ting; Li, Tiantian; Kis, Olena; Zhao, Zhen; Spreafico, Anna; Medina, Tiago da Silva; Wang, Yadon; Roulois, David; Ettayebi, Ilias; Chen, Zhuo; Chow, Signy; Murphy, Tracy; Arruda, Andrea; O’Kane, Grainne M.; Liu, Jessica; Mansour, Mark; McPherson, John D.; O’Brien, Catherine; Leighl, Natasha; Bedard, Philippe L.; Fleshner, Neil; Liu, Geoffrey; Minden, Mark D.; Gallinger, Steven; Goldenberg, Anna; Pugh, Trevor J.; Hoffman, Michael M.; Bratman, Scott V.; Hung, Rayjean J.; De Carvalho, Daniel D.

doi:10.1038/s41586-018-0703-0

Letter
Published: 14 November 2018

Sensitive tumour detection and classification using plasma cell-free DNA methylomes

Shu Yi Shen¹^na1,
Rajat Singhania¹^na1,
Gordon Fehringer²^na1,
Ankur Chakravarthy¹^na1,
Michael H. A. Roehrl^1,3,4,
Dianne Chadwick¹,
Philip C. Zuzarte⁵,
Ayelet Borgida²,
Ting Ting Wang^1,4,
Tiantian Li¹,
Olena Kis¹,
Zhen Zhao¹,
Anna Spreafico¹,
Tiago da Silva Medina¹,
Yadon Wang¹,
David Roulois^1,6,
Ilias Ettayebi^1,4,
Zhuo Chen¹,
Signy Chow¹,
Tracy Murphy¹,
Andrea Arruda¹,
Grainne M. O’Kane¹,
Jessica Liu⁴,
Mark Mansour⁴,
John D. McPherson⁷,
Catherine O’Brien¹,
Natasha Leighl¹,
Philippe L. Bedard¹,
Neil Fleshner¹,
Geoffrey Liu^1,4,8,
Mark D. Minden¹,
Steven Gallinger^9,10,
Anna Goldenberg¹¹,
Trevor J. Pugh^1,4,
Michael M. Hoffman^1,4,11,
Scott V. Bratman^1,4,
Rayjean J. Hung^2,8 &
…
Daniel D. De Carvalho^1,4

Nature volume 563, pages 579–583 (2018)Cite this article

63k Accesses
530 Citations
852 Altmetric
Metrics details

Subjects

Abstract

The use of liquid biopsies for cancer detection and management is rapidly gaining prominence¹. Current methods for the detection of circulating tumour DNA involve sequencing somatic mutations using cell-free DNA, but the sensitivity of these methods may be low among patients with early-stage cancer given the limited number of recurrent mutations^2,3,4,5. By contrast, large-scale epigenetic alterations—which are tissue- and cancer-type specific—are not similarly constrained⁶ and therefore potentially have greater ability to detect and classify cancers in patients with early-stage disease. Here we develop a sensitive, immunoprecipitation-based protocol to analyse the methylome of small quantities of circulating cell-free DNA, and demonstrate the ability to detect large-scale DNA methylation changes that are enriched for tumour-specific patterns. We also demonstrate robust performance in cancer detection and classification across an extensive collection of plasma samples from several tumour types. This work sets the stage to establish biomarkers for the minimally invasive detection, interception and classification of early-stage cancers based on plasma cell-free DNA methylation patterns.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: The cfDNA methylome as a sensitive approach to detect ctDNA in low levels of input DNA.**

**Fig. 2: The cfMeDIP–seq method can identify thousands of DMRs in circulating cfDNA obtained from patients with pancreatic adenocarcinoma.**

**Fig. 3: Methylome analysis of plasma cfDNA enables tumour classification.**

**Fig. 4: Plasma-derived DMRs are informative of cancer type.**

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Article Open access 28 March 2024

Austin D. Reed, Sara Pensa, … Walid T. Khaled

An immunophenotype-coupled transcriptomic atlas of human hematopoietic progenitors

Article Open access 21 March 2024

Xuan Zhang, Baobao Song, … H. Leighton Grimes

Spatial transcriptomics reveals discrete tumour microenvironments and autocrine loops within ovarian cancer subclones

Article Open access 03 April 2024

Elena Denisenko, Leanne de Kock, … Alistair R. R. Forrest

Data availability

R markdowns (either knit or raw) and scripts used to generate the findings in this study have been deposited on Zenodo (DOIs in Supplementary Table 13). All the cell line datasets generated and/or analysed during the current study are available in the Gene Expression Omnibus repository under accession code GSE79838. The cfMeDIP–seq next-generation sequencing data for patient samples that support the findings of this study are available upon request from the corresponding author to comply with institutional ethics regulation. Source data for Fig. 1b and Extended Data Fig. 3e are provided in Supplementary Table 9, and for Fig. 1c in Supplementary Table 10. Additional source data can be found on Zenodo (Supplementary Table 13).

References

Diaz, L. A., Jr & Bardelli, A. Liquid biopsies: genotyping circulating tumor DNA. J. Clin. Oncol. 32, 579–586 (2014).
Article Google Scholar
Aravanis, A. M., Lee, M. & Klausner, R. D. Next-generation sequencing of circulating tumor DNA for early cancer detection. Cell 168, 571–574 (2017).
Article CAS Google Scholar
Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20, 548–554 (2014).
Article CAS Google Scholar
Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018).
Article CAS ADS Google Scholar
Phallen, J. et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci. Transl. Med. 9, eaan2415 (2017).
Article Google Scholar
Hoadley, K. A. et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, 929–944 (2014).
Article CAS Google Scholar
Lehmann-Werman, R. et al. Identification of tissue-specific cell death using methylation patterns of circulating DNA. Proc. Natl Acad. Sci. USA 113, E1826–E1834 (2016).
Article CAS Google Scholar
Visvanathan, K. et al. Monitoring of serum DNA methylation as an early independent marker of response and survival in metastatic breast cancer: TBCRC 005 prospective biomarker study. J. Clin. Oncol. 35, 751–758 (2017).
Article CAS Google Scholar
Potter, N. T. et al. Validation of a real-time PCR-based qualitative assay for the detection of methylated SEPT9 DNA in human plasma. Clin. Chem. 60, 1183–1191 (2014).
Article CAS Google Scholar
Chan, K. C. et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing. Proc. Natl Acad. Sci. USA 110, 18761–18768 (2013).
Article CAS ADS Google Scholar
Sun, K. et al. Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments. Proc. Natl Acad. Sci. USA 112, E5503–E5512 (2015).
Article CAS Google Scholar
Grunau, C., Clark, S. J. & Rosenthal, A. Bisulfite genomic sequencing: systematic investigation of critical experimental parameters. Nucleic Acids Res. 29, E65 (2001).
Article CAS Google Scholar
Taiwo, O. et al. Methylome analysis using MeDIP-seq with low DNA concentrations. Nat. Protoc. 7, 617–636 (2012).
Article CAS Google Scholar
Newman, A. M. et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat. Biotechnol. 34, 547–555 (2016).
Article CAS Google Scholar
Sharma, S., Kelly, T. K. & Jones, P. A. Epigenetics in cancer. Carcinogenesis 31, 27–36 (2010).
Article CAS Google Scholar
Yin, Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017).
Article Google Scholar
Michiels, S., Koscielny, S. & Hill, C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365, 488–492 (2005).
Article CAS Google Scholar
Pedersen, K. S. et al. Leukocyte DNA methylation signature differentiates pancreatic cancer patients from healthy controls. PLoS ONE 6, e18223 (2011).
Article CAS ADS Google Scholar
Teschendorff, A. E. et al. An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS ONE 4, e8274 (2009).
Article ADS Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Article Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar
Lienhard, M., Grimm, C., Morkel, M., Herwig, R. & Chavez, L. MEDIPS: genome-wide differential coverage analysis of sequencing data derived from DNA enrichment experiments. Bioinformatics 30, 284–286 (2014).
Article CAS Google Scholar
Kis, O. et al. Circulating tumour DNA sequence analysis as an alternative to multiple myeloma bone marrow aspirates. Nat. Commun. 8, 15086 (2017).
Article ADS Google Scholar
Kennedy, S. R. et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat. Protoc. 9, 2586–2606 (2014).
Article CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS Google Scholar
Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).
Article CAS Google Scholar
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).
Article CAS ADS Google Scholar
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Article CAS ADS Google Scholar
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Article CAS Google Scholar
Gu, H. et al. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat. Protoc. 6, 468–481 (2011).
Article CAS ADS Google Scholar
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Article CAS Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article Google Scholar
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, (2008).
Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).
Article Google Scholar
Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).
Article CAS Google Scholar
Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).
Article CAS Google Scholar

Download references

Acknowledgements

This study was conducted with support from the University of Toronto McLaughlin Centre (MC-2015-02), the Canadian Institutes of Health Research (CIHR FDN 148430 and CIHR New Investigator Salary award 201512MSH-360794-228629), Ontario Institute for Cancer Research (OICR) with funds from the province of Ontario, Canada Research Chair (950-231346), and the Princess Margaret Cancer Foundation to D.D.D.C. as well as Canadian Cancer Society (CCSRI 701717) to R.J.H., CCSRI 704716 to R.J.H. and D.D.D.C. and CCSRI 703827 to M.M.H. Recruitment of healthy individuals was supported by Cancer Care Ontario Chair of Population Health and CCSRI 020214 awarded to R.J.H. Collection of lung cancer samples was supported by the Alan B. Brown chair in molecular genomics and the Lusi Wong Lung Cancer Early Detection Program to G.L. We acknowledge the Princess Margaret Genomics Centre for carrying out the next-generation sequencing and the Bioinformatics and HPC Core, Princess Margaret Cancer Centre for their expertise in generating the next-generation sequencing data.

Reviewer information

Nature thanks E. Collisson, A. Teschendorff and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

These authors contributed equally: Shu Yi Shen, Rajat Singhania, Gordon Fehringer, Ankur Chakravarthy

Authors and Affiliations

Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Shu Yi Shen, Rajat Singhania, Ankur Chakravarthy, Michael H. A. Roehrl, Dianne Chadwick, Ting Ting Wang, Tiantian Li, Olena Kis, Zhen Zhao, Anna Spreafico, Tiago da Silva Medina, Yadon Wang, David Roulois, Ilias Ettayebi, Zhuo Chen, Signy Chow, Tracy Murphy, Andrea Arruda, Grainne M. O’Kane, Catherine O’Brien, Natasha Leighl, Philippe L. Bedard, Neil Fleshner, Geoffrey Liu, Mark D. Minden, Trevor J. Pugh, Michael M. Hoffman, Scott V. Bratman & Daniel D. De Carvalho
Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Gordon Fehringer, Ayelet Borgida & Rayjean J. Hung
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Michael H. A. Roehrl
Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
Michael H. A. Roehrl, Ting Ting Wang, Ilias Ettayebi, Jessica Liu, Mark Mansour, Geoffrey Liu, Trevor J. Pugh, Michael M. Hoffman, Scott V. Bratman & Daniel D. De Carvalho
Genome Technologies, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
Philip C. Zuzarte
UMR_S 1236, Univ Rennes 1, Inserm, Etablissement Français du sang Bretagne, Rennes, France
David Roulois
Department of Biochemistry and Molecular Medicine, UC Davis Comprehensive Cancer Center, Sacramento, CA, USA
John D. McPherson
Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Geoffrey Liu & Rayjean J. Hung
Fred Litwin Centre for Cancer Genetics, Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
Steven Gallinger
Department of Surgery, Toronto General Hospital, Toronto, Ontario, Canada
Steven Gallinger
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Anna Goldenberg & Michael M. Hoffman

Authors

Shu Yi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Rajat Singhania
View author publications
You can also search for this author in PubMed Google Scholar
Gordon Fehringer
View author publications
You can also search for this author in PubMed Google Scholar
Ankur Chakravarthy
View author publications
You can also search for this author in PubMed Google Scholar
Michael H. A. Roehrl
View author publications
You can also search for this author in PubMed Google Scholar
Dianne Chadwick
View author publications
You can also search for this author in PubMed Google Scholar
Philip C. Zuzarte
View author publications
You can also search for this author in PubMed Google Scholar
Ayelet Borgida
View author publications
You can also search for this author in PubMed Google Scholar
Ting Ting Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tiantian Li
View author publications
You can also search for this author in PubMed Google Scholar
Olena Kis
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Anna Spreafico
View author publications
You can also search for this author in PubMed Google Scholar
Tiago da Silva Medina
View author publications
You can also search for this author in PubMed Google Scholar
Yadon Wang
View author publications
You can also search for this author in PubMed Google Scholar
David Roulois
View author publications
You can also search for this author in PubMed Google Scholar
Ilias Ettayebi
View author publications
You can also search for this author in PubMed Google Scholar
Zhuo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Signy Chow
View author publications
You can also search for this author in PubMed Google Scholar
Tracy Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Arruda
View author publications
You can also search for this author in PubMed Google Scholar
Grainne M. O’Kane
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mark Mansour
View author publications
You can also search for this author in PubMed Google Scholar
John D. McPherson
View author publications
You can also search for this author in PubMed Google Scholar
Catherine O’Brien
View author publications
You can also search for this author in PubMed Google Scholar
Natasha Leighl
View author publications
You can also search for this author in PubMed Google Scholar
Philippe L. Bedard
View author publications
You can also search for this author in PubMed Google Scholar
Neil Fleshner
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mark D. Minden
View author publications
You can also search for this author in PubMed Google Scholar
Steven Gallinger
View author publications
You can also search for this author in PubMed Google Scholar
Anna Goldenberg
View author publications
You can also search for this author in PubMed Google Scholar
Trevor J. Pugh
View author publications
You can also search for this author in PubMed Google Scholar
Michael M. Hoffman
View author publications
You can also search for this author in PubMed Google Scholar
Scott V. Bratman
View author publications
You can also search for this author in PubMed Google Scholar
Rayjean J. Hung
View author publications
You can also search for this author in PubMed Google Scholar
Daniel D. De Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.Y.S. and D.D.D.C. designed and developed the cfMeDIP–seq protocol. R.J.H. and G.F. conceived and designed the study related to the pancreatic cancer component. S.Y.S., R.S., A.C. and D.D.D.C. conceived and designed the study related to the other cancer types. S.Y.S., S.V.B., T.J.P. and D.D.D.C. designed the experiments. S.Y.S., D.C., M.H.A.R., P.C.Z., Z.C., T.L., O.K., D.R., I.E., Z.C., S.C., G.M.O., J.L., M.M. and Z.Z. performed the experiments. T.d.S.M., Y.W. and C.O. performed the mouse experiments. R.S., A.C., G.F., T.T.W., A.G., T.J.P., M.M.H. and D.D.D.C. analysed the data with scientific input from R.J.H. G.F., A.B., D.C., A.S., T.M., A.A., N.L., M.H.A.R., J.D.M., P.L.B., N.F., G.L., M.D.M., S.G., T.J.P. and R.J.H. collected the clinical data related to the samples, determined the sample selection criteria and matching scheme, and provided the clinical samples. S.Y.S., R.S., A.C. and D.D.D.C. wrote the paper with feedback from all authors.

Corresponding authors

Correspondence to Rayjean J. Hung or Daniel D. De Carvalho.

Ethics declarations

Competing interests

D.D.D.C., S.Y.S., A.C., S.V.B., R.S. and R.J.H. are listed as inventors/contributors on patents filed related to this work.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Simulation of the probability of detecting ctDNA as a function of the number of DMRs, sequencing depth and percentage of ctDNA in plasma cfDNA, and a proposed method to enrich ctDNA.

a, Bioinformatic simulation of scenarios with different proportions of ctDNA present in the sample (0.001% to 10%, columns), and a range of tumour-specific DMRs—from 1, 10, 100, 1,000 or 10,000—determined through the comparison of ctDNA to normal cfDNA (rows), with reads sampled at varying sequencing depths at each locus (10×, 100×, 1,000× and 10,000×) (x axis). The probability of detecting at least five epimutations per DMR increases as the number of available features increases, even at shallow coverage per locus (left y axis). Each panel depicts probability of detection against coverage per candidate DMR for one simulation scenario. b, Schematic representation of the cfMeDIP–seq protocol.

Extended Data Fig. 2 Sequencing saturation analysis and quality controls of MeDIP–seq and cfMeDIP–seq carried out on varying starting inputs of HCT116 DNA sheared to mimic cfDNA.

a, Results of the saturation analysis from the Bioconductor package MEDIPS analysing cfMeDIP–seq data from each replicate, for each starting input amount and including an input control. b, The protocol was tested in two biological replicates of four starting DNA inputs (100, 10, 5 and 1 ng) of HCT116 DNA sheared to mimic cfDNA. The specificity of the reaction was calculated using methylated and unmethylated spiked-in A. thaliana DNA. The fold-enrichment ratio was calculated using genomic regions of the fragmented HCT116 DNA (human methylated HIST1H2BA and unmethylated GAPDH). The horizontal dotted line indicates a fold-enrichment ratio threshold of 25, dots represent biological replicates, with lines representing the mean. c, CpG enrichment scores of the sequenced samples (two biological replicates each of four starting DNA inputs (100, 10, 5 and 1 ng) and one input control) show a robust enrichment of CpGs within the genomic regions from the immunoprecipitated samples compared to the input control. The CpG enrichment score was obtained by dividing the relative frequency of CpGs of the regions by the relative frequency of CpGs of the human genome. The horizontal dotted line indicates a CpG enrichment score of 1, dots represent biological replicates, with lines representing the mean. d, Genome-wide Pearson correlations of normalized read counts per 300-bp window between cfMeDIP–seq signal for 1 to 100 ng of input HCT116 DNA sheared to mimic cfDNA (2 biological replicates per concentration). e, Genome Browser snapshot of HCT116 cfMeDIP–seq signal across a window (chr8:145,095,942–145,116,942) selected out of four examined loci, at different starting DNA inputs (1 to 100 ng, in biological replicates), compared with RRBS (ENCODE: ENCSR000DFS) and WGBS (Gene Expression Omnibus: GSM1465024) data (aligned to hg19). For cfMeDIP–seq, the y axis indicates RPKMs; for RRBS, yellow and blue blocks represent hypermethylated and hypomethylated CpGs, respectively. In the WGBS track, peak heights indicate methylation level.

Extended Data Fig. 3 Sequencing saturation analysis and quality controls of cfMeDIP–seq from serial dilution.

a, Schematic representation of the CRC DNA (HCT116) dilution series into multiple myeloma DNA (MM.1S). For both CRC and multiple myeloma DNA, the genomic DNA was sheared to mimic cfDNA fragmentation. The entire dilution series was used to carry out cfMeDIP–seq (n = 1) and ultra-deep sequencing for mutation detection (n = 1). b, The specificity of the reaction for each dilution in the series (n = 1) was calculated using methylated and unmethylated spiked-in A. thaliana DNA. c, CpG enrichment representing the ratio of relative frequency of CpGs in regions to relative frequency of CpGs in the human genome for each dilution in the series (n = 1), determined by cfMeDIP–seq. The horizontal dashed line represents a CpG enrichment of 1. d, Saturation analysis of cfMeDIP–seq sequenced reads from each dilution point in the series (n = 1). e, Across a serial dilution series (n = 7 dilution points, two technical replicates, each replicate was used per protocol) of HCT116 DNA spiked into MM.1S multiple myeloma DNA, near-perfect correlations are observed between observed and expected numbers of DMRs. f, g, Ultra-deep sequencing for mutation detection of three CRC-specific point mutations within BRAF (p.P301P), KRAS (p.G13D) and PIK3CA (p.H1047R) in the same dilution series (of CRC into multiple myeloma DNA) (n = 1). UMIs were incorporated into the sequencing adapters and used to create SSCSs (f) and DCSs (g) for the detection of allele frequency for each mutation at each locus. For each mutation, the reference allele is found at the top. The dashed red line indicates the limit of detection.

Extended Data Fig. 4 Quality control of cfMeDIP–seq from circulating cfDNA from patients with PDAC (cases) and healthy donors (controls).

a, b, Specificity of reaction calculated using methylated and unmethylated spiked-in A. thaliana DNA for each case sample (a) and each control sample (b). The fold-enrichment ratio was not calculated owing to the very limited amount of DNA available after final libraries were generated. c, d, CpG enrichment of the sequenced cases (c) and controls (d). The horizontal dashed line represents a CpG enrichment of 1. e, Principal component (PC) analysis of cfDNA methylation from 24 plasma cfDNA samples from healthy donors and 24 plasma cfDNA samples from patients with PDAC, using the 1 million most variable windows by median absolute deviation (300 bp) genome-wide. Left, PC2 against PC1; right, PC3 against PC1. f, Percentage of variance explained by each principal component.

Extended Data Fig. 5 Methylome analysis of plasma cfDNA distinguishes patients with early-stage PDAC from healthy controls.

a, The difference in plasma cfDNA methylation plotted against the difference in tumour DNA methylation for each overlapping window (n = 547,887). The difference in plasma cfDNA methylation between patients with PDAC and healthy controls is log₁₀-fold, as measured by cfMeDIP–seq. Tumour DNA methylation difference is delta beta from primary PDAC tumour to normal tissue, as measured by RRBS. The blue line is a trend line, with the correlation determined by Pearson’s correlation. b, Scatter plot showing the DNA methylation difference for each overlapping window. The x axis shows the DNA methylation difference for the primary PDAC tumour compared with normal PBMCs from the RRBS data. The y axis shows the DNA methylation difference for the plasma cfDNA methylation from patients with PDAC compared with healthy donors from the cfMeDIP–seq data. Correlation determined by Pearson’s correlation. c, Genome Browser snapshot of RRBS and cfMeDIP–seq signal across a representative chromosomal region selected from four candidate regions (chr8:145,095,942–145,116,942) using reference genome hg19. RRBS tracks show the methylation signal for the laser capture microdissection tissues from PDAC tumour cases and the matching normal tissue, from the same patient, shown in the same order. Each coloured block represents DMCs, with yellow representing hypermethylated and blue representing hypomethylated. cfMeDIP–seq tracks show the methylation signal (RPKMs) detected in the cfDNA, with cases representing plasma from the same PDAC cases and controls corresponding to plasma from age- and sex-matched healthy controls. For the cfMeDIP–seq tracks, green and blue peaks indicate the methylation signal (RPKMs) detected in the cfDNA.

Extended Data Fig. 6 Circulating cfDNA methylation profiles can identify transcription factor footprints and infer active transcriptional networks in the tissue of origin.

a, Expression profile of all transcription factors (n = 42) that were characterized as binding in healthy controls across 53 human tissues from the GTEx project. Several transcription factors that are preferentially expressed in the haematopoietic system were identified (PU.1, NFE2 and GATA1). b, Expression profiles (ssGSEA scores; single-sample gene set enrichment analysis) of all transcription factors with hypomethylated motifs in controls (n = 42) are overexpressed compared with those of 1,000 random sets of 42 transcription factors across GTEx whole-blood data (P < 2.2 × 10⁻¹⁶, Wilcoxon’s Rank Sum test, two-sided). c, Expression profile of all transcription factors (n = 52) characterized as binding in patients with PDAC. Several pancreas-specific or pancreatic-cancer-associated transcription factors were identified. Moreover, hallmark transcription factors that drive molecular subtypes of pancreatic cancer were also identified. d, Expression profile (ssGSEA scores) of all transcription factors with hypomethylated motifs in cases (n = 52) are overexpressed compared with those of 1,000 random sets of 52 transcription factors in the normal pancreas (GTEx data) (Wilcoxon Rank Sum test, two-sided test, P < 2.2 × 10⁻¹⁶). e, Expression profile of all transcription factors with hypomethylated motifs in PDAC cases (n = 52) are overexpressed compared those of 1,000 random sets of 52 transcription factors in PDAC tissue (TCGA data) (Wilcoxon Rank Sum test, two-sided test, P < 2.2 × 10⁻¹⁶). For violin plots (b, d, e) the ends of the boxes represent the lower and upper quartiles and the middle line indicates the median. Whiskers represent 1.5× IQR, and outliers are excluded. Rotated kernel densities are also displayed.

Extended Data Fig. 7 Quality control of cfMeDIP–seq from circulating cfDNA from multiple cancer types.

a, c, e, g, i, k, Specificity of the reaction; and b, d, f, h, j, CpG enrichment score for each sample per cancer type. The horizontal dashed lines represent a CpG enrichment of 1.

Extended Data Fig. 8 Comparison of plasma cfDNA DMRs with tumour DMCs.

a, Yield of cfDNA extracted per ml of plasma from healthy donors (n = 24), bladder cancer (n = 20), renal cancer (n = 20), lung cancer (n = 25), breast cancer (n = 25), pancreatic cancer (n = 24), colorectal cancer (23) and AML (n = 28). Horizontal bars represent the mean, with dots representing individual samples. b–h, Scatter plots showing the DNA methylation difference for all overlapping windows in PDAC (n = 245,980 windows) (b), AML (n = 206,735 windows) (c), BLCA (n = 193,943 windows) (d), BRCA (n = 204,623 windows) (e), CRC (n = 210,645 windows) (f), LUC (n = 193,043 windows) (g) and RCC (n = 198,390 windows) (h). The x axis shows the DNA methylation difference between the primary tumour (TCGA data) and normal PBMCs. The y axis shows the DNA methylation difference between the plasma cfDNA methylation for each cancer type and healthy controls from the cfMeDIP–seq data. The blue line is a trend line, with the correlation determined by Pearson’s correlation.

Extended Data Fig. 9 Circulating plasma cfDNA methylation samples used to distinguish between multiple cancer types and healthy donors.

a, b, Pathology stage (according to the AJCC/UICC 7th Edition) breakdown by tumour type for samples in the training set (a) and in the validation set (b). Non-small-cell lung carcinoma, LUC (NSCLC); small-cell lung cancer, LUC (SCLC).

Extended Data Fig. 10 Characterization of hypermethylated regions from cfDNA that are not methylated in leukocytes.

a, Violin plots for the DNA methylation (plotted as beta value) of 38,352 regions in normal blood cells selected on the basis of low DNA methylation levels using IHEC whole-genome bisulfite sequencing data. For violin plots, the ends of the boxes represent the lower and upper quartiles and the middle line represents the median. Whiskers represent 1.5× IQR, and outliers are excluded. Rotated kernel densities are also displayed. b, Volcano plots representing the regions with low DNA methylation levels in normal blood cells that overlap with hypermethylated regions in the plasma cfDNA for PDAC (n = 3,146 CpG sites) relative to normal tissue, and RCC (n = 2,767 CpG sites), BLCA (n = 3,286 CpG sites), BRCA (n = 6,836 CpG sites), CRC (n = 8,360 CpG sites) and LUC (n = 5,239 CpG sites) relative to PBMCs. The x axis represents DNA methylation (plotted as delta beta value), obtained from tumour data from TCGA for cancers other than PDAC and RRBS for PDAC. The y axis represents −log₁₀ q values (Benjamini Hochberg false discovery rate, BHFDR).

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-13 and an SI Tables Guide with full table legends.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, S.Y., Singhania, R., Fehringer, G. et al. Sensitive tumour detection and classification using plasma cell-free DNA methylomes. Nature 563, 579–583 (2018). https://doi.org/10.1038/s41586-018-0703-0

Download citation

Received: 05 December 2016
Accepted: 25 September 2018
Published: 14 November 2018
Issue Date: 22 November 2018
DOI: https://doi.org/10.1038/s41586-018-0703-0

Keywords

This article is cited by

Blood FOLR3 methylation dysregulations and heterogeneity in non-small lung cancer highlight its strong associations with lung squamous carcinoma
- Yunhui Qu
- Xiuzhi Zhang
- Songyun Ouyang
Respiratory Research (2024)
Multimodal epigenetic sequencing analysis (MESA) of cell-free DNA for non-invasive colorectal cancer detection
- Yumei Li
- Jianfeng Xu
- Wei Li
Genome Medicine (2024)
Discrimination of pancreato-biliary cancer and pancreatitis patients by non-invasive liquid biopsy
- Christina Hartwig
- Jan Müller
- Georg F. Weber
Molecular Cancer (2024)
Reduced representative methylome profiling of cell-free DNA for breast cancer detection
- Qingmo Yang
- Xingqiang Zhu
- Hongliang Chen
Clinical Epigenetics (2024)
Terminal modifications independent cell-free RNA sequencing enables sensitive early cancer detection and classification
- Jun Wang
- Jinyong Huang
- Deming Gou
Nature Communications (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.