Technical Report | Published:

Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA

Nature Genetics volume 49, pages 635642 (2017) | Download Citation

Abstract

Adjacent CpG sites in mammalian genomes can be co-methylated owing to the processivity of methyltransferases or demethylases, yet discordant methylation patterns have also been observed, which are related to stochastic or uncoordinated molecular processes. We focused on a systematic search and investigation of regions in the full human genome that show highly coordinated methylation. We defined 147,888 blocks of tightly coupled CpG sites, called methylation haplotype blocks, after analysis of 61 whole-genome bisulfite sequencing data sets and validation with 101 reduced-representation bisulfite sequencing data sets and 637 methylation array data sets. Using a metric called methylation haplotype load, we performed tissue-specific methylation analysis at the block level. Subsets of informative blocks were further identified for deconvolution of heterogeneous samples. Finally, using methylation haplotypes we demonstrated quantitative estimation of tumor load and tissue-of-origin mapping in the circulating cell-free DNA of 59 patients with lung or colorectal cancer.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Primary accessions

Gene Expression Omnibus

Referenced accessions

Gene Expression Omnibus

References

  1. 1.

    , & The somatic replication of DNA methylation. Cell 24, 33–40 (1981).

  2. 2.

    et al. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell 26, 813–825 (2014).

  3. 3.

    Linkage disequilibrium—understanding the evolutionary past and mapping the medical future. Nat. Rev. Genet. 9, 477–485 (2008).

  4. 4.

    , , & Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res. 20, 883–889 (2010).

  5. 5.

    DNA methylation: switching phenotypes with epialleles. Nat. Rev. Genet. 15, 572 (2014).

  6. 6.

    & Single-cell epigenomics: techniques and emerging applications. Nat. Rev. Genet. 16, 716–726 (2015).

  7. 7.

    Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  8. 8.

    et al. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinformatics 17, 259 (2016).

  9. 9.

    et al. Plasma DNA tissue mapping by genome-wide methylation sequencing for non-invasive prenatal, cancer and transplantation assessments. Proc. Natl. Acad. Sci. USA 112, E5503–E5512 (2015).

  10. 10.

    et al. Identification of tissue-specific cell death using methylation patterns of circulating DNA. Proc. Natl. Acad. Sci. USA 113, E1826–E1834 (2016).

  11. 11.

    et al. Human body epigenome maps reveal noncanonical DNA methylation variation. Nature 523, 212–216 (2015).

  12. 12.

    et al. Distinct DNA methylomes of newborns and centenarians. Proc. Natl. Acad. Sci. USA 109, 10522–10527 (2012).

  13. 13.

    et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013).

  14. 14.

    et al. Global loss of DNA methylation uncovers intronic enhancers in genes showing expression changes. Genome Biol. 15, 469 (2014).

  15. 15.

    et al. Epigenomic analysis detects aberrant super-enhancer DNA methylation in human cancer. Genome Biol. 17, 11 (2016).

  16. 16.

    et al. Loss of 5-hydroxymethylcytosine is linked to gene body hypermethylation in kidney cancer. Cell Res. 26, 103–118 (2016).

  17. 17.

    , , , & Deciphering the heterogeneity in DNA methylation patterns during stem cell differentiation and reprogramming. BMC Genomics 15, 978 (2014).

  18. 18.

    et al. Increased methylation variation in epigenetic domains across cancer types. Nat. Genet. 43, 768–775 (2011).

  19. 19.

    et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948–951 (2008).

  20. 20.

    , , , & Large histone H3 lysine-9-dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat. Genet. 41, 246–250 (2009).

  21. 21.

    et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

  22. 22.

    & Regulated noise in the epigenetic landscape of development and disease. Cell 148, 1123–1131 (2012).

  23. 23.

    et al. The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet. 41, 178–186 (2009).

  24. 24.

    et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).

  25. 25.

    et al. Integrative analysis of haplotype-resolved epigenomes across human tissues. Nature 518, 350–354 (2015).

  26. 26.

    Random forests. Mach. Learn. 45, 5–32 (2001).

  27. 27.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  28. 28.

    et al. The homeoprotein Nanog is required for maintenance of pluripotency in mouse epiblast and ES cells. Cell 113, 631–642 (2003).

  29. 29.

    et al. Induction of pluripotency in mouse somatic cells with lineage specifiers. Cell 153, 963–975 (2013).

  30. 30.

    et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced-representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).

  31. 31.

    , , , & Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).

  32. 32.

    et al. TET1 and hydroxymethylcytosine in transcription and DNA methylation fidelity. Nature 473, 343–348 (2011).

  33. 33.

    & Linkage-disequilibrium analysis of allelic heterogeneity in DNA methylation. Epigenetics 10, 1093–1098 (2015).

  34. 34.

    & Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. USA 99, 3740–3745 (2002).

  35. 35.

    , & Multiple sources of bias confound functional enrichment analysis of global '-omics' data. Genome Biol. 16, 186 (2015).

  36. 36.

    et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

  37. 37.

    et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

  38. 38.

    et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

  39. 39.

    et al. Genome-wide quantitative assessment of variation in DNA methylation patterns. Nucleic Acids Res. 39, 4099–4108 (2011).

  40. 40.

    et al. Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat. Genet. 44, 1207–1214 (2012).

  41. 41.

    , & Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

  42. 42.

    et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 13, 86 (2012).

  43. 43.

    & DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-seq data. Bioinformatics 29, 1083–1085 (2013).

Download references

Acknowledgements

We thank S. Kaushal for managing and handling patient samples in the UCSD Moores Cancer Center Biorepository Tissue Technology Shared Resource, and S.M. Lippman, R. Liu and B. Ren for insightful discussions. This study was supported by US National Institutes of Health grants R01GM097253 (Kun Zhang), R01CA217642 (Kang Zhang), R01EY025090 (Kang Zhang) and P30CA23100 (S.M.L.), and a VA Merit Award (Kang Zhang).

Author information

Author notes

    • Shicheng Guo
    •  & Dinh Diep

    These authors contributed equally to this work.

Affiliations

  1. Department of Bioengineering, University of California at San Diego, La Jolla, California, USA.

    • Shicheng Guo
    • , Dinh Diep
    • , Nongluk Plongthongkum
    • , Ho-Lim Fung
    •  & Kun Zhang
  2. Institute for Genomic Medicine, University of California at San Diego, La Jolla, California, USA.

    • Kang Zhang
    •  & Kun Zhang
  3. Shiley Eye Institute, University of California at San Diego, La Jolla, California, USA.

    • Kang Zhang
  4. Veterans Administration Healthcare System, San Diego, California, USA.

    • Kang Zhang

Authors

  1. Search for Shicheng Guo in:

  2. Search for Dinh Diep in:

  3. Search for Nongluk Plongthongkum in:

  4. Search for Ho-Lim Fung in:

  5. Search for Kang Zhang in:

  6. Search for Kun Zhang in:

Contributions

Kun Zhang conceived the initial concept and oversaw the study; S.G., D.D. and Kun Zhang performed the bioinformatics analyses; N.P., D.D. and H.-L.F. performed the experiments; Kang Zhang contributed plasma samples from healthy individuals; and Kun Zhang, S.G. and D.D. wrote the manuscript with input from all co-authors.

Competing interests

S.G., D.D. and Kun Zhang are listed as inventors in patent applications related to the methods disclosed in this manuscript, and Kun Zhang is a co-founder and scientific advisor of Singlera Genomics, Inc.

Corresponding author

Correspondence to Kun Zhang.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–13 and Supplementary Note

Excel files

  1. 1.

    Supplementary Tables

    Supplementary Tables 1–13

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3805

Further reading