Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

scGHOST: identifying single-cell 3D genome subcompartments

Abstract

Single-cell Hi-C (scHi-C) technologies allow for probing of genome-wide cell-to-cell variability in three-dimensional (3D) genome organization from individual cells. Computational methods have been developed to reveal single-cell 3D genome features based on scHi-C, including A/B compartments, topologically associating domains and chromatin loops. However, no method exists for annotating single-cell subcompartments, which is important for understanding chromosome spatial localization in single cells. Here we present scGHOST, a single-cell subcompartment annotation method using graph embedding with constrained random walk sampling. Applications of scGHOST to scHi-C data and contact maps derived from single-cell 3D genome imaging demonstrate reliable identification of single-cell subcompartments, offering insights into cell-to-cell variability of nuclear subcompartments. Using scHi-C data from complex tissues, scGHOST identifies cell-type-specific or allele-specific subcompartments linked to gene transcription across various cell types and developmental stages, suggesting functional implications of single-cell subcompartments. scGHOST is an effective method for annotating single-cell 3D genome subcompartments in a broad range of biological contexts.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the scGHOST framework.
Fig. 2: scGHOST’s application to GM12878 single-cell Hi-C data showcases its accuracy in annotating single-cell subcompartments.
Fig. 3: scGHOST’s application to WTC11 scHi-C data and IMR90 single-cell 3D genome imaging data.
Fig. 4: scGHOST’s application to scHi-C data from the Lee et al. human PFC and Tan et al. developing mouse brain datasets.
Fig. 5: Application to HiRES data of developing mouse embryos.

Similar content being viewed by others

Data availability

In this work, we used several public datasets. scHi-C data for the GM12878 cell line12 were downloaded from the 4DN Data Portal29,40 (4DNES4D5MWEZ, 4DNESUE2NSGS and 4DNESTVIP977) in FASTQ format and were processed into contact maps at 500-kb resolution using the recommended processing pipeline (https://github.com/VRam142/combinatorialHiC)41 of the data source. The scHi-C dataset of the human prefrontal cortex14 was downloaded from the Gene Expression Omnibus (GEO) (GSE130711) in contact pairs format, which was then transformed into contact maps at 500-kb resolution. The WTC11 scHi-C dataset was downloaded from the 4DN Data Portal29 (4DNESF829JOW and 4DNESJQ4RXY5). All scHi-C datasets were imputed with Higashi17 (https://github.com/ma-compbio/Higashi)42 with default parameters. The Dip-C developing mouse brain dataset13 was downloaded from the GEO (GSE162511). The HiRES developing mouse embryos dataset33 was downloaded from the GEO (GSE223917). We downloaded the following ENCODE datasets: ENCFF167NBF, ENCFF171MDW, ENCFF803DJF, ENCFF776OVW, ENCFF001GNK, ENCFF001GNN, ENCFF001GOA, ENCFF001GNX, ENCFF001GNT, ENCFF001GNR, ENCFF001GRA, ENCFF001GRD, ENCFF001GRQ, ENCFF001GRM, ENCFF001GRJ, ENCFF001GRG, ENCFF834HNV, ENCFF066MEE, ENCFF366BVS, ENCFF050ZTH and ENCFF519FHW. The imaging dataset31 was obtained from Zenodo (https://doi.org/10.5281/zenodo.3928890)43. We also downloaded the scRNA-seq of multiple cortical areas of the human brain from the Allen Brain Map44,45. The marker genes for astrocyte (Astro), oligodendrocyte (ODC), oligodendrocyte progenitor cell (OPC), endothelial cell (Endo), microglia (MG) and neuron cell types were identified using Seurat46,47 with default parameters. For each cell type, the background was chosen as the rest of the cell types. When identifying marker genes for neuron subtypes, the background was chosen as the rest of the neuron cells. The genes were then ranked by the log fold change value between a specific cell type and the background.

Code availability

The source code of scGHOST can be accessed at https://github.com/ma-compbio/scGHOST (ref. 48), which has also been deposited to Zenodo (https://doi.org/10.5281/zenodo.10141210; ref. 49).

References

  1. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Xiong, K. & Ma, J. Revealing Hi-C subcompartments by imputing inter-chromosomal chromatin interactions. Nat. Commun. 10, 5069 (2019).

  4. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

  5. Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

  6. Zheng, H. & Xie, W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 20, 535–550 (2019).

    Article  CAS  PubMed  Google Scholar 

  7. Marchal, C., Sima, J. & Gilbert, D. M. Control of DNA replication timing in the 3D genome. Nat. Rev. Mol. Cell Biol. 20, 721–737 (2019).

    Article  CAS  PubMed  Google Scholar 

  8. Misteli, T. The self-organizing genome: principles of genome architecture and function. Cell 183, 28–45 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ramani, V. et al. Massively multiplex single-cell Hi-C. Nat. Methods 14, 263–266 (2017).

  10. Nagano, T. et al. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547, 61–67 (2017).

  11. Tan, L., Xing, D., Chang, C.-H., Li, H. & Xie, X. S. Three-dimensional genome structures of single diploid human cells. Science 361, 924–928 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Kim, H.-J. et al. Capturing cell type-specific chromatin compartment patterns by applying topic modeling to single-cell Hi-C data. PLoS Comput. Biol. 16, e1008173 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Tan, L. et al. Changes in genome architecture and transcriptional dynamics progress independently of sensory experience during post-natal brain development. Cell 184, 741–758 (2021).

    Article  CAS  PubMed  Google Scholar 

  14. Lee, D.-S. et al. Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nat. Methods 16, 999–1006 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Liu, H. et al. DNA methylation atlas of the mouse brain at single-cell resolution. Nature 598, 120–128 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zhou, J. et al. Robust single-cell Hi-C clustering by convolution-and random-walk–based imputation. Proc. Natl Acad. Sci. USA 116, 14011–14018 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Zhang, R., Zhou, T. & Ma, J. Multiscale and integrative single-cell Hi-C analysis with Higashi. Nat. Biotechnol. 40, 254–261 (2022).

    Article  CAS  PubMed  Google Scholar 

  18. Zhang, R., Zhou, T. & Ma, J. Ultrafast and interpretable single-cell 3D genome analysis with Fast-Higashi. Cell Syst. 13, 798–807 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Zhang, Y. et al. Computational methods for analysing multiscale 3D genome organization. Nat. Rev. Genet. 25, 123–141 (2023).

  20. Zhou, T., Zhang, R. & Ma, J. The 3D genome structure of single cells. Annu. Rev. Biomed. Data Sci. 4, 21–41 (2021).

  21. Yu, M. et al. SnapHiC: a computational pipeline to identify chromatin loops from single-cell Hi-C data. Nat. Methods 18, 1056–1059 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Belmont, A. S. Nuclear compartments: an incomplete primer to nuclear compartments, bodies, and genome organization relative to nuclear architecture. Cold Spring Harb. Perspect. Biol. 14, a041268 (2022).

    Article  CAS  PubMed  Google Scholar 

  23. Liu, Y. et al. Systematic inference and comparison of multi-scale chromatin sub-compartments connects spatial organization to cell phenotypes. Nat. Commun. 12, 2439 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Ashoor, H. et al. Graph embedding and unsupervised learning predict genomic sub-compartments from hic chromatin interaction data. Nat. Commun. 11, 1173 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Grover, A. & Leskovec, J. node2vec: scalable feature learning for networks. In Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855–864 (Association for Computing Machinery, 2016).

  26. Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at https://arxiv.org/abs/1301.3781 (2013).

  27. Trojer, P. & Reinberg, D. Facultative heterochromatin: is there a distinctive molecular signature? Mol. Cell 28, 1–13 (2007).

    Article  CAS  PubMed  Google Scholar 

  28. Zhu, C. et al. Joint profiling of histone modifications and transcriptome in single cells from mouse brain. Nat. Methods 18, 283–292 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Reiff, S. B. et al. The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data. Nat. Commun. 13, 2365 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Friedman, C. E. et al. Single-cell transcriptomic analysis of cardiac differentiation from human PSCs reveals HOPX-dependent cardiomyocyte maturation. Cell Stem Cell 23, 586–598 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Su, J.-H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-scale imaging of the 3D organization and transcriptional activity of chromatin. Cell 182, 1641–1659 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Perez, J. D. et al. Quantitative and functional interrogation of parent-of-origin allelic expression biases in the brain. eLife 4, e07860 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Liu, Z. et al. Linking genome structures to functions by simultaneous single-cell Hi-C and RNA-seq. Science 380, 1070–1076 (2023).

    Article  CAS  PubMed  Google Scholar 

  34. Zhou, T. et al. Concurrent profiling of multiscale 3D genome organization and gene expression in single mammalian cells. Preprint at bioRxiv https://doi.org/10.1101/2023.07.20.549578 (2023).

  35. Tang, J. et al. LINE: large-scale information network embedding. In Proc. of the 24th International Conference on World Wide Web 1067–1077 (Association for Computing Machinery, 2015).

  36. Perozzi, B., Al-Rfou, R. & Skiena, S. DeepWalk: online learning of social representations. In Proc. of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 701–710 (Association for Computing Machinery, 2014).

  37. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. of the 3rd International Conference on Learning Representations (ICLR, 2015).

  38. Satopaa, V., Albrecht, J., Irwin, D. & Raghavan, B. Finding a ‘kneedle’ in a haystack: detecting knee points in system behavior. In 2011 31st International Conference on Distributed Computing Systems Workshops 166–171 (IEEE, 2011).

  39. Arvai, K. kneed. GitHub https://github.com/arvkevi/kneed (2020).

  40. Dekker, J. et al. The 4D nucleome project. Nature 549, 219–226 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. VRam142/combinatorialHiC. GitHub https://github.com/VRam142/combinatorialHiC (2017).

  42. ma-compbio/Higashi. GitHub https://github.com/ma-compbio/Higashi (2022).

  43. Su, J.-H., Zheng, P., Kinrot, S., Bintu, B. & Zhuang, X. Genome-scale imaging of the 3D organization and transcriptional activity of chromatin. Zenodo https://doi.org/10.5281/zenodo.3928890 (2020).

  44. Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Hodge, R. D. et al. Conserved cell types with divergent features in human versus mouse cortex. Nature 573, 61–68 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. ma-compbio/scGHOST. GitHub https://github.com/ma-compbio/scGHOST (2024).

  49. Xiong, K., Zhang, R. & Ma, J. scGHOST. Zenodo https://doi.org/10.5281/zenodo.10116434 (2023).

Download references

Acknowledgements

This work was supported, in part, by National Institutes of Health Common Fund 4D Nucleome Program grant UM1HG011593 (J.M.); National Institutes of Health Common Fund Cellular Senescence Network Program grant UG3CA268202 (J.M.); and National Institutes of Health grants R01HG007352 (J.M.) and R01HG012303 (J.M.). J.M. was additionally supported by a Guggenheim Fellowship from the John Simon Guggenheim Memorial Foundation, a Google Research Collabs Award and a Single-Cell Biology Data Insights award from the Chan Zuckerberg Initiative. R.Z. was additionally supported by funding from the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard.

Author information

Authors and Affiliations

Authors

Contributions

K.X. and J.M. conceived the development of this work. K.X. and R.Z. developed the software tools. K.X., R.Z. and J.M. conducted data analysis and investigation. K.X., R.Z. and J.M. wrote the paper. J.M. acquired funding to support this work.

Corresponding author

Correspondence to Jian Ma.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Ming Hu, Fulai Jin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Lei Tang, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–32 and Supplementary Note.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, K., Zhang, R. & Ma, J. scGHOST: identifying single-cell 3D genome subcompartments. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02230-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41592-024-02230-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing