This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Data availability
HIC files for both the DLBCL and healthy B cell datasets2 are available for download at https://github.com/vaquerizaslab/chess/tree/master/examples/dlbcl, while the raw FASTQ files can be accessed from the ArrayExpress archive under accession code E-MTAB-5875. COOL files for the reproduced and shuffled data can be accessed from https://github.com/hanjunlee21/StructuralSimilarity/tree/main/COOL and have been deposited at Zenodo (https://doi.org/10.5281/zenodo.7937194). HIC files for seven human cell types5,6 are available for download at the Gene Expression Omnibus under accession code GSE63525. FASTQ files for the GM12878 dataset are available for download at GSM2360314. The DNase I hypersensitivity assay dataset for GM12878 is available for download at https://www.encodeproject.org/experiments/ENCSR000EMT/. Source data are provided with this paper.
Code availability
All code required for the reproduction of our findings is available on GitHub (https://github.com/hanjunlee21/StructuralSimilarity) and has been deposited at Zenodo (https://doi.org/10.5281/zenodo.7937194). The HiCShuffle source code is publicly available at https://github.com/hanjunlee21/HiCShuffle and is indexed in PyPI as hicshuffle. The HiCShuffle source code has been deposited in Zenodo at https://doi.org/10.5281/zenodo.7937187. The CHESS source code1 is publicly available at https://github.com/vaquerizaslab/CHESS and is indexed in PyPI as chess-hic.
References
Galan, S. et al. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat. Genet. 52, 1247–1255 (2020).
Díaz, N. et al. Chromatin conformation analysis of primary patient tissue using a low input Hi-C method. Nat. Commun. 9, 4938 (2018).
Van Der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
Ing-Simmons, E., Machnik, N. & Vaquerizas, J. M. Reply to: Revisiting the use of structural similarity index in Hi-C. Nat. Genet. https://doi.org/10.1038/s41588-023-01595-5 (2023).
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Müller, C. A. et al. The dynamics of genome replication using deep sequencing. Nucleic Acids Res. 42, e3 (2014).
Van Steensel, B. & Belmont, A. S. Lamina-associated domains: links with chromosome architecture, heterochromatin, and gene repression. Cell 169, 780–791 (2017).
Djekidel, M. N., Chen, Y. & Zhang, M. Q. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process. Genome Res. 28, 412–422 (2018).
Knight, P. A. & Ruiz, D. A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047 (2013).
Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).
Busby, M. A. et al. Expression divergence measured by transcriptome sequencing of four yeast species. BMC Genomics 12, 635 (2011).
Author information
Authors and Affiliations
Contributions
H.L., B.B., M.S.L. and T.S. conceptualized the study. H.L. designed the software and carried out the investigation. H.L. prepared and wrote the original draft of the manuscript. H.L., B.B., M.S.L. and T.S. reviewed and edited the draft. H.L., M.S.L. and T.S. supervised the study.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Schematic of data shuffling.
To destroy any significant differences in chromatin contacts between the query and reference input files of CHESS, reads from the FASTQ files of Díaz et al.2 were shuffled to create hybrid FASTQ files containing identical fraction of reads from DLBCL and NORMAL libraries. DLBCL, diffuse large B-cell lymphoma.
Extended Data Fig. 2 Assessment of the mean SSIM subtraction approach proposed by Ing-Simmons et al.4.
a, Distributions of mean SSIM in chromosome 2p for each chromatin-contact map comparison. Gray line indicates the subtracted mean SSIM value that is defined as the difference between the mean SSIM value of diffuse large B cell lymphoma versus healthy B cells (blue) and the mean SSIM value of two shuffled datasets (red). b, Scatter plot on the relationship between the mean SSIM values of diffuse large B cell lymphoma versus healthy B cells and the subtracted mean SSIM values (Pearson’s r = −0.012, P = 0.796; two-tailed test). DLBCL, diffuse large B cell lymphoma; SSIM, structural similarity index measure.
Extended Data Fig. 3 Assessment of the heuristic approach proposed by Ing-Simmons et al.4.
a, Scatter plots on three key metrics (mean SSIM, inverse of the Fano factor, and mean absolute fold change). Magenta dots indicate regions that passed the heuristically defined thresholds proposed by Ing-Simmons et al.4 (bottom 10th percentile for mean SSIM and 90th percentile for the Fano factor), while gray dots indicate regions that failed the thresholds. For each group, three representative regions were selected for further analyses (panels 1–6). b, Chromatin-contact maps for panels 1–6. Regions that passed the heuristically defined thresholds exhibited shallow read coverage and showed limited evidence of differential chromatin contact. DLBCL, diffuse large B-cell lymphoma; SSIM, structural similarity index measure.
Extended Data Fig. 4 Schematic of data shuffling using HiCShuffle.
HiCShuffle is a python-based software that is indexed in PyPI as hicshuffle. HiCShuffle generates four GZIP-compressed shuffled FASTQ files for paired-end experiments. Each FASTQ file would contain half of the query FASTQ file and half of the reference FASTQ file. Both FASTQ and GZIP-compressed FASTQ formats are compatible with HiCShuffle. HiCShuffle is compatible with UNIX-based systems.
Supplementary information
Source data
Source Data Extended Data Figs. 2 and 3
Statistical source data.
Rights and permissions
About this article
Cite this article
Lee, H., Blumberg, B., Lawrence, M.S. et al. Revisiting the use of structural similarity index in Hi-C. Nat Genet 55, 2049–2052 (2023). https://doi.org/10.1038/s41588-023-01594-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-023-01594-6
This article is cited by
-
Reply to: Revisiting the use of structural similarity index in Hi-C
Nature Genetics (2023)