Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction

Matters Arising to this article was published on 05 December 2023

Abstract

Dynamic changes in the three-dimensional (3D) organization of chromatin are associated with central biological processes, such as transcription, replication and development. Therefore, the comprehensive identification and quantification of these changes is fundamental to understanding of evolutionary and regulatory mechanisms. Here, we present Comparison of Hi-C Experiments using Structural Similarity (CHESS), an algorithm for the comparison of chromatin contact maps and automatic differential feature extraction. We demonstrate the robustness of CHESS to experimental variability and showcase its biological applications on (1) interspecies comparisons of syntenic regions in human and mouse models; (2) intraspecies identification of conformational changes in Zelda-depleted Drosophila embryos; (3) patient-specific aberrant chromatin conformation in a diffuse large B-cell lymphoma sample; and (4) the systematic identification of chromatin contact differences in high-resolution Capture-C data. In summary, CHESS is a computationally efficient method for the comparison and classification of changes in chromatin contact data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: CHESS overview and examples.
Fig. 2: CHESS evaluation on synthetic Hi-C matrices.
Fig. 3: Global comparison of syntenic region similarity between human and mouse using CHESS.
Fig. 4: Identification of chromatin conformational changes in fly embryos after Zelda (zld) knockdown.
Fig. 5: Identification of structural changes in a DLBCL.
Fig. 6: Feature extraction from Capture-C data.

Similar content being viewed by others

Data availability

The datasets analyzed in this study have been obtained from the Gene Expression Omnibus (Rao et al.10, GSE63525; Bonev et al.12, GSE96107; Despang et al.50, GSE125294) and ArrayExpress (Hug et al.9, E-MTAB-4918; Díaz et al.48, E-MTAB-5875).

Code availability

The CHESS source code and the code for generating the synthetic Hi-C matrices and running tests on them is available on GitHub: (https://github.com/vaquerizaslab/CHESS). The intervaltree and tqdm packages used internally in CHESS can be found at https://github.com/chaimleib/intervaltree and https://github.com/tqdm/tqdm, respectively. In addition, CHESS uses internally the following published packages: FAN-C71 (https://github.com/vaquerizaslab/fanc); Cython72; SciPy69; scikit-image59; NumPy73,74; Pandas75; Pathos76; Pybedtools77; Kneed78.

References

  1. Bonev, B. & Cavalli, G. Organization and function of the 3D genome. Nat. Rev. Genet. 17, 661–678 (2016).

    Article  CAS  PubMed  Google Scholar 

  2. Vietri Rudan, M. et al. Comparative Hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 10, 1297–1309 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Acemel, R. D., Maeso, I. & Gómez‐Skarmeta, J. L. Topologically associated domains: a successful scaffold for the evolution of gene regulation in animals. Wiley Interdiscip. Rev. Dev. Biol. 6, e265 (2017).

    Article  Google Scholar 

  4. Lazar, N. H. et al. Epigenetic maintenance of topological domains in the highly rearranged gibbon genome. Genome Res. 28, 983–997 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Eres, I. E., Luo, K., Hsiao, C. J., Blake, L. E. & Gilad, Y. Reorganization of 3D genome structure may contribute to gene regulatory evolution in primates. PLoS Genet. 15, e1008278 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Yang, Y., Zhang, Y., Ren, B., Dixon, J. R. & Ma, J. Comparing 3D genome organization in multiple species using phylo-HMRF. Cell Syst. 8, 494–505.e14 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Ke, Y. et al. 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. Cell 170, 367–381.e20 (2017).

    Article  CAS  PubMed  Google Scholar 

  8. Du, Z. et al. Allelic reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232–235 (2017).

    Article  CAS  PubMed  Google Scholar 

  9. Hug, C. B., Grimaldi, A. G., Kruse, K. & Vaquerizas, J. M. Chromatin architecture emerges during zygotic genome activation independent of transcription. Cell 169, 216–228.e19 (2017).

    Article  CAS  PubMed  Google Scholar 

  10. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bonev, B. et al. Multiscale 3D genome rewiring during mouse neural development. Cell 171, 557–572.e24 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Nagano, T. et al. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547, 61–67 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Gibcus, J. H. et al. A pathway for mitotic chromosome formation. Science 359, eaao6135 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Spielmann, M., Lupiáñez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).

    Article  CAS  PubMed  Google Scholar 

  16. Krijger, P. H. L. & de Laat, W. Regulation of disease-associated gene expression in the 3D genome. Nat. Rev. Mol. Cell Biol. 17, 771–782 (2016).

    Article  CAS  PubMed  Google Scholar 

  17. Darrow, E. M. et al. Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture. Proc. Natl Acad. Sci. USA 113, E4504–E4512 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Yang, T. et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 27, 1939–1949 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Sauria, M. E. G. & Taylor, J. QuASAR: quality assessment of spatial arrangement reproducibility in Hi-C data. Preprint at bioRxiv https://doi.org/10.1101/204438 (2017).

  22. Ursu, O. et al. GenomeDISCO: a concordance score for chromosome conformation capture experiments using random walks on contact map graphs. Bioinformatics 34, 2701–2707 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Yan, K.-K., Yardımcı, G. G., Yan, C., Noble, W. S. & Gerstein, M. HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps. Bioinformatics 33, 2199–2201 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Shavit, Y. & Lio’, P. Combining a wavelet change point and the Bayes factor for analysing chromosomal interaction data. Mol. Biosyst. 10, 1576–1585 (2014).

    Article  CAS  PubMed  Google Scholar 

  25. Huynh, L. & Hormozdiari, F. TAD fusion score: discovery and ranking the contribution of deletions to genome structure. Genome Biol. 20, 60 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Paulsen, J. et al. HiBrowse: multi-purpose statistical analysis of genome-wide chromatin 3D organization. Bioinformatics 30, 1620–1622 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lareau, C. A. & Aryee, M. J. diffloop: a computational framework for identifying and analyzing differential DNA loops from sequencing data. Bioinformatics 34, 672–674 (2018).

    Article  CAS  PubMed  Google Scholar 

  28. Djekidel, M. N., Chen, Y. & Zhang, M. Q. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process. Genome Res. 28, 412–422 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Stansfield, J. C., Cresswell, K. G., Vladimirov, V. I., Dozmorov, M. G. HiCcompare: an R-package for joint normalization and comparison of HI-C datasets. BMC Bioinformatics 19, 279 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Lun, A. T. L. & Smyth, G. K. diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics 16, 258 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Cook, K. B., Hristov, B. H., Le Roch, K. G., Vert, J. P. & Noble, W. S. Measuring significant changes in chromatin conformation with ACCOST. Nucleic Acids Res. 48, 2303–2311 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).

    Article  PubMed  Google Scholar 

  34. Wang, Z. & Bovik, A. C. A universal image quality index. IEEE Signal Process. Lett. 9, 81–84 (2002).

    Article  CAS  Google Scholar 

  35. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Harmston, N. et al. Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation. Nat. Commun. 8, 441 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Lee, J. et al. Synteny Portal: a web-based application portal for synteny block analysis. Nucleic Acids Res. 44, W35–W40 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Schwarzer, W. et al. Two independent modes of chromatin organization revealed by cohesin removal. Nature 551, 51–56 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Nora, E. P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944.e22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Haarhuis, J. H. I. et al. The cohesin release factor WAPL restricts chromatin loop extension. Cell 169, 693–707.e14 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Rao, S. S. P. et al. Cohesin loss eliminates all loop domains. Cell 171, 305–320.e24 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Wutz, G. et al. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 36, 3573–3599 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Gassler, J. et al. A mechanism of cohesin-dependent loop extrusion organizes zygotic genome architecture. EMBO J. 36, 3600–3618 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Lupiáñez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Franke, M. et al. Formation of new chromatin domains determines pathogenicity of genomic duplications. Nature 538, 265–269 (2016).

    Article  CAS  PubMed  Google Scholar 

  46. Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).

    Article  CAS  PubMed  Google Scholar 

  47. Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Díaz, N. et al. Chromatin conformation analysis of primary patient tissue using a low input Hi-C method. Nat. Commun. 9, 4938 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Hughes, J. R. et al. Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment. Nat. Genet. 46, 205–212 (2014).

    Article  CAS  PubMed  Google Scholar 

  50. Despang, A. et al. Functional dissection of the Sox9Kcnj2 locus identifies nonessential and instructive roles of TAD architecture. Nat. Genet. 51, 1263–1271 (2019).

    Article  CAS  PubMed  Google Scholar 

  51. Kalhor, R., Tjong, H., Jayathilaka, N., Alber, F. & Chen, L. Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat. Biotechnol. 30, 90–98 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Lin, D. et al. Digestion-ligation-only Hi-C is an efficient and cost-effective method for chromosome conformation capture. Nat. Genet. 50, 754–763 (2018).

    Article  CAS  PubMed  Google Scholar 

  53. Beagrie, R. A. et al. Complex multi-enhancer contacts captured by genome architecture mapping. Nature 543, 519–524 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Cardozo Gizzi, A. M. et al. Microscopy-based chromosome conformation capture enables simultaneous visualization of genome organization and transcription in intact organisms. Mol. Cell 74, 212–222.e5 (2019).

    Article  CAS  PubMed  Google Scholar 

  55. Sampat, M. P., Wang, Z., Gupta, S., Bovik, A. C. & Markey, M. K. Complex wavelet structural similarity: a new image similarity index. IEEE Trans. Image Process. 18, 2385–2401 (2009).

    Article  PubMed  Google Scholar 

  56. Homola, T., Dohnal, V. & Zezula, P. Searching for sub-images using sequence alignment. In Proc. 2011 IEEE International Symposium on Multimedia 61–68 (IEEE, 2011).

  57. Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Knight, P. A. & Ruiz, D. A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047 (2013).

    Article  Google Scholar 

  59. van der Walt, S. et al. scikit-image: image processing in Python. PeerJ. 2, e453 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Behara, K. N. S., Bhaskar, A. & Chung, E. Geographical window based structural similarity index for OD matrices comparison. J. Intell. Transp. Syst., https://doi.org/10.1080/15472450.2020.1795651 (2020).

  61. Djukic, T., Hoogendoorn, S. & Van Lint, H. Reliability assessment of dynamic OD estimation methods based on structural similarity index. In Proc. Transportation Research Board 92nd Annual Meeting (Transportation Research Board, 2013).

  62. Breakey, D. & Meskell, C. Comparison of metrics for the evaluation of similarity in acoustic pressure signals. J. Sound Vib. 332, 3605–3609 (2013).

    Article  Google Scholar 

  63. Hines, A. & Harte, N. Speech intelligibility prediction using a Neurogram Similarity Index Measure. Speech Commun. 54, 306–320 (2012).

    Article  Google Scholar 

  64. Tomasi, C. & Manduchi, R. Bilateral filtering for gray and color images. In Proc. Sixth International Conference on Computer Vision 839–846 (IEEE, 1998).

  65. Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. A Syst. Hum. 9, 62–66 (1979).

    Article  Google Scholar 

  66. Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012).

    Article  CAS  PubMed  Google Scholar 

  67. Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Cournac, A., Marie-Nelly, H., Marbouty, M., Koszul, R. & Mozziconacci, J. Normalization of a chromosomal contact map. BMC Genomics 13, 436 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Blythe, S. A. & Wieschaus, E. F. Zygotic genome activation triggers the DNA replication checkpoint at the midblastula transition. Cell 160, 1169–1181 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Kruse, K., Hug, C. B. & Vaquerizas, J. M. FAN-C: a feature-rich framework for the analysis and visualisation of C data. Preprint at bioRxiv https://doi.org/10.1101/2020.02.03.932517 (2020).

  72. Behnel, S. et al. Cython: the best of both worlds. Comput. Sci. Eng. 13, 31–39 (2011).

    Article  Google Scholar 

  73. Oliphant, T. E. A Guide to NumPy (Trelgol Publishing, 2006).

  74. van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 (2011).

    Article  Google Scholar 

  75. McKinney, W. Data structures for statistical computing in Python. In Proc. Python in Science Conference 56–61 (ScyPy.org, 2010).

  76. McKerns, M. M., Strand, L., Sullivan, T., Fang, A. & Aivazis, M. A. G. Building a framework for predictive science. Preprint at https://arxiv.org/abs/1202.1056 (2012).

  77. Dale, R. K., Pedersen, B. S. & Quinlan, A. R. Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics 27, 3423–3424 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Satopaa, V., Albrecht, J., Irwin, D. & Raghavan, B. Finding a ‘Kneedle’ in a haystack: detecting knee points in system behavior. In Proc. 31st International Conference on Distributed Computing Systems Workshops 166–171 (IEEE Computer Society, 2011).

Download references

Acknowledgements

Work in the Vaquerizas laboratory is funded by the Max Planck Society, the Deutsche Forschungsgemeinschaft (DFG) Priority Programme SPP 2202 ‘Spatial Genome Architecture in Development and Disease’ (project no. 422857230 to J.M.V.), the DFG Clinical Research Unit CRU326 ‘Male Germ Cells: from Genes to Function’ (project no. 329621271 to J.M.V.), the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 643062—ZENCODE-ITN to J.M.V.) and the Medical Research Council in the UK. This research was partially funded by the European Union’s H2020 Framework Programme through the European Research Council (grant no. 609989 to M.A.M.-R.). We thank the support of the Spanish Ministerio de Ciencia, Innovación y Universidades through grant no. BFU2017-85926-P to M.A.M.-R. The Centre for Genomic Regulation thanks the support of the Ministerio de Ciencia, Innovación y Universidades to the European Molecular Biology Laboratory partnership, the ‘Centro de Excelencia Severo Ochoa 2013–2017’, agreement no. SEV-2012-0208, the CERCA Programme/Generalitat de Catalunya, Spanish Ministerio de Ciencia, Innovación y Universidades through the Instituto de Salud Carlos III, the Generalitat de Catalunya through the Departament de Salut and Departament d’Empresa i Coneixement and cofinancing by the Spanish Ministerio de Ciencia, Innovación y Universidades with funds from the European Regional Development Fund corresponding to the 2014–2020 Smart Growth Operating Program. S.G. thanks the support from the Company of Biologists (grant no. JCSTF181158) and the European Molecular Biology Organization Short-Term Fellowship programme.

Author information

Authors and Affiliations

Authors

Contributions

N.M. and J.M.V. conceptualized the study. S.G., N.M. and K.K. devised the methodology. N.M. and J.M.V. carried out the investigation. S.G., K.K. and N.D. obtained the resources. S.G., N.M., K.K., M.A.M.-R. and J.M.V. prepared and wrote the original draft of the manuscript. S.G., N.M., K.K., N.D., M.A.M.-R. and J.M.V. wrote, reviewed and edited the draft. J.M.V. supervised the study. M.A.M.-R. and J.M.V. acquired the funding.

Corresponding author

Correspondence to Juan M. Vaquerizas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Performance analysis of the CHESS algorithm.

a, CHESS P values in dependence of the relative noise level in synthetic matrices. Shown are the cases of equal amounts of noise in reference R and query Q (top) and different amounts of noise (bottom, noise only added to Q). Each case is examined for normalised and observed/expected (obs/exp) matrices, and different window sizes in the SSIM algorithm. b, Empirically determined CHESS P values in dependence of the size factor between R and Q for normalised (left) and observed/expected (obs/exp) matrices (right) (details in Methods). a, b, Solid lines indicate the mean, shaded areas the standard deviation over 100 simulations per parameter combination.

Extended Data Fig. 2 Technical details of the SSIM algorithm applied to Hi-C matrices.

a, Schematic overview of the structural similarity algorithm (SSIM). SSIM scores are calculated on all submatrices of R /Q at a given window size (WS). The final SSIM score is the mean of all SSIM submatrix scores. b, SSIM submatrix formula. Different components are coloured: illuminance (green), structure * contrast (red). x, y refer to submatrices (at the same positions) of the two full matrices for which the SSIM average is computed (see panel a). μ indicates the mean, σ the standard deviation, c1 and c2 are small constants that are introduced only for numerical reasons. c and d, SSIM comparisons of a matrix to itself (red dots) and 1,000 random matrices of the same size (blue dots). c, SSIM component values in dependence of SSIM score for different SSIM window sizes. d, Scatterplots of ranked SSIM scores at window size 100 vs. ranked scores at smaller window sizes.

Extended Data Fig. 3 Additional analysis of the CHESS algorithm.

a, Uniform distribution of empirically determined CHESS P values for comparisons of matrices with 100 % noise added. b, Distribution of structural similarity scores (SSIM) for background and truth comparisons at 25 k/Mb and 1.5 M/Mb simulated sequencing depth. Above each: Fractional change (value at x % noise/value at 0 % noise) of the standard deviation (std) of background scores and mean of truth scores over 100 simulations per parameter combination.

Extended Data Fig. 4 CHESS is robust to changes in noise due to random ligations and sequencing depth in real Hi-C data.

a, Examples of 5 Mb matrices used in this analysis including a 5, 80 and 95 % of added noise (random ligations between pairs of loci). We tested to what extent CHESS is able to identify two matrices as being identical, after noise and sequencing depth were adjusted independently in them. Matrices are based on chromosome 19 data from Bonev et al.12. a, examples of the data with different amounts of noise. b, empirically determined P values and z-scores of CHESS runs with different window sizes, noise levels and simulated sequencing depths (details in Methods). Step size and matrix resolution were both 25 kb. Lines for 2 x 105 and 1 x 106 overlap for runs with window sizes > 1 Mb. c, As in panel a, but comparing CHESS runs with 2.5 Mb window size on matrices binned at 25 kb and 10 kb. b and c, solid lines indicate the mean, shaded areas the standard deviation over 1976, 2066, 2156, 2246, 2300 matrix pairs for window sizes 10 Mb, 7.5 Mb, 5 Mb, 2.5 Mb, 1 Mb, respectively.

Extended Data Fig. 5 Reproducibility of CHESS using different window (WS) and step sizes (SS), sequencing depths and resolutions.

For this analysis were tested the WS (250 kb - 3 Mb), SS (25 kb - 1 Mb), sequencing depths (percentage of reads between 20 and 80) and resolutions (10 kb and 25 kb) (details in Methods). X-axis labels: varied parameters in parentheses, fixed parameters before. The first two boxplots with red dots represent the Jaccard indices (JI) between CHESS results in Bonev et al.12 using different WS, SS and sequencing depths. The boxplots with blue dots correspond to the Díaz et al.48 dataset; in this case using different WS, SS, and then between different WS, SS and resolutions. mESC mouse embryonic stem cells, NPC neural progenitor cells. Boxplot elements: centre line: median, whiskers: 1.5x interquartile range, box limits: upper-lower quartile.

Extended Data Fig. 6 CHESS benchmark against HOMER, diffHiC and ACCOST.

a, Upset plot representing the intersection size between differential interactions of CHESS, HOMER, diffHiC and ACCOST. Below, an example is shown for each intersected group. b, Computational requirements of CHESS, HOMER, diffHiC and ACCOST. The first line plot shows the CPU usage, the second the memory consumption. The vertical dashed line represents the end of the run.

Extended Data Fig. 7 CHESS performance on differently sized simulated matrices with realistic noise and sequencing depth.

Shown are empirically determined CHESS p- and z-scores (details in Methods) for comparisons of R with a read depth of 100 read pairs / 100 bins and a resized copy Q. Scaling factor is indicated on the x-axis. A noise level of 25 % was added to both matrices independently. Sequencing depth was adjusted to 100 k/Mb. Solid lines indicate the mean, shaded areas the standard deviation over 100 simulations per parameter combination. Colours correspond to the different sizes of R.

Extended Data Fig. 8 Feature extraction from Capture-C data.

Examples of differential feature extraction with CHESS between the wt (top contact map) and different mutants (middle contact map) in the Despang et al.50 dataset. Lost and gained structures in the mutants are highlighted in blue and red squares, respectively. Log2 fold-change maps are depicted below (bottom contact map) with identified features coloured according to the directionality of the change. Below each comparison, the genomic annotation is represented, highlighting the modification of each mutant. The vertical lines define the CTCF binding motifs, dashed when deleted. Red hexagons demarcate TAD boundaries. Feature extraction between wt and a, ∆Bor, in which the border was deleted. b, ∆BorC1, in which the border and the first CTCF binding motif were deleted. c, ∆BorC1-2, in which the border and the two first CTCF binding motifs were deleted. d, ∆BorC1-4, in which the border and four CTCF binding motifs were deleted. e, ∆CTCF, in which the border and all the CTCF binding motifs were removed. f, Bor-KnockIn, in which the border was moved to a new location within the Sox9 locus. g, InvC∆Bor, in which the Sox9 sequence was inverted and the border was removed.

Supplementary information

Supplementary Information

Supplementary Figs. 1–5

Reporting Summary

Supplementary Table

Supplementary Table 1

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Galan, S., Machnik, N., Kruse, K. et al. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat Genet 52, 1247–1255 (2020). https://doi.org/10.1038/s41588-020-00712-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-020-00712-y

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research