Abstract

Structural variants (SVs) can result in changes in gene expression due to abnormal chromatin folding and cause disease. However, the prediction of such effects remains a challenge. Here we present a polymer-physics-based approach (PRISMR) to model 3D chromatin folding and to predict enhancer–promoter contacts. PRISMR predicts higher-order chromatin structure from genome-wide chromosome conformation capture (Hi-C) data. Using the EPHA4 locus as a model, the effects of pathogenic SVs are predicted in silico and compared to Hi-C data generated from mouse limb buds and patient-derived fibroblasts. PRISMR deconvolves the folding complexity of the EPHA4 locus and identifies SV-induced ectopic contacts and alterations of 3D genome organization in homozygous or heterozygous states. We show that SVs can reconfigure topologically associating domains, thereby producing extensive rewiring of regulatory interactions and causing disease by gene misexpression. PRISMR can be used to predict interactions in silico, thereby providing a tool for analyzing the disease-causing potential of SVs.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Fraser, J., Williamson, I., Bickmore, W. A. & Dostie, J. An overview of genome organization and how we got there: from FISH to Hi-C. Microbiol. Mol. Biol. Rev. 79, 347–372 (2015).

  2. 2.

    Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

  3. 3.

    Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

  4. 4.

    Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

  5. 5.

    Phillips-Cremins, J. E. et al. Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153, 1281–1295 (2013).

  6. 6.

    Fraser, J. et al. Hierarchical folding and reorganization of chromosomes are linked to transcriptional changes in cellular differentiation. Mol. Syst. Biol. 11, 852 (2015).

  7. 7.

    Lupiáñez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

  8. 8.

    Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).

  9. 9.

    Lupiáñez, D. G., Spielmann, M. & Mundlos, S. Breaking TADs: how alterations of chromatin domains result in disease. Trends Genet. 32, 225–237 (2016).

  10. 10.

    Franke, M. et al. Formation of new chromatin domains determines pathogenicity of genomic duplications. Nature 538, 265–269 (2016).

  11. 11.

    Duan, Z. et al. A three-dimensional model of the yeast genome. Nature 465, 363–367 (2010).

  12. 12.

    Lesne, A., Riposo, J., Roger, P., Cournac, A. & Mozziconacci, J. 3D genome reconstruction from chromosomal contacts. Nat. Methods 11, 1141–1143 (2014).

  13. 13.

    Serra, F. et al. Restraint-based three-dimensional modeling of genomes and genomic domains. FEBS Lett. 589(20 Pt A), 2987–2995 (2015).

  14. 14.

    Barbieri, M. et al. Complexity of chromatin folding is captured by the strings and binders switch model. Proc. Natl. Acad. Sci. USA 109, 16173–16178 (2012).

  15. 15.

    Bohn, M. & Heermann, D. W. Diffusion-driven looping provides a consistent framework for chromatin organization. PLoS One 5, e12218 (2010).

  16. 16.

    Brackley, C. A., Taylor, S., Papantonis, A., Cook, P. R. & Marenduzzo, D. Nonspecific bridging-induced attraction drives clustering of DNA-binding proteins and genome organization. Proc. Natl. Acad. Sci. USA 110, E3605–E3611 (2013).

  17. 17.

    Chiariello, A. M., Annunziatella, C., Bianco, S., Esposito, A. & Nicodemi, M. Polymer physics of chromosome large-scale 3D organisation. Sci. Rep. 6, 29775 (2016).

  18. 18.

    Giorgetti, L. et al. Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell 157, 950–963 (2014).

  19. 19.

    Nicodemi, M. & Prisco, A. Thermodynamic pathways to genome spatial organization in the cell nucleus. Biophys. J. 96, 2168–2177 (2009).

  20. 20.

    Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. USA 112, E6456–E6465 (2015).

  21. 21.

    Scialdone, A., Cataudella, I., Barbieri, M., Prisco, A. & Nicodemi, M. Conformation regulation of the X chromosome inactivation center: a model. PLoS Comput. Biol. 7, e1002229 (2011).

  22. 22.

    Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016).

  23. 23.

    Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

  24. 24.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  25. 25.

    Guo, Y. et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015).

  26. 26.

    Nora, E. P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944.e22 (2017).

  27. 27.

    Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability. Nature 511, 344–347 (2014).

  28. 28.

    Hehir-Kwa, J. Y. et al. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants. Nat. Commun. 7, 12989 (2016).

  29. 29.

    Newman, S., Hermetz, K. E., Weckselblatt, B. & Rudd, M. K. Next-generation sequencing of duplication CNVs reveals that most are tandem and some create fusion genes at breakpoints. Am. J. Hum. Genet. 96, 208–220 (2015).

  30. 30.

    de Gennes, P. G. Scaling Concepts in Polymer Physics (Cornell Univ. Press, Ithaca, NY, 1979).

  31. 31.

    Kirkpatrick, S., Gelatt, C. D. Jr & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).

  32. 32.

    Salamon, P., Sibani, P. & Frost, R. Facts, Conjectures, and Improvements for Simulated Annealing (SIAM, Philadelphia, 2002).

  33. 33.

    Knight, P. A. & Ruiz, D. A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047 (2013).

  34. 34.

    Parisi, G. Statistical Field Theory (Westview Press, New York, 1998).

  35. 35.

    Rosa, A. & Everaers, R. Structure and dynamics of interphase chromosomes. PLOS Comput. Biol. 4, e1000153 (2008).

  36. 36.

    Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).

  37. 37.

    Kremer, K. & Grest, G. S. Dynamics of entangled linear polymer melts: A molecular-dynamics simulation. J. Chem. Phys. 92, 5057 (1990).

  38. 38.

    Andrey, G. et al. Characterization of hundreds of regulatory landscapes in developing limbs reveals two regimes of chromatin folding. Genome Res. 27, 223–233 (2017).

  39. 39.

    Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).

  40. 40.

    Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

  41. 41.

    Nagy, K. N. J. in Advanced Protocols for Animal Transgenesis (eds. Pease, S. & Saunders, T. L.) 431–455 (Springer, Berlin, 2011).

  42. 42.

    Artus, J. & Hadjantonakis, A. K. Generation of chimeras by aggregation of embryonic stem cells with diploid or tetraploid mouse embryos. Methods Mol. Biol. 693, 37–56 (2011).

  43. 43.

    Hagège, H. et al. Quantitative analysis of chromosome conformation capture assays (3C-qPCR). Nat. Protoc. 2, 1722–1733 (2007).

  44. 44.

    Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 4, 1310 (2015).

  45. 45.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  46. 46.

    Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

Download references

Acknowledgements

We thank the sequencing core, transgenic unit, and animal facilities of the Max Planck Institute for Molecular Genetics for technical assistance. This work is supported by grants from the Deutsche Forschungsgemeinschaft (DFG) and the Max Planck Foundation (MPF) to S.M. and D.G.L., the Berlin Institute of Health (BIH) to S.M. and A.P.; by CINECA ISCRA Grant HP10CYFPS5 and HP10CRTY8P, computer resources at INFN and Scope at the University of Naples (M.N.), and the Einstein BIH Fellowship Award to M.N.

Author information

Author notes

    • Darío G. Lupiáñez

    Present address: Epigenetics and Sex Development Group, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin-Buch, Germany

  1. These authors contributed equally: Simona Bianco, Darío G. Lupiáñez, Andrea M. Chiariello, Carlo Annunziatella.

Affiliations

  1. Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso di Monte Sant’Angelo, Naples, Italy

    • Simona Bianco
    • , Andrea M. Chiariello
    • , Carlo Annunziatella
    •  & Mario Nicodemi
  2. Max Planck Institute for Molecular Genetics, RG Development and Disease, Berlin, Germany

    • Darío G. Lupiáñez
    • , Katerina Kraft
    • , Guillaume Andrey
    •  & Stefan Mundlos
  3. Institute for Medical and Human Genetics, Charité – Universitätsmedizin Berlin, Berlin, Germany

    • Darío G. Lupiáñez
    • , Katerina Kraft
    •  & Stefan Mundlos
  4. Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité – Universitätsmedizin Berlin, Berlin, Germany

    • Darío G. Lupiáñez
    •  & Stefan Mundlos
  5. Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany

    • Robert Schöpflin
    •  & Martin Vingron
  6. Department Developmental Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany

    • Lars Wittler
  7. Epigenetic Regulation and Chromatin Architecture Group, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin-Buch, Germany

    • Ana Pombo

Authors

  1. Search for Simona Bianco in:

  2. Search for Darío G. Lupiáñez in:

  3. Search for Andrea M. Chiariello in:

  4. Search for Carlo Annunziatella in:

  5. Search for Katerina Kraft in:

  6. Search for Robert Schöpflin in:

  7. Search for Lars Wittler in:

  8. Search for Guillaume Andrey in:

  9. Search for Martin Vingron in:

  10. Search for Ana Pombo in:

  11. Search for Stefan Mundlos in:

  12. Search for Mario Nicodemi in:

Contributions

M.N. and S.M. designed the project. S.M., D.G.L., and M.V. devised the cHi-C experiments. S.B., A.M.C., C.A., and M.N. developed the modeling part; S.B., A.M.C., and C.A. ran the computer simulations and performed their analyses. L.W. derived mouse homozygous lines and performed tetraploid aggregations. D.G.L., K.K., and G.A. performed cHi-C experiments, and R.S. performed bioinformatic analyses. M.N., S.M., D.G.L., S.B., A.P., A.M.C., and C.A. wrote the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Stefan Mundlos or Mario Nicodemi.

Integrated supplementary information

  1. Supplementary Figure 1 Determination of PRISMR parameters.

    (A) The decay of the minimum of the cost function, H0, in our SA algorithm as a function of the total allowed number of different types of binding domains, n, in the studied murine Epha4 locus (case shown λ=0). (B) The increase of the minimum of the cost function, H0, in our SA algorithm as a function of the cost to add an additional binding site, λ, in the studied locus (case shown n=21). (C) The decay of the minimum of the cost function, H0, in our SA algorithm as a function of the resolution of a single polymer bead, r, in the studied locus (case shown λ=1, n=21). Each point in the plots is the mean of 20 independent runs from different initial conditions. Error bars, i.e. standard deviation from the mean, are within the symbol size.

  2. Supplementary Figure 2 Comparison of cHi-C and model data in human fibroblast cells after correcting for genomic proximity effects.

    The model derived contact matrix (bottom left) of the EPHA4 locus in human fibroblast cells has a high Pearson correlation with our cHi-C data (top left; r=0.93; n=1 with internal control comparing 4 different experiments, see Methods). The matrices in the right panel are obtained by subtraction of the average interaction at a given genomic distance. Interestingly, the patterns in the data are still captured by the model after the effects of genomic distance are subtracted: the Pearson correlation coefficients remains high (r’=0.69).

  3. Supplementary Figure 3 Epigenetic barcoding of the binding domains envisaged by PRISMR in the Epha4 locus of CH12-LX cells.

    (A) In the Epha4 locus of CH12-LX murine cells, PRISMR envisages n=21 different binding domains (colors). Their genomic position and abundance are here recapitulated. (B) ENCODE chromatin features for the same DNA region are here listed24. (C) Matrix with the statistically significant Pearson correlation coefficients of the different binding domains of panel (A) with the ENCODE signals of panel (B). The domains have been clustered according to the similarity of their epigenetic barcode. Each experiment consists in two isogenic replicates.

  4. Supplementary Figure 4 Comparison of original model predictions with the model that also includes prior knowledge on CTCF and with a model with only CTCF.

    The figure shows the contact matrices from our cHi-C data in mouse E11.5 limb bud tissue (top-left; n=1 with internal control comparing 4 different experiments, see Methods) and from three different models. The bottom left panel reports the results derived by our PRISMR method: they have a Pearson, r, and distance-corrected Pearson correlation, r’, with cHi-C data equal to, respectively, r=0.94 and r’=0.60. The top right panel shows the data from a variant of PRISMR (the ‘PRISMR+CTCF’ model) that takes into account prior knowledge of the CTCF binding sites of the locus; its correlations with cHi-C data are r=0.95 and r’=0.52, comparable to the initial PRISMR model. Conversely, a model that only includes CTCF sites (bottom right) has a lower correlation with cHi-C data (r=0.89, r’=0.05).

  5. Supplementary Figure 5 PRISMR model based on mouse wild-type CH12-LX Hi-C data predicts the effects of homozygous structural variants on chromatin architecture.

    (A) Contact matrices are shown from PRISMR predictions for the listed mutations, derived from wild-type Hi-C in mouse CH12-LX cells23. Experimental cHi-C data in mutant E11.5 limb bud tissue are shown below (n=1 with internal control comparing 4 different experiments, see Methods). The Epha4 genomic region with its genes is also schematically shown. Deleted/inverted regions are in grey. PRISMR predicts with high accuracy the 3D chromatin effects of the DelB/DelB deletion (Pearson correlation r=0.93, distance-corrected Pearson correlation r’=0.45), the DelBs/DelBs deletion (r=0.93, r’= 0.46) and the InvF/InvF inversion (r=0.92, r’=0.49). (B) The difference matrices between wt and mutants from PRISMR predictions (top) and cHi-C data (bottom; n=1 with internal control comparing 4 different experiments, see Methods) have also similar patterns. Significant gains of interactions are displayed in red and loss in blue. Arrowheads indicate regions of ectopic interaction between the Epha4 TAD and other regions of the genome. See Methods for sample collection. (C) Virtual 4C plots with interactions from the viewpoint of the respective phenotype causing gene, Pax3 and Wnt6.

  6. Supplementary Figure 6 PRISMR model based on wild-type mouse limbs E11.5 cHi-C data predicts the effects of homozygous structural variants on chromatin architecture.

    (A) Contact matrices are shown from PRISMR predictions for the listed mutations, derived from wild-type cHi-C data in E11.5 limb bud tissue. Experimental cHi-C data in mutant E11.5 limb bud tissue are shown below (n=1 with internal control comparing 4 different experiments, see Methods). The Epha4 genomic region with its genes is also schematically shown. Deleted/inverted regions are in grey. PRISMR predicts with high accuracy the 3D chromatin effects of the DelB/DelB deletion (Pearson correlation r=0.94, distance-corrected Pearson correlation r’=0.50), the DelBs/DelBs deletion (r=0.95, r’= 0.55) and the InvF/InvF inversion (r=0.93, r’=0.52). (B) The difference matrices between wt and mutants from PRISMR predictions (top) and cHi-C data (bottom; n=1 with internal control comparing 4 different experiments, see Methods) have also similar patterns. Significant gains of interactions are displayed in red and loss in blue. Arrowheads indicate regions of ectopic interaction between the Epha4 TAD and other regions of the genome. (C) Virtual 4C plots with interactions from the viewpoint of the respective phenotype causing gene, Pax3 and Wnt6.

  7. Supplementary Figure 7 PRISMR predicted 3D conformations of the Epha4 locus in murine CH12-LX cells.

    Top-left: the PRISMR model based on published Hi-C data in murine CH12-LX cells (n=2)23 recapitulates (Pearson correlation r=0.91, distance-corrected Pearson correlation r’=0.56) the experimental pairwise contact matrix (see also Fig. 2). The shown 3D conformation is a snapshot of the model of the locus with the relative positions of genes and regulator highlighted. Bottom-left: the PRISMR model inferred from the above wt data is informed with the DelB/DelB deletion and the effects on chromatin folding predicted (see also Supplementary Fig. 5). The shown 3D conformation is a snapshot of the model bearing the DelB deletion. Top-right: Analogous results for the DelBs/DelBs shorter deletion. Bottom-right: Analogous results for the InvF/InvF inversion.

  8. Supplementary Figure 8 Regions of ectopic interaction in murine cell mutants.

    Zoom of the regions exhibiting significant ectopic interactions within the subtraction matrices from: (A) PRISMR+CTCF model of E11.5 limb tissue (Fig. 3b). (B) PRISMR model of E11.5 limb tissue (Supplementary Fig. 6b).(C) PRISMR model of CH12-LX cells (Supplementary Fig. 5b). (D) Experimental cHi-C data in E11.5 limb tissue (Fig. 3b). The distance-corrected correlation coefficient between model and experiment is reported in Supplementary Table 2 for all the shown cases.

  9. Supplementary Figure 9 Mouse mutations mapped on the mutated genome.

    cHi-C (top; n=1 with internal control comparing 4 different experiments, see Methods) and PRISMR+CTCF model (bottom) contact data in E11.5 limb tissue cells bearing the shown mutations are here mapped on the mutated genome. The model of the DelB deletion, including the Epha4 TAD boundary, predicts a fusion between the remaining Epha4 and Pax3 TADs, as seen experimentally. Ectopic contacts between the same regions are also predicted in the smaller DelBs deletion, which leaves the Epha4/Pax3 boundary intact. The InvF inversion is predicted, and experimentally confirmed, to rearrange the genomic content of the two adjacent TADs.

  10. Supplementary Figure 10 PRISMR predicted 3D conformations of the EPHA4 locus in human fibroblast cells.

    Top-left: the PRISMR model based on our cHi-C data (n=1 with internal control comparing 4 different experiments, see Methods) in healthy human fibroblast cells recapitulates (Pearson correlation r=0.93, distance-corrected Pearson correlation r’=0.69) the experimental pairwise contact matrix (see also Fig. 2). The shown 3D conformation is a snapshot of the model of the locus with the relative positions of genes and regulator highlighted. Top-right: The PRISMR model inferred from the above wild-type control data is informed with the DelB heterozygous deletion (DelB/+) and the effects on chromatin folding predicted (see also Fig. 4). The shown 3D conformation is a model-derived snapshot of the mutated locus. Bottom-left: Analogous results for the DupF/+ duplication. Bottom-right: Analogous results for the DupP/+ duplication.

  11. Supplementary Figure 11 Regions of ectopic interaction in human fibroblast cells.

    Zoom of the regions exhibiting significant ectopic interactions within the subtraction matrices from the PRISMR model (top) and cHi-C data (bottom) in human fibroblast cells. The distance-corrected correlation coefficients between models and experiments are reported in Supplementary Table 2.

  12. Supplementary Figure 12 Comparison of mean-field-approximated and MD-derived model contact matrices.

    Mean Field approximated contact matrices of the EPHA4 locus are very similar to the full Molecular Dynamics (MD) derived ones in all of the four considered cell types: in CH12-LX (top left) we find a Pearson correlation r=0.95 and a distance-corrected Pearson correlation r’=0.84, in limbs E11.5 (top right) r=0.92, r’=0.77, in human IMR90 (bottom left) r=0.91, r’= 0.83, in human fibroblasts (bottom right) r=0.95, r’=0.74. Capture Hi-C experiments were performed as singletons with internal control comparing four different experiments (see Methods).

  13. Supplementary Figure 13 Convergence of our SA algorithm during a saw-tooth run.

    The convergence of the cost function during a single saw-tooth run of our SA algorithm to its asymptotic value is shown. The different visible steps in the plot correspond to the different SA temperatures sampled by the algorithm (from very high, initial region, to almost zero, final step). Importantly, the plot shows that a stable minimum is approached well within the time scales of our simulations. Each point in the plot is the mean of 30 independent runs from different initial conditions; error bars (standard deviation from the mean) are within the symbol size.

  14. Supplementary Figure 14 Distance-corrected Pearson correlation after removal of each single binding domain in the model of wild-type and mutant murine data.

    Each single different color of the Epha4 PRISMR model in mouse CH12-LX cells (Fig. 2b) is withdrawn and the value of the distance-corrected Pearson correlations of the corresponding “reduced” model with Hi-C data is evaluated. The calculations here are made within the mean-field approximation (see Supplementary Note). For that reason the correlation values can slightly differ from the full MD results reported elsewhere in the manuscript. In particular, the figure shows the correlations: (A) between wild type model contact data and murine CH12-LX Hi-C data23 (B, C, D) between model contact data and cHi-C data in the different mutations considered.

  15. Supplementary Figure 15 Distance-corrected Pearson correlation after removal of each single binding domain in the model of human fibroblasts cHi-C data.

    (A) PRISMR identified n=24 different binding domains (colors) in the model of the EPHA4 locus in human fibroblasts. (B) Each single different color is withdrawn from the PRISMR model of panel A and the corresponding value of the distance-corrected Pearson correlations between the contact data of the “reduced” model and cHi-C data is reported. As in Supplementary Figure 14, the calculations we report here for our analysis are made within the mean-field approximation (see Supplementary Note). For that reason the correlation values can slightly differ from the full MD results reported elsewhere in the manuscript.

  16. Supplementary Figure 16 Comparison of the TADs in model and cHi-C contact data.

    The overlap between TAD boundaries called in our cHi-C data (top) and in our model contact data (bottom) is on average above 90%. The figure shows, as an example, the comparison of TADs between (A) human fibroblast and PRISMR model and (B) mouse limb tissue E11.5 and PRISMR+CTCF model.

Supplementary information

  1. Supplementary Figures

    Supplementary Figures 1–16

  2. Reporting Summary

  3. Supplementary Notes and Tables

    Supplementary Notes 1–19 and Supplementary Tables 1–5

  4. Supplementary Data Set 1

    Simulation datasets

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41588-018-0098-8