Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization

Di Stefano, Marco; Paulsen, Jonas; Lien, Tonje G.; Hovig, Eivind; Micheletti, Cristian

doi:10.1038/srep35985

Download PDF

Article
Open access
Published: 27 October 2016

Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization

Marco Di Stefano¹^na1,
Jonas Paulsen²^na1,
Tonje G. Lien³^na1,
Eivind Hovig^4,5,6^na1 &
…
Cristian Micheletti¹^na1

Scientific Reports volume 6, Article number: 35985 (2016) Cite this article

6448 Accesses
53 Citations
102 Altmetric
Metrics details

Subjects

Abstract

Combining genome-wide structural models with phenomenological data is at the forefront of efforts to understand the organizational principles regulating the human genome. Here, we use chromosome-chromosome contact data as knowledge-based constraints for large-scale three-dimensional models of the human diploid genome. The resulting models remain minimally entangled and acquire several functional features that are observed in vivo and that were never used as input for the model. We find, for instance, that gene-rich, active regions are drawn towards the nuclear center, while gene poor and lamina associated domains are pushed to the periphery. These and other properties persist upon adding local contact constraints, suggesting their compatibility with non-local constraints for the genome organization. The results show that suitable combinations of data analysis and physical modelling can expose the unexpectedly rich functionally-related properties implicit in chromosome-chromosome contact data. Specific directions are suggested for further developments based on combining experimental data analysis and genomic structural modelling.

Reconstruct high-resolution 3D genome structures for diverse cell-types using FLAMINGO

Article Open access 12 May 2022

Hao Wang, Jiaxin Yang, … Jianrong Wang

Defining and detecting links in chromosomes

Article Open access 13 August 2019

Szymon Niewieczerzal, Wanda Niemyska & Joanna I. Sulkowska

Integrative genome modeling platform reveals essentiality of rare contact events in 3D genome organizations

Article Open access 11 July 2022

Lorenzo Boninsegna, Asli Yildirim, … Frank Alber

Introduction

The advent of experimental techniques to study the structural organization of the genome has opened new avenues for clarifying the functional implications of genome spatial arrangement. For instance, the organization of chromosomes in territories with limited intermingling was first demonstrated by fluorescence in-situ hybridization (FISH) experiments^1,2 and, next, rationalised in terms of memory-effects produced by the out-of-equilibrium mitotic → interphase decondensation^{3,4,5,6,7,8,9}. These effects are, in turn, essential for the subsequent chromosomal recondensation step of the cell cycle^5,9. More recently, chromosome conformation capture techniques have allowed for quantifying the contact propensity of pairs of chromosome regions, hence providing key clues for the hierarchical organization of chromosomes into domains with varying degree of compactness and gene activity^7,10,11,12.

Since their very first introduction¹⁰, conformational capture experiments have been complemented by efforts to build coarse-grained models of chromosomes^13,14,15. These modelling approaches have been used with a twofold purpose. On the one hand, general models for long and densely-packed polymers have been used to compare their contact propensities and those inferred from Hi-C data. These approaches are useful to understand the extent to which the Hi-C-probed genome organization depends on general, aspecific physical constraints ^{3,5,7,14,15,16,17,18,19,20,21,22}. On the other hand, Hi-C and other experimental measurements have been used as knowledge-based constraints to build specific, viable candidate three-dimensional representations of chromosomes^{10,14,23,24,25}. These models are valuable because they can expose the genomic structure-function interplay to a direct inspection and analysis, a feat that cannot be usually accomplished with the sole experimental data¹⁰.

Developing such models is difficult. In part, this is because it requires overcoming the limitations of the (currently unavoidable) dimensional reduction where a set of contact propensities is measured in place of the actual three-dimensional conformations, and still obtain the latter. But an additional and key difficulty is the structural heterogeneity of the chromosomal conformational ensemble that is probed experimentally. As for the simpler, but still challenging, problem of proteins with structurally-diverse substates^26,27, such conformational heterogeneity makes it impossible for using all phenomenological restraints to pin down a unique representative structure, and suitable methods must be devised to deal with the inherent heterogeneity.

Here, by building on previous modelling efforts^{10,14,23,24,25}, we tackle these open isssues and ask whether Hi-C data subject to a suitable statistical selection can be indeed be used as phenomenological constraints to obtain structural models of the complete human diploid genome that are viable, i.e. that possess correct functionally-related properties.

The key elements of our approach are two. First, we use advanced statistical tools to single out local and non-local cis-chromosome contacts that are significantly enriched with respect to the reference background of Hi-C data. Second, we employ steered molecular dynamics simulations to drive the formation of these constitutive contacts in a physical model of the human diploid genome, where chromosomes are coarse-grained at the 30 nm level. The viability of this general strategy is here explored for two distinct human cell lines: lung fibroblasts (IMR90) and embryonic stem cells (hESC).

Various functional aspects of the genome organization have been previously addressed with structural models, see ref. 22 for recent reviews. Some of these studies have addressed the architecture of specific, local functional domains^21,24,28, including the formation and plasticity of topologically associated domains (TADs)^{29,30,31,32,33,34}. Structural models were also used to explore general features of the structure-function relationship, such as the interplay of gene co-expression and co-localization in human chromosome 19³⁵ or of epigenetic states and genome folding in D. melanogaster²⁰.

Other studies have instead dealt with the challenge of modelling entire yeast^{23,36,37,38,39} or human chromosomes^25,40 consistently with available experimental data, particularly for the spatial proximity of chromosome loci. The two main challenges of these approaches are: the use of suitable data analysis strategies for inferring pairwise distances from the phenomenological data, such as Hi-C^{21,23,36,40,41,42}, and the optimal use of the distances as phenomenological constraints for three-dimensional models^{21,23,25,40,42}.

Our study addresses both issues and complements earlier efforts in several respects. For data analysis, we use a statistical test based on the zero-inflated negative binomial (ZiNB) distribution to identify significant entries (contacts) from Hi-C matrices of ref. 11 that are much enriched with respect to the expected (background) occurrence of contacts. Typically, this reference distribution is taken as the standard binomial one, which is parametrized based on genomic distance between loci, as well as various Hi-C technical biases^23,43,44,45. In other selection schemes the conditional expectancy has been used to parametrize a negative binomial model based on the interaction frequencies between the restriction fragments⁴⁶. Recently, Rao et al.¹² have taken advantage of ultra high resolution Hi-C data to identify focal peaks in the Hi-C heatmaps by local scanning.

The advantageous property of the ZiNB scheme is its capability to deal with the inevitable sparsity of Hi-C matrices. The distinctive feature of our structural modelling is, instead, the seamless combination of the following features: the modelling is applied to the entire diploid genome inside the nucleus, the coarse-graining level is uniformly set to match the physical properties of the 30 nm fiber and, finally, steered molecular dynamics simulations are used to promote the formation of a subset of the Hi-C contacts, only the significant ones, allowing the unconstrained regions of the chromosomes to organize only under the effect of aspecific physical constraints. The approach is also robust for the introduction of an independent set of constraints based on the high-resolution Hi-C measurements in ref. 12, which provide information about local interactions associated with the boundaries of TADs.

Using our approach, we found that the model chromosomes remain mostly free of topological entanglement and acquire various properties distinctive of the in vivo genome organization. In particular, we found gene-rich and gene-poor regions, lamina associated domains (LADs), loci enriched in histone modifications, and Giemsa bands to be preferentially localized in the expected nuclear space. To our knowledge, this study, which builds on and complements previous genome modelling efforts^22,23,36 is the first to engage in genome-wide physical modelling for two different human cell lines, based on Hi-C data from two different groups, and processed with two alternative statistical analyses. While this breadth ought to make the results interesting per se, the fact that several correct functional features are systematically recovered, makes the approach more relevant and useful for genome modelling. In fact, besides providing a concrete illustration of the genomic structure-function interplay, the results prompt the further development of coarse-grained models as an essential complement of experimental data analysis. Specific directions for such advancements are suggested.

Results

Significant pairwise constraints from Hi-C data

As input data for the knowledge-based three-dimensional (3D) modelling of human chromosomes, we used Hi-C measurements from lung fibroblasts (IMR90) and embryonic stem cells (hESC)¹¹. These data sets provide genome-wide 3D contact information between chromosome regions at 100 kilobases (kb) resolution. We focused on cis-chromosome Hi-C contacts, which, in contrast to trans-chromosome ones, show rich and robust pairing patterns⁷. The matrix of cis contacts is sparse as most of the a priori possible pairings have no associated reads, either because they are genuinely not in spatial proximity, or because their contacting probability is too low to be reliably detected for a given sequencing depth.

This data sparsity must be appropriately dealt with for pinpointing the statistically-significant cis-chromosome pairings that serve as knowledge-based constraints. To this end, we carried out a stringent statistical analysis using the zero-inflated negative binomial distribution (see Methods).

We accordingly singled out 16,409 and 14,928 significant pairings for IMR90 and hESC cells, respectively, using a 1% threshold for the false-discovery rate, see Supplementary Tables S1 and S2, and Supplementary Fig. S1. The number of significant contacts is comparable, and actually larger by a factor of 2, than those found by Rao et al. using different selection criteria and different Hi-C data (and that we shall later incorporate in our modelling as well). The significant pairings obtained with this culling procedure ought to correspond to a core of contacts that are likely present across the heterogenenous conformational repertoire populated by chromosomes. Therefore, this core does not include all contacts present in actual chromosome conformations, so that the restrained model structures are expectedly underconstrained with respect to the real system. Still, as we show later, these core contacts suffice to correctly pin down the larger scale functional features of chromosome organization.

For both cell lines the number of significant pairings correlates only weakly with chromosome length (p-value > 0.08 of non-parametric Kendall rank correlation), but correlates significantly with the number of genes in the chromosomes (p-value < 0.005), see Supplementary Fig. S2. Consistent with this observation, the highest linear density of significant contacts is found for the chromosomes 19 and X, where the gene density is high, while the lowest is found for chromosomes 13 and 18, where the gene density is low.

The observed correlation is not obvious a priori. In fact, because the cross-linking step in Hi-C experiments is not specific for gene-rich regions, the resulting contacts are expected to be unbiased in this respect. In addition, we used the normalization method of Imakaev et al.⁴⁷ to correct for various technical biases, including the possible difficulty of mapping reads on gene-poor chromosomes, whose sequence repetitiveness can be high. It is therefore plausible that the statistical selection criterion is capable of singling out those contact patterns that, being significantly enhanced across the probed cell population, are relevant for gene function.

Genome-wide models from spatial constraints

The statistically significant Hi-C pairings were used as target contacts for the model diploid system of human chromosomes. Following refs 5, 9 and 35, each chromosome was modelled as a semi-flexible chain of beads⁴⁸ with 30 nm diameter, corresponding to about 3 kb⁴⁹. Two copies of each autosome plus one of the X chromosome were packed at the nominal genome density inside a confining nuclear environment. For simplicity, the nuclear shape has been chosen to be spherical with a radius of 4,800 nm, neglecting the flattened ellipsoidal shape of fibroblast cells⁵⁰. The initial positions of the chromosomes were assigned in a stochastic way based on the phenomenological radial position propensities of ref. 50 (phenomenologically placed chromosomes). Steered molecular dynamics simulations were next used to promote the formation of contacts corresponding to the significant Hi-C pairings. Notice that the heterogeneity of the Hi-C sample should make it unfeasible to satisfy simultaneously all contacts corresponding to significant Hi-C pairings. Rather, the selected pairings ought to consist of incompatible subsets of feasible contacts.

The steering process was repeated independently 10 times for each considered cell line. To ensure the statistical independence, we considered one conformation per run, namely the snapshot taken at the end of the steering protocol, for all the following structural analysis. For simplicity and definiteness, we mostly focus on lung fibroblasts (IMR90) with phenomenologically placed chromosomes and point out the relevant analogies with the human embryonic stem cells (hESC) in specific contexts. These and other reference cases, such as pre-steering models and steered systems with random initial placements of the chromosomes, are detailed in Supplementary Figs S3 and S5–S16.

The data in Fig. 1A,B show that the target proximity constraints, despite being practically all unsatisfied before steering, are progressively established in significant proportions during the steering process. The result is consistent with previous findings³⁵ and proves a posteriori that a large fraction of the selected constraints can indeed be simultaneously established in a three-dimensional model without being hindered by the physical incompatibilities within the target contacts. The same properties hold also in the cases of hESC cells and of random initial placements of the chromosomes (Supplementary Fig. S3).

A typical arrangement of the steered chromosome conformation is shown in Fig. 2A. The accompanying tomographic cut (Fig. 2B) shows that chromosomes have a convex shape, as typically observed in FISH imaging^1,50. As discussed later, the limited trans- intermingling observed in chromosomes at the initial, decondensed states, is preserved during steering and is present in the optimized chromosome conformations.

We compared the post-steering distance matrices with the matrices of Hi-C reads. This comparison is meant as a further a posteriori assessment of the system compliance to follow the actual constraints and acquire a spatial organization compatible with the full Hi-C matrices. To do so we used the non-parametric Kendall association test between corresponding entries of the two matrices. The results are shown, for all chromosomes, as Supplementary Fig. S4, and indicate a systematic significant correlation. This is actually an anticorrelation because shorter spatial distances reflect in higher number of reads. To better capture the interplay of aspecific (short-range) and specific (longer-range) domain organization the Kendall correlation coefficient was computed over corresponding entries with genomic distance larger than a minimum threshold, that was varied from 1Mbp to half the chromosome length. The resulting correlation profiles in Supplementary Fig. S4 show that post-steering distance matrices typically maintain a significant anticorrelation with Hi-C upon increasing the genomic distance threshold. By contrast, the same correlation, but measured for the initial chromosome distance matrix degrades rapidly with increasing threshold. This is correct, because it reflects the increasing weight of longer-range specific interactions at the expense of the aspecific, shorter-range ones, of the initial state. Chromosome specific features are hence systematically reproduced only by the steered model chromosomes.

Local variability of chromosomal nuclear positioning

The nuclear position variability of the IMR90 optimised chromosomes is shown in Fig. 3, where the heatmap represents the standard deviations of the radial positioning of all chromosome portions. Chromosomes 19 and X which have the highest number of genes, have the lowest average position variability, while acrocentric chromosomes 13, 14, 15, 21 and telomeric regions of chromosomes 18, 21 and 22 have the largest one. At the same time, the observed position variations are plausible from the functional point of view, particularly regarding the increased motility found to occur for telomeric regions, compared to other, internal repeat regions in the human genome^51,52.

Nuclear positioning of functional regions

The steering optimization of the genome-wide model is based on two phenomenological inputs, the significant Hi-C contacts and the typical radial placement of chromosomes, which are not simply nor manifestly connected to functional aspects. Therefore, a relevant question is whether functionally-related properties can at all be recovered and exposed by the optimized chromosome conformations.

We accordingly considered the radial placement, before and after steering, of gene-rich and gene-poor regions, of lamina associated domains (LADs), of loci enriched in H3K4me3 (activating), H3K9me3 and H3K27me3 (repressive) histone modifications. We also investigated the preferential nuclear localization of Giemsa (G)-bands, which are the resulting coloring patterns of the low-resolution chromosomal staining technique Giemsa banding. Using this technique, heterochromatic, AT-rich and relatively gene-poor regions are depicted by darkly (positive) staining bands, while less condensed chromatin, tending to be GC-rich and more transcriptionally active, appears as light (negative) bands.

These all acquired the correct preferential positioning after steering. Specifically, chromosome regions that are rich in genes, associated with activating or repressing histone modifications or positive Giemsa staining bands occupy preferentially the nuclear center. Conversely, regions poor in genes or corresponding to positive Giemsa bands occupy preferentially the periphery. This holds for LADs too, whose simultaneous location at the nuclear periphery is not expected, given the highly dynamic association with the lamina in vivo⁵³. These properties are shown in Supplementary Figs S13–S16 for hESC, while for fibroblasts, where territorial radial positioning is known to be less definite⁵⁴, are shown in Fig. 4.

We found that: (i) prior to steering, genes are not preferentially near the nucleus center, and that (ii) for entirely random initial chromosome placements, the steered LADs locations are less peripheric. The significance of these differences are shown in Supplementary Figs S9–S12 and indicates that the observed functional properties emerge specifically after introducing the phenomenological constraints on the genome structural model.

The robustness of the functionally-related properties is implied by their consistency across the IMR90 and hESC cell lines. But it is best illustrated by the persistence of the same features upon adding an independent set of spatial constraints. In fact, after completing the steering with the significant IMR90 contacts from ref. 11, we added those selected in ref. 12 for the same cell line, but in different, and higher resolution in situ Hi-C experiments. These additional target contacts are fewer in number (8,040) than the first set and are more local. In fact, the median sequence separation of contacting pairs is 220 kb in the selected set of ref. 12 while it is equal to 46.8 Megabase (Mb) for our reference set. Within our top-down approach, where the constraints are used to introduce progressively detailed structural features on top of an initially generic chromosome model, it is natural to apply the more local constraints of ref. 12 after those from ref. 11. As it is shown in Fig. 5, the added set has the same compliance to steering as the reference one, and the formation of the new contacts does not compete with or disrupt the former. In fact, all previously-discussed functionally related features persist with the added constraints, see Supplementary Figs S17–S19. Interestingly, this compatibility suggests that organizational mechanisms at both local- and non-local levels, meaning sequence separations smaller than 220 kb or larger than 46.8 Mb, can simultaneously concur to the formation of large-scale genome topology and is consistent with current views of how the genome acquires the organization in local domains (TADs)⁵⁵.

In this regard, we have compared the macrodomain organization of the optimised chromosome 19 with that obtained by Kalhor et al.²⁵ based on an independent set of experimental proximity measurements. We chose chromosome 19, because it has the highest linear density of imposed target constraints (~15 constraints/Mb, see Supplementary Table S3), and all of them are simultaneously established in the optimally constrained models within 480 nm. For this comparison, we first used a clustering procedure (see Methods) to optimally partition the chromosome arms into the same number of domains established in ref. 25. Next, we compared the overlap of the two domain partitions and established its significance by comparing it with the overlap distribution of random subdivisions of the two arms in the same number of clusters. The comparison, visualised in Fig. 6A, shows that the domains of chromosome 19 optimised by the combined constraints of refs 11 and 12 have a significant overlap q = 63% (p-value = 0.026) with those of ref. 25. Interestingly, the overlap is appreciably smaller before the addition of the constraints from ref. 12, q = 55% (p − value = 0.065), see Supplementary Fig. S20.

The results give a very vivid example of how the addition of independent sets of phenomenological constraints is not only viable (meaning that they do not interfere negatively) but actually allows to better expose the genuine large-scale organizational feature of the genome. The total number of combined constraints used for chr19 is 1,100, equivalent to ~19 constraints per Mbp. This density of constraints with mixed local and non-local character thus appears to be a good target for robust chromosome modelling.

Furthermore, we compared the structural models, optimised with the phenomenological constraints based on Dixon et al. Hi-C data, with the multidimensional scaling (MDS) constraints provided by the analysis of Lesne et al.³⁶ of the very same Hi-C matrix. MDS is an approximate strategy for “inverting” of a well-populated matrix of pairwise-distances, and hence its applicability to genomic contexts required downsampling the input Hi-C matrix. We accordingly lowered the structural resolution of our models to the 100 kb level (by taking the average bead position of all beads in each 100 kb bin), and then used the similarly resampled Dixon et al. Hi-C matrix to obtain the MDS model of each chromosome via the Floyd-Warshall algorithm³⁶. We compared both steered and non-steered chromosomes with the corresponding MDS-optimized structures, using the root mean square deviation (RMSD) after Procrustes superimposition. As Fig. 6B shows, the steered structures show a much greater similarity to the MDS-optimized structures, indicating that global organizational patterns of individual chromosomes are similar. Furthermore, chromosomes with a denser set of constraints (e.g. chr19, chr17, chr16, chr7) are more similar in terms of RMSD, again indicating that the comparison is sound. We emphasize that MDS is a method for coarse-grained optimization of smaller structures (e.g. single chromosomes), while our modelling approach allow for the joint modelling of all chromosomes simultaneously, and can account for their structural heterogeneity.

Chromosome pre-mitotic recondensation

During the interphase → mitotis step of the cell cycle, the decondensed interphase chromosomes reconfigure in its characteristic rod-like shapes. This rearrangement, while certainly being assisted by topoisomerases, is also aided by the limited incidence of cis and trans topological constraints. This feature is, in turn, fostered by the out-of-equilibrium characteristics of the cell cycle^{3,4,5,6,7,9,14}. In fact, the observed chromosomal entanglement is significantly lower than for equivalent mixtures of long equilibrated polymers, which cannot reconfigure over biologically relevant time scales⁵.

We tested the reconfiguration compliance of the optimised chromosome configurations by switching off the phenomenological target constraints after steering and replacing them with alternative target pairings between loci at the regular sequence separation of 200 kb. These constraints were chosen ad hoc to promote the rearrangement into a linear succession of loops, analogously to the string-like (mitotic) chromosome models of refs 4 and 56.

The steered chromosomes were indeed able to reconfigure and establish most of the new constraints. Their compliance to the target rearrangement is, in fact, very similar to the compliance of the pre-steering conformations which, having been relaxed from the initial rod-like arrangement, are ideally primed to be reconfigured efficiently, see Fig. 7. The lack of significant topological barriers allows the condensing chromosomes to segregate neatly, see Fig. 8, in qualitative accord with FISH observations^1,50.

Discussion

We studied the genome organization of human lung fibroblast (IMR90) and embryonic stem cells (hESC) by combining advanced statistical analysis of Hi-C data and coarse-grained physical models of chromosomes. The models were specialised for each cell line with phenomenological pairwise contacts corresponding to statistically-significant entries of the cis-chromosome Hi-C heatmaps of ref. 11. Being shared by an appreciable fraction of the cells probed experimentally, these contacts, despite not covering all contacts in actual chromosome conformations, ought to be simultaneously compatible, and physically viable. Their formation in the otherwise general models was promoted with steered molecular dynamics simulations on 10 independent replicates per cell line.

With the combined data-analysis and modelling strategy, we studied whether relevant aspects of the genomic structure-function relationship could be retrieved from Hi-C data. For this genome-wide study, we use general chromosomes models that are discretised at the 30 nm level. This fine intrinsic granularity sets a lower bound for the optimised chromosome structure resolution. The latter, in fact, depends on the abundance and type of phenomenological pairwise contacts that are used as phenomenological constraints. The present approach, therefore, aptly complements previous efforts based on larger-scale models incorporating observations from different phenomenological sources (tethered conformation capture techniques²⁵).

We found that, by solely promoting the formation of statistically-significant Hi-C contacts, the model chromosomes, despite being underestrained compared to actual chromosomes, still acquire a number of properties that are distinctive of the in vivo organization. These properties cannot be simply inferred from the direct analysis of Hi-C maps only, a fact that stresses and reinforces the early intuition that three-dimensional models can best expose properties that are otherwise encoded only indirectly in pairwise contact matrices¹⁰.

Specifically, in the optimised chromosome models gene-rich regions and activating or repressing histone modifications preferentially occupy the central space of the nucleus. By contrast, gene-poor ones and lamina associated domains occupy the nuclear periphery. All these features are in accord with experimental observations and with the functionally-related implications of the radial positioning of these regions. The localization of gene-rich regions in the nuclear center can, in fact, be an important prerequisite to organize them in clusters of proximal loci (transcriptional foci) providing an efficient means for their co-expression and co-regulation⁵⁷. Further, a likely explanation for these observations is that the gene-rich, centrally localized chromatin is most actively regulated via histone modifications, while the peripheral regions of the genome, which are less accessible, are preferentially constituted by repressed chromatin. In support of this view, it was recently proposed that lamina-associated chromatin compartments are depleted of most other histone modification signatures¹².

Furthermore, chromosomal regions associated with positive or negative Giemsa bands tend to occupy the nuclear periphery or centre, respectively. This result is again in accord with previous findings⁵⁴. It may also be correlated with the preferential placement of gene-rich and gene-poor regions as suggested in ref. 58. In fact, GC content in mammals, which is also reflected in the Giemsa banding patterns, is correlated with several genomic features that are potentially relevant from a functional viewpoint, including gene density, transposable element distribution, methylation levels, recombination rate, and expression levels. Thus, correlation studies will often tend to observe these together⁵⁹. It is however, also possible that Hi-C experimental uncertainties contribute in part to the observed effect^60,61.

The robustness of the structure-function relationship recovered by using the IMR90 constraints from the data of ref. 11, was tested by adding a further set of phenomenological constraints. These corresponded to the significant contacts selected in ref. 12 from high-resolution in situ Hi-C experiments on the same cell line. The concomitant steering with the two sets of constraints did not ruin the previously-established contacts and, in fact, acquired a sizeable proportion of the added ones. These facts indicate that the two sets are compatible, arguably because the significant contacts of ref. 12 are more local than those selected from the experimental data of ref. 11 with the different statistical methods described in the Methods section. As a result, the two sets complement each other for pinning the large- and small-scale structural features of the doubly-steered chromosome conformations. In fact, these not only maintain the correct preferential radial positioning of functional regions but further acquire a more specific organization in macro-domains that significantly overlaps with that recently identified by ref. 25.

Finally, we characterized the compliance of the optimized, steered chromosomes to reconfigure in a dense linear conformation. This test was aimed at mimicking the interphase → mitotic rearrangements that occur during cell cycle, also aided by topoisomerases, without encountering significant topological hindrance. The optimised chromosomes showed excellent compliance towards developing a dense linear organization. This confirms that chromosome entanglement in the optimised system is minimal, and hence realistic.

To summarise, a consistent accord with phenomenological observations^54,58 was found for all considered properties of the optimised model chromosome configurations, from the preferred radial positioning of several types of functionally-relevant regions to the capability of chromosomes to recondense and segregate without topological hindrance. It is notable that these features emerge by using Hi-C data¹¹ and radial placement⁵⁰ as the sole source of constraints for the general, physics-based chromosome models. This highlights the significant extent to which functionally-related aspects of the genomic organization principles can be extracted from experimental structural data with the aid of suitable data analysis and chromosome modelling. As a further proof of that, we recall that none of the preferential positioning properties of the monitored functional loci emerge without imposing any phenomenological constraints.

We expect that the general approach followed in this study could be profitably extended and transferred to other systems, so to incorporate additional knowledge-based information and to capture more detailed aspects of the spatial organizational principles of eukaryotic nuclei. These will be even more relevant when missing genomic repeat structures eventually become incorporated into analysis. Furthermore, we in particular envisage that extending considerations from cis- to trans-chromosome contacts may be important towards pinning down more precisely the relative positioning of chromosomes, and also their absolute position in the nuclear environment.

Methods

Significant pairwise constraints from Hi-C data

The phenomenological constraints were based on two sources: primarily the raw data of ref. 11 for human lung fibroblasts (IMR90) and embryonic stem cells (hESC) and the list of significant contacts provided by Rao et al. based on their high resolution experiments on IMR90¹².

The significant IMR90 and hESC pairings from the data of ref. 11 were singled out as follows. As input data, x_ij, we took the number of Hi-C reads between two chromosome regions, i and j, discretized at the 100 kb resolution (comparable to the coarse-graining level of the chromosome model) and corrected for technical bias with the method of Imakaev et al.⁴⁷. The distribution of x values at fixed genomic distance, δ ≡ |i − j| was modelled using the discrete zero-inflated negative binomial mixture distribution (ZiNB)⁶². Specifically, the distribution of all reads with a given δ was fitted using the following probability distribution function:

In the above expressions, p is the probability of not having a 3D contact, θ captures the extra variance of the data with respect to a Poisson distribution, and π is the probability of observing additional zeros in the data set. These reflect the intrinsic sparsity of Hi-C data sets, which cannot be accounted for by the sole negative binomial distribution. The fitting procedures were carried out with the pscl package in R⁶³.

Based on this ZiNB model distribution, the p-value for each x_ij, that is the probability of observing a contact frequency equal to or more extreme than x_ij, is given by:

where , , and are the (δ-dependent) best-fit parameters. We correct for multiple testing at fixed δ by selecting significant interactions at 1% false-discovery rate of the Benjamini-Hochberg method⁶⁴. The criterion yields a total of 16,409 and 14,928 interactions (respectively 0.07% and 0.06% of all possible cis-chromosome pairs between 100 kb regions), for lung fibroblasts (IMR90) and for embryonic stem cells (hESC) data sets respectively. The significant contacts for IMR90 are listed in Supplementary Table S1, see Supplementary Fig. S1 for some of their graphical representations, and those of hESC are provided in Supplementary Table S2.

We note that the multiple testing correction assumes that Hi-C maps entries are uncorrelated. Even accounting for the medium resolution of the data sets, this assumption can hold only approximately. Appropriate sampling strategies have been suggested for dealing with such correlations in ref. 12. While these strategies are not primed to be used in conjunction with our analysis, the viability of our method is shown a posteriori. In fact, a similar compliance to steering is observed for the contacts selected using the data of ref. 11 and our statistical method, and for those selected using the data and the statistical analysis of ref. 12. Furthermore, we emphasize that accounting for correlations during multiple testing correction can only result in a less conservative correction procedure, and will not reduce the number of false positives. Thus, in this analysis, the effect would be minor.

Genome-wide modelling and spatial constraints

The statistically significant Hi-C contacts were enforced as spatial constraints for a genome-wide modelling of chromosome organization in lung fibroblasts (IMR90) and embryonic stem cells (hESC) nuclei.

The model chromosomes are packed at the typical nuclear density of 0.012 bp/nm³ inside a confining nucleus. For simplicity, the nuclear shape has been chosen to be spherical with a radius of 4,800 nm, neglecting the flattened ellipsoidal shape of fibroblast cells⁵⁰. Each chromosome is treated as a chain of beads⁴⁸ with diameter σ = 30 nm equal to the nominal chromatin fiber thickness, and hence corresponding to a stretch of 3.03 kb^5,9,35. The chain bending rigidity is chosen so that the model chromatin fiber has the correct persistence length (150 nm)⁵.

The chromosome lengths, in number of beads, are given in Supplementary Table S3 and range from 15,873 to 82,269 (for chromosome 21 and 1, respectively). The total number of beads in the system is 1,952,709, resulting from the presence of two copies for each autosome and a single copy of the sexual chromosome X. The latter is present in a single copy to account for the absent (male) or inactivated (female) copy, while the male chromosome Y is not considered for its small size. Each chromosome is initially prepared in a rod-like structure resulting from stacked rosette patterns, as in ref. 5.

The rod-like chromosomes are initially positioned in the nucleus with two different protocols: random and phenomenological, that is matching the preferential chromosome radial positioning reported in ref. 50. In the random scheme, the chromosomes are consecutively positioned, from the longest to the shortest, by placing their midpoint randomly inside the nucleus and with a random axis orientation. If trans steric clashes arise at any stage, the placement procedure is repeated. The phenomenological scheme is analogous to the random one, except that chromosome midpoints are placed within one of six discretised radial shells, so as to match as close as possible (best positioning out of 10,000 trials) to the experimental average distance from the nuclear center reported in ref. 50. Chromosome protrusions beyond the nuclear sphere are next eliminated by briefly evolving the system with a stochastic (Langevin) molecular dynamics simulation under the action of a radial compressive force, until all beads are brought inside the spherical nuclear boundary, which is treated as impenetrable for all simulations. The Langevin dynamics is integrated numerically with the LAMMPS simulation package⁶⁵, with default parameters and the duration of this initial compressive adjustment phase is set equal to 1,200 τ_LJ, where τ_LJ is the characteristic Lennard-Jones simulation time (see Supplementary Methods).

The chromosomes are next evolved with the stochastic dynamics for a timespan of 120,000 τ_LJ (corresponding to 7 hours in realtime using the time mapping in ref. 5) during which centromeres are compactified by the attractive pairwise interaction of their constitutive beads while chromosome arms relax from the initial rod-like and expand inside the nucleus, adapting to its shape (see Supplementary Methods). After this relaxation phase, the system dynamics is steered to promote the formation of target cis-chromosome pairings corresponding to the statistically significant Hi-C contacts. As in ref. 35, the steering involves setting harmonic constraints between the centers of mass of paired target regions, each spanning 33 beads (equivalent to the 100 kb data resolution). To minimize the out-of-equilibrium driving of the system, the strength of the spring constant is progressively ramped up during the steering phase, which lasts for a maximum of 6,000 τ_LJ, starting from suitably weak initial values of the spring constants. The latter are set based on the sequence separation of the target pair so as to just counteract their entropic recoil (see Supplementary Methods). The same steering procedure has been applied to the optimal conformations (only for IMR90 cell line) to enforce the contacts of ref. 12 for a maximum of 600 τ_LJ. All the steered simulations have been performed using the PLUMED⁶⁶ package for LAMMPS.

Both the system relaxation and the steering procedure are replicated independently 10 times per each of the 4 considered setups, in order to assess the average properties of the models. The four setups are given by the combination of the IMR90 or hESC cell lines and the phenomenological or random chromosome positioning. To ensure the statistical independence, we used one conformation per run, namely the snapshot taken at the end of the steering protocol, for the analysis of the structurally and functionally-oriented properties of the nuclear chromosome organization.

The cumulative computer time of the 40 simulations involved 150,000 single CPU hours on 32 processors for the pre-steering runs, and 15,000 single CPU hours on a single processor for the steered ones.

Nuclear positioning of functionally-related loci

The steered genome-wide organization was profiled for the preferred positioning of various functionally-related loci: genes, lamina associated domains (LADs), H3K4me3 (activating), H3K9me3 and H3K27me3 (repressive) histone modifications, and Giemsa staining bands. The genomic location of genes and bands were obtained from the UCSC Table Browser⁶⁷, those of LADs from ref. 68, and histone modification data were obtained from GEO accession number GSE16256. These regions were mapped on the model chromosomes and their preferential radial positioning, before and after steering, was characterized by subdividing the nucleus in 15 radial shells of equal thickness (~320 nm), and computing for each shell the enrichment in beads associated to a given functional region.

Spatial macrodomains

The macrodomains of chromosome 19, which has the highest density of imposed constraints (~15 constraints/Mb, see Supplementary Table S3), were obtained from a spatial clustering analysis and then compared with the “block” partitions established in ref. 25 based on tethered conformational capture techniques. The clustering consisted of a sequence-continuous K-medoids partitioning of chromosome arms, discretised at the 100kb-long level, corresponding to 33 beads. The entries of pairwise dissimilarity matrices, Δ, were based on the average distances, 〈d〉, of the 100kb-long segments, the average being taken over the final conformations optimized with the constraints of refs 11,12. Specifically, for two segments i and j, the corresponding dissimilarity entry, Δ_ij was set equal to 〈d_ij〉 when the latter was below the 750 nm. The cutoff distance of 1,000 nm was used for more distance pairs. The two chromosome arms were separately subdivided in the same number of domains of ref. 25 and the significance of the resulting correspondence, or overlap, of the sequentially numbered domains was obtained by comparison against 1,000 random sequence-continuous subdivisions of the arms in the same number of domains.

The chromosome pre-mitotic recondensation

To reconfigure the model chromosomes to a linear (mitotic-like) state we switched off the target harmonic constraints and replaced them with couplings between pairs of loci at the regular sequence separation of 200 kb. These constraints promote the chromosome rearrangement into a linear succession of loops analogous to the mitotic chromosome models of refs 4,56. During this simulation, the parameters of all the harmonic constraints are treated on equal footing. The spring constants are maintained equal and unvaried, while the equilibrium distances are decreased in steps of 30 nm, from 200 nm (the maximum extension of a 200 kb chromatin model strand) to 30 nm (the size of a bead), every 0.6 τ_LJ. At the final nominal equilibrium distance, the simulations are extended up to 300 τ_LJ. Besides applying the recondesation procedure to the 10 steered replicates of the lung fibroblast (IMR90), we also apply it to the relaxed, pre-steering chromosomes.

Additional Information

How to cite this article: Di Stefano, M. et al. Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization. Sci. Rep. 6, 35985; doi: 10.1038/srep35985 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Cremer, T. & Cremer, C. Chromosome territories, nuclear architecture and gene regulation in mammalian cells. Nat. Rev. Genet. 2, 292−301 (2001).
CAS PubMed Google Scholar
Branco, M. R. & Pombo, A. Intermingling of chromosome territories in interphase suggests role in translocations and transcription-dependent associations. PLoS Biol. 4, e138 (2006).
PubMed PubMed Central Google Scholar
Grosberg, A., Rabin, Y., Havlin, S. & Neer, A. Crumpled globule model of the three-dimensional structure of DNA. Europhys. Lett., 23, 373 (1993).
ADS CAS Google Scholar
Sikorav, J. L. & Jannink, G. Kinetics of chromosome condensation in the presence of topoisomerases: a phantom chain model. Biophys. J. 66, 827–837 (1994).
ADS CAS PubMed PubMed Central Google Scholar
Rosa, A. & Everaers, R. Structure and dynamics of interphase chromosomes. PLoS Comput. Biol. 4, e1000153 (2008).
ADS MathSciNet PubMed PubMed Central Google Scholar
Vettorel, T., Grosberg, A. Y. & Kremer, K. Statistics of polymer rings in the melt: a numerical simulation study. Phys. Biol. 6 025013 (2009).
ADS PubMed Google Scholar
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
ADS CAS PubMed PubMed Central Google Scholar
Dorier, J. & Stasiak, A. Topological origins of chromosomal territories. Nucleic Acids Res. 37, 6316–6322 (2009).
CAS PubMed PubMed Central Google Scholar
Rosa, A., Becker, N. B. & Everaers, R. Looping probabilities in model interphase chromosomes. Biophys. J. 98, 2410–2419 (2010).
ADS CAS PubMed PubMed Central Google Scholar
Dekker, J., Rippe, K., Dekker, M. & Kleckner, N. Capturing Chromosome Conformation. Science, 295, 1306–1311 (2002).
ADS CAS PubMed Google Scholar
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Rao, S. S. P. et al. A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell, 159, 1665–1680 (2014).
CAS PubMed PubMed Central Google Scholar
Langowski, J. Chromosome conformation by crosslinking: polymer physics matters. Nucleus, 1, 37–39 (2010).
PubMed PubMed Central Google Scholar
Marti-Renom, M. A. & Mirny, L. A. Bridging the Resolution Gap in Structural Modeling of 3D Genome Organization. PLoS Comput. Biol. 7, e1002125 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Nicodemi, M. & Pombo, A. Models of chromosome structure. Curr. Opin. Cell Biol. 28, 90–95 (2014).
CAS PubMed Google Scholar
Mateos-Langerak, J. et al. Spatially confined folding of chromatin in the interphase nucleus. Proc. Natl. Acad. Sci. USA 106, 3812–3817 (2009).
ADS CAS PubMed PubMed Central Google Scholar
Yaffe, E. & Tanay, A. Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Gen. 43, 1059–1065 (2011).
CAS Google Scholar
Sexton, T. et al. Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome. Cell, 148, 458–472 (2012).
CAS PubMed Google Scholar
Barbieri, M. et al. Complexity of chromatin folding is captured by the strings and binders switch model. Proc. Natl. Acad. Sci. USA 109, 16173–16178 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Jost, D., Carrivain, P., Cavalli, G. & Vaillant, C. Modeling epigenome folding: formation and dynamics of topologically associated chromatin domains. Nucleic Acids Res. 42, 9553–9561 (2014).
CAS PubMed PubMed Central Google Scholar
Giorgetti, L. et al. Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell, 157, 950–963 (2014).
CAS PubMed PubMed Central Google Scholar
Serra, F. et al. Restraint-based three-dimensional modeling of genomes and genomic domains. FEBS lett. 589, 2987–2995 (2015).
CAS PubMed Google Scholar
Duan, Z. et al. A three-dimensional model of the yeast genome. Nature, 465, 363–367 (2010).
ADS CAS PubMed PubMed Central Google Scholar
Baù, D. & Marti-Renom, M. A. Structure determination of genomic domains by satisfaction of spatial restraints. Chromosome Res., 19, 25–35 (2011).
PubMed Google Scholar
Kalhor, R., Tjong, H., Jayathilaka, N., Alber, F. & Chen, L. Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat. Biotech. 30, 90–98 (2012).
CAS Google Scholar
Dedmon, M. M., Lindorff-Larsen, K., Christodoulou, J., Vendruscolo, M. & Dobson, C. M. Mapping Long-Range Interactions in α-Synuclein using Spin-Label NMR and Ensemble Molecular Dynamics Simulations. J. Am. Chem. Soc. 127, 476–477 (2005).
CAS PubMed Google Scholar
Camilloni, C. & Vendruscolo, M. Statistical Mechanics of the Denatured State of a Protein Using Replica-Averaged Metadynamics. J. Am. Chem. Soc. 136, 8982–8991 (2014).
CAS PubMed Google Scholar
Scialdone, A., Cataudella, I., Barbieri, M., Prisco, A. & Nicodemi, M. Conformation regulation of the X chromosome inactivation center: a model. PLoS Comput. Biol. 7, e1002229 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Le Dily, F. et al. Distinct structural transitions of chromatin topological domains correlate with coordinated hormone-induced gene regulation. Genes Dev. 28, 2151–2162 (2014).
CAS PubMed PubMed Central Google Scholar
Benedetti, F., Dorier, J., Burnier, Y. & Stasiak, A. Models that include supercoiling of topological domains reproduce several known features of interphase chromosomes. Nucleic Acids Res. 42, 2848–2855 (2014).
CAS PubMed Google Scholar
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. USA, 112 E6456–E6465 (2015).
CAS PubMed PubMed Central Google Scholar
Fudenberg, G. et al. Formation of Chromosomal Domains by Loop Extrusion. Cell Rep. 15, 1–12 (2016).
Google Scholar
Chiariello, A. M., Annunziatella, C., Bianco, S., Esposito, A. & Nicodemi, M. Polymer physics of chromosome large-scale 3D organisation. Sci. Rep. 6 (2016).
Tiana, G. et al. Structural fluctuations of the chromatin fiber within topologically associating domains. Biophys. J. 110, 1234–1245 (2016).
ADS CAS PubMed PubMed Central Google Scholar
Di Stefano, M., Rosa, A., Belcastro, V., Di Bernardo, D. & Micheletti, C. Colocalization of coregulated genes: a steered molecular dynamics study of human chromosome 19. PLoS Comput. Biol. 9, e1003019 (2013).
ADS CAS PubMed PubMed Central Google Scholar
Lesne, A., Riposo, J., Roger, P., Cournac, A. & Mozziconacci, J. 3D genome reconstruction from chromosomal contacts. Nat. Methods, 11, 1141–1143 (2014).
CAS PubMed Google Scholar
Wong, H. et al. A Predictive Computational Model of the Dynamic 3D Interphase Yeast Nucleus. Curr. Biol, 22, 1881–1890 (2012).
CAS PubMed Google Scholar
Tjong, H., Gong, K., Chen, L. & Alber, F. Physical tethering and volume exclusion determine higher order genome organization in budding yeast. Genome Res., 22, 1295–1305 (2012).
CAS PubMed PubMed Central Google Scholar
Gong, K., Tjong, H., Zhou, X. J. & Alber, F. Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast. PLoS One, 10, e0119672 (2014).
Google Scholar
Zhang, B. & Wolynes, P. G. Topology, structures, and energy landscapes of human chromosomes. Proc. Natl. Acad. Sci. USA 112, 6062–6067 (2015).
ADS CAS PubMed PubMed Central Google Scholar
Meluzzi, D. & Arya, G. Recovering ensembles of chromatin conformations from contact probabilities. Nucleic Acids Res., 41, 63–75 (2013).
CAS PubMed Google Scholar
Serra, F., Baù, D., Filion, G. & Marti-Renom, M. A. Structural features of the fly chromatin colors revealed by automatic three-dimensional modeling. bioRxiv, doi: 10.1101/036764 (2016).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell, 38, 576–589 (2010).
CAS PubMed PubMed Central Google Scholar
Ay, F., Bailey, T. L. & Noble, W. S. Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Res. 24, 999–1011 (2014).
CAS PubMed PubMed Central Google Scholar
Mifsud, B. et al. GOTHiC, a simple probabilistic model to resolve complex biases and to identify real interactions in Hi-C data. bioRxiv, doi: /10.1101/023317 (2015).
Jin, F. et al. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature, 503, 290–294 (2013).
ADS CAS PubMed PubMed Central Google Scholar
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999–1003 (2012).
CAS PubMed PubMed Central Google Scholar
Kremer, K. & Grest, G. S. Dynamics of entangled linear polymer melts: A molecular-dynamics simulation. J. Chem. Phys. 92, 5057–5086 (1990).
ADS CAS Google Scholar
Finch, J. T. & Klug, A. Solenoidal model for superstructure in chromatin. Proc. Natl. Acad. Sci. USA 73, 1897–1901 (1976).
ADS CAS PubMed PubMed Central Google Scholar
Bolzer, A. et al. Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes. PLoS Biol., 3, e157 (2005).
PubMed PubMed Central Google Scholar
Wang, X. et al. Rapid telomere motions in live human cells analyzed by highly time-resolved microscopy. Epigenet. Chromatin, 1, 4 (2008).
Google Scholar
Robin, J. D. et al. Telomere position effect: regulation of gene expression with progressive telomere shortening over long distances. Genes Dev. 28, 2464–2476 (2014).
PubMed PubMed Central Google Scholar
Kind, J. et al. Single-cell dynamics of genome-nuclear lamina interactions. Cell, 153, 178–192 (2013).
CAS PubMed Google Scholar
Federico, C., Cantarella, C. D., Di Mare, P., Tosi, S. & Saccone, S. The radial arrangement of the human chromosome 7 in the lymphocyte cell nucleus is associated with chromosomal band gene density. Chromosoma, 117, 399–410 (2008).
CAS PubMed Google Scholar
Ciabrelli, F. & Cavalli, G. Chromatin-driven behavior of topologically associating domains. J. Mol. Biol. 427, 608–625 (2015).
CAS PubMed Google Scholar
Naumova, N. et al. Organization of the mitotic chromosome. Science, 342, 948–953 (2013).
ADS CAS PubMed PubMed Central Google Scholar
Takizawa, T., Meaburn, K. J. & Misteli, T. The meaning of gene positioning. Cell, 135, 9–13 (2008).
CAS PubMed PubMed Central Google Scholar
Bickmore, W. A. & van Steensel, B. Genome architecture: domain organization of interphase chromosomes. Cell, 152, 1270–1284 (2013).
CAS PubMed Google Scholar
Romiguier, J., Ranwez, V., Douzery, E. J. P. & Galtier, N. Contrasting GC-content dynamics across 33 mammalian genomes: relationship with life-history traits and chromosome sizes. Genome Res. 20, 1001–1009 (2010).
CAS PubMed PubMed Central Google Scholar
Gavrilov, A., Razin, S. V. & Cavalli, G. In vivo formaldehyde cross-linking: it is time for black box analysis. Briefings Funct. Genomics, elu037 (2014).
Cheung, M., Down, T. A., Latorre, I. & Ahringer, J. Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res., 39, e103 (2011).
CAS PubMed PubMed Central Google Scholar
Yau, K. K. W., Wang, K. & Lee, A. H. Zero-Inflated Negative Binomial Mixed Regression Modeling of Over-Dispersed Count Data with Extra Zeros. Biom. J. 45, 437–452 (2003).
MathSciNet Google Scholar
Zeileis, A., Kleiber, C. & Jackman, S. Regression Models for Count Data in R. J. Stat. Softw. 27, 1–25 (2008).
Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B, 57, 289–300 (1995).
MathSciNet MATH Google Scholar
Plimpton, S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J. Comp. Phys. 117, 1–19 (1995).
ADS CAS MATH Google Scholar
Bonomi, M. et al. PLUMED: a portable plugin for free-energy calculations with molecular dynamics. Comp. Phys. Comm. 180, 1961–1972 (2009).
ADS CAS Google Scholar
Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
CAS PubMed PubMed Central Google Scholar
Guelen, L. et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature, 453, 948–951 (2008).
ADS CAS PubMed Google Scholar

Download references

Acknowledgements

We are grateful to Marc A. Marti-Renom, Davide Baù, and Angelo Rosa for useful discussions. This work was supported by The Norwegian Cancer Society [grant 71220-PR-2006-0433]; the Research Council of Norway; and the Italian Ministry of Education [grant PRIN 2010HXAW77].

Author information

Di Stefano Marco and Paulsen Jonas contributed equally to this work.

Authors and Affiliations

SISSA, International School for Advanced Studies, Trieste, I-34136, Italy
Marco Di Stefano & Cristian Micheletti
Institute of Basic Medical Sciences, University of Oslo, Oslo, 0317, Norway
Jonas Paulsen
Department of Mathematics, University of Oslo, Oslo, 0316, Norway
Tonje G. Lien
Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, 0310, Norway
Eivind Hovig
Department of Informatics, University of Oslo, Oslo, 0316, Norway
Eivind Hovig
Institute of Cancer Genetics and Informatics, Oslo, 0310, Norway
Eivind Hovig

Authors

Marco Di Stefano
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Paulsen
View author publications
You can also search for this author in PubMed Google Scholar
Tonje G. Lien
View author publications
You can also search for this author in PubMed Google Scholar
Eivind Hovig
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Micheletti
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.D.S., J.P., E.H. and C.M. conceived and designed the experiments; J.P. and T.G.L. performed statistical analyses; M.D.S. performed the molecular dynamics simulations; M.D.S. and J.P. analysed the data and implemented the analysis tools; M.D.S., J.P., E.H. and C.M. wrote the paper.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Supplementary Methods

Supplementary Tables

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Di Stefano, M., Paulsen, J., Lien, T. et al. Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization. Sci Rep 6, 35985 (2016). https://doi.org/10.1038/srep35985

Download citation

Received: 06 June 2016
Accepted: 30 September 2016
Published: 27 October 2016
DOI: https://doi.org/10.1038/srep35985

This article is cited by

Computational methods for analysing multiscale 3D genome organization
- Yang Zhang
- Lorenzo Boninsegna
- Jian Ma
Nature Reviews Genetics (2024)
Loop-extrusion and polymer phase-separation can co-exist at the single-molecule level to shape chromatin folding
- Mattia Conte
- Ehsan Irani
- Mario Nicodemi
Nature Communications (2022)
Inferring chromosome radial organization from Hi-C data
- Priyojit Das
- Tongye Shen
- Rachel Patton McCord
BMC Bioinformatics (2020)
Transcriptional activation during cell reprogramming correlates with the formation of 3D open chromatin hubs
- Marco Di Stefano
- Ralph Stadhouders
- Marc A. Marti-Renom
Nature Communications (2020)
Defining and detecting links in chromosomes
- Szymon Niewieczerzal
- Wanda Niemyska
- Joanna I. Sulkowska
Scientific Reports (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Significant pairwise constraints from Hi-C data

Genome-wide models from spatial constraints

Local variability of chromosomal nuclear positioning

Nuclear positioning of functional regions

Chromosome pre-mitotic recondensation

Discussion

Methods

Significant pairwise constraints from Hi-C data

Genome-wide modelling and spatial constraints

Nuclear positioning of functionally-related loci

Spatial macrodomains

The chromosome pre-mitotic recondensation

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links