Abstract
Complex biological processes, such as cellular differentiation, require intricate rewiring of intracellular signalling networks. Previous characterisations revealed a raised network entropy underlies less differentiated and malignant cell states. A connection between entropy and Ricci curvature led to applications of discrete curvatures to biological networks. However, predicting dynamic biological network rewiring remains an open problem. Here we apply Ricci curvature and Ricci flow to biological network rewiring. By investigating the relationship between network entropy and FormanRicci curvature, theoretically and empirically on singlecell RNAsequencing data, we demonstrate that the two measures do not always positively correlate, as previously suggested, and provide complementary rather than interchangeable information. We next employ Ricci flow to derive network rewiring trajectories from stem cells to differentiated cells, accurately predicting true intermediate time points in gene expression time courses. In summary, we present a differential geometry toolkit for understanding dynamic network rewiring during cellular differentiation and cancer.
Similar content being viewed by others
Introduction
Cellular differentiation is a complex biological process essential for embryonic development as well as the maintenance and repair of adult tissues. Aberrant differentiation underlies a wide spectrum of pathology. This includes malignancy, where cells may fail to differentiate or dedifferentiate, becoming trapped in a more plastic, proliferative state^{1}. A key feature of cellular differentiation is an orchestrated shift in the intracellular transcriptomic distribution. C. H. Waddington proposed in 1939, a seminal interpretation of the intracellular state during differentiation, known as the Waddington Landscape^{2}. Under this landscape, less differentiated cells occupy a higher potential energy, represented by an elevated position. As cells differentiate they roll down this complex landscape, following a trajectory determined by its hills and valleys, dropping in potential energy until the cell arrives at an attractor state: the differentiated cell.
While an intuitive and appealing picture, the deep complexity of the intracellular state revealed by modern transcriptomic and proteomic quantification, as well as the discovery that we can reprogramme cells to earlier phases of differentiation, motivated a recasting of the Waddington Landscape from a metaphorical picture into an interpretable mathematical framework^{3,4}. Modern interpretations of Waddington’s Landscape have reframed cell fate trajectories via the phase space of transcriptomic dynamics^{5,6,7}. While nondeterministic elements of these transcriptomic dynamics have motivated more informationtheoretic characterisations of cell fate trajectories^{8}. The latter interpretation has revealed the intracellular states of less differentiated cells can be considered more “promiscuous", displaying a higher entropy in their proteinprotein interactions, which decreases during differentiation and increases in cancer, providing a quantitative correlate for the “height" in Waddington’s landscape^{9,10,11}.
As Waddington’s landscape has evolved from an intuitive picture to a mathematical framework, however, cell fate transitions have maintained a geometric appeal^{12}. Geometric approaches to studying cell fate have often focused on characterisations of the underlying dynamical system and typically require detailed knowledge of generegulatory networks relevant to specific cell fate transitions^{7,13}. However, at the genomewide scale, we do not have this deep understanding of intracellular interactions and instead rely on sparse graphical representations, known as biological networks, which can be weighted by biological samples to describe relevant dynamics^{9}. The notion that a (weighted) network has an underlying geometry is wellstudied and there are numerous methodologies for network embedding^{14}, with application to biological networks^{15,16}. Recently, discrete analogues of tools from differential geometry^{17,18}, a rich mathematical field for studying manifolds and their curvatures, have been applied to the study of biological networks^{19,20,21,22,23}. These tools provide a new window into the geometry of cell fate and a rich theoretical literature to apply.
In particular, discrete analogues of Ricci curvature, well known for its use to describe the curvature of spacetime in Einstein’s theory of general relativity, have been employed to discriminate biological networks weighted with cancer gene expression data from corresponding healthy tissue^{19}. In 2015, Sandhu et al.,^{19} proposed a theoretical link between network entropy and a discrete version of Ricci curvature (OllivierRicci curvature^{17}) computed over the edges of a weighted network. This link was motivated by the theoretical results of Lott and Villani, relating a lower bound of the Ricci curvature on a metricmeasure space to the convexity of an entropy functional^{24}, suggesting that Ricci curvature and entropy (computed in this way) may be positively correlated. Though network entropy is not theoretically equivalent to the entropy functional from the metricmeasure space setting, it was found that, like network entropy, total OllivierRicci curvature is elevated on networks weighted with cancer data, compared to healthy^{19}. Subsequently, similar results have been obtained, using the less computationally intensive FormanRicci curvature^{21,23}, including that this curvature decreases during cellular differentiation, again like network entropy. It is of note, however, that depending on the construction of this FormanRicci curvature, investigators have demonstrated both positive^{22} and negative^{25} correlations with network entropy.
Cellular differentiation and oncogenesis like all biological events are dynamic processes, and the recent results detailed above suggest that the geometry of the underlying space of intracellular interactions, described by biological networks, may change predictably during their progression. The dynamic evolution of manifolds is a wellstudied topic in differential geometry. In a seminal contribution to the field, Hamilton introduced Ricci flow as a tool to study the topological implications of deforming a metric on a manifold according to its Ricci curvature^{26}, which led subsequently to the striking solution of the Poincaré conjecture by Perelman^{27,28}. Like curvature Ricci flow can also be defined in a discrete setting^{29}, and recently discrete Ricci flows and curvatures have been applied to problems in network theory^{30,31,32} such as network alignment^{33}, community detection^{34,35,36}, functional community inference for biological networks^{37} and phase transitions in timevarying complex networks^{38}.
In what follows we first present some background on the computation of network entropy and discrete Ricci curvatures in the context of gene expression weighted proteinprotein interaction networks. We then propose a framework for employing a discrete Ricci curvature and normalised Ricci flow to predict dynamic trajectories between temporally linked gene expression samples. We next consider the relationship between our FormanRicci curvature construction and network entropy; using a simple toy network we show that the two network measures are not always positively correlated. We find that in promiscuous signalling regimes (such as in stem cells) the measures do positively correlate, but in lower entropy regimes they may anticorrelate, suggesting the two measures are complementary rather than interchangeable. By analysing over 6000 singlecell transcriptomes, we confirm these propositions, demonstrating that network entropy and our FormanRicci curvature positively correlate in stem cells, but negatively correlate in cancerous and differentiated samples. Lastly, we consider two independent transcriptomic time courses describing multiple time points during cellular differentiation in different tissues. Using our Ricci flow construction we derive gene expression trajectories from the first time point sample to the last, faithfully predicting the ordering of intermediate samples, without prior knowledge.
Results
Intuition, definitions and preliminaries
Intuitively, we interpret the Waddington Landscape as analogous to the phase space of transcriptomic dynamics during cellular differentiation (Fig. 1A). Let n denote the number of genes in the genome and \({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}}:={({x}_{i}^{t})}_{i=1}^{n}\, > \,0\) denote the vector of transcript abundance for each gene at time \(t\in {{\mathbb{R}}}^{+}\). Consideration of \(\frac{d{{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}}}{dt}\) yields an n dimensional phase space ϕ, describing permissive trajectories of gene expression. Trajectories between two points in ϕ represent geodesics from one transcriptomic state to another, and distances along these trajectories can be computed by equipping ϕ with a Riemannian metric g. The degree to which these geodesic distances differ from Euclidean distances can be assessed via consideration of Ricci curvature, allowing us to recast the n dimensional manifold (ϕ, g) as an n + 1 dimensional manifold Φ with a Euclidean geometry. This added dimension allows us to interpret the “height" of Waddington’s Landscape, and permits investigation of its association with cellular differentiation states.
A key issue in progressing this construct is the knowledge of \(\frac{d{{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}}}{dt}\), which will be a highly sophisticated function incorporating transcription, translation and degradation of mRNA and protein for each gene, as well as the complexities of epigenetic regulation, generegulatory networks, proteinprotein interaction networks and cellcell/microenvironment interactions.
It has been shown however, that integration of transcriptomic data with a proteinprotein interaction network (PIN), compiled from multiple sources, yields an entropy rate which is a clear correlate of cellular differentiation potential and thus represents a proxy for “height" in Waddington’s differentiation landscape^{9}. This suggests a pragmatic approach considering \(\frac{d{{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}}}{dt}\) purely constructed from proteinprotein interactions, may be sufficient for initial interrogation of the structure of Φ, in lieu of a more rigorous theoretical understanding of other contributors.
Network entropy and discrete Ricci curvature
In our construct we let G = (V, E) denote the undirected graph describing the human PIN, with adjacency matrix \(A={({a}_{ij})}_{i,j\in V}\), where ∣V∣ = n. For any x ∈ ϕ, where x_{i} > 0 for all i ∈ V, we define the weighted adjacency matrix \(W({{{{{{{\bf{x}}}}}}}})={({a}_{ij}{x}_{i}{x}_{j})}_{i,j\in V}\), and the rowstochastic matrix, \(P({{{{{{{\bf{x}}}}}}}})={({p}_{ij}({{{{{{{\bf{x}}}}}}}}))}_{i,j\in V}\), where:
The entropy rate S_{R}(x) of P(x) (hereafter denoted as network entropy, Methods) decreases as cells differentiate, this has been established in bulk and singlecell transcriptomic data from cells at different stages of differentiation and throughout differentiation time courses, by us and multiple independent investigators^{9,11,22}. Network entropy is also higher in cancerous compared to healthy tissue, and is prognostic in breast and lung cancer^{10,11,39}.
Sandhu et al.,^{19} proposed a positive correlation between network entropy and a discrete version of Ricci curvature computed over edges in a weighted network \({(Ri{c}_{e}({{{{{{{\bf{x}}}}}}}}))}_{e\in E}\), with network average or total Ricci curvature defined by:
where deg(i) = ∑_{j∈V}a_{ij} and where \({({\pi }_{i}({{{{{{{\bf{x}}}}}}}}))}_{i=1}^{n}\) is the stationary distribution of P(x). The correlation between network entropy and total discrete Ricci curvature has since been considered by several studies in the following form^{19,21,22}:
While not an unreasonable deduction, justification for this inequality derives from a theoretical investigation of metricmeasure spaces (M, d, m), where (M, d) is a metric space and m is a measure on the Borel σ − algebra of M (Methods)^{24,40}. Investigation in this setting uncovered a relationship between the convexity of a relative entropy, computed over the space of probability measures on (M, d), with respect to the measure m and a lower bound of the Ricci curvature of (M, d, m)^{24,40}. From this association, it was concluded that the negative of the relative entropy and Ricci curvature are positively correlated^{19}. We note, however, that the network setting is not equivalent to metricmeasure spaces. In particular network entropy (an entropy rate) is not equivalent to the relative entropy described by^{24}. The inequality (3) is therefore not guaranteed from the results on metricmeasure spaces^{24,40}.
Moreover, discrete Ricci curvatures, though often theoretically rich, are not exact quantifiers of the continuous Ricci curvature on a manifold. There are several approaches to computing a discrete Ricci curvature on edges of a network, including OllivierRicci curvature^{17} and FormanRicci curvature^{18}, both of which have been applied to biological networks and demonstrate elevated total curvature in cancer^{19,21,22}. FormanRicci curvature follows a combinatorial construction as follows:
where \({({W}_{i})}_{i\in V}\) is a vector of vertex weights and \({({\omega }_{ij})}_{(i,j)\in E}\) is a vector of edge weights. We note that FormanRicci curvature is less computationally intensive to evaluate than OllivierRicci curvature.
Though the discrete entropy and curvature measures do not exactly correspond to the metricmeasure space setting, the relation (3) suggests an intriguing geometrical interpretation for the observation that network entropy decreases during cellular differentiation. Transcriptomic states representing undifferentiated cells x^{stem} ∈ ϕ, have higher network entropy compared to differentiated cells x^{diff} ∈ ϕ. Under (3) it follows that Ric(x^{stem}) > Ric(x^{diff}). Treelike networks have a very low curvature, whereas cliques are highly curved^{33}, giving a natural interpretation to this inequality in terms of more deterministic pathway activation during differentiation.
In our phase space analogy to Waddington’s Landscape, with \(\frac{d{{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}}}{dt}\) essentially described by W(x^{t}), we see stem cells occupying regions of high curvature (hill tops) and curvature decreasing as cells differentiate, analogously, rolling downhill to valleys. This gives us an intuitive, empirical tool to understand construction of the n + 1 dimensional space Φ for the n dimensional phase space (ϕ, g) at given data points.
Normalised discrete Ricci flow
Cellular differentiation is a dynamic process and typically we only have data for start and end points x^{stem} and x^{diff} and perhaps a handful of points between. We consider extrapolation between these data points via a discrete normalised Ricci flow.
We propose to use a discrete version of the 2dimensional normalised Ricci flow, which has previously been considered in the context of weighted networks^{30}:
for Δt > 0, where d_{t}(i, j) is a distance between connected nodes i, j ∈ V at time t, \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\) is the Ricci curvature on edge (i, j) ∈ E at time t and \({\overline{Ric}}_{(i,j)}\) is an edgewise normaliser to which we want to converge.
Here we consider t = 0 to refer to the undifferentiated cell state x^{stem} and define the normaliser via the fully differentiated state: \({\overline{Ric}}_{(i,j)}=Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{diff}}}}}}}}})}_{(i,j)}\). We postulate that (5) will permit estimation of a permissive trajectory from x^{stem} to x^{diff} in ϕ.
For (5) to generate trajectories the following properties are required (Fig. 1B):

Knowledge of d_{t}(i, j) must be sufficient to calculate \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\).

Δt must be sufficiently small to prevent negative values of d_{t}.
The following properties are also desired:

Knowledge of d_{t} allows calculation of x^{t} or some transformation thereof, e.g., W(x^{t}). This will permit comparison to intermediate real data points to validate the approach.

Computation time of Ricci curvatures must be sufficiently short to permit multiple iterations rapidly, as for large PINs such as those investigated here, there are typically ~ 150, 000 edges.
In what follows we compute \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\) as a FormanRicci curvature \({R}_{F}^{t}(i,j)\) with edge weights \({\omega }_{ij}:={\omega }_{ij}^{t}=\frac{{a}_{ij}}{{x}_{i}^{t}{x}_{j}^{t}}\) and node weights \({W}_{i}=\frac{1}{{{{{{{\mathrm{deg}}}}}}}(i)}\). \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}={R}_{F}^{t}(i,j)\) thus obeys:
We further choose \({d}_{t}(i,j)={\omega }_{ij}^{t}\). These choices satisfy all of our required and desired properties and detailed justification can be found in the Methods.
Positive correlation between network entropy and total FormanRicci curvature requires a specific signalling regime
Previous studies have demonstrated a positive correlation between network entropy and network average (or total) discrete Ricci curvature computed on differentiating stem cells^{19,22}. However, recently it has been demonstrated using a slightly different construction of FormanRicci curvature that a negative correlation can be observed with network entropy^{25}. As discussed above a positive correlation between network entropy and discrete Ricci curvature is not guaranteed in general, as the motivating theoretical results relate to slightly different quantities^{24,40}.
To gain intuition we investigated the association between our version of FormanRicci curvature and network entropy on a simple kstar network displayed in Fig. 2A, consisting of k + 1 nodes, of which k have a single edge connecting them to a central node i. We assign each node l ≠ j a weight x_{l} = 1 and assign node j a weight x_{j} = ϵ > 0. We can derive analytical expressions for network entropy (S_{R}) and total FormanRicci curvature (R_{F}, defined by (6) and (2)) on this simple network in terms of k and ϵ (Methods).
We performed a numerical analysis of these expressions for various values of \(k\in {{\mathbb{Z}}}^{+}\setminus 1\) and ϵ > 0 (Fig. 2B–D). By construction S_{R} is maximal for ϵ = 1, regardless of k. For k = 2, R_{F} also has a global maximum at ϵ = 1 and the positive correlation with S_{R} expressed in (3) holds. However for all other values of k, the association between network entropy and total FormanRicci curvature follows two regimes depending on ϵ (Fig. 2D). For ϵ < 1(3) holds and network entropy and total FormanRicci curvature are positively correlated. However, for ϵ > 1 we can always find a range of values of ϵ for which network entropy and total FormanRicci curvature are negatively correlated, this range becomes larger as k increases.
Though these results only apply to a very simple network, they suggest a fundamental difference in what network entropy and total FormanRicci curvature are measuring. This suggests these measures are complementary, rather than interchangeable as has been previously proposed^{19}. In our simple network, network entropy is maximised for ϵ = 1. We can reduce network entropy by reducing ϵ, signalling more the k − 1 neighbours of our central node i at the cost of reducing signalling to our chosen neighbour j, a strategy we call “many for one" (Fig. 2E), in this case R_{F} will also decrease. Alternatively, we can reduce network entropy by increasing ϵ, and signal more to our chosen node j at the cost of signalling less to our remaining neighbours, a strategy we call “one for many" (Fig. 2E), in this case for larger values of k, R_{F} may increase.
Network entropy is blind to the two signalling strategies, but they are biologically distinct. The “one for many" strategy mirrors deterministic pathway activation, characteristic of a low entropy regime. This strategy is more likely in a highly committed cell, performing a very specific function^{9}. Variation in gene expression amongst welldifferentiated cells may therefore capture the negative correlation between network entropy and total curvature we have demonstrated possible by our theoretical investigation. Conversely, the “many for one" signalling strategy, though not maximising entropy, represents a more disordered state than the “one for many" strategy, maintaining the possibility of diverse pathway activation without committing. This regime mirrors the promiscuous signalling of stem cells, which must maintain the option to differentiate and perform a wide variety of functions^{9}. Variation in gene expression amongst stem cells may therefore capture the positive correlation between network entropy and total curvature, which we have theoretically demonstrated more dominant in “many for one" signalling.
The degree of correlation between network entropy and total FormanRicci curvature has biological relevance
Our theoretical results suggest that our S_{R} and R_{F} may be positively correlated in stem cells, but negatively correlated in more differentiated tissue. Previous studies reporting an association between network entropy and total FormanRicci curvature typically present results on stem cell populations^{21,22,25}. Though the curvatures of more differentiated and cancerous tissues are often also examined, the association with network entropy in these tissues is typically not reported^{25,41}. We note that these studies also employ slightly different constructions of FormanRicci curvature than our own and while most show a positive correlation with network entropy in stem cells^{19,22}, one shows a negative correlation^{25}.
We analysed the previously considered scRNAseq data sets of Chu et al.^{11,22,25,42} describing the early stages of embryonic stem cell (ESC) differentiation. These data consist of 2 separate experiments, one describing 1018 single cells assayed at different stages of multipotency and a second describing 758 single cells assayed at 6 distinct time points during ESC differentiation. On both these data sets we found that network entropy and our total FormanRicci curvature were positively correlated (Pearson’s r > 0.78, p < 2.2 × 10^{−16}) and discriminate distinct lineages during stem cell differentiation (Fig. 3A, B) as previously reported^{11,22,25}.
We next analysed a large scRNAseq data set describing 1257 malignant and 3256 healthy single cells from 19 patients with malignant melanoma^{43}, on which total curvature values have previously been calculated, but the association with network entropy was not presented^{22,25}. These cells represent more differentiated tissue and as hypothesised from our theoretical investigation, we found a negative association between network entropy and our total FormanRicci curvature on these cells (Pearson’s r = − 0.77, p < 2.2 × 10^{−16}, Fig. 3C). We also found that malignant cells displayed higher values of network entropy as expected (twotailed Wilcoxon p < 2.2 × 10^{−16})^{9}, however, they displayed lower values of total FormanRicci curvature (twotailed Wilcoxon p < 2.2 × 10^{−16}, Fig. 3C). Considering healthy and malignant cells separately, we found that the correlation between network entropy and total FormanRicci curvature was significantly more negative across healthy cells compared to malignant (control cells: Pearson’s r = − 0.83, p < 2.2 × 10^{−16}), malignant cells: Pearson’s r = − 0.009, p = 0.76, Fisher’s ztransformation: p < 2.2 × 10^{−16}).
To confirm this finding we analysed an independent data set describing 272 malignant and 160 healthy cells from patients with colorectal cancer^{25,44}. We again identified a negative correlation between network entropy and total FormanRicci curvature (Pearson’s r = − 0.86, p < 2.2 × 10^{−16}, Fig. 3D), with higher network entropy (twotailed Wilcoxon p = 1.5 × 10^{−6}) but lower total FormanRicci curvature (twotailed Wilcoxon p = 8.0 × 10^{−4}) in cancerous cells. Again, considering healthy and malignant cells separately, the correlation between network entropy and total FormanRicci curvature was significantly more negative across healthy cells compared to malignant, though the difference was more subtle than in the melanoma data set (control cells: Pearson’s r = − 0.90, p < 2.2 × 10^{−16}, malignant cells: Pearson’s r = − 0.83, p < 2.2 × 10^{−16}, Fisher’s ztransformation p < 4.8 × 10^{−3}).
This suggests that network entropy and total FormanRicci curvature are not interchangeable measures of cell potency, but complementary. Increasing network entropy is seen in both less differentiated tissue and cancer, while total FormanRicci curvature increases in less differentiated tissue and decreases in cancer. Together these measures present a more complete picture of the global intracellular signalling state.
Ricci flow for approximating transcriptomic trajectories
We have found that network entropy and our total FormanRicci curvature are related quantities but not interchangeable.
We next consider whether Ricci flow can approximate realistic trajectories through gene expression phase space during cellular differentiation. We first considered the time course scRNAseq data set of Chu et al.^{42}, describing ESC differentiation at 6 time points. For each time point we computed the mean transcriptomic vector across single cells, which we considered representative of the transcriptomic state at this time point, giving us a set of 6 vectors \({({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{t=0}^{5}\) (Fig. 4A). To provide a null model we considered a straight line trajectory from W(x_{0}) to W(x_{5}) (Methods). We computed the Euclidean distance between points along this straight line and the true intermediate data points \({(W({{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\bf{t}}}}}}}}}))}_{t=1}^{4}\), to determine the ordering of the true data points along the straight line trajectory (Methods, Fig. 4B). As anticipated the straight line trajectory did not pass the true data points in the correct order, and the distance along the trajectory to the closest pass of the true data point was not correlated with differentiation time of the true data point (Pearson’s r = 0.85, p = 0.153, Fig. 4C). We next considered the trajectory from W(x_{0}) to W(x_{5}) produced by our normalised discrete Ricci flow described by (5) (Methods). We found that the Ricci flow trajectory passed by the true data points in the correct order, and the number of iterations to the closest pass of the true data points correlated with the differentiation time of those points (Pearson’s r = 0.96, p = 0.04, Fig. 4D).
To confirm the finding that Ricci flows correctly orders differentiation trajectories, we considered our data set of bulk RNAsequencing of human myoblast differentiation into multinucleated myotubes, with transcriptomic samples taken at 8time points in triplicate (Fig. 5A)^{45}. Performing analysis as above, separately for each triplicate, we found that closest pass progression along a null model linear trajectory correlated with differentiation time but could not robustly discriminate time points across triplicates (Pearson’s r = 0.79, p = 1.0 × 10^{−4}, Fig. 5B). In contrast, closest pass Ricci flow iterations were highly correlated with differentiation time (Pearson’s r = 0.93, p = 3.7 × 10^{−8}, Fig. 5B) and were tightly reproducible across triplicates, discriminating all time points, with the exception of the first two intermediate time points. These initial time points were taken only 90 min apart and thus are unlikely to represent a significant dynamic change.
Discussion
Numerous measures have been developed in network theory to analyse network properties. Classic approaches include studying the degree distribution, clustering coefficient, and shortest path between nodes, all of which provide insights into the network’s geometry^{46}. However, to study the geometric and topological properties of networks more deeply, discrete adaptations of differential geometry have become widely applied^{19,22,30,31,32,33,36,41}. In differential geometry curvature is a key actor, describing the local behaviour of a manifold, and geometric flows can be employed to perturb this important property and examine the consequences. By treating networks as discrete counterparts of manifolds, we can view them as geometric objects and discrete curvatures and flows on networks have proven effective tools for addressing common network theory questions^{31,32,33,36}.
Here we investigated discrete Ricci curvature and Ricci flow, to study properties of biological signalling in differentiating and malignant cells. This work builds on the finding that network entropy is a proxy for “height" in Waddington’s Landscape— having higher values on stem cells and malignant cells compared to healthy differentiated tissue^{9,10,11}—by investigating the enticing theoretical link between Ricci curvature and entropy^{19,24,40}. We propose a framework to calculate the total FormanRicci curvature of a single biological sample, which is compatible with a discrete Ricci flow, to infer trajectories between the intracellular signalling regimes of two temporally connected transcriptomic samples.
By investigating our framework in a simple analytically tractable setting, we prove that network entropy and our total FormanRicci curvature are not guaranteed to be positively correlated. Our investigation suggests that a positive correlation is likely across samples with a highly promiscuous signalling regime (such as stem cells), with a negative correlation more likely across cells with deterministic signalling (differentiated cells). We provide empirical evidence for this theoretical hypothesis through the analysis of > 6000 singlecell transcriptomes. Interestingly, we found that cancer cells have a higher network entropy but lower total FormanRicci curvature than healthy differentiated cells and that the correlation between network entropy and total FormanRicci curvature is less negative in cancerous cells compared to healthy. This is in contrast to stem cells where both network entropy and total FormanRicci curvature are higher than healthy differentiated cells and positively correlated.
One of the hallmarks of malignancy is anaplasia—the dedifferentiation of cancerous cells compared to their healthy counterparts. Anaplasia is typically quantified by histological grade, where tumour cells are compared morphologically to their healthy counterparts and assigned a low grade if they appear similar, or a high grade if they have lost the appearance associated with specialised function. Anaplastic malignant cells gain some of the hallmarks of stem cells, such as a higher proliferative capacity, they also gain additional functions, including those which facilitate metastasis. Our theoretical results suggest that the loss of negative correlation between network entropy and total FormanRicci curvature in malignant cells may represent an increase in “many to one” signalling compared to healthy cells, expected in anaplasia. Highly anaplastic cells may even attain a signalling regime more characteristic of stem cells, and show a positive correlation between network entropy and total FormanRicci curvature.
By applying our normalised discrete Ricci flow to the first and last time point of time courses of cellular differentiation from two distinct tissues, we derived biological network rewiring trajectories, which accurately predicted intermediate time points. Predictions made by this approach require experimental validation but offer the possibility of deeper insights into the molecular events underpinning cellular differentiation and early biomarker detection for malignancy and regenerative pathology.
Our findings contrast with other studies, which proposed a positive correlation between network entropy and total discrete curvature of a biological network, by appealing to results on metricmeasure spaces^{19,24,40}. There are a number of reasons for this contrast. Firstly, the discrete network setting is not the exact analogue of the metricmeasure space setting and in particular the definitions of “entropy" in the two settings are not identical. Secondly, discrete approximations of Ricci curvature for networks are nonunique and there are several ways of defining them depending upon context, including OllivierRicci curvature derived from optimal transport considerations^{17} and FormanRicci curvature derived from consideration of cell complexes^{18}. It has been shown that node averages of these different discrete Ricci curvatures computed on the same network do not always correlate^{20}. Moreover, if we focus only on the FormanRicci curvature employed here, it can be seen from (4) that there is considerable flexibility in its definition, via the selection of node and edge weights. Indeed a positive correlation between FormanRicci curvature and network entropy^{22}, became negative across the same samples when the investigators used a different choice of edge weights^{25}. The selection of weights for FormanRicci curvature therefore requires careful consideration to ensure it is matched to context. In particular, it may be possible to choose weights which artificially engineer a correlation between total FormanRicci curvature and network entropy. Moreover, if we define both node and edge weights as variables which change temporally, as has been done previously^{22,25}, then a Ricci flow on edges as we have constructed is computationally intractable. Our findings therefore motivate theoretical investigation into how to translate the deep results from metricmeasure spaces into the biological network setting with more fidelity, as well as a more robust understanding of the impact of parameter choices when applying FormanRicci curvature to weighted biological networks. Here we provide a framework for such theoretical investigation and show that our FormanRicci curvature is an informative biological network measure, complementing rather than simply correlating with network entropy by providing robust discrimination between healthy, cancerous and stem cells.
Our work paves the way towards addressing questions related to the prediction of network evolution over time and their study with tools adapted from differential geometry. Though both theoretical and experimental investigations are required to fully exploit this area, we demonstrate that important insights into the molecular mechanisms of health and disease can be achieved through analysis of discrete Ricci curvatures and flows.
Methods
Network entropy calculation
The computation of network entropy was as previously described^{9,10,11} employing the SCENT package in R and the symmetric PIN compiled from multiple sources in 2016 available at https://github.com/aet21/SCENT. We denote the adjacency matrix of the PIN by \(A={({a}_{ij})}_{i,j=1}^{n}\).
For each gene expression sample, genes were matched to proteins in the PIN, when multiple genes were mapped to a single protein, expression levels were averaged over and only the largest connected component of the PIN was considered postmatching. For each matched sample \({{{{{{{\bf{x}}}}}}}}={({x}_{i})}_{i=1}^{n} \, > \,0\) a weighted network \(W({{{{{{{\bf{x}}}}}}}})={({a}_{ij}{x}_{i}{x}_{j})}_{i,j\in V}\), and rowstochastic matrix, \(P({{{{{{{\bf{x}}}}}}}})\,=\,{({p}_{ij}({{{{{{{\bf{x}}}}}}}}))}_{i,j\in V}\), where:
were constructed.
We define the local entropy of node i as
the entropy rate associated with P(x) is then given by
Where \(\pi ({{{{{{{\bf{x}}}}}}}})={({\pi }_{i}({{{{{{{\bf{x}}}}}}}}))}_{i\in V}\) is the stationary distribution of P(x) satisfying
As G is undirected and a single connected component, by the PerronFrobenius theorem the stationary distribution π has an analytical solution given by:
When presented in figures network entropy was calculated as the above entropy rate S_{R}(x) normalised by the maximal entropy rate possible from the topology of the matched PIN, following our prior convention, to allow comparison across different networks^{10,11}.
Construction of the Ricci flow equation
Formally for a smooth manifold Y a Ricci flow defines for an open interval \((a,b)\in {{\mathbb{R}}}^{+}\) a Riemannian metric d_{t} such that:
the constant − 2 is largely conventional and can be replaced with any k < 0, to ensure existence of a unique solution in finite time. Normalised Ricci flows are typically employed for convergence studies when certain properties, e.g., volume, are required to be finite
where \(\overline{Ric}\) is a normaliser.
In 2 dimensions normalised Ricci flow is wellstudied theoretically^{47} and takes a special form:
For normalised discrete Ricci Flow we employ the following expression described in the main text and applied previously^{21}:
for Δt > 0, where d_{t}(i, j) is a distance between connected nodes i, j ∈ V at time t, \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\) is the Ricci curvature on edge (i, j) ∈ E at time t and \({\overline{Ric}}_{(i,j)}\) is an edgewise normaliser to which we want to converge.
We next must choose expressions for d_{t}(i, j) and \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\) which satisfy our required and desired properties outlined in the Results.
We select \(Ric{({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{t}}}}}}}}})}_{(i,j)}\) to be a FormanRicci curvature \({R}_{F}^{t}(i,j)\), as this discrete form of Ricci curvature is fast to compute compared to other versions such as OllivierRicci curvature, and we must compute ~ 150,000 edgewise curvatures per iteration of our Ricci flow. We choose the edge weights of this curvature to be \({\omega }_{ij}: \!\!={\omega }_{ij}^{t}=\frac{{a}_{ij}}{{x}_{i}^{t}{x}_{j}^{t}}\) and node weights \({W}_{i}=\frac{1}{{{{{{{\mathrm{deg}}}}}}}(i)}\). \({R}_{F}^{t}(i,j)\) thus obeys:
We also choose \({d}_{t}(i,j)={\omega }_{ij}^{t}\). We note that, as for other discrete Ricci flow studies^{30,33}, d_{t}(i, j) is not a metric, as it fails the triangle inequality, however, it is small, implying “close proximity" of connected vertices i, j ∈ V if the corresponding transcript levels of genes i and j are high at time t. In addition at each iteration of (5), this choice of d_{t}(i, j) allows computation of \({({\omega }_{ij}^{t+\Delta t})}_{(i,j)\in E}\), which can be input into (4), allowing computation of \({({R}_{F}^{t+\Delta t}(i,j))}_{(i,j)\in E}\) and thus the next iteration of (5). This iterated d_{t+Δt} can simply be inverted to give W(x^{t+Δt}) which allows direct comparison of the Ricci flow generated transcriptomic distribution with real biological data. Our choice of d_{t} thus satisfies all our desired properties and is a reasonable distance measure.
W_{i} is chosen to be independent of x^{t} as the Ricci flow iteration only provides enough equations to calculate updates of edge weights, thus if W_{i} depends on t we cannot compute \({R}_{F}^{t}(i,j)\) over each iteration of (5). We select \({W}_{i}=\frac{1}{{{{{{{\mathrm{deg}}}}}}}(i)}\) to normalise the sums in (4), which is important when comparing total FormanRicci curvature and network entropy (see below).
We further note that:
implying that as ω_{ij} decreases, based on our definition of the distance d(i, j) = ω_{ij}, i and j become “closer", and the FormanRicci curvature increases, and vice versa (Fig. 1C). This behaviour is as expected from a curvature. Moreover, considering our Ricci flow construction in (5), if \({R}_{F}^{t}(i,j) \, > \,\overline{{R}_{F}(i,j)}\) then \({d}_{t+\Delta t}(i,j)={\omega }_{ij}^{t+\Delta t}\) will increase, leading to a reduction in \({R}_{F}^{t}(i,j)\) via (17), driving convergence to \(\overline{{R}_{F}(i,j)}\).
Thus our choice of Ricci flow construction is computationally efficient, facilitates convergence of the flow towards the normaliser and satisfies all our required and desired properties outlined in the results.
Investigating the correlation between network entropy and total FormanRicci curvature on a simple network
We consider the simple kstar network displayed in Fig. 2A, consisting of k + 1 vertices, of which k have a single edge connecting them to a central vertex i. We assign each vertex l ≠ j a weight x_{l} = 1 and assign vertex j a weight x_{j} = ϵ > 0.
Our FormanRicci curvature is defined on an edge as follows:
whence
Which we denote as:
for notational ease, where:
We note that via (1):
which gives us the alternative expression, which can be helpful when considering stochastic matrices
Employing the results above it is a simple deduction that for our toy network:
The stationary distribution of the network is also easily calculated from (11) as:
It is also clear that the local entropies will satisfy:
The network entropy of this network is thus simply:
Which is a convex function of ϵ maximal at ϵ = 1 (Fig. 2B).
We now consider the total FormanRicci curvature, defined by:
where
In our example, the following can be deduced from equation (23):
Which allows the calculation of
Whence
Network entropy and total FormanRicci curvature comparison
Network entropy was calculated on each gene expression sample as described above. FormanRicci curvature was computed over an edge (i, j) using the following expression:
Nodal average FormanRicci curvature was computed as previously described^{22,25} via:
and network average, or total FormanRicci curvature was computed via:
where \({({\pi }_{i}({{{{{{{\bf{x}}}}}}}}))}_{i=1}^{n}\) is the stationary distribution of P(x).
The choice of node weights for our FormanRicci curvature \({W}_{i}=\frac{1}{{{{{{{\mathrm{deg}}}}}}}(i)}\) is important here as it ensures that the upper bounds of each of the two sums comprising edgewise FormanRicci curvature defined in (4) are not dependent on node degree, and so nodal average FormanRicci curvature is also independent of degree. This is required as the local entropy of a node i (defined in (8)) takes values on [0, deg(i)] and thus has a degree dependence. We define total FormanRicci curvature here, to mirror network entropy, as a weighted sum of nodal average curvatures, using the stationary distribution \({({\pi }_{i}({{{{{{{\bf{x}}}}}}}}))}_{i=1}^{n}\) as the weights. Our choice of node weights \({W}_{i}=\frac{1}{{{{{{{\mathrm{deg}}}}}}}(i)}\) thus prevents total FormanRicci curvature and network entropy from correlating purely because of a shared degree dependence. We note that while our choice of W_{i} prevents degree dependence of edge wise and nodal FormanRicci curvature, the use of the stationary distribution in calculation of total FormanRicci curvature introduces the relative biological importance of hub nodes^{9}.
Associations between network entropy and total FormanRicci curvature were assessed using Pearson correlation with significance at the 5% level.
Computing linear and Ricci flow trajectories between timeordered gene expression samples
Trajectories for time course gene expression data were derived via two approaches, a null Euclidean straight line trajectory and by employing our discrete normalised Ricci flow. For both approaches the first gene expression time point (x^{0}) was used as a starting state and the final time point (x^{T}) was the end state. Intermediate time points were not used in the derivation of the trajectory only for its validation.
For normalised discrete Ricci flow we employ the following expression described above:
This flow will deform the weight on an edge of the PIN at a rate proportional to the difference between the edge curvature at a starting state and a final state determined by the normaliser.
We set the normaliser of our Ricci flow as the FormanRicci curvature calculated at the final time point T: \({\overline{Ric}}_{(i,j)}={R}_{F}^{T}(i,j)\). The time increment Δt was selected empirically. If Δt is too large then negative values of the incremented distance d_{t+Δt} are possible, which are not acceptable by definition, however, if Δt is very small convergence of the Ricci flow to the normaliser will require a great number of iterations and will not be computationally practical. We therefore considered a range of values for Δt ∈ {10^{−3}, …, 10^{−1}}. For each gene expression time course, we implemented one time step of the Ricci flow from the first time point x^{0} using each Δt value and selected the optimal Δt as the largest which does not admit negative values of d_{0+Δt}. For both time courses considered this value was Δt = 0.06.
We note that the maximal value of Δt which does not admit negative values of d_{0+Δt} can also be derived theoretically and depends on the differences between the edgewise FormanRicci curvatures at t = 0 and those of the normaliser via:
where, \({E}^{*}=\{(i,j)\in E:\overline{Ri{c}_{(i,j)}}Ric{({x}^{0})}_{(i,j)} \, > \, 0\}\). For both time courses considered Δt^{*} ∈ [0.06, 0.065] and Ricci flow was thus implemented using close to the maximal value of Δt possible. Smaller values of Δt can be used to obtain a more finegrain approximation of the network rewiring trajectory, at the cost of increased computation time and the need for more iterations before convergence.
For both gene expression time courses, we found that after 150 iterations the normalised Ricci flow converged very close to the normaliser, with little change in d_{t+Δt} with subsequent iterations, we thus selected 150 as the optimal number of iterations in the flow. We note that by construction the final transcriptomic time point will always be closest to the end of the trajectory. As the number of iterations is selected as sufficiently large to ensure convergence, rather than the minimum number of iterations required for convergence, the end of the trajectory represents signalling in a steady state, as opposed to the precise moment gene expression matches the final time point.
To derive the Euclidean linear trajectory null model, from the starting gene expression time point to the final, we constructed a straight line from \({W}^{0}={({a}_{ij}{x}_{i}^{0}{x}_{j}^{0})}_{i,j\in V}\) to \({W}^{T}={({a}_{ij}{x}_{i}^{T}{x}_{j}^{T})}_{i,j\in V}\) in \({{\mathbb{R}}}^{n\times n}\). We selected 150 equally spaced points along this line via the following expression
Comparing inferred trajectories to true time course gene expression data
For both normalised discrete Ricci flow and the Euclidean linear trajectory null model we derived a trajectory described by 150 discrete points from the starting gene expression state to the final, as above. Each of these discrete data points can be transformed into a prediction of the weighted network: \({W}_{p}({{{{{{{{\bf{x}}}}}}}}}^{{{{{{{{\bf{r}}}}}}}}})={a}_{ij}{x}_{i}^{r}{x}_{j}^{r}\) for r ∈ {1, …, 150}. In the case of the Euclidean trajectory, the inferred point is exactly this weighted network, while for the normalised Ricci flow \({W}_{p}({{{{{{{\bf{{x}}}}}}}^{r}}})={(1/{d}_{r}(i,j))}_{i,j\in V}\).
For each true intermediate time point in the gene expression time course {1, …, T − 1} we computed the Euclidean distance between each of the 150 predictions of W_{p}(x^{r}) in each inferred trajectory and the true data points {W(x^{1}), …, W(x^{T−1})}.
The value of r which minimised the distance between W_{p}(x^{r}) and W(x^{t}) was considered the point along the trajectory which most closely corresponded to the true gene expression trajectory at time t.
The association between the trajectory points corresponding to the measured time points and the true intermediate time points themselves (excluding starting and ending time points) was assessed via Pearson correlation, with significance at the 5% level.
Entropy and Ricci curvature on networks and metricmeasure spaces
A connection between Ricci curvature and relative entropy has been explored in the setting of metricmeasure spaces by several investigators^{24,40,48}. Formally let (M, d, m) be a metricmeasure space, where (M, d) is a metric space and m is a measure on the Borel σalgebra of M, the authors typically aim to define a notion by which (M, d, m) has a Ricci curvature bounded below by \(K\in {\mathbb{R}}\) and explore the consequences. To do so they consider the metric space P_{2}(M) = (P(M), W_{2}), associated with the metric space (M, d), where P(M) is the space of Borel probability measures on M and W_{2} is the Wasserstein2 distance. W_{2} is a distance measure commonly used in optimal transport, to provide intuition if m_{1}, m_{2} ∈ P(M) then \({W}_{2}{({m}_{1},{m}_{2})}^{2}\) is the smallest cost of transporting the total mass from the measure m_{1} to the measure m_{2}, where the cost of transporting a unit mass between points a_{1} and a_{2} ∈ M is \(d{({a}_{1},{a}_{2})}^{2}\). Employing results on displacement convexity along geodesics in P(M), a connection between an entropy functional defined on P(M) and the Ricci curvature of (M, d, m) can be proposed.
Formally, using the notation of Strum 2006^{40}, we define a relative entropy functional with respect to m on P(M) via:
It has been proposed (based on results for Riemannian manifolds^{48}) that (M, d, m) has Ricci curvature bounded below by \(K\in {\mathbb{R}}\) if and only if, for any ν_{0}, ν_{1} ∈ P(M), where Ent(ν_{0}∣m), Ent(ν_{1}∣m) < ∞, there exists a geodesic γ: [0, 1] → P(M), where γ(0) = ν_{0} and γ(1) = ν_{1} such that:
Sandhu et al.^{19}, use this statement to infer a positive correlation between an entropy defined as the negative of Ent( ⋅ ∣m) and the Ricci curvature of (M, d, m).
In our setting of networks, there is not an unambiguous way to map to a metricmeasure space. The definition of the space (M, d, m) could have many choices in terms of network topology as well as vertex and edge weights. Moreover, the definition of FormanRicci curvature applied to networks is again nonunique, depending on edge and vertex weights and the validity of this curvature depends upon an interpretation of the network as a cell complex approximation to a Riemannian manifold. The definition of network entropy as an entropy rate is also not equivalent to the definition of Ent( ⋅ ∣m), and again the choice of m for the network setting is nonunique. Collectively this highlights a distinction between the network setting and metricmeasure spaces, and results in one setting cannot be expected to be valid in the other, in particular correlation between entropy and curvature.
Statistics and reproducibility
The association between network entropy and total FormanRicci curvature in transcriptomic data sets was evaluated using Pearson’s correlation coefficient. The comparison between network entropy and total FormanRicci curvature in cancerous and healthy single cells was evaluated using twotailed Wilcoxon tests. The association between closest pass Ricci flow iteration/straight line trajectory iteration and true differentiation time was evaluated using Pearson’s correlation coefficient. No statistical method was used to predetermine the sample size. No data were excluded from the analyses. The experiments were not randomised. The Investigators were not blinded to allocation during experiments and outcome assessment.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All relevant data supporting the key findings of this study are available within the article. The Normalised read count data corresponding to RNAsequencing used in this study are available in the GEO database^{49} under the following accession codes. The data describing scRNAseq of 1018 single cells assayed at different stages of multipotency and alongside data describing 758 single cells assayed at 6 distinct time points during ESC differentiation^{42} are available in the GEO database under accession code GSE75748. The data describing scRNAseq of 1257 malignant and 3256 healthy single cells from 19 patients with malignant melanoma^{43} are available in the GEO database under accession code GSE72056. The data describing scRNAseq of 272 malignant and 160 healthy cells from patients with colorectal cancer^{44} are available in the GEO database under accession code GSE81861. Our data set describing healthy myoblast differentiation at 8 distinct time points^{45} is available in the GEO database under accession codes GSE102812 and GSE123468. Source data are provided in this paper.
Code availability
The R code developed for the analysis presented in the paper is accessible in the following Github: https://github.com/anthbapt/CellulardifferentiationtrajectorieswithRicciflow and the used version of the code is deposited in Zenodo with https://doi.org/10.5281/zenodo.10469562^{50}.
References
Hanahan, D. Hallmarks of cancer: new dimensions. Cancer Discov. 12, 31–46 (2022).
Waddington, C. H. An Introduction to Modern Genetics. (George Alien & Unwin, London,1939)
MacArthur, B. D., Maayan, A. & Lemischka, I. R. Systems biology of stem cell fate and cellular reprogramming. Nat. Rev. Mol. Cell biol. 10, 672–681 (2009).
MacArthur, B. D., Ma’ayan, A. & Lemischka, I. R. Toward stem cell systems biology: from molecules to networks and landscapes. Cold Spring Harb.Symposia Quant. Biol. 73, 211–215 (2008).
Wang, J., Zhang, K., Xu, L. & Wang, E. Quantifying the Waddington landscape and biological paths for development and differentiation. Proc. Natl Acad. Sci. USA 108, 8257–8262 (2011).
Ferrell, J. E. Bistability, bifurcations, and Waddington’s epigenetic landscape. Curr. Biol. 22, 458 (2012).
Sáez, M. et al. Statistically derived geometrical landscapes capture principles of decisionmaking dynamics during cell fate transitions. Cell Syst. 13, 12–283 (2022).
Macarthur, B. D. & Lemischka, I. R. Xstatistical mechanics of pluripotency. Cell 154, 484–489 (2013).
Banerji, C. R. S. et al. Cellular network entropy as the energy potential in Waddington’s differentiation landscape. Sci. Rep. 3, 3039 (2013).
Banerji, C. R. S., Severini, S., Caldas, C. & Teschendorff, A. E. Intratumour signalling entropy determines clinical outcome in breast and lung cancer. PLoS Comput. Biol. 11, 1–23 (2015).
Teschendorff, A. E. & Enver, T. Singlecell entropy for accurate estimation of differentiation potency from a cell’s transcriptome. Nat. Commun. 8, 1–15 (2017).
MacArthur, B. D. The geometry of cell fate. Cell Syst. 13, 1–3 (2022).
Rand, D. A., Raju, A., Sáez, M., Corson, F. & Siggia, E. D. Geometry of gene regulatory dynamics. Proc. Natl Acad. Sci. USA 118, 2109729118 (2021).
Baptista, A., SánchezGarcía, R. J., Baudot, A. & Bianconi, G. Zoo guide to network embedding. J. Phys. Complex. 4, 042001 (2023).
Ángeles Serrano, M., Boguñá, M. & Sagués, F. Uncovering the hidden geometry behind metabolic networks. Mol. bioSyst. 8, 843–850 (2012).
Zhou, Y. & Sharpee, T.O. Hyperbolic geometry of gene expression. iScience 24 https://doi.org/10.1016/J.ISCI.2021.102225 (2021).
Ollivier, Y. Ricci curvature of metric spaces. C. R. Math. 345, 643–646 (2007).
Forman, R. R. Bochner’s method for cell complexes and combinatorial Ricci curvature. Discrete Comput. Geom. 29, 323–374 (2003).
Sandhu, R. et al. Graph curvature for differentiating cancer networks. Sci. Rep. 5, 1–13 (2015).
Samal, A. et al. Comparative analysis of two discretizations of Ricci curvature for complex networks. Sci. Rep. 8, 8650 (2018).
Pouryahya, M., Mathews, J. & Tannenbaum, A. Comparing three notions of discrete Ricci curvature on biological networks. https://doi.org/10.48550/ARXIV.1712.02943 (2017).
Murgas, K. A., Saucan, E., Sandhu, R. Quantifying cellular pluripotency and pathway robustness through formanRicci curvature, 616–628 https://doi.org/10.1007/9783030934132_51 (2022).
Elkin, R. et al. Geometric network analysis provides prognostic information in patients with high grade serous carcinoma of the ovary treated with immune checkpoint inhibitors. NPJ Genom. Med. 6, 1–11 (2021).
Lott, J. & Villani, C. Ricci curvature for metricmeasure spaces via optimal transport. Ann. Math. 169, 903–991 (2009).
Murgas, K. A., Saucan, E. & Sandhu, R. Hypergraph geometry reflects higherorder dynamics in protein interaction networks. Sci. Rep. 12, 1–12 (2022).
S, H.R.: The Ricci flow on surfaces. Contemp. Math. 71, 237–262 (1988).
Perelman, G. The entropy formula for the Ricci flow and its geometric applications. https://arxiv.org/abs/math/0211159 (2002).
Perelman, G. Ricci flow with surgery on threemanifolds. https://arxiv.org/abs/math/0303109 (2003).
Zhang, M., Zeng, W., Guo, R., Luo, F. & Gu, X. D. Survey on discrete surface Ricci flow. J. Comput. Sci. Technol. 30, 598–613 (2015).
Weber, M., Jost, J. & Saucan, E. FormanRicci flow for change detection in large dynamic data sets. Axioms 5, 26 (2016).
Weber, M., Saucan, E. & Jost, J. Characterizing complex networks with formanRicci curvature and associated geometric flows. J. Complex Netw. 5, 527–550 (2017).
Cohen, H. et al. Objectbased dynamics: applying FormanRicci flow on a multigraph to assess the impact of an object on the network structure. Axioms 11, 486 (2022).
Ni, C.C., Lin, Y.Y., Gao, J. & Gu, X. in Graph Drawing and Network Visualization (eds Biedl, T., Kerren, A.) 447–462 (Springer, Cham, 2018).
Ni, C.C., Lin, Y.Y., Luo, F. & Gao, J. Community detection on networks with Ricci flow. Sci. Rep. 9, 9984 (2019).
Sia, J., Jonckheere, E. & Bogdan, P. OllivierRicci curvaturebased method to community detection in complex networks. Sci. Rep. 9, 9800 (2019).
Lai, X., Bai, S. & Lin, Y. Normalized discrete Ricci flow used in community detection. Phys. A Stat. Mech. Appl. 597, 127251 (2022).
Sia, J., Zhang, W., Jonckheere, E., Cook, D. & Bogdan, P. Inferring functional communities from partially observed biological networks exploiting geometric topology and side information. Sci. Rep. 12, 10883 (2022).
Znaidi, M. R. et al. A unified approach of detecting phase transition in timevarying complex networks. Sci. Rep. 13, 17948 (2023).
West, J., Bianconi, G., Severini, S. & Teschendorff, A. E. Differential network entropy reveals cancer system hallmarks. Sci. Rep. 2, 802 (2012).
Sturm, K. T. On the geometry of metric measure spaces. Acta Math. 196, 65–131 (2006).
Pouryahya, M., Mathews, J. & Tannenbaum, A. Comparing three notions of discrete Ricci curvature on biological networks. https://arxiv.org/abs/1712.02943 (2017).
Chu, L.F. et al. Singlecell RNAseq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm. Genome Biol. 17, 1–20 (2016).
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by singlecell rnaseq. Science 352, 189–196 (2016).
Li, H. & Courtois, E. T. et al. Reference component analysis of singlecell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nat. Genet. 49, 708–718 (2017).
Banerji, C. R. S. et al. Dynamic transcriptomic analysis reveals suppression of PGC1α/ERRα drives perturbed myogenesis in facioscapulohumeral muscular dystrophy. Hum. Mol. Genet. 28, 1244–1259 (2018).
Boguñá, M. et al. Network geometry. Nat. Rev. Phys. 3, 114–135 (2021).
Chow, B. & Luo, F. Combinatorial Ricci flows on surfaces. J. Differ. Geom. 63, 97–129 (2003).
Sturm, K.T. Convex functionals of probability measures and nonlinear diffusions on manifolds. J. Math. Pures Appl. 84, 149–168 (2005).
Barrett, T. et al. Ncbi geo: archive for functional genomics data sets—update. Nucl. Acids Res. 41, 991–995 (2012).
Baptista, A., MacArthur, B. D. & Banerji, C.R.S. Charting cellular differentiation trajectories with Ricci flow. Zenodo https://doi.org/10.5281/zenodo.10469562 (2023).
Acknowledgements
All authors gratefully acknowledge funding from the TuringRoche Strategic Partnership, and Prof. Ginestra Bianconi for interesting discussions.
Author information
Authors and Affiliations
Contributions
C.R.S.B. and B.D.M. designed research; A.B. and C.R.S.B. performed research; A.B. and C.R.S.B. analysed data; C.R.S.B. created numerical code, with contributions from A.B.; A.B., B.D.M., and C.R.S.B. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Paul Bogdan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Baptista, A., MacArthur, B.D. & Banerji, C.R.S. Charting cellular differentiation trajectories with Ricci flow. Nat Commun 15, 2258 (2024). https://doi.org/10.1038/s41467024458896
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467024458896
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.