# Cellular network entropy as the energy potential in Waddington's differentiation landscape

## Abstract

Differentiation is a key cellular process in normal tissue development that is significantly altered in cancer. Although molecular signatures characterising pluripotency and multipotency exist, there is, as yet, no single quantitative mark of a cellular sample's position in the global differentiation hierarchy. Here we adopt a systems view and consider the sample's network entropy, a measure of signaling pathway promiscuity, computable from a sample's genome-wide expression profile. We demonstrate that network entropy provides a quantitative, in-silico, readout of the average undifferentiated state of the profiled cells, recapitulating the known hierarchy of pluripotent, multipotent and differentiated cell types. Network entropy further exhibits dynamic changes in time course differentiation data, and in line with a sample's differentiation stage. In disease, network entropy predicts a higher level of cellular plasticity in cancer stem cell populations compared to ordinary cancer cells. Importantly, network entropy also allows identification of key differentiation pathways. Our results are consistent with the view that pluripotency is a statistical property defined at the cellular population level, correlating with intra-sample heterogeneity, and driven by the degree of signaling promiscuity in cells. In summary, network entropy provides a quantitative measure of a cell's undifferentiated state, defining its elevation in Waddington's landscape.

## Introduction

The observed diversity of mature cells and human tissues arises as a result of a complex, intricate program of cellular differentiation, ultimately originating from (pluripotent) embryonic stem cells1. Although systems biology principles underpinning the transitions between specific cellular states, such as pluripotency and progenitor states, are in the process of being elucidated2,3,4,5, much remains to be learned. In the case of hematopoiesis, one of the best understood developmental systems, the full repertoire of transcription factors and signaling pathways dictating cell-fate is still unknown5,6,7,8,9. Other studies have focused on characterising the pluripotent and progenitor states in terms of genome-wide gene expression10,11,12,13,14,15, DNA methylation and chromatin state profiles16,17,18,19,20. Although these molecular signatures can discriminate cells of specific differentiation stages from each other, there is, as yet, no single quantitative measure that can correctly place a sample within the global differentiation hierarchy. Rephrased in the context of Waddington's differentiation landscape21, we do not yet have a molecular measure that can represent the energy potential, i.e. the elevation, in Waddington's landscape.

Recently, it has been proposed that pluripotency, and more generally, the undifferentiated state, is an emergent statistical property of a population of cells22,23,24, not well-defined at the single-cell level. Specifically, it has been argued that high cellular diversity underpins the pluripotent or multipotent capacity of stem cell populations, with differentiated cell populations representing a more uniform synchronised state22. Motivated by this, we here explore a system's property of a cellular sample, called network entropy, in the context of cellular differentiation. At the single-cell level network entropy can be thought of as an approximate measure of signaling pathway promiscuity22,25,26,27. Thus, a highly undifferentiated cell, such as a pluripotent stem cell, would have a high network entropy since it must maintain the option to initiate the activation of a wide number of different signaling pathways associated with commitment to diverse cell fates6. In contrast, a terminally differentiated cell would have a low network entropy, since it must maintain activation of a few pathways specific to their fate. At the population level, high network entropy would thus imply increased cellular heterogeneity, since the increased signaling promiscuity results in an increased stochasticity across single cells. Thus, we posited that network entropy would provide a direct molecular correlate of the undifferentiated state of a cellular sample, allowing us to place an arbitrary sample at its appropriate elevation in Waddington's landscape.

To test our hypothesis, we here compute sample-specific network entropies for a large number of gene expression data sets relevant to cellular differentiation, reprogramming and cancer, encompassing over 800 samples, including cell-lines and primary tissue. Our main key findings are: (i) network entropy is a highly accurate discriminator of pluripotent and non-pluripotent cell-types, (ii) it can further discriminate cellular states of varying degrees of multipotency within distinct lineages, (iii) it provides a more robust and general measure of a cell's position in the global differentiation hierarchy than gene expression signatures, and does so independently of cell proliferation, and (iv) it predicts a higher cellular heterogeneity in cancer stem cells compared to ordinary cancer cells.

## Results

### Construction and rationale of network entropy

To compute network entropy requires estimation of the signaling/interaction probabilities of proteins in a given sample. Thus, we integrated the gene expression profile of a given sample with a comprehensive protein interaction signaling network (PIN) (see SI28), using the mass-action principle to construct a sample-specific stochastic matrix pij where i and j label two distinct genes. The stochastic matrix provides a rough proxy for the interaction probabilities present in the given sample and its construction is based on the assumption that two genes known to interact at the protein level will have a greater interaction probability when they are both highly expressed (see SI). From the stochastic matrix, the network entropy can be calculated as the entropy rate29,30

where Si is the local entropy of node (gene/protein) i and where πi is the i'th element of the stationary distribution of pij (i.e. πp = π, see Methods, SI). Thus, the entropy rate gives a steady state average measure of the uncertainty (or promiscuity) in signaling information flow over the network. To facilitate comparison of entropy rates obtained from samples profiled on different expression arrays, values were always normalised to the maximum possible entropy rate of a given integrated network (SI, fig. S1).

We posited that the entropy rate of a sample (e.g. a cell-line), as computed above, would capture the average level of signaling pathway promiscuity and hence of the cellular heterogeneity in the sample. Under this model, highly undifferentiated and plastic cells, such as stem cells, would be characterised by a state of high network entropy, allowing them the option to differentiate into diverse cell lineages (Fig. 1A). Similarly, since differentiation implicates activation of specific molecular signaling pathways, this activation would lead to a reduction in the uncertainty/promiscuity of information flow, i.e. a low entropy state (Fig. 1A).

As a proof of concept that the entropy rate does indeed measure the level of signaling promiscuity we first devised a simulation model (SI). We compared the entropy rate of our PIN with weights defined by a uniform stochastic matrix (i.e. one with pij 1/ki where ki is the degree of node i) representing a promiscuous poised state, to the entropy rate obtained by randomly activating individual genes and specific signal transduction pathways in the network (SI, Fig. 1B). In the case where individual genes were activated, this led, in approximately 70% of perturbations, to a reduction in the global entropy rate (Binomial P < 0.001, Fig. 1B). However, in the case where whole signaling pathways were activated, the reduction in the entropy rate was observed in 85% of cases (Binomial P < 10−10, Fig. 1B), consistent with a substantially lower uncertainty in the information flow.

### Network entropy quantifies the level of multipotency

Based on the simulation results, we sought to determine if network entropy could discriminate biological samples that differ in terms of their signaling promiscuity. Thus, we computed the network entropy rate of samples in the “stem cell matrix” (SCM), a compendium of over 219 samples (mostly cell-lines), all profiled with the same Illumina arrays, 59 of which were deemed pluripotent, with the rest (160) deemed non-pluripotent11. We observed that network entropy was significantly higher in the cell-lines deemed pluripotent (P < 10−10, Fig. 2A). To provide an independent benchmark we also computed a t-test based pluripotency score (TPSC, SI), constructed from an independent 19-gene pluripotency expression signature, containing important pluripotency markers such as NANOG and LIN28A12. The TPSC pluripotency score was also significantly higher in the pluripotent cell lines (SI, fig. S2), and both measures were significantly correlated, confirming that network entropy is indeed a marker of pluripotency (Fig. 2B). In an independently generated data set profiling 107 human embryonic and 52 induced pluripotent stem cell lines, as well as 32 differentiated tissue samples31, the entropy rate achieved 100% accuracy in discriminating pluripotent from differentiated samples (Figs. 2C–D). Crucially, all these results were independent of cell proliferation, as we verified by removing cell proliferation and cycling genes32 from the network (see SI, figs. S3–S4). Furthermore, passage number and sex did not have noticeable effects on the entropy rate as assessed in 107 human embyronic stem cell (hESC) lines (SI, fig. S5). Consistent with network entropy being a marker of pluripotency we observed that induced pluripotent stem cell samples exhibited high entropy values, similar to that of hESCs, and significantly higher than that of their parental differentiated cells (P < 0.0001, SI, figs. S6–S7).

Next, we compared the network entropy of hESCs to that of committed but multipotent cell types, including neural stem cells (NSCs), hematopoietic stem cells (HSCs) and mesenchymal stem cells (MSCs). Confirming our hypothesis, all of these stem cell types exhibited entropies which were lower than that of hESCs/iPSCs, but higher than their differentiated progeny (Fig. 3A, SI, S8–S9). Thus, network entropy can discriminate cells within a lineage according to their differentiation status. To test this further, in a combined haematological data set33, encompassing a number of different blood cell types including differentiated types (e.g. monocytes), and less differentiated ones (e.g. CD34+ HSCs and erythroblasts/megakaryocytes), network entropy recapitulated a differentiation hierarchy consistent with prior knowledge34,35 (see SI, fig. S10). Importantly, we observed that network entropy was a relatively robust measure, being fairly insensitive to the normalisation or platform used (SI, figs. S8–S11), although in the case of MSCs biological variations were evident (SI, figs. S8)36,37.

### Network entropy is reduced during differentiation

If network entropy is a general measure of the undifferentiated state of cells, it ought to exhibit dynamic changes in time course differentiation data. To this end, we considered expression data of differentiated retinal pigment epithelial cells, which were induced to de-differentiate, followed by a period of re-differentiation (SI). Remarkably, network entropy increased upon de-differentiation, reaching a maximum, with values subsequently dropping upon re-differentiation (Fig. 3B). As another example, we considered a time course data set consisting of human promyelocytic leukemia progenitor (HL60) cells, differentiating into neutrophils38. There were two separate time courses, using distinct stimuli to induce differentiation of HL60 cells. In both cases, network entropy was significantly reduced with time (ATRA stimulus, R2 = 0.96, P < 10−8, Fig. 3B). Once again, these dynamic changes were independent of cell-proliferation (SI, fig. S12).

### Network entropy discriminates cancer stem cells, cancer and normal cells

Differentiation is a key distinctive feature of cancer and normal cells, with cancer representing a less differentiated and more heterogeneous state. Confirming this, network entropy was consistently higher in cancer tissue compared to normal cells, across four different tissue types, with cancer cell-lines exhibiting even higher values (Fig. 4A). We further analysed an expression data set profiling putative cancer stem cells (CSCs) and their parental cancers across a number of different tissues39. This showed that CSCs exhibited a marginally higher network entropy than their non-stem like counterparts, consistent with the view that CSCs retain a higher level of plasticity (Fig. 4B).

Interestingly, comparing the network entropy of hESCs to teratocarcinomas and germ cell tumours (all from the SCM and all deemed pluripotent), revealed marginally higher values in the hESCs (SI, fig. S13). This pattern of higher network entropy in normal stem cells was also seen in the non pluripotent context: for instance, the network entropy of HSCs and NSCs was, in general, higher than that of leukemic stem and glioma stem cells, respectively (SI, fig. S13–S14). Thus, while CSCs and ordinary cancer cells exhibit significantly increased cellular heterogeneity compared to normal differentiated tissue, CSCs do not appear to exhibit higher values relative to their normal stem cell counterparts, and even appear to show reduced levels of entropy compared to normal stem cells.

### Dynamic changes in local network entropy identifies key differentiation genes and pathways

To demonstrate that the dynamic changes in entropy can be related to changes in activation of specific pathways, we considered, as a proof of principle, the case of Notch-signaling. Notch signaling is inactive yet inducible in the pluripotent state, with activation normally associated with differentiation40,41,42,43,44,45. Thus, essential components of the Notch signaling pathway should exhibit a lower network entropy in the non-pluripotent state. Using data from the stem cell matrix11, we were able to confirm this for 12 of the 13 Notch pathway genes (SI, figs. S15–S16). To confirm the statistical significance of this, in none of 10000 random selections of 13 genes from the PIN did we observe the same level of consistency and statistical significance as for the Notch pathway genes (P < 0.0001), indicating that reduced entropy of the Notch pathway is a key feature of the non-pluripotent state (SI, fig. S17). It is also important to demonstrate that the interactors driving the lower entropy of Notch genes are other Notch-pathway genes. For many Notch genes (e.g. NOTCH2, NOTCH3, DLL1, JAG1, PSENEN, APH1A, APH1B) this was indeed the case, despite the fact that there were also many non-Notch pathway interactors present (SI, figs. S16,S18).

To further test the added value of local network entropy, we revisited the HL60 to neutrophil time course data. Using linear regressions we identified the genes showing the most significant decreases and increases in network entropy. Ranking genes according to those showing the largest reductions in network entropy and performing a subsequent Gene Set Enrichment Analysis (GSEA), we identified JAK-STAT signalling as one of the key pathways (SI, fig. S19–S20). The involvement of this pathway is heavily supported by previous studies46,47,48,49. Attesting to the statistical significance of the JAK-STAT pathway, computing entropies after randomly permuting the gene expression profiles over the nodes in the network led to no significantly enriched biological terms (adjusted P-values > 0.05). This is an important result because it shows that the dynamic network entropy changes inferred from the integrated PIN are indeed targeting specific signaling pathways. Finally, using non-network based approaches did not identify the JAK-STAT pathway (SI, fig. S19).

## Discussion

Here we have taken a systems analysis view of cellular differentiation, proposing the concept that network entropy is inversely correlated with the differentiation status of a sample. By computing the network entropy of over 800 samples, encompassing cell types from many diverse cell-lineages and differentiation stages, and profiled using a variety of different microarray platforms, we have demonstrated that entropy provides a near absolute quantification measure of the undifferentiated state of any given sample.

In the context of normal physiology, hESCs and other pluripotent cell types were correctly predicted to exhibit the highest levels of network entropy, followed by multipotent stem cells (e.g. NSC/HSC/MSC), with terminally differentiated cells exhibiting significantly lower entropy (Fig. 5). In the context of cancer, CSCs exhibited higher levels of cellular entropy than ordinary cancer cells, although this difference appears substantially reduced in comparison to what is observed between normal stem cells and their differentiated progeny (Fig. 5). Cancer cell lines exhibited a higher entropy than primary cancers, with cancer tissue possessing higher values than normal tissue (Fig. 5). All these findings are consistent with network entropy being a direct measure of the average intrasample cellular heterogeneity, supporting the view that cellular states such as pluripotency are a statistical property of a cell population6,22. Indeed, although we have not analysed genome-wide single-cell expression data, it is highly plausible that the degree of cellular heterogeneity is determined by the level of signaling promiscuity, and hence stochasticity, in single cells6,22. The observation that cancer stem cells exhibit a high but marginally lower network entropy than their normal stem cell counterparts is also consistent with the view that CSCs must be characterised by oncogenic pathway dependencies, which, as shown in a previous study, lead to a lowering of network entropy26. Local entropy analyses aimed at identifying the specific oncogenic pathways driving the lower entropy in CSCs could thus offer novel therapeutic opportunities26.

It is important to stress again that network entropy provides a very general system's measure of the undifferentiated state of a sample. In this regard, we remark that reported pluripotency expression signatures12,15, which lack a systems-level interpretation and understanding, could only consistently discriminate pluripotent from non-pluripotent cell types, but generally failed to discriminate cell types located further down the differentiation hierarchy, irrespective of normal or cancer physiology (SI, figs. S21–S27). Thus, the fact that network entropy provides a more refined classification of the distinct cell types across the global differentiation hierarchy, and that it did so independently of cell-proliferation indices, attests to the biological importance of this measure and of the statistical mechanical framework on which it is based.

Although we observed some variation in entropy rates between studies profiling the same cell types using the same technology, it is nevertheless also important to note that these variations were in general small and that network entropy provided a relatively robust measure of the undifferentiated cellular state: for instance, hESCs always showed the highest levels of network entropy, irrespective of study or platform. This robustness stems from two key features. First, network entropy is a self-calibrating measure, as it is constructed by taking ratios of gene expression intensity values. This makes it a dimensionless quantity and fairly insensitive to the microarray or normalisation method used, unlike the scores derived from pluripotency signatures which showed significant variations between studies (see SI, fig. S28). Second, network entropy is not affected by overfitting since it is a quantity which does not depend on feature selection. Thus, unlike pluripotency expression signatures12,14,15, network entropy does not depend on tunable parameters. It follows that network entropy could provide a simple, general and robust quantitative test for assessing the pluripotency or multipotency of a cellular sample. For instance, it could be used to assess the quality of iPSCs in reprogramming experiments or even to identify mislabeled samples.

Since a sample's network entropy is computed from integration of its genome-wide expression profile with a protein interaction network, it is important to also comment on the robustness of the results in relation to the network, and more importantly on the number of genes that are measure. Considering the HL60 differentiation time course data set as a test case, we observed that randomly subsampling from the underlying integrated network and recomputing the entropy rates for the resulting maximally connected components, still resulted in significant decreases of the entropy rate with differentiation stage, as long as we subsample at least 40% of genes in the network (SI, fig. S29). That the association between network entropy and differentiation stage is robust to subsampling indicates that the dynamic changes in entropy are driven by a subtle interplay between the gene expression changes and the topological properties of the nodes exhibiting these changes. We leave investigation of this and other aspects to a future study.

In summary, we have proposed a relatively simple, computable, systems property of a genome-wide expression profile, called network entropy, which provides an estimate of signaling promiscuity and cellular heterogeneity, and which correlates with the undifferentiated state of cells. Network entropy may thus serve as a quantitative in-silico proxy for a sample's differentiation potential in Waddington's epigenetic landscape.

## Methods

Full details of the data sets, interaction network and all statistical methods used are provided in SI. Below, we give a brief sketch of how network entropy is calculated.

### Construction of the sample specific stochastic matrix and network entropy rate

The sample specific stochastic matrix is estimated by integrating the gene expression profile of the sample with a comprehensive protein interaction network. Specifically, we invoke the mass action principle: let Ei denote the normalised expression level of gene i in a given sample. For a given neighbour j N(i) (where N(i) labels the neighbours of i in the PIN), the mass-action principle states that the probability of interaction with i is approximated by the product EiEj. Normalising this to ensure that Σj pij = 1, we get

Clearly, if j N(i), then pij = 0. This then defines a sample-specific stochastic matrix. From this stochastic matrix one can then construct a local network entropy for each gene i in the PIN, as

which reflects the level of uncertainty or promiscuity in the local interaction probabilities around gene i. We note that the above expression for the local entropy is not normalised so that the maximum possible entropy depends on the degree (ki) of the node i. In fact, max Si = log ki. Thus, it is convenient to also define a normalised local entropy as (see25),

We stress again that this local network entropy can be computed for each gene i in each given sample. When defining a global network entropy (i.e. for the whole network) one can, in principle, consider the average of these normalised local entropies. This average however is a nonequilibrium entropy26, in contrast to the global entropy rate, SR, which is defined in terms of the stationary distribution, π, of the stochastic matrix p, i.e. through πp = π. Specifically, the global entropy rate, SR, is defined by29,30

where Si are the unnormalised local entropies. We note that the network entropy rate is bounded between 0 and a positive maximum value that depends only on the adjacency matrix of the network50. Indeed, it can be shown that the maximum possible entropy rate is attained by a stochastic matrix, pij, defined by pij = Aijvj/λvi, where Aij is the adjacency matrix (i.e. unweighted) of the PIN, and v and λ are the dominant eigenvector and eigenvalue of this adjacency matrix, respectively. The maximum attainable entropy rate, MR, will thus depend on the specifics of the network, including total number of genes, edges and topology. Thus, to facilitate comparison between networks, the network entropy rate, SR, can be scaled relative to the maximum attainable value in that given network, , so that is always bounded between 0 and 1. In this work, all reported entropy rates have been normalised in this way.

We note that computation of the entropy rate is computationally intensive as it requires estimation of the stationary distribution of a large stochastic matrix. For a connected network of size 8290 nodes, computation of a sample's entropy rate takes ~ 10 minutes on a Dell Precision T5400 workstation. R-scripts performing the computations are freely available on request.

## References

1. 1

Keller, G. Embryonic stem cell differentiation: emergence of a new era in biology and medicine. Genes Dev 19, 1129–1155 (2005).

2. 2

Zhou, J. X., Brusch, L. & Huang, S. Predicting pancreas cell fate decisions and reprogramming with a hierarchical multi-attractor model. PLoS One 6, e14752 (2011).

3. 3

Zhou, J. X. & Huang, S. Understanding gene circuits at cell-fate branch points for rational cell reprogramming. Trends Genet 27, 55–62 (2011).

4. 4

Ladewig, J., Koch, P. & Bruestle, O. Leveling waddington: the emergence of direct programming and the loss of cell fate hierarchies. Nat Rev Mol Cell Biol 14, 225–236 (2013).

5. 5

Heinaniemi, M. et al. Gene-pair expression signatures reveal lineage control. Nat Methods 10, 577–583 (2013).

6. 6

Chang, H. H., Hemberg, M., Barahona, M., Ingber, D. E. & Huang, S. Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453, 544–547 (2008).

7. 7

Pina, C. et al. Inferring rules of lineage commitment in haematopoiesis. Nat Cell Biol 14, 287–294 (2012).

8. 8

Kohn, L. A. et al. Lymphoid priming in human bone marrow begins before expression of cd10 with upregulation of l-selectin. Nat Immunol 13, 963–971 (2012).

9. 9

Rodrigues, N. P., Tipping, A. J., Wang, Z. & Enver, T. Gata-2 mediated regulation of normal hematopoietic stem/progenitor cell function, myelodysplasia and myeloid leukemia. Int J Biochem Cell Biol 44, 457–460 (2012).

10. 10

Chen, X. et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008).

11. 11

Mueller, F. J. et al. Regulatory networks define phenotypic classes of human stem cell lines. Nature 455, 401–405 (2008).

12. 12

Mikkelsen, T. S. et al. Dissecting direct reprogramming through integrative genomic analysis. Nature 454, 49–55 (2008).

13. 13

Wong, D. J. et al. Module map of stem cell genes guides creation of epithelial cancer stem cells. Cell Stem Cell 2, 333–344 (2008).

14. 14

Mueller, F. J. et al. A bioinformatic assay for pluripotency in human cells. Nat Methods 8, 315–317 (2011).

15. 15

Palmer, N. P., Schmid, P. R., Berger, B. & Kohane, I. S. A gene expression profile of stem cell pluripotentiality and differentiation is conserved across diverse solid and hematopoietic cancers. Genome Biol 13, R71 (2012).

16. 16

Meissner, A. et al. Genome-scale dna methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008).

17. 17

Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

18. 18

Zhu, J. et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).

19. 19

Smith, Z. D. & Meissner, A. Dna methylation: roles in mammalian development. Nat Rev Genet 14, 204–220 (2013).

20. 20

Bock, C. et al. Dna methylation dynamics during in vivo differentiation of blood and skin stem cells. Mol Cell 47, 633–647 (2012).

21. 21

Waddington, C. H. The Strategy of the Genes: A Discussion of Some Aspects of Theoretical Biology (Allen & Unwin, London, 1957).

22. 22

Macarthur, B. D. & Lemischka, I. R. Statistical mechanics of pluripotency. Cell 154, 484–489 (2013).

23. 23

Furusawa, C. & Kaneko, K. Chaotic expression dynamics implies pluripotency: when theory and experiment meet. Biol Direct. 15 (2009).

24. 24

Furusawa, C. & Kaneko, K. A dynamical systems view of stem cell biology. Science 338 (2012).

25. 25

Teschendorff, A. E. & Severini, S. Increased entropy of signal transduction in the cancer metastasis phenotype. BMC Syst Biol 4, 104 (2010).

26. 26

West, J., Bianconi, G., Severini, S. & Teschendorff, A. E. Differential network entropy reveals cancer system hallmarks. Sci Rep 2, 802 (2012).

27. 27

Li, Y., Yi, M. & Zou, X. Identification of the molecular mechanisms for cell-fate selection in budding yeast through mathematical modeling. Biophys J. 104, 2282–94 (2013).

28. 28

Cerami, E. G. et al. Pathway commons, a web resource for biological pathway data. Nucleic Acids Res 39, D685–D690 (2011).

29. 29

Latora, V. & Baranger, M. Kolmogorov-sinai entropy rate versus physical entropy. Phys Rev Lett 82 (1999).

30. 30

Gomez-Gardenes, J. & Latora, V. Entropy rate of diffusion processes on complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 78, 065102 (2008).

31. 31

Nazor, K. L. et al. Recurrent variations in dna methylation in human pluripotent stem cells and their differentiated derivatives. Cell Stem Cell 10, 620–634 (2012).

32. 32

Ben-Porath, I. et al. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet 40, 499–507 (2008).

33. 33

Watkins, N. A. et al. A haematlas: characterizing gene expression in differentiated human blood cells. Blood 113, e1–e9 (2009).

34. 34

Goldfarb, A. N., Wong, D. & Racke, F. K. Induction of megakaryocytic differentiation in primary human erythroblasts: a physiological basis for leukemic lineage plasticity. Am J Pathol 158, 1191–1198 (2001).

35. 35

Miranda-Saavedra, D. & Gttgens, B. Transcriptional regulatory networks in haematopoiesis. Curr Opin Genet Dev 18, 530–535 (2008).

36. 36

Zipori, D. The stem state: mesenchymal plasticity as a paradigm. Curr Stem Cell Res Ther 1, 95–102 (2006).

37. 37

Krinner, A., Hoffmann, M., Loeffler, M., Drasdo, D. & Galle, J. Individual fates of mesenchymal stem cells in vitro. BMC Syst Biol 4, 73 (2010).

38. 38

Huang, S., Eichler, G., Bar-Yam, Y. & Ingber, D. E. Cell fates as high-dimensional attractor states of a complex gene regulatory network. Phys Rev Lett 94, 128701 (2005).

39. 39

Yu, Y. H. et al. Network biology of tumor stem-like cells identified a regulatory role of cbx5 in lung cancer. Sci Rep 2, 584 (2012).

40. 40

Noggle, S. A., Weiler, D. & Condie, B. G. Notch signaling is inactive but inducible in human embryonic stem cells. Stem Cells 24, 1646–1653 (2006).

41. 41

Yu, K. et al. A precisely regulated gene expression cassette potently modulates metastasis and survival in multiple solid cancers. PLoS Genet 4, e1000129 (2008).

42. 42

Meier-Stiegen, F. et al. Activated notch1 target genes during embryonic cell differentiation depend on the cellular context and include lineage determinants and inhibitors. PLoS One 5, e11481 (2010).

43. 43

Liu, J., Sato, C., Cerletti, M. & Wagers, A. Notch signaling in the regulation of stem cell self-renewal and differentiation. Curr Top Dev Biol 92, 367–409 (2010).

44. 44

Bigas, A., D'Altri, T. & Espinosa, L. The notch pathway in hematopoietic stem cells. Curr Top Microbiol Immunol 360, 1–18 (2012).

45. 45

Blank, U., Karlsson, G. & Karlsson, S. Signaling pathways governing stem-cell fate. Blood 111, 492–503 (2008).

46. 46

Minami, M. et al. Stat3 activation is a critical step in gp130-mediated terminal differentiation and growth arrest of a myeloid cell line. Proc Natl Acad Sci U S A 93, 3963–3966 (1996).

47. 47

Caldenhoven, E. et al. Differential activation of functionally distinct stat5 proteins by il-5 and gm-csf during eosinophil and neutrophil differentiation from human cd34+ hematopoietic stem cells. Stem Cells 16, 397–403 (1998).

48. 48

Kanayasu-Toyoda, T., Yamaguchi, T., Uchida, E. & Hayakawa, T. Commitment of neutrophilic differentiation and proliferation of hl-60 cells coincides with expression of transferrin receptor. effect of granulocyte colony stimulating factor on differentiation and proliferation. J Biol Chem 274, 25471–25480 (1999).

49. 49

Coffer, P. J., Koenderman, L. & de Groot, R. P. The role of stats in myeloid differentiation and leukemia. Oncogene 19, 2511–2522 (2000).

50. 50

Demetrius, L. & Manke, T. Robustness and network evolution-an entropic principle. Physica A 346, 682–696 (2005).

## Acknowledgements

CRSB was supported by a EPSRC/BBSRC CoMPLEX PhD studentships. AET was supported by a Heller Research Fellowship. We would like to thank Roger Kramer for data collection tasks, and Alex Gutteridge for data pointers.

## Author information

Authors

### Contributions

Statistical analysis was performed by C.R.S.B. and A.E.T. Study was conceived by A.E.T., C.R.S.B. and J.Z. A.E.T. wrote the manuscript with contributions from C.R.S.B. D.M.S. contributed data. S.S. contributed funding. M.W. and T.E. contributed ideas.

### Corresponding author

Correspondence to Andrew E. Teschendorff.

## Ethics declarations

### Competing interests

The authors declare no competing financial interests.

## Supplementary information

### Supplementary Information

Supplementary Information (PDF 1852 kb)

## Rights and permissions

Reprints and Permissions

Banerji, C., Miranda-Saavedra, D., Severini, S. et al. Cellular network entropy as the energy potential in Waddington's differentiation landscape. Sci Rep 3, 3039 (2013). https://doi.org/10.1038/srep03039

• Accepted:

• Published:

• ### Single-cell RNA sequencing reveals that lung mesenchymal progenitor cells in IPF exhibit pathological features early in their differentiation trajectory

• Daniel J. Beisang
• , Karen Smith
• , Libang Yang
• , Alexey Benyumov
• , Jeremy Herrera
• , Eric Lock
• , Emilian Racila
• , Colleen Forster
• , Brian J. Sandri
• , Craig A. Henke
•  & Peter B. Bitterman

Scientific Reports (2020)

• ### Perturbation-Driven Entropy as a Source of Cancer Cell Heterogeneity

• Sebastian M.B. Nijman

Trends in Cancer (2020)

• ### Glioblastoma Stem Cells: Driving Resilience through Chaos

• Briana C. Prager
• , Shruti Bhargava
• , Christopher G. Hubert
•  & Jeremy N. Rich

Trends in Cancer (2020)

• ### Computational network biology: Data, models, and applications

• Chuang Liu
• , Yifang Ma
• , Jing Zhao
• , Ruth Nussinov
• , Yi-Cheng Zhang
• , Feixiong Cheng
•  & Zi-Ke Zhang

Physics Reports (2020)

• ### Entropy as a Robustness Marker in Genetic Regulatory Networks

• Mustapha Rachdi
• , Jules Waku
• , Hana Hazgui
•  & Jacques Demongeot

Entropy (2020)