A multitude of biological processes ranging from genetic and epigenetic mutations, DNA damage, to complex intra- and intercellular signaling dynamics undoubtedly play key roles in triggering cancer in a given patient1,2,3,4,5,6. However, for many of these biological processes the various detailed biochemical reactions that take place are unknown. Similarly, the exact interplay between processes can also be ambiguous. Rather, it is the qualitative effect of varying a particular reactant or altering the environmental conditions in a systematic fashion that we observe, without necessarily understanding all of the underlying processes involved. For example, once formed, tumors seem to evolve in a fairly generic way: They either lie dormant, or grow, fed by the underlying network vasculature, capable of generating new vessels via angiogenesis when needed7. Generally, an absence of nutrients will tend to reduce growth, while sufficient supply leads to a progression in tumor cell behavior from differentiation and proliferation to migration7. Metastasis of cancer to lymph nodes and other organs, thought to be the most lethal aspect of the disease, likewise may depend on myriad patient-specific factors concerning the lymphatic system, immune response, micro-environmental factors and general patient health8. However, once again, the actual process is fairly generic – involving the spread of cancer cells from the primary tumor through the lymphatic and circulatory systems. Most fundamentally, at the heart of all these processes is the essential interplay between an evolving population of cancer cells which is fed by – and feeds back on – an underlying blood vessel network structure which supplies nutrients to the tumor and tissue, but simultaneously provides a transport network through which cancer cells can metastasize to other parts of the body and drugs are delivered to the tumor. Yet, the blood vessel structure is typically highly irregular in tumors and further complicated by the highly dynamic structural growth and degradation interplay with the evolving tumor mass, making an averaged description for modeling purposes insufficient. These factors clearly emphasize the importance of incorporating relevant network structures not only for tumor progression prognosis, but also for the analysis of effective treatments. For these reasons, to model the progression of tumor growth behavior it may be more productive and informative to implement universally observed and biologically derived, qualitative behavior in the model dynamics. Such qualitative mechanisms have proved to be useful in building models which correlate well with experimental findings, deepening our understanding of the basic underlying processes and making practical predictions possible9.

Early tumor models often resembled a theoretical exercise, looking at averaged behavior whilst neglecting the importance of environmental heterogeneities at various length-scales – or were computationally too expensive due to the ambitiously detailed nature of the model setup and sheer number of cells necessary to investigate long term behavior9,10,11. All models by design are simplifications and approximations based on assumptions of the true biological system. Cancer models, regardless of mathematical rigor and modeling complexity, are typically criticized as too simplistic for complex tumor-related phenomena9. However, promisingly, a rapidly growing number of models have seen a close symbiotic collaboration between theoreticians, biologists, oncologists and clinicians, which has lead to novel predictions emerging from the model results, which were subsequently experimentally verified9,10,11,12,13,14,15,16.

We believe that the greatest shortcoming is the current lack of implementation of clinical images into models as initial conditions for patient specific prognosis9,10,11,12,13,14,15,16. Most are seeded with artificial initial conditions of cancer size, shape and density, as well as environmental parameters, struggling to combine the model with data gathered from clinical images17,18,19. Great advances in imaging techniques have enabled more and more accurate visualization of the problem zone, allowing for a wide range in length scales and time resolution, particularly at the molecular level. However, these have not been successfully implemented into tissue level cancer modeling mainly due to multi-scale compatibility issues. Indeed, the majority of models are inflexible to even the simplest extensions, modifications or re-scaling. A direct one-to-one mapping of all cells is unfeasible20,21,22,23,24,25,26,27,28,29, whilst modeling the global spatially-averaged behavior fails to describe important cellular and environmental heterogeneities in the system which may be particularly important during early tumor growth12,30,31,32,33,34,35,36,37,38,39.

Here we present a simple multi-scale model which addresses two fundamental issues. First, the uncertainties in the details of the biological processes are accounted for by describing behavior in local regions by a well established averaged behavior growth equation, whilst preserving the heterogeneities in each region; second, the ability to take advantage of invaluable data gathered from patient images, to be used as seed for the model for comparison and further development, taking an appropriately coarse-grained length scale for the model to adapt to the image resolution. We realize that many more details could be included in a model of primary cancer growth or metastasis. Many models and methodologies exist, ranging in length and time scale, capturing the biology of intra and intercellular signaling up to tissue level dynamics, each successfully mimicking some part of the complex emerging phenomena such as tumor growth, angiogenesis and metastasis. Yet each also comes with limitations. The primary purpose of the presented model is a complementary one to existing models, seeing how far one can go in explaining a wide range of clinical data using a simplified and minimal, adaptive and multi-scalable, yet most crucially, data driven approach to understand and predict a patient's highly personalized tumor progression from early growth through to metastasis and even treatment strategy analysis using a single model.

Our model seeded with in vivo data predicts robust growth corridors of future tumor growth behavior in good agreement with a murine mammary cell line EMT6-HER2 model in BALB/c mice, as well as reproducing clinical human patient data of metastasis. Moreover, the model predicts a hidden scaling relation between the underlying nutrient supplying vessel structure and cancer co-evolution, a finding which estimates a tumor's most likely evolution and more importantly, pinpoints specific vessel target sites to optimally control tumor growth.


In vivo data implementation in mathematical model

A mathematical model was developed to purposefully seed up-to-date patient information for a personalized prognosis and to bridge the gap between the length scale extremes of current mathematical modeling efforts (see Methods section for model description). A minimal-mechanism of a co-evolving nutrient network and cancer population is applied to the growth of a single tumor. Key to the individualized prognosis is the implementation of in vivo images as initial conditions. Later in the paper, we show how exactly the same multi-scale mathematical model can be applied at the level of systemic metastasis simply by making a change in the biological interpretation of its network features.

Figure 1 illustrates the methodology of extracting and coarse-graining information from immunofluorescent stained in vivo images whilst preserving the heterogeneity of initial vessel density and hence nutrient supply, of a tumor. The cellular activity inside local regions, created by the boxes of the imposed grid, is described by the biologically ubiquitous discrete logistic map, which is a good approximation of the universally observed Gompertzian growth behavior of cancer40, which preserves the biological importance of the unit cell and accounts for the local cell-cell and cell-microenvironment interactions41,42,43. From a mathematical modeling point of view, previous applications of the logistic equation to cancer either apply the logistic equation to the entire tumor44 or form a continuous time spatial diffusion equation which allows the unrealistic transfer of arbitrarily small amounts of cancer across space. The presented application is novel since the discretization into smaller regions, forming a grid of coupled autonomous logistic equations, allows universal growth behavior in each region to be applied using appropriate growth rates extracted from an image (regional vessel density), whilst coupling allows inter-regional diffusion, migration and communication. Our model specifically accounts for the fact that cancer consists physically of discrete units (cells) and hence there is a lower bound below which a continuous formulation of cell density becomes incorrect, yet above which the changing size and mass of cells deems a continuous description valid.

Figure 1
figure 1

Model setup.

Schematic of implementing in vivo immunofluorescent image data into the mathematical model as initial condition. The right half illustrates the step-by-step procedure of extracting and coarse graining information from the in vivo images. The tumor growth behavior in each box is modeled as Logistic growth and the model equations capture the fundamental interplay between an evolving population of cancer cells which is fed by – and feeds back on – an underlying nutrient network and its spreading through transport processes. The blue inset shows the time ordering of events at each time step of the mathematical model.

Experimental in vivo growth data fitting to model

To test the agreement of the model results to in vivo growth data, the model was seeded with in vivo images of muscle vasculature from the flanks of untreated BALB/c mice as initial condition, representing potential regions of primary tumor growth (implantation zone in mouse model). Figure 2 shows a growth corridor (blue shaded area) predicted from our model. The corridor is formed by the blue dashed line, which is the average growth curve of 2000 model simulations, where the tumor seed was virtually implanted at different locations on the image for each run and the solid blue lines are the standard deviation. Hence, Fig. 2 predicts the most likely growth behavior if a cancer were to originate somewhere in the environment depicted by the image. Assuming BALB/c mice generally have similar initial conditions (image in Fig. 2), the flanks of 5 mice were implanted with murine mammary cell line EMT6-HER2, whose growth data (dark blue points) fall inside the growth corridor with good agreement. Similar analysis was repeated with many more images of various other regions of the flanks in BALB/c mice confirming robust behavior in the predicted growth corridors.

Figure 2
figure 2

Model substantiation.

Fit of in vivo experimental growth data to a growth corridor determined by the model seeded with an image of initial muscle vascular structure in BALB/c mice. The growth corridor (shaded region) is formed by the average growth curve of virtual tumor implantations (blue dashed line) and standard deviation (solid blue lines), showing good agreement with growth data. The yellow circle and box are representative regions with fast and slow growth curves respectively. The three insets show sample growth patterns of the virtual tumor with (a) necrotic core and proliferating ring46 (b) diffusive growth in nutrient rich environments47 (c) multiple source growth. The purple data points show growth data of EMT6-HER2 tumors treated with an endostatin-antibody fusion protein and the dashed purple line model results, where we mimicked the decreasing effect of the protein on the vasculature. The model used the same number of injections and time interval as in the in vivo experiments.

The yellow circle and box in Fig. 2 are representative regions with fast and slow growth curves respectively. The three insets Fig. 2a–2c show sample growth patterns of the virtual tumor. Interestingly, the distance between high density vessel sources is of vital importance (as analyzed Fig. 3). In the absence of angiogenesis, should the maximum radius, l, to which a tumor can grow from a single source be smaller than the distance to the next vessel, d, then the tumor will remain a finite size and eventually starve and die, as shown in Fig. 2b. However, if l > d, Fig. 2c depicts how a neighboring source can facilitate continued growth.

Figure 3
figure 3

Growth behavior in finite source environments.

Finite number of sources may be due to anti-angiogenic treatment. For example, (a) is an immunofluorescent image of short, scattered vessels inside an EMT6-HER2 tumor treated with an endostatin-antibody fusion protein, (b) is the coarse-grained result for model implementation and (c) shows the remaining sources if a threshold is applied. Model results suggest that small clusters of cancer cells remaining around vessels can lead to more aggressive re-growth (inset of (c)). (d) – (g) illustrate the collapse of data points onto a linear relationship by accounting for appropriate average distance between sources and final radius of tumor (see inset illustrations). The distance between sources is calculated of (d) all sources when maximally distributed, (e) all sources in the system at actual position, (f) sources inside the tumor (g) sources inside the tumor yet neglecting the sources on the perimeter of the tumor where the tumor cell density is too small to result in growth.

Systematic treatment strategy analysis

Also, we briefly illustrate the efficacy of the model to systematically analyze all possible treatment strategies (dosage, interval, frequency of which drug/treatment combination), to predict personalized treatment effectiveness. The dashed purple line in Fig. 2 shows that our model's predictions of treatment are also consistent with in vivo experiments. Results are shown for BALB/c mice implanted with cell line EMT6-HER2 and subsequently injected with αHER2-huEndo fusion proteins45 which is an endostatin-antibody fusion protein specifically engineered to target the HER2 receptor and limit the growth of adjacent blood vessels through the action of a fused anti-angiogenic endostatin domain. By measuring the biological effect of a single injection of the endostatin-antibody fusion protein on the tumor, the model subsequently simulated the same number of injections and same time interval as in the in vivo experiments with good agreement. A systematic analysis of all possible treatment strategies of varying dosage, frequency and schedule will be presented elsewhere.

Universal growth behavior scaling from vessel location

Given the dynamic interplay of the growing tumor with the underlying vessel structure, Fig. 3 analyses the growth behavior in finite source environments (Fig. 3a–3c), where the distance between vessels, alluded to in Fig. 2a–2c, becomes important to the tumor's progression. For example, Fig. 3a shows an immunofluorescent image of vessels inside an EMT6-HER2 tumor that has been treated with an endostatin-antibody fusion protein45 resulting in a finite number of short, small-clustered and scattered vessels. Model results suggest that in cases where small clusters of cancer cells survive around remaining vessels (even after anti-angiogenic treatment) islands of re-growth can occur, as shown in the inset of Fig. 3c, leading to a more aggressive re-growth rate than before treatment. The heterogeneous nature of remaining vessel locations not only presents the problem of indefinite re-growth of cancer by movement beyond the finite radius each vessel can sustain individually, but also the additional challenge of optimizing and analyzing drug delivery strategies for efficacy and efficiency. Specifically, the vasculature in a tumor is highly irregular in structure creating regions completely void of vessels and regions densely packed with vessels. This implies that drug delivery will be highly disproportional not reaching all areas of the tumor48. Even with the advent of a genetically targeted approach where a drug is specifically designed for a patient, there still exists a need for delivery analysis locally in primary tumors as well as globally via metastatic spread. Our model is ideally suited to systematically analyze the effect of vascular structure on delivery, in addition to the countless possible multiple drug therapies48, to help optimize experimental design by taking into account the heterogeneities of the system which usually cause variation and hence unpredictability.

Much like forest fires49 or nutrient source manipulation in conservation corridor analysis50, the distance between vessel sources is key in determining the most likely progression of a tumor. Hence, in Fig. 3, we identify a measure based on the distance between sources to predict its evolution and hence identify the key targets which allow control and limitation of the final tumor growth size. The results of Fig. 3g show that as long as the initial vasculature heterogeneity can be quantified, the diversity in final tumor size disappears under an universal scaling. The initial vasculature structure can be used to assess where a particular patient's tumor sits on this scaled curve thereby providing a prediction of its final size.

Figures 3d–3g show the same data using different measures of average distance between sources and each dot is one realization of the model simulations. Central to the universal scaling of Fig. 3g is identifying which sources to include, as illustrated in the insets of Figs. 3d–3g. In Fig. 3d, we calculated the average distance between all sources, where the sources were assumed to be maximally separated and plotted against the final radius of the tumor, rmax . Figure 3e calculates the average distance between all sources using the actual position of the sources within the system. Yet, as argued in Fig. 2a–2c sources only become significant if their distance is smaller than the potential radius of the growing tumor. Hence, in Fig. 3f, only the distances between sources on or inside the final tumor boundary were included. This resulted in the clusters of points below the red line of Fig. 3e to be pushed closer to the red line, as indicated by the blue arrow. Finally, the scatter below the red line of Fig. 3f can be explained by circumstances where the growing tumor does reach another source, yet the cancer cell density pushed into them is below a critical threshold, too little to result in cell proliferation. Hence, eliminating such cases resulted in the final plot Fig. 3g.

The results of Fig. 3 illustrate the important possibility of systematically targeting specific vessels. For example, in Fig 3c, say a cancer seed originating from the three sources in the centre is predicted to result in a final tumor radius depicted by the red circle determined from Fig. 3f. Inside the radius is a fourth source, highlighted by the green arrow in Fig. 3c, which would facilitate further growth to a new radius. Hence, one could minimally target the single source (green arrow) to prevent further growth, rather than taking more invasive measure and thus, perhaps preserve functionality of the affected system. This analysis has a powerful consequence, in that, it gives the surgeon an exact size of tumor to remove, or which vessel sources to block in order to control the final size of the tumor.

Multi-scalability of local model to predict global metastasis data

Finally, we explore the extendibility and multi-scalability of the model to the global phenomenon of metastasis. Metastasis is usually treated as an entirely separate topic in modeling since the underlying biology is different. However, as illustrated in Fig. 4, we successfully apply the same model equations to both single tumors and metastasis, simply by changing the interpretation of the terms: Instead of the cancer cell diffusion to neighboring boxes on a regular lattice representing free space for growth, the boxes represent lymph nodes and the underlying inter-box connections the lymphatic system. As shown in the lower panel of Fig. 4, the growth within each box is now a macro-level version of the single tumor model in which we use the logistic growth map to apply to the entire space in which a tumor may grow. In other words, we simply apply our exact same mathematical equations (Eqns. (1)(3) in Methods) on a different scale and with a different network for diffusion (Fig. 4). As discussed in Ref. (8), cancer cells can spread to other organs at every time step from the beginning of the primary tumor's growth.

Figure 4
figure 4

Model implementation and results of metastasis.

Metastasis uses the same model as for single tumor growth. The upper panel shows average cumulative distribution plotted for different underlying networks: random (blue solid) and scale-free (orange solid). The clinical data (red circles) lies somewhere between the two types of networks suggesting that the precise network structure does not matter to make a first-order prediction. The red dashed line is a fit to the clinical data by varying the r distribution51. Finally, the green dashed line shows the Poisson complementary cumulative distribution function with mean equal to the mean number of affected sites from the clinical data. It is the expected curve based on the assumption that nodes get infected independently (i.e. random) and illustrates that the empirical and theory are fat-tailed compared to purely random. The lower panel shows a schematic of the similarities of single tumor growth and metastasis using the same model.

Interestingly, as shown in Fig. 4, the results do not depend sensitively on the choice of network – as long as it is irregular (e.g. random or scale-free). The upper panel shows metastasis on different underlying networks: random (blue solid) and scale-free (orange solid). Clearly, the clinical data (red circles) lies somewhere between the two types of networks. Generally, diffusion on networks is reasonably insensitive to the network structure as long as the distribution of links is fairly broad and the distance over which the diffusion takes place is short. In other words, the cancer does not spread far enough into the network to feel the difference between a random network and scale-free network - at least, to first order. This implies that knowledge of people's precise lymphatic network details are not required in order to make a first-order prediction of the probability that n nodes will be positive. The red dashed line in Fig. 4 is a fit to the clinical data53. The green dashed line shows the Poisson complementary cumulative distribution function with mean equal to the mean number of affected sites from the clinical data, which demonstrates that the empirical data and theory are fat-tailed compared to purely random.


The ever increasing number of discoveries about the biological processes underlying tumor progression, set against the many aspects which still remain unknown or ambiguous, has led to the creation of many extremely complex mathematical descriptions (perhaps motivated by the desire to include as many biological details as possible) which are computationally intensive and include many unknown parameters. These models can be generally categorized into two extremes: The molecular level, trying to understand the intra and intercellular signaling dynamics of individual or small clusters of cells and the tissue level, modeling the emergence of phenomena such as angiogenesis and metastasis. Yet, the molecular models are difficult to scale up to enough cells to comprise a full organ, whilst the tissue level models often lack the heterogeneities vital to an accurate and personalized, prediction.

In this paper, we presented a model which aims to bridge this gap and provide a practical, multi-scale model capable to be seeded with in vivo images to predict the most likely tumor growth behavior through prediction corridors, as well as subsequent spreading behavior of metastasis. For both length scales, the model results show good agreement to in vivo growth data of a cell line EMT6-HER2 model in BALB/c mice, as well as clinical human patient data of metastasis. Furthermore, we outline the use of the model for systematic treatment analysis, focusing on the effect of vascular structure on drug delivery. A novel scaling relationship between the tumor and the underlying nutrient sources not only predicts the most likely progression of the tumor, but also identifies key vessel target sites to optimally control tumor growth.

Despite its quantitative accuracy and simplicity, our model's neglect of the wealth of known biological details associated with cell biology and physiology, may attract criticism of our minimal-model approach as resembling the ‘Consider a spherical cow...’ cliché typically levied at physicists. However, the existing gap between model sophistication and clinical need demands the exploration of such an approach in our opinion. The unique coupling of image data with the mathematical model allows information about the heterogeneity of the system to be preserved and more importantly, be utilized for individualized prognosis. Hence, the model cancer growth is directly driven by in vivo information and demonstrates a new approach to modeling cancer growth using patient specific data, showing good agreement at multiple length scales for a variety of phenomena. As such it complements existing theoretical approaches rather than replacing them and can be integrated with them in the future.


Mathematical model

The blue inset of Fig.1 shows the time ordering of events at each time step of the mathematical model and corresponds to two coupled, discrete equations applied within each box of the grid. The first equation is:

where and are the cancer concentrations at the beginning and end of time interval Δt. The tumor growth rate, ri,n , at time step n in box i is assumed to be directly proportional to the vessel density in box i extracted from the image. As described below (see Image information extraction), the initial cancer, and endothelial cell densities, ri,n = 0 , are extracted from in vivo images stained for both types of cells at time t = 0.

At this stage only vessel density is considered as the primary driving force of growth rate. Nutrients determine individual cell behavior and thus population response. Yet, rather than applying a single r as was done in previous models44, we split the system for maximal heterogeneity, making the model highly non-deterministic.

Furthermore, the model equations capture the tendency of any overcrowding of cancer cells to crush the vasculature or cause it to regress, leading to lower nutrient supply52 and thus slower growth. Hence, the equation for vessel density (i.e. cancer growth rate) is given by:

Following our methodology of implementing a coarse grained view of an universally observed growth behavior, the single parameter α incorporates all details which may contribute to the vessel density such as vessel stabilizing and/or destabilizing factors, (anti) angiogenic growth factors, as well as any therapeutic agents. This may be crude and biologically unsatisfying, yet due to its observation driven nature, in short, this setup captures the co-evolving, dynamic, feedback-driven interplay between cancer and the underlying nutrient network52.

Finally, cancer cell mobility to neighboring boxes is modeled via simple diffusion:

where again, similar to α, β represents all properties of the environment, which could influence the ease of cancer cell diffusion53, as well as other local gradients such as chemotaxis and haptotaxis. More specifically, α is some function of growth promotion (negative α) and inhibition (positive α) factors which influence angiogenesis and nutrient deprivation conditions via the adaptive and feedback-driven value of ri,n at all time steps. Despite a long list of possible influences, we expect as a first approximation that the values of α and β will take on similar values for patients from similar risk groups. In the future, we will make α and β functions of specific factors, making the model more biologically accurate and hence more patient specific. For example as a first proof-of-principle, we show in Fig. 2 that the effect of an anti-angiogenic endostatin-antibody fusion protein which breaks down the vessel structure and halts angiogenesis (as verified by in vivo images), can be successfully mimicked by reflecting the fusion proteins destructive effect on the vessels by means of a positive value of α in the model.

Image information extraction

Without loss of information the image colors are converted to grayscale for easier manipulation. A grid is imposed, where each box size of the grid is chosen to correspond to approximately 100 cells. The box size can be adapted according to the system and type of image. Finally, the individual pixel values contained in each grid are added and averaged, to represent the average vessel density in each box. These values then provide the initial condition for the tumor's evolution, making the model as patient-specific as desired. This procedure can be repeated for any property of interest.

In vivo imaging procedure

In vivo immunofluorescent images of the muscle vascular structure in the flanks of BALB/c mice were taken prior to implantation s.c. contra-laterally of murine mammary tumor cell line EMT6-HER2 (1×106 cells per mouse). Two mice were sacrificed for blood vessel analysis. Histologic sections of muscle from the sacrificed mice were analyzed using immunofluorescent staining for DAPI (red color; example image has 10× magnification).

Growth corridor analysis

We only seeded blood vessel structure for Fig. 2 since an analysis was done prior to implantation of the tumor seed. Hence, we virtually implanted a tumor in the mathematical model, recorded the resulting growth curve and repeated this procedure 2000 times (corresponding to approximately a 10% sample size), each time using a different location. Furthermore, this procedure was repeated with images from various locations in the flanks of the BALB/c mice. Similar initial conditions can be seeded into the model concerning the size and location of an already growing tumor. Immunofluorescent images can be taken of the growing tumor and hence, a similar procedure can be performed. The chosen parameter values for the presented results are at this stage arbitrary, yet our general findings are robust to variations in α and β. A table of parameter values for various cell line types will be presented elsewhere.

Endostatin-antibody fusion protein treatment

BALB/c mice (n = 4 per group) were implanted s.c. contralaterally with EMT6 and EMT6-HER2 (1×106 cells per mouse), followed on day 4 by equimolar injections every other day (7 time treatments) of αHER2-huEndo-P125A (42 μg), or PBS. On day 12, two mice were sacrificed for the blood vessel analysis after four treatments. We analyzed histologic sections of tumors from the sacrificed mice using immunofluorescent staining for PECAM (vessels) and DAPI for counter-staining of the nucleus. Although still a preliminary result, the dashed purple line in Fig. 2 is an average of 1000 model results where we mimicked the inhibitory effect of the protein on the vasculature formation.

Metastasis network analysis

For each of the 100 sites (or nodes), we drew ri from a normal distribution N(µ = 1, σ = 0.2) and took α, β for non-primary tumor sites to be β = 0.8, α = 0.2. Furthermore, for each trial we seeded the tumor at a randomly picked primary site with C0 = 0.5 and β' = 0.6, α' = 0.4. The average cumulative distribution of 3000 trials is plotted for both types of networks, where a new network was generated for each trial. The clinical data was fitted by varying the r distribution; in this case a skewed distribution with peak close to r = 0.01. However, the same fit can be achieved by starting from a random network and simply adding more and more links, slowly tending towards a scale-free network.