Stochastic Biological System-of-Systems Modelling for iPSC Culture

Large-scale manufacturing of induced pluripotent stem cells (iPSCs) is essential for cell therapies and regenerative medicines. Yet, iPSCs form large cell aggregates in suspension bioreactors, resulting in insufficient nutrient supply and extra metabolic waste build-up for the cells located at the core. Since subtle changes in micro-environment can lead to a heterogeneous cell population, a novel Biological System-of-Systems (Bio-SoS) framework is proposed to model cell-to-cell interactions, spatial and metabolic heterogeneity, and cell response to micro-environmental variation. Building on stochastic metabolic reaction network, aggregation kinetics, and reaction-diffusion mechanisms, the Bio-SoS model characterizes causal interdependencies at individual cell, aggregate, and cell population levels. It has a modular design that enables data integration and improves predictions for different monolayer and aggregate culture processes. In addition, a variance decomposition analysis is derived to quantify the impact of factors (i.e., aggregate size) on cell product health and quality heterogeneity.


Introduction
Since induced pluripotent stem cells (iPSCs) have the potential to differentiate into any cell type in the body, the discovery and availability of iPSCs has created numerous opportunities in the fields of regenerative medicines, cell therapies, drug discovery, and tissue engineering 1,2 .Large-scale manufacturing of iPSCs will be essential to support these clinical and research applications, not currently met to date.
While iPSCs can be grown in small colonies in adherent monolayers (such as in petri-dish), this culture method is not ideal for large-scale manufacturing.Thus, stirred suspension cultures are recommended for manufacturing purposes 3,4 .Due to strong cell-to-cell interactions, one challenge that confronts large-scale iPSC cultures is the tendency of iPSCs to self-aggregate, which can lead to large aggregates in suspension bioreactors.Further, stirred suspension cultures can experience complex hydrodynamics conditions that can affect cell behaviours, including aggregation, metabolism, apoptosis, expansion, and differentiation 5 .
As stem cells are highly sensitive to environmental conditions, a critical concern for iPSC aggregate cultures is spatial heterogeneity.Basically, nutrients and differentiation factors can become unevenly distributed in iPSC aggregates.These variable conditions across an aggregate can result in heterogeneous cell populations and cell death 6,7,8 .Therefore, understanding spatial heterogeneity and controlling aggregate size is crucial for predictable and consistent results in iPSC cultures.
Metabolic kinetic modelling is valuable to advance the scientific understanding of stem cell cultures, predict cell growth and functional behaviours, and guide feeding and agitation strategies.While several models have been developed to describe different aspects of iPSC cultures, a comprehensive mechanistic model that captures the multi-scale and heterogeneity nature of iPSC aggregate cultures is still lacking.For example, the Monod-type unstructured-unsegregated culture model developed by Galvanauskas et al. 9 assumes homogeneity of the entire iPSC population and does not account for intracellular variability.The population balance model proposed by Bartolini et al. 10 focuses on the temporal evolution of the size distribution of embryonic stem cell (ESC) aggregates, while Van Winkle et al. 6 modelled a single human ESC (hESC) spherical aggregate, yet neglected the dynamics of the cell population.Recently, Odenwelder et al. 11 applied a metabolic flux analysis (MFA) 12,13 to describe the effects of glucose and lactate concentrations on monolayer iPSCs.And Wu et al. 14 developed a mechanistic model that described cell aggregation and considered oxygen transport for iPSC aggregate cultures, however, neglected metabolite diffusion, intracellular metabolism, and cell metabolic heterogeneity.
Existing mechanistic models for mammalian cell cultures often assume the cell population is homogenous.
And thus, these metabolic models overlook the stochastic nature of living cells.For cultures such as Chinese hamster ovary (CHO) cells and yeast, the assumption of spatial homogeneity may be sufficient in wellstirred systems where cell aggregates are non-existent or only involve a few cells.In these cases, unsegregated models, such as dynamic flux balance analyses 12,15 and cell culture kinetic models 16 have been demonstrated to predict culture performance.But unlike single cell suspension cultures, aggregate cultures, common to iPSCs, are not homogenous.As the aggregate size increases, spatial heterogeneity likely increases cell-to-cell variation and increases metabolic heterogeneity.While stochastic chemical kinetics 17,18 and queueing network models 19,20 have been developed to describe the dynamics and inherent stochasticity of chemically reacting systems and metabolic networks, these approaches do not account for the effect of cell aggregation on metabolite variations and potential cell heterogeneity common to iPSC cultures.Additionally, spatial variance component analysis (SVCA) was introduced to study spatial heterogeneity contributed by cell-to-cell interactions, intrinsic effect, and environmental effect 21 .However, this study is built on the random effect model that is challenging to faithfully characterize the underlying complex interaction mechanisms and causal interdependencies from molecular to cellular to macroscopic levels.
In a broader scope, several multiscale bioprocess models have been developed for studying aggregate structures in the clinical studies, such as cancer tissues 22,23 , and biofilms 24 .Notable examples include software packages like PhysiCell 25 , Chaste 26 , ChemChaste 27 , BMX 28 , and Morpheus 29 .These tools fall under the category of agent-based models, which integrate cell-based metabolic reactions, reactiondiffusion processes, and individual cell growth dynamics.Furthermore, these models have the potential to include specialized stochastic cell models, allowing them to represent regulatory structures and account for the mechanical and chemical interactions.While these methods could be effective for modelling small aggregate systems, these models become computationally prohibitive when applied to large-scale iPSC manufacturing processes.
To overcome the limitations of existing methods, this paper presents a novel biological systems-of-system (Bio-SoS) model with modular design to characterize the mechanisms in iPSC aggregate cultures and describe the spatial-temporal causal interdependencies from individual cells to cell aggregates, and to cell population.It is built on mechanistic modules, including: (1) a stochastic metabolic network (SMN) model describing cell metabolic response to environmental variation; (2) a population balance model (PBM) describing the iPSC proliferation, collision, and aggregation process due to cell-to-cell interactions; and (3) a reaction-diffusion model (RSM) characterizing spatial heterogeneity of micro-environmental conditions which accounts for the different diffusion rates of nutrient and metabolite molecules through cell aggregates.
The modular design allows us to organically assemble individual mechanistic modules to construct a Bio-SoS model for iPSC aggregate cultures, which facilitates the integration of data and information collected from different cell culture processes with different dynamics and spatial heterogeneous micro-environment conditions.This Bio-SoS modelling philosophy for multi-scale bioprocess is extendable and applicable to general biological ecosystems, accounting for complex interactions and inherent stochasticity.
The proposed Bio-SoS framework represents a novel in-silico process analytical technology (PAT) for iPSC aggregate cultures, offering valuable insights into individual cell response to micro-environmental changes and metabolic information across distinct aggregate locations.The key contributions are threefold.First, the proposed multi-scale Bio-SoS model has a modular design that facilitates the integration of data from different culture conditions (such as 2D monolayer cultures used in the lab and aggregate cultures recommended for industrial manufacturing).This modular design will accelerate iPSC large-scale manufacturing process development without conducting extensive experiments.It can provide a valuable tool for yield optimization and cell product quality consistency control.Second, since the objective for iPSC cultures is undifferentiated biomass, i.e., the cells are the product, the proposed variance decomposition analysis on the Bio-SoS mechanistic model enhances our systematic understanding of iPSC culture spatial heterogeneity, predicts the impact of critical factors (i.e., aggregate size) on metabolic heterogeneity, and enables us to avoid unwanted cell death or heterogeneous cell populations during expansion.It can identify the root-causes of cell-to-cell variation, analyze intracellular metabolism at different positions within an aggregate, and determine the optimal aggregate size range across different bioreactor conditions.Third, the proposed Bio-SoS simulation provides a comprehensive and efficient way to account for the complex and stochastic nature of iPSC aggregate and monolayer cultures.In contrast to existing agent-based simulation tools [22][23][24][25][26][27][28][29] , the Bio-SoS model is a specialized simulation tool tailored for iPSC cultures.It can improve simulation efficiency through: (1) modelling iPSC aggregation using population balance equations to ensure a more efficient representation of cellular dynamics during aggregation; (2) constructing a coarse-grained approach that divides each aggregate into small spherical shells and assumes cellular metabolisms and micro-environment are homogeneous in each spherical shell; and (3) utilizing a single-cell stochastic metabolic network model to predict the cell metabolic response to environmental variation.

Bio-SoS Model for Multi-scale iPSC Cultures
Within (3-dimensional or 3D) aggregate cultures, cells proliferate and interact with each other at three levels: intracellular metabolic reactions, metabolites and nutrients diffusion through the cell aggregates, and aggregate interactions within bulk culture fluid.Each individual cell within the culture is a complex system with potentially stochastic behavior.And as these cells cluster together, a larger system of systems is formed with micro-environmental conditions shaped by cell interactions.Taken together, aggregates comprise the entire iPSC population in the bioreactor interacting with the bulk media.
The proposed Bio-SoS model characterizes the complex interactions and regulatory reaction network mechanisms from molecular-to cellular-, and to macro-kinetics; see Fig. 1.Based on the causal interdependencies between these biological systems, the Bio-SoS model was constructed incorporating three interconnected mechanistic modules summarized below (see "Methods" section for details).The intracellular regulatory metabolic reaction network is connected to the aggregate via the transport of metabolites across the cellular membrane, while cells within an aggregate are linked through the diffusion of intra-aggregate metabolites and nutrient supplies.This Bio-SoS reaction network mechanism for iPSC aggregate culture is characterized through the integration of a reaction-diffusion model (RDM) and singlecell stochastic metabolic networks (SMNs) quantifying the metabolic heterogeneity.The aggregation process of the cell population is effectively characterized using PBM.The Bio-SoS model can both sample and computational efficiently characterize the spatial heterogeneity in micro-environmental conditions and the variability in cell-to-cell metabolism.The SMN is built to describe the intracellular reactions to microenvironmental perturbations.Suppose cells residing in each spherical shell are homogeneous.The metabolic heterogeneity is characterized by the difference in fluxes between exterior cells and inner cells locating at different positions in aggregates.

Population balance model (PBM)
. A PBM is used to describe the aggregation process accounting for cell proliferation and coalescence/collisions of iPSC aggregates.
2. Reaction-diffusion Model (RDM).It is constructed to describe cell-to-cell interactions and the intraaggregate fluid dynamics, which is a fundamental process involving the transport of reacting nutrients and metabolites through iPSC aggregates.This movement is influenced by the metabolite production/consumption and a diffusive flux that is proportional to the local concentration gradient.
By considering the complex interplay between individual cell metabolic reactions and their microenvironment, this model can estimate how long cells might experience a particular condition.The diffusion coefficients used in this study are listed in Supplementary Table 4. 3. Stochastic metabolic network (SMN) for single cells.The metabolic molecular reaction network is constructed from the curated biochemical interactions based on experimental data 11 .These interactions characterize intracellular metabolisms and the cellular responses to environmental perturbations.By following the studies on stochastic molecular reaction models 19,20 , a Poisson process is utilized for modelling molecular enzymatic reaction occurrence within the SMN to capture the random nature of enzyme-substrate collisions and subsequent reactions.The expected metabolic reaction rates for homogeneous cell population were derived from Wang et al. 30 and calibrated using 2D monolayer culture experimental data of extracellular metabolites and intracellular isotopic measurements from Odenwelder et al. 11 .The iPSC stochastic metabolic network (composed of 32 metabolites and 38 reactions) mainly focuses on central carbon metabolism and contains the major reactions for glucose consumption, the TCA cycle, anaplerosis, pentose phosphate pathway (PPP), and amino acid metabolism.The reactions, metabolites and enzymes that are used in this study are listed in Supplementary Table 1, 2, and 3.
Built on the Bio-SoS model characterizing the causal interdependencies of iPSC aggregate culture, a spatial variance analysis approach is derived to study the root-cause of cell metabolic heterogeneity (see "Methods" section for details), which can be used to guide the control of iPSC cultures, including aggregate size.Because the system comprises aggregates of multiple sizes, cell aggregates were numerically classified into  groups.To further consider the spatial heterogeneity in each aggregate with radius, say  ℓ for ℓ = 1,2, … , , the aggregates were divided into spherical shells; see the illustration of cells located in the -th spherical shell [ ℓ "#$ ,  ℓ " ] in Fig. 1(b).The proposed Bio-SoS variance analysis approach can assess the contribution of spatial heterogeneity from aggregates with different sizes to the cell product metabolic variation and quantify the metabolic heterogeneity through studying the relative changes of metabolic fluxes for cells residing at different locations and time within an aggregate.

Model Validation for Bio-SoS Framework
The three individual model modules (i.e., PBM, RDM, and SMN) were validated with experimental data from literature 7,11,14 , then the integrated Bio-SoS model was validated based on both monolayer culture data from Odenwelder et al. 11 and aggregate culture data from Kwok et al. 7 .
First, the PBM was validated using cell proliferation and aggregation dynamics data from Wu et al. 14 .The experimental and model predicted aggregate growth profiles are shown in Fig. 2a and the time-varying aggregate size distribution is illustrated in Fig. 2b.Overall, the PBM captured the aggregation dynamics.
The model-based prediction is obtained through solving the PBM numerically by using finite differences over equally spaced interval (i.e., 1 m) in the radius domain of an aggregate and 0.1 hour in the time domain 31 .
Second, the RDM was validated by using stirred-tank suspension bioreactor data from Wu et al. 14 .The transport of metabolites and nutrients through 3D aggregates is crucial for the effectiveness of cultivation systems.This is especially significant in stem cell cultures, as cell metabolism determines iPSC product functional quality attributes and it plays an important role in determining pluripotency and lineage specification 32 .Given the stem cell aggregate property (i.e., porosity and tortuosity) estimated based on the measures in Wu et al. 14 , the RDM was solved analytically by applying a local quasi-steady-state approximation, i.e., aggregates are spatially decomposed into equally sized small spherical shells and the metabolic concentrations are assumed to be homogenous within each shell.Then, the model predicted profiles of heterogeneous metabolite concentrations under steady-state conditions are shown in Fig. 2c and Supplementary Fig. 2. It should be noted that these results were obtained from the single representative aggregate simulation.Clearly, the diffusion of nutrients (i.e., glucose, glutamine, serine) becomes limited as the aggregate radius increases leading to reduced nutrient levels for cells residing closer to the core.For large aggregates, the cells in the inner area are starving due to shortage of glucose, glutamine, and serine.
This finding is consistent with the glucose transport limitation in human mesenchymal stem cells reported by Zhong et al. 33 Also, limited diffusion resulted in higher concentration of metabolite wastes and inhibitors (i.e., lactate, ammonium, glutamate) in the aggregates as the size increases.Future investigations of nutrient and metabolic waste levels within aggregates using imaging techniques could provide more evidence.2.
In addition to being able to model the monolayer culture behavior, we sought to further validate the Bio-SoS model on aggregate cultures.As mentioned earlier, the Bio-SoS model was developed and trained by utilizing monolayer culture data from K3 iPSCs 11 and cell aggregation profiles from hESCs 14 .To test its extrapolation prediction performance, we used a collection of aggregate culture data of FSiPS (short for FS hiPSC clone 2) collected from stirred suspension bioreactor from Kwok et al.

In-silico Study of iPSC Aggregate Health Conditions
Aggregate size influences the transport of nutrients, oxygen, metabolites, and growth factors due to the diffusion of each species, which is related to molecular size and charge.Cells inside large aggregates potentially experience hypoxic conditions and poor nutrient supply due to the limited diffusion of oxygen and nutrients from the bulk media to the center.As a result, the cell growth is reduced.It has already been reported that oxygen concentration in the center region of larger embryoid bodies (EBs) (400 m in diameter) is 50% lower compared to in medium EBs (200 m in diameter), which caused apoptosis at the core due to low oxygen diffusion 6,33 .Further, low diffusion can limit removal of waste metabolites, such as lactate and ammonia, and increase cell necrosis in the center region as these species reach critical levels.
Previous studies have shown that high lactate concentration can have negative effects on stem cell pluripotency.For example, murine embryonic stem cells (mESCs) and iPSCs proliferated and maintained pluripotency in lactate concentrations up to 40 mM 35 , while hESCs exhibited decreased pluripotency through Tra-1-60 expression after continuous passaging in 22 mM lactate-containing media 36 .Ouyang et al. 37 showed that murine embryonic stem cells (mESC) are extremely sensitive to the presence of lactate in media.They inferred that the growth of mESC was inhibited at lactate greater than 16 mM, and that high lactate affected the cell pluripotency.Glucose has also been observed to affect the embryoid body formation potential of mESC when the concentration less than 2.5 mM 35 .Therefore, in this study, we define cells as being unhealthy if glucose was less than 2.5 mM or lactate was greater than 40 mM.
The Bio-SoS model was used to predict the fraction of unhealthy cells using glucose and lactate concentration criteria (glucose < 2.5 mM or lactate > 40 mM) for aggregates of radius ranging from 30 m to 600 m.Fig. 5a shows the percentages of unhealthy cells within an aggregate with a particular radius with varying bulk glucose concentration and a fixed lactate concentration of 0 mM.Notably, all aggregates, regardless of size, exhibited 100% of cells being classified as unhealthy when the bulk glucose concentration fell below 2.5 mM.In contrast, aggregates of radii less than 150 m, had higher fractions of healthy cells for bulk glucose concentrations greater than 5 mM.Fig. 5b shows the percentage of unhealthy cells within an aggregate with a particular radius with varying bulk lactate concentration and a fixed glucose concentration of 20 mM.A different pattern was observed for cell health when the bulk lactate concentration changed.The fraction of unhealthy cells increases with the increased aggregate size, and it appears that cells are more sensitive to higher lactate concentrations compared to lower glucose concentrations.

In-silico Study of Cell-to-Cell Metabolic Heterogeneity
It is well accepted that culture conditions need to remain uniform for optimal iPSC metabolic function and pluripotency maintenance.The formation of large aggregates increases the risk of heterogeneity due to limited diffusion of nutrients and growth factors, and removal of waste metabolites.In order to describe the expected metabolic flux rate response of homogeneous cell population to environmental change, the metabolic regulatory networks were adapted from a companion work by Wang et al , we construct a SMN for single cells that can characterize the stochastic reaction network for individual cells and cell metabolic response to environmental change.
To understand how cell metabolism is affected by the aggregate size and micro-environmental changes, the metabolic reaction intracellular flux rates that are sensitive to the aggregate size are shown in Fig. 6a.
Further, the key extracellular metabolite concentrations are shown in Fig. 6b.The flux rates and metabolite a b concentrations were standardized by subtracting the mean and dividing by the standard deviation calculated over all aggregates (see "Method" section).The expected flux rates of the (forward) reactions (e.g., GLNSf and HK) decreased gradually as the aggregate sizes increased, and the flux rate of the reverse reaction (e.g., GLNSr) increased as the aggregate sizes increase.The biomass flux was observed to be relatively high for small aggregates, suggesting such aggregates provide favorable metabolic conditions for biomass production.The reversible reactions such as GLNS, LacT, GLDH, ASTA were affected by the extracellular metabolite concentrations.For example, the flux of LacTr increases in large aggregates due to the high extracellular lactate levels while the fluxes of GLNSr and GLDHr increase due to low-glutamine levels inside large aggregates (Fig. 6b).In summary, these results confirm the importance of maintaining aggregate size for optimal biomass production.It has been observed previously that aggregates exceeding a diameter of 300 m experience hypoxia and low core nutrient concentrations, resulting in cell necrosis and loss of pluripotency 38 .We further investigated the metabolic heterogeneity between inner and outer cells within aggregates of radii ranging from 60 m to 360 m (Fig. 7).The blue dashed line depicts the flux rate at the outer cell, while the colored dots represent the relative flux rate of inner cells at various locations.This ratio was calculated as the flux rate of the inner cell relative to the outer cell.The results in Fig. 7 also show the significant increase in metabolic heterogeneity from 24 to 48 hours, as the greater flux deviations observed in the latter period.This can be attributed to the relative shortage of intracellular metabolites after 48 hours.For example, the 48-hour serine consumption flux (SAL) becomes more diverse due to low serine, while the poor nutrient supply affected biomass production leading to heterogeneous biomass fluxes (see Supplementary Fig. 1).To quantitatively assess the overall metabolic heterogeneity of each aggregate, the difference between flux rates of the inner and outer cells was calculated (Supplementary Table 5 and 6).Based on the simulation results, the metabolic heterogeneity of a 240-μm aggregate was found to be approximately 14 times greater than that of a 60-μm aggregate.After 48 hours, both aggregate sizes had doubled in metabolic heterogeneity.These simulations demonstrate that metabolic heterogeneity increases with both aggregate size and culture duration.
It is also worth noting that TCA (tricarboxylic acid) cycle has been observed to maintain a stable flux in cell aggregates of different sizes as shown in Fig. 7 (see the reactions highlighted in blue).This observation supports the widely accepted understanding that the TCA cycle is a fundamental housekeeping metabolic pathway (i.e., the flux rate maintains relatively stable in different conditions) 39 , and that it is tightly regulated 40 .

Optimal Aggregate Size
An important factor that needs to be strictly controlled in bioreactors is the aggregate size.If iPSC aggregates become too large, uneven diffusion of nutrients and growth factors can occur, causing cell death or heterogeneous cell populations.The optimal aggregate size can be determined by considering the balance between metabolic heterogeneity and biomass yield.
Simulations of a batch bioreactor were performed to calculate the mean and relative standard deviation (RSD) for biomass productivity (Fig. 8a-c) for a 72-hour culture.Supplementary Fig. 3 presents additional simulation results for the mean and RSD of biomass productivity at 6-hour intervals.These results indicated a yield-heterogeneity trade-off, as the mean biomass productivity consistently decreased while the RSDs consistently increased with increased aggregate size.It was also observed that the RSDs consistently increased with the culture time.During the initial 24 hours, the majority of aggregates in the bioreactors exhibited relatively stable biomass productivity (Fig. 8a).The aggregate size distribution (Fig. 2b) indicated that the number of aggregates with a radius greater than 200 m was very low.This observation implies that the heterogeneity within the cell population is limited early in cultures.However, after 48 hours, the culture entered steady state (as reported previously 11,30 ).Fig. 8b shows increased heterogeneity of biomass yield for larger aggregates.Notably, there was a distinct cut-off radius of 150 m after 48-hour culture.For larger radius, the RSD increased significantly faster.These findings suggest the presence of an optimal aggregate size range that minimizes biomass productivity variability and aggregate heterogeneity.
Lastly, as Fig. 8c suggests, the biomass productivity decreased significantly as the available nutrients were nearly depleted around 72 hours.Overall, aggregates with a radius below 150 m had relatively high biomass productivity and a relatively low heterogeneity for 48-hour cultures.These simulation results suggest that iPSC cultures should be maintained as uniform aggregates around 150 m in radius and should not be greater than 200 m in radius.This evidence agrees with experiment observations in the literature 6, 8, 38 .

Discussion
The introduction of the Bio-SoS framework marks an important step in the development of multi-scale bioprocess mechanistic model and analytical technology for iPSC cultures.By effectively characterizing cell-to-cell interactions and complex mechanisms of iPSC aggregate culture, the proposed Bio-SoS not only supports data integration and enables the prediction of metabolic dynamics at different scales, but also models spatial heterogeneity and quantifies metabolic intensities and variations across different aggregate locations and time.The model's versatility is shown by its ability to study cell health in different-sized aggregates, predict micro-environmental profile metabolite concentrations under diverse conditions, and determine the optimal aggregate size range and feeding strategy for maximum bioreactor efficiency.
Ultimately, the proposed Bio-SoS presents a promising pathway towards low cost and high-quality personalized cell therapies and offers a platform for optimizing large-scale iPSC manufacturing without extensive experiments.

a b c
The model validation study demonstrated that the proposed Bio-SoS model has good extrapolation prediction performance.Even though only the monolayer culture data of K3 iPSC with various initial conditions was used, the model was able to predict the metabolic dynamics for a different cell line (FSiPS) cultured in aggregates.This demonstrates the potential for transferring the learning from a monolayer culture to a stirred-suspension culture.From the methodological perspective, this success relies on the proposed biological system-of-systems modelling principle that can organically assemble single-cell model to construct a Bio-SoS model for iPSC aggregate cultures, characterizing cell-to-cell interactions and cell response to spatial heterogeneous micro-environment conditions.
The versatility of the proposed framework is demonstrated by three in-silico scenarios.First, the Bio-SoS model was employed to simulate cell health within aggregates of varying sizes and extracellular metabolite concentrations.These simulation results were able to reproduce the trend observed for experimental findings.Specifically, the model predicted that larger iPSC aggregates would experience hypoxic conditions and poor nutrient supply for inner cells.This stress would then lead to reduced cell growth and increased cell necrosis at the center of the aggregates.Second, the Bio-SoS model was used to investigate the impact of aggregate size on both cell metabolism and micro-environmental heterogeneities.Those simulations provided quantitative insights into the variation of metabolic fluxes across cells at different positions within iPSC aggregates and under varying culture conditions.Third, the Bio-SoS model was used to identify the optimal aggregate size range to maximize efficiency and yield of pluripotent stem cells cultured in a bioreactor.These simulations suggested an ideal aggregate size range is around 150 m in radius for biomass productivity, which agreed with published reports 8,38 .
The system of modules can be a) assembled to facilitate data integration and improve the prediction of monolayer and aggregate cultures; and b) utilized to control cell product quality heterogeneity and optimize the production process performance.Also, the modular design of the Bio-SoS model will facilitate future extensions to incorporate cell responses (i.e., flux rates, phenotype, gene expression) to both mechanical (e.g., stirred speed, hydrodynamic force) and chemical (e.g., concentrations of nutrients/metabolites/oxygen, pH) environmental conditions.

Cell Proliferation and Aggregation Dynamics Modelling
The aggregation process is characterized by the dynamic evolution of a density profile of aggregates.Let (, ) denote the average number of aggregates or clusters of size  (i.e., mass and volume) at time , where mass is equal to a buoyant density times volume.Due to the fact that buoyant density of cells does not vary significantly during the cell cycle, it was assumed the size  corresponds either to the volume or mass with the relationship with radius (denoted by ) 14 , i.e.,  ∝  . .The important factors impacting on the merge rate of cell aggregates include: (1) mass of aggregates; (2) position or distance of aggregates; and (3) velocity.Therefore, for any cell aggregate with mass , the rate at which it merges with another aggregate with mass  / is proportional to their densities, i.e., the average number of coalescences, (,  / ) →  +  / , per unit time per unit volume is $ 0 (, )( / , )(| / ).
The aggregation kernel, denoted by (| / ), is associated with the average coalescence or merge rate of cell aggregates with mass  and  / .In this work, only a purely coalescing process was considered where the merge rate was only dependent on the size of aggregates, ℎ  ,  $ ,  > 0. ( The exponential part in equation (1) accounts for the fact that there is the decreasing coalescence efficiency as the aggregate size increases 14  The break-up effects of aggregates were considered negligible and the profile (, ) evolves through coalescence events only.A population balance model (PBM) is employed to simulate the temporal evolution of size of the aggregates accounting for cell proliferation contributions and collisions between particles to aggregate size growth 14,41,42 .The temporal evolution of cell aggregate profile becomes, .Since the size of each cell is  8 , the average number of cells in each aggregate is given by  ;<== =  $ ()/ 8 .Let  denotes the cell density and  is the volume.
Because the system comprises aggregates of multiple sizes, cell aggregates were classified into  groups.

Reaction-Diffusion Model
The reaction-diffusion model provides a means of characterizing cell-to-cell interactions and quantifying the spatial heterogeneity in micro-environment conditions, by modelling the dynamic change of spatial and temporal concentration profiles of crucial components such as nutrients and metabolites like glucose and lactate.Suppose that the aggregates have cell spheres.A cluster of cells in a liquid medium was considered and assumed to have uniform diffusion in all directions along the radial axis.Let  = l $ , … ,  " ' m denote the spatial profile of intracellular metabolite concentrations for species  = 1,2, … ,  + at time  and located at radius .Let  = l $ , … ,  " ' m denote the spatial profile of concentration of each extracellular species  = 1,2, … ,  ; at time  and located at radius .
The concentration of the -th species at time  and located at aggregate radius , denoted by  & (, ), satisfies the set of reaction-diffusion equations in spherical coordinate with boundary conditions, i.e., Due to the reaction and diffusion, the metabolite concentrations at different locations or depth of a cell aggregate are different.We divide each aggregate of radius  ℓ into  ℓ spherical shells (shaped like a 3D annulus or "rings") and assume that the cellular metabolisms are homogeneous in each spherical shell.The -th spherical shell of the ℓ-th aggregate is located between the radii between  ℓ "#$ and  ℓ " ; see the illustration in Fig. 1b.The volume of the spherical shell is the difference between the enclosed volume of the outer sphere and the enclosed volume of the inner sphere: . W ℓ ("#$) X . .
Quasi-steady-state solution in radial cases of   (, ).It is challenging to directly solve the transient reaction-diffusion equation (4).Thus, a similar approach as that used in McMurtrey 44 was applied to account for the quasi-steady-state setting, i.e.,

D;
( D-= 0. A quasi-steady-state solution of reaction-diffusion model is widely used in cell aggregation literature 44,45 to describe the metabolite diffusion inside an aggregate.In each -th spherical shell, the extracellular concentration profile or micro-environmental condition can be solved analytically 44 as the nutrient consumption or inhibitor formulation rates  & (, ) are constant in the spherical shell specified by [ ℓ "#$ ,  ℓ " ]: where  & (ℓ,",-) is the consumption/formulation rate of metabolite  & in the -th spherical shell of the ℓth aggregate and  & (ℓ,",-) denotes the the boundary condition of metabolite  & in the -th spherical shell of the ℓ-th aggregate at time .The formulation/consumption rate  & (ℓ,",-) depends on the intracellular metabolism of cells located in the  -th spherical shell of the ℓ -th aggregate and its mathematical formulation will be discussed in the next section.
Here,  & is the effective diffusion coefficient of the -th species.Given the porosity  and tortuosity , , where  & ' is the diffusion coefficient of the  -th species in aqueous condition.In a 4-day culture in spinner flasks at the agitation rate of 60 rpm, the porosity and tortuosity of hESC aggregates were reported to be 0.270 ± 0.007 unitless and 1.551 ± 0.086 unitless, respectively 14 .
Since the metabolites are transported between the outer spherical shell and extra-aggregate environment, the boundary condition for the  ℓ -th spherical shell is given by  & (ℓ,G ℓ ,-) =  & ().Further, metabolites are freely transported between spherical shells and thus the metabolite concentrations on the outer surface of the ( − 1)-th spherical shell equal to the concentrations on the inner surface of the -th spherical shell, i.e., mathematically,  & (ℓ,"#$,-) =  & W ℓ (") , X.The solution in equation (5) illustrates that: (a) the nutrient concentrations, increasing from aggregate center to the surface, becomes highest on the surface (, ) =  & (); and (b) the metabolite waste concentrations, decreasing from center to surface, is highest at the center Estimating reaction rate  & in reaction-diffusion model.Conceptually the reaction rate  & (nmol/(m .⋅ ℎ)) of the metabolite  & is the weighted sum of the associated flux rates (nmol/ 10 : cells/h).Thus, putting all in vectors, we have  =  ⋅  ⋅  where  represents the stoichiometric matrix,  is the flux rate vector, and  is a unit conversion factor.By assuming the averaged single cell radius to be 7.5 m, the average cell volume can be estimated by  =  ⋅ 4@ .× 7.5 .(m ./cell) with porosity .Then 1 nmol/10 : cells/h of flux rate can be converted to the reaction rate of

Stochastic Metabolic Reaction Network Model for Single-Cell
For each iPSC, let us consider a metabolic network system with  species ( $ ,  0 , … ,  " ) which interact with each other through  chemical reactions.Let  " & () denote the number of molecules of metabolite  & in an individual cell at time .Denote the vector  • = ( " $ ,  " 0 , … ,  " " ).Here  • includes both intracellular and extracellular metabolites, i.e.,  • = ( ",  ").The metabolic reaction network can be expressed as 46 where  &U and  &U / are nonnegative integers.Let  U be the vector whose  -th component is  &U representing the number of molecules of the -th metabolite consumed in the -th reaction.Let  U / be the When the flux rate is constant  U l • ( ()m =  U , the dynamics of expected metabolite concentration is the same as dynamic flux balance model 15,48 (see "Supplementary Methods").Thus, the deterministic ODE-based metabolic model can be interpreted as a special case of the stochastic reaction model with mean metabolite concentrations and constant flux rates, ignoring cell-to-cell variation.
Second, the heterogeneous cell population is considered.Let  & () denote the concentration of metabolite  & (i.e., the number of molecules per unit of volume) in the system at time .Here the metabolite concentrations include both intracellular and extracellular metabolites, i.e.,  = (, ).Recall that the cell aggregates were divided to  groups with different size.Each aggregate is divided to  ℓ spherical shells with ( ℓ " , t) and ( ℓ " , t) representing the concentrations of extracellular (within aggregate) and intracellular metabolites in the ℓ-th aggregate at the radius of  ℓ " at time Thus, the cell density (i.e., the cell number per unit of volume) in the -th spherical shell of the ℓ-th aggregate group at time  can be computed based on the aggregate density and the number of cells in that spherical shell, i.e., .
where the aggregate density  ℓ () (equation (3)) was obtained by solving the population balance model.
Recall that  W ℓ (") , X can be obtained by solving the reaction-diffusion model (see "Methods" section).
After that, we can divide the culture time where (0) is the initial concentrations.

Variance Decomposition Analysis
Stem cells, such as iPS cells, that proliferate in the culture process can undergo considerable variation in metabolic reaction rates due to the change in specific environmental conditions.To better understand the metabolic heterogeneity of stem cell culture and identifies sources of uncertainty, we developed a variance decomposition method based on the Bio-SoS model.It can be used to explain how the variance measuring the metabolic heterogeneity is contributed by different sized cell aggregates and how the variance is spatially distributed within each cell aggregate.
Relative standard deviation (RSD) is a statistical measurement that describes the spread of data with respect to the mean.The RSD of metabolite  & in each ℓ-th cell aggregate at time  is expressed by • Cell Population Heterogeneity: Given well-controlled bulk bioreactor conditions, we suppose the independence of different aggregates.During the time interval (,  + d], the total variance of the metabolite  & concentration change in the bio-system, denoted by  &,- 0 , can be divided into the contribution from each ℓ-th group of aggregates with radius  ℓ , i.e., This study can support the analysis and provide the insights on both metabolic heterogeneity and spatial heterogeneity.The metabolic heterogeneity can be assessed through studying the relative changes of metabolic fluxes for cells residing at different locations and time within an aggregate as shown in Fig. 7. At the same time, the impact of spatial heterogeneity can be investigated through measuring the contribution of (biomass) output variance from the cell aggregates with different radius size as illustrated in Fig. 8.
This can guide the selection of optimal aggregate size to maximize the expected yield and control the output variance.

Flux and Metabolite Standardization
Standardization of the flux rates and metabolite concentrations used was performed in three steps: (1) the average fluxes and metabolite concentrations were calculated for each ℓ-th group of aggregates with radius  ℓ ranging from 30 to 600 m; (2) the means and standard deviations of those fluxes and concentrations over all aggregates were computed; and (3) for each flux or metabolite concentration, the mean value was subtracted and the result was divided by the corresponding standard deviation.For example, average flux of the -th reaction in the ℓ-th group of aggregates is given by ̅ and the standardization of this within-aggregate flux is performed by

Statistics and reproducibility
The details about sample sizes, parameters and steps of statistical analysis are provided in relevant methods and results sections, figure legends, and tables where applicable.All statistical analysis is performed in Python.
population balance model are adapted from Wu et al. 14

Fig. 1
Fig. 1 An illustration of the proposed Bio-SoS for iPSC aggregate culture (Created with BioRender.com).(a) The sizes of cell aggregates grow over culture time.The PBM model was used to describe the dynamics of cell aggregation process and predict the aggregate size distribution.(b) Cell aggregates are spatially decomposed into equally sized small spherical shells.In each shell, the metabolic concentrations are assumed to be homogenous.RDM was used to characterize the intra-aggregate diffusion dynamics of nutrient and metabolite concentrations, showing that the nutrient concentrations increased as distance from the center increases, while the metabolic waste concentrations decreased from the center.(c) The SMN is built to describe the intracellular reactions to microenvironmental perturbations.Suppose cells residing in each spherical shell are homogeneous.The metabolic heterogeneity is characterized by the difference in fluxes between exterior cells and inner cells locating at different positions in aggregates.

Fig. 2 Fig. 3
Fig. 2 PBM and RDM models to predict iPSC aggregation process in stirred suspension bioreactor cultures.Panel (a) shows the experimental and the predicted aggregate size; Panels (b) shows representative experimental data and PBM predictions of size (radius) distribution for cell aggregates  % (, ) at 24, 48 and 72 hr.The dotted orange lines represent the experimental data from Wu et al. 14 , while the blue lines show the PBM size distributions.(c) Steady-state concentration profiles of Glucose, Lactate, and Alanine for iPSC aggregates with radii ranging from 60 m to 600 m, each with identical values for diffusion coefficients  & ' , porosity  = 0.27 and tortuosity  = 1.5 and initial intracellular metabolite concentrations.

7 4 .Fig. 4
Fig. 4 Glucose and lactate concentration profiles predicted by the Bio-SoS model in comparison with the literature data collected from stirred-tank suspension bioreactor.(a) Glucose (b) Lactate.The Bio-SoS model (red line) was trained only with the monolayer data for K3 iPSC described in Odenwelder et al. 11 .The experimental data shown (blue circle, Mean±SD) are for FSiPS cultured in aggregate and described in Kwok et al. 7 .Note the trends predicted by the Bio-SoS model, which was trained on monolayer data, match the observed trends of aggregate cultures.

Fig. 5
Fig. 5 Unhealthy cells as a function of aggregate radius and metabolite concentration.(a) Effect of the glucose concentration and aggregate radius on percentage of unhealthy cells at a fixed bulk lactate concentration of 0 mM.(b) Effect of the lactate concentration and aggregate radius on percentage of unhealthy cells at a fixed bulk glucose concentration of 20 mM.Results are averaged by 30 simulation runs.

Fig. 6
Fig. 6 Expected reaction flux rates and extracellular metabolite concentrations by aggregate size.(a) Standardized flux rates.(b) Standardized metabolite concentrations.Simulations were conducted with constant bulk glucose (25 mM), lactate (5 mM), and alanine (0.1 mM) concentrations.To facilitate visual comparison, all flux rates and concentrations were standardized by subtracting the mean and dividing the standard deviation calculated over all aggregates.Aggregates range from 30 m to 600 m.Results are averaged by 30 simulation runs.

Fig. 7 Fig. 7
Fig. 7 Metabolic heterogeneity of aggregates of varying sizes (60, 120, 240 and 360  in radius) at 24 hours and 48 hours.Specifically, the relative fluxes were compared for cells located at the center of the aggregates of various sizes, called the inner cells.The blue dashed lines represent relative fluxes of outer cells, which were consistent across all four aggregate sizes.The circle (purple) represents the relative flux of the inners cells for the 360 m aggregates; triangle (red) represents that for 240 m aggregate; star (green) represents that for 120  m aggregate; square (orange) represents that for 60  m aggregate; normalized to the fluxes of outer cells.Results are averaged by 30 simulation runs.

Fig. 8
Fig. 8 Biomass yield and variance decomposition analysis.In panel, we simulated the biomass production of different aggregates (ranging from 30 m to 300 m in radius) for a duration of one hour with 100 replicates.The recorded results in panels (a), (b) and (c) correspond to 24-hour, 48-hour and 72hour culture respectively.The blue line represents the mean, and the red line represents the RSD of the biomass (i.e.,  (&)*'++,ℓ,-with ℓ = 30,45, … ,300 from equation (11)).
fact that the collisions of these aggregates and the shear induced by micro-fluids with nonlinear velocity profile through a pore network within each cell cluster can lead to film drainage.It involves the draining of the film surrounding the interactive aggregates to permit actual coalescence of aggregates with size  and  / .The effects from other factors are lumped into the constant  of the model.In general, except aggregate size, the merge rate should be a function of other factors, such as agitation speed and medium compositions.The cell-to-cell and cell-to-medium interactions influence the aggregation process and the design of the kernel (⋅).This falls outside the scope of this paper but is a subject for future work.The kernel parameters (  = 1.26 × 10 #. hr -1 m #2/.,  $ = 1.94 × 10 #4 m #.' ,  = 8.06 × 10 #$ ) are adapted from Wu et al.14 .

6 $
5 , )( / , )( 5 | / ) / 6 − ^(, )( / , )(| / ) where the density function (, ) is defined such that (, ) is the fraction of aggregates with sizes between  and  +  at time  per unit volume of the culture.Let  8 denote the size of a single cell.The first and second terms on the right side of equation (2) coming from the Smoluchowski coagulation equation, accounting for random coalescence.The first term describes the formation of aggregates with size  / due to the aggregation of cell clusters with size  / >  8 and  5 =  −  / .The last two terms represent the "loss" of aggregate with size  due to their merging with those of size  / to form larger aggregates and due to cell proliferation /.Following Wu et al. 14 , the rate of aggregate size was modelled by the change / due to cell proliferation by Gompertz equation 42 :   =  9 ⋅ log M   N where  is the aggregate size reached as  → ∞ and  9 is a constant characterizing the cell proliferation.The Gompertz equation parameters ( = 9.71 × 10 : and  9 = 5.72 × 10 #. hr -1 ) are adapted from Wu et al. 14 .Based on the iPSC culture data utilized in this study, the shrinking of iPSC aggregates has not been observed.If the experimental observation indicates the cell death impact, our model can be extended to account for the shrinking of the iPSC aggregates due to the cell death.The transformation of the density function from aggregate size  to aggregate radius  can be done with the rule of transformation of random variables.For the strictly monotone transformation :  ↦  !" and its inverse  =  #$ () =  .the probability density function of the aggregate radius, denoted by  % (, ), is given by,  % (, ) = l #$ ()m n  #$ ()  n = 2( ., ) 0 .The first moments of the cell distribution yield the total aggregate size (accounting for the aggregate porosity ),  $ () = (1 − ) ∫ (, ) 7 6 $ subject to boundary conditions: ()  & (, ) =  & ()  () (0, )  = 0 where  & is the effective diffusion coefficient,  & () is the extra-aggregate environment conditions (e.g., the bulk concentrations of glucose and lactate in the bioreactor), and  is the aggregate size.Here the reaction rate  & is negative for nutrient consumption and positive for inhibitor formulation.The boundary condition (a) comes from the fact that the metabolite concentration on the surface of an aggregate equals to that measured in the bioreactor; and (b) is produced by the condition of spherical symmetry.

Equation ( 6 ) 8 d𝜏. Equation ( 9 )
vector whose -th component is  &U / representing the number of molecules of the -th metabolite produced by the -th reaction.By writing them in matrix form, i.e.,  = ( $ , … ,  W ) and  / = ( $ / , … ,  W / ), we define the stoichiometric matrix as  =  / − .Let  U ( •) be the flux rate at which the -th reaction occurs for an individual cell, that is, the propensity/intensity of the -th reaction as a function of the number of molecules of metabolites.Later on, we will show that the rate  U ( •) is a generalized flux rate, analogous to that of the metabolic flux analysis (MFA).Let () be the -dimensional vector whose -th component is  U (), representing the number of times the -th molecular reaction has occurred by time  in a single cell.Thus, at time , the profile of intracellular and extracellular metabolite molecules follows the dynamics, i.e.,  •() =  •(0) + - U ()l U / −  U m =  •(0) +  ⋅ () is a mass balance equation where  •() −  •(0) is the difference of number of molecules in time interval [0, ] and  ⋅ () is the net amount of reaction output by time .The number of occurrences of the  -th molecular reaction,  U ( + d) −  U () , during time interval [,  + d] is modelled as a Poisson random variable with mean (and variance)  U l •()md and  U () follows a nonhomogeneous Poisson process with the flux rate or molecule generation rate  U l •()m 17, 20, 46 .Based on the definition of Poisson process, during the time interval (,  + d], the probability that the th reaction occurs  times becomes 46 Pl U ( + d) −  U ( ) Define the expected accumulated occurrences of the -th molecular reaction by time t, i.e., Λ U () = ∫  U l •()m can be rewritten as  U () =  WΛ U ()X, where () is a unit (or rate one) Poisson process.Let () = WlΛ $ ()m, … , lΛ W ()mX, then equation (6) becomes,  •() =  •(0) + - WΛ U ()X l U / −  U m Mechanistic Model for iPSC Aggregate Culture By assembling the single-cell metabolic network with population balance and reaction-diffusion models, we developed the Bio-SoS characterizing the dynamics of iPSC aggregate culture with either homogeneous or non-homogeneous cell population.Since cells create the driving force for the dynamics of the culture process, we focus on modelling the evolution of cell metabolic dynamics and associated microenvironmental conditions.Let  denote the cell density, i.e., number of cells per unit of volume.Let  • ( () denote the number of molecules of metabolites within and "around" each -th cell with  = 1,2, … , .First, the homogeneous cell population is assumed.Let (t, ) = ∑  • ( () .e., the number of molecules of metabolites per unit of volume) in the system at time .Given a cell density () (i.e., the number of cells per unit of volume), the change in the number of metabolite molecules during the short time interval (,  + d] is obtained by summing the changes from each cell, i.e., Δ(, ) ≡ ( + Δ, ) − (, ) = -[ • ( ( + Δ) −  • ( ()] [0, ] into G equally spaced intervals [( − 1)Δ, Δ] with  = 1,2, … , , and numerically compute the metabolite concentrations (t) as () ≈ (0) + ---- W̅ U (ℓ,",(a#$)`-) &U ) represent the concentration change of metabolite  & due to the reactions occurred in cells located in the -th spherical shell of the ℓ-th aggregate during time interval (,  + d] .Then, the total variance of the metabolite  & in an individual aggregate, denoted by  &,ℓ,- 0 , can be expressed by

Table 5 . Comparison of reaction flux rates between outer and inner cells in aggregates with a radius of 60, 120,240 and 360 𝝁𝐦 at 24 hours.
. The diffusion coefficients are from multiple