Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Investigating heterogeneities of live mesenchymal stromal cells using AI-based label-free imaging

## Abstract

Mesenchymal stromal cells (MSCs) are multipotent cells that have great potential for regenerative medicine, tissue repair, and immunotherapy. Unfortunately, the outcomes of MSC-based research and therapies can be highly inconsistent and difficult to reproduce, largely due to the inherently significant heterogeneity in MSCs, which has not been well investigated. To quantify cell heterogeneity, a standard approach is to measure marker expression on the protein level via immunochemistry assays. Performing such measurements non-invasively and at scale has remained challenging as conventional methods such as flow cytometry and immunofluorescence microscopy typically require cell fixation and laborious sample preparation. Here, we developed an artificial intelligence (AI)-based method that converts transmitted light microscopy images of MSCs into quantitative measurements of protein expression levels. By training a U-Net+ conditional generative adversarial network (cGAN) model that accurately (mean $$r_s$$ = 0.77) predicts expression of 8 MSC-specific markers, we showed that expression of surface markers provides a heterogeneity characterization that is complementary to conventional cell-level morphological analyses. Using this label-free imaging method, we also observed a multi-marker temporal-spatial fluctuation of protein distributions in live MSCs. These demonstrations suggest that our AI-based microscopy can be utilized to perform quantitative, non-invasive, single-cell, and multi-marker characterizations of heterogeneous live MSC culture. Our method provides a foundational step toward the instant integrative assessment of MSC properties, which is critical for high-throughput screening and quality control in cellular therapies.

## Introduction

Mesenchymal stromal cell (MSC) therapy offers a promising treatment option for inflammatory disorders1, immune-mediated diseases2, and neurological damages3. MSCs are popular for their easy in vitro proliferation4 and ability to differentiate into various mesenchymal tissues4,5. MSC tissue culture has been shown to modulate immune responses for autoimmune diseases, sepsis, and transplant surgery6,7,8,9. The efficacy of these critical therapeutic functions is difficult to predict due to the inherent MSC heterogeneity, arising from variations between MSC donors, batches, and clones10,11,12,13. Understanding and controlling this notorious cellular heterogeneity in vitro has been a central challenge in both basic and translational research14,15. Unfortunately, it has been difficult to precisely define and measure such heterogeneity in live MSCs, since current MSC characterization methods are either relatively non-specific, time-consuming, or invasive16,17.

Traditional MSC characterization methods have focused on analyzing either the morphological phenotypes18,19,20,21,22,23 or expression of surface markers24,25,26,27. While measuring cellular morphology allows for non-invasive assessment of live cells, no strong scientific link has been established between cellular morphology and MSC characteristics. In contrast, molecular-based gene expression measurements, which primarily rely on immunochemistry assays for quantifying protein expression, provide more rigorous analysis of MSC characteristics28. The downside of these assays is that they usually require laborious sample preparation, damaging cell fixation, and repetitive experimental procedures for multiple marker analysis29,30. Collectively, the inability to perform non-invasive, instant, and biologically rigorous characterizations substantially limit the therapeutic potential of MSCs.

To address these technical challenges, we developed an AI-based label-free imaging platform for studying the characteristic heterogeneity in live MSCs. AI-based image translation approaches have been proven to be useful tools for visualizing 3D organelle structures30, distinguishing cell types and states31,32, and improving image quality33,34,35,36, segmentation37, and restoration38.

In this work, we developed a deep convolutional neural network (CNN) to predict immunofluorescent-like images using transmitted light microscopy data. This non-invasive imaging method allows us to directly observe gene expression on the protein level for multiple markers simultaneously in live MSCs. We demonstrated that the AI-translated immunofluorescence images showed a high degree of similarity to the ground truth and can be applied to quantitative studies. Using the results of our trained model, we combined gene expression levels of 8 common MSC markers with the measurement of 12 morphological features to draw deeper conclusions about MSC heterogeneity. From clustering and PCA analysis we found that while both morphology and gene expression can be used to effectively identify MSC heterogeneities, they provide contrasting assessments on cell characteristics. Lastly, utilizing the AI-predicted images, we performed further analyses to profile gene expression fluctuations and characterize protein localizations on the sub-cellular level. Our model has potential to advance current methods for assessing cellular heterogeneity and can be broadly applied in clinical and therapeutic applications.

## Results

### Machine learning model development

Our machine learning (ML) model pairs two convolutional neural networks (CNNs), one a generator and the other a discriminator (Fig. 1a). The generator network, based on a U-Net architecture (Fig. S1)30,39, learns the nonlinear relationship between the phase contrast image and its corresponding fluorescent target. During training, the neural network minimizes the loss function that quantifies the pixel-to-pixel differences between the predicted and target image. Here, the target is a fluorescent image of cultured MSCs at passage 3 that are immunofluorescently labeled (“Methods”). After propagating through the U-Net, the resultant image is loaded into the discriminator network, developed using a conditional generative adversarial network (cGAN) (Fig. S1)40, which evaluates the probability of similarity between prediction and target. During training, the discriminator output, an adaptive loss function, is iterated over a set number of cycles through the model to optimize the prediction. Following the completion of the iteration process, the resultant trained model was used for predicting virtual fluorescent MSC labels from unused phase contrast data.

We found that the trained AI model is able to predict images that closely recapitulate many features found in the target on multiple length scales. An example of model prediction is shown in Fig. 1b where the target is a fluorescent image of MSCs stained for CD105, a membrane glycoprotein that has been commonly used as a positive MSC marker. As demonstrated in the figure, the prediction captured the overall intensity distribution, cell morphology, and sub-cellular structures including protein localization and nucleus shape.

One advantage of our machine learning approach is the individual training process for each marker, allowing us to avoid emission channel cross-contamination in multi-color imaging and improve the fluorescent signal specificity. Moreover, since individual models can be directly combined upon completion of training, there is no limit on the number of markers that can be predicted simultaneously from one phase contrast input. Applying this feature, we trained a panel of genes comprising 8 MSC markers, which were strategically selected to cover a wide range of MSC properties; the markers included CD105, CD90, and CD73 to define MSC subpopulations41, CD29 to implicate cell migration42, CD146 to show vascular smooth muscle commitment43, CD106 (VCAM1)44 and STRO-145 to illustrate MSC immunomodulation capacity, and CD44 to designate the cell-cell and cell-matrix interactions46. We show the combined predicted image composite in Fig. 1c, in which each marker exhibits distinct distributions and local enrichments within the cells. We have shown that this approach can also be used to process stitched tile images (Fig. S2).

To evaluate the prediction accuracy of our model, we calculated the pixel-level Pearson correlation coefficient $$r_s$$ (“Methods”) between the prediction and target images. Overall, we observed fairly high accuracy in all tested markers with an $$r_s$$ average of $$\sim$$ 0.77 (Fig. 1d). However, the actual prediction accuracy depends on the specific marker of interest. For example, the surface proteins that show a more uniform distribution (e.g., CD105 and CD44) exhibit higher values of $$r_s$$ than the markers that show more protein localization (e.g., CD73 and CD29). By comparing our results with the data from a generator-only model (i.e., U-Net only), we found that the addition of a discriminator CNN improved both the prediction accuracy and robustness (Fig. S3) for most tested markers. Such improvements were also found in the Laplacian Pearson correlation analysis $$r_{lap}$$ (Fig. S4, “Methods”) that compares the Laplacian fields of prediction and target images (Fig. 1e). Here, the real-space analysis $$r_s$$ focuses on the overall intensity distribution, whereas the Laplacian-space analysis $$r_{lap}$$ highlights the details of signal variation. This prediction accuracy evaluation was repeated using a different measurement metric (absolute error)47 and consistent results were obtained (Fig. S5).

### Characterizations of predicted images

To characterize the robustness of our ML model, we studied how training data properties including dataset size, signal-to-noise ratio (SNR) of the target images, and the presence of impurities, influence the prediction accuracy. For simplicity, we focused on CD105 as it shows the highest Pearson correlation coefficient, allowing us to perturb the training data systematically.

First, we analyzed differences in prediction accuracy by modulating the number of training images. We found that with only 20 training images, relatively accurate predictions can readily be achieved by the ML model (Fig. 2a). Nevertheless, more training images still improved the prediction outcome, particularly for the local intensity variation, as shown in the rightmost column of Fig. 2a. This observation is also illustrated by the analyses of $$r_s$$ (Fig. 2b) and $$r_{lap}$$ (Fig. 2c).

We also tested the sensitivity of our ML model to input images with varying signal-to-noise ratios (SNR). In doing this, we lowered the microscope illumination power stepwise and repeated the data acquisition and training process for nine different intensities (Fig. S6). This experiment aimed to simulate the imaging of fluorophores with different emission intensities. Within the SNR range explored, we found that our ML model is able to reliably reproduce the input images without overfitting (Fig. 2d). We also found that our model smoothed the intensity distribution in low SNR images and effectively denoised the data, consistent with previous findings40,48. To summarize the relationship between the prediction accuracy and image brightness, we plotted $$r_s$$ (Fig. 2e) and $$r_{lap}$$ (Fig. 2f) as a function of SNR. We also added a theoretical upper bound $$C_{max}$$ that assumes the pixel-level noise is uncorrelated for comparison (“Methods”). We found that $$r_s (SNR)$$ qualitatively followed the trend of $$C_{max}(SNR)$$, maintaining a high value $$>0.8$$ throughout all tested SNRs (Fig. 2e). However, $$r_{lap}$$ showed a more significant dependence on SNR, illustrated by its nearly halved value at SNR $$=8$$ (Fig. 2f). We further identified that this more pronounced decay is attributed to both the weakly-correlated noise fluctuation and the less-accurate prediction of local intensity (Fig. S7).

Lastly, we evaluated the impacts of the impurities in the images (e.g., microscope slide dusts, fluorescent speckles, non-specific binding) on the CNN training (Fig. S8). We tested three cases, in which either the training or test (for making new predictions) images contain impurities or both of them show artifacts. Our results suggested that the training set quality has a greater impact on the prediction accuracy than the test set quality (Fig. S8). This suggests that careful sample preparation and quality control of the training set plays a crucial role in achieving optimal prediction outcomes. In addition, we found that while impurities in the phase contrast images can propagate through to the prediction, most of the fluorescent speckles that only appear in the target images can be suppressed through ML training (Fig. S8a).

### Multi-marker heterogeneity

Our AI-based imaging approach opens up the possibility to perform a multi-marker characterization of MSCs in situ, which has been suggested to provide a more accurate description of the cell state27,49. Compared to multi-color fluorescence-activated cell sorting (FACS), which has been commonly used for similar subpopulation analyses, our method is free from spectral overlaps and does not require complex compensation procedures50. Specifically, since each model (marker) was trained with only one fluorescent channel at a time, the final multi-color composite is simply a combination of predictions from individually trained models. This training process is free from the influence of protein colocalization, and, theoretically, allows us to include unlimited markers for characterizing cells. We performed a single-cell analysis (Fig. 3a, Fig. S9) of 500 cells where we measured the overall fluorescent intensity over individual cells for all eight markers (Fig. 3b). This microscopy-based quantitative measurement has been extensively validated and conducted for evaluating gene expression51,52,53. After calculating the cell-level overall pixel intensity, we plotted the prediction value versus the target value and determined the corresponding correlation coefficient (Fig. S10). Consistent with the pixel-to-pixel comparison (Fig. 1d), we also observed high prediction-target correlations for all tested markers.

In addition to quantifying the gene expression level, our approach also allowed us to simultaneously characterize the MSC phenotypic morphology (Fig. S11, “Methods”), as it has been shown that non-invasive microscopy can potentially be used to predict the stem cell fate of human MSCs for clinical applications based on morphology19,55. Such integrative gene-morphology datasets then enabled us to directly elucidate the relationship between such essential cell properties. To visualize the correlation between the tested 20 variables (8 MSC markers and 12 morphological features), we performed an unsupervised hierarchical clustering analysis and created the resulting heatmap shown in Fig. 3c. The clustering indicated that while both MSC markers and cell morphology were able to separate cell subpopulations, their identified heterogeneities exhibited distinct patterns. Specifically, among all tested morphological features, we found that the cell aspect ratio and nucleus-to-cell ratio have higher correlations to MSC markers as they clustered together far from the rest of the morphological features. In contrast, nucleus area, cell total area, and perimeter show a clustering result that is less correlated with all other properties explored. Moreover, our data suggest that MSC markers provide additional orthogonal assessments on cell behavior that may not be fully revealed using the morphology analysis alone.

To further validate this finding, we conducted a Principal Component Analysis (PCA) to examine the correlation between MSC marker expression and cell morphology (Fig. 3d,e). In addition, we used a previous morphology-based categorization protocol54 to divide the cells into three subgroups: elongated and spindle-shaped cells (SS, blue); small, triangular or star-shaped cells (RS, green); and large, flattened cells with prominent nuclei (FC, red). This additional characterization data was incorporated into our PCA gene-morphology plot to validate previous findings. In the PCA biplot, where only the MSC markers were used (Fig. 3d), we did not observe any segregation of the data points. As expected, after including the morphological features in the PCA (Fig. 3e), these previously categorized subpopulations showed indications of separation. This incomplete segregation may be attributed to human errors in classifying the cells based on visual grouping.

Our observed low correlation between surface marker expression and morphology are consistent with previous studies19,56,57,58,59. For example, it has been shown that expression of MSC markers does not directly report the proliferation capacity and senescence of MSCs56. Also, recent experiments have used the morphological phenotype to better quantitatively predict the MSC immunomodulation potency19,57 and differentiation potential58,59. In fact, many MSC markers were originally designed to exclude hematopoietic cells that are usually co-harvested during isolation. While they can effectively validate the identity and origin of MSCs27,60,61, they may be insufficient to describe the cell state and function. Collectively, our gene-morphology data suggested that combining surface protein marker expression with morphological features could provide a more complete and specific evaluation of MSC heterogeneity. This finding highlights another important advantage of conducting cell assessment using our non-invasive AI-based imaging method.

### Realtime gene expression measurement

Realtime gene expression assessment is crucial to evaluate MSC culture quality for basic research, therapeutics, and drug screening12. Our AI-based microscopy allows us to perform instantaneous gene expression measurements on both the cellular and subcellular levels. Demonstrating this technique, we acquired a phase-contrast time-lapse video of live MSCs over 48 hours and generated the corresponding AI-predicted fluorescent videos for 4 MSC markers, CD105, CD29, CD44, and STRO-1 (Fig. 4a, Videos S1, S2). We validated the usage of ML models of fixed samples for predicting live cells by testing the influence of the fixative on the cell morphology. We found that no significant morphological change could be observed (Fig. S12). As shown in the predicted video, we observed significant fluctuations in the gene expression level, reminiscent of the stochastic gene expression findings in other systems62,63. We analyzed and plotted the evolution of the overall expression level of a single cell in Fig. 4b. To validate the observed fluctuation, we highlighted the 60% confidence interval of the predicted value with gray bands (Fig. 4b), which was estimated by analyzing the distribution statistics of the prediction-target error, see Fig. S13. This prediction uncertainty was found to be $$<2$$% for all markers (Fig. S13c).

To understand the relationships of the fluctuations between MSC markers, we analyzed two types of marker cross-correlations. The first analysis averaged the overall cross-correlation values over different cells for a single time point. This calculation was performed using the dataset shown in Fig. 3. The second analysis, in contrast, averaged the marker cross-correlation over the entire temporal fluctuation for a single cell (Fig. 4b). By comparing these analyses (Fig. 4c), we found that the fluctuation correlation (averaged over time) is significantly lower than the overall correlation (averaged over cells) for all tested markers. This reduced correlation value indicated that the observed temporal fluctuations are strongly influenced by uncorrelated noise, consistent with previous stochastic gene expression experiments64,65,66.

While these fluctuations are relatively independent, we found that all markers exhibited a similar autocorrelation decay trend (Fig. 4d). Here, the autocorrelation function characterizes the persistency of a temporal fluctuation. By fitting an exponential decay to the head of the curve, we found that all four markers displayed a correlation time around 40 minutes (inset of Fig. 4d). The minor anticorrelation in Fig. 4d may be attributed to the finite measurement duration50.

Lastly, we analyzed the intracellular heterogeneity using the acquired multi-marker time-lapse data (Video S2). We calculated a projected intracellular heterogeneity by averaging the gene expression field over the vertical axis and determined the 1-dimensional profiles of gene expression level across the cell (Fig. S14). This calculation was then repeated for each time point, in which the final spatial-temporal matrices are illustrated using heatmaps shown in Fig. 5a–c. The heatmaps illustrated clear subcellular differential expression between all markers (CD105, CD29, STRO-1, CD44) throughout the cell and over time. CD105 (Fig. 5a) and STRO-1 (Fig. 5b) showed the strongest expression difference. This differential expression was nearly 50% of the analyzed gene expression fluctuations (Fig. 5c). The distinct fluctuation frequency and persistency of these protein distributions were shown by the 2D Fast Fourier Transform (FFT) plots (Fig. 5d), in which CD105 displayed a temporal (red arrows) and spatial (black arrows) fluctuation that is more significant than that for STRO-1, CD29, and CD44. Our findings collectively indicated that the characteristics of the intracellular heterogeneity, which can be evaluated by the protein localization, are strongly gene dependent. While it would be interesting to further validate these findings using fluorescent reporters, creating such a multi-marker reporter line is challenging and beyond the scope of this work. Nevertheless, all of our analyses mainly focused on the statistical characteristics of the spatial-temporal gene expression fluctuation (e.g., correlation time and 2D FFT), minimizing the impact of potential ML prediction errors on the findings.

## Discussion

Overall, we demonstrated that AI-based label-free microscopy offers a powerful experimental platform for conducting non-invasive, quantitative, and multi-marker characterizations of MSC heterogeneity. With this tool, we found that the MSC morphology and conventional surface markers provide complementary assessments of the cell characteristics. On the cellular level, we showed that the stochastic gene expression fluctuations for all tested markers shared a similar correlation timescale. On the subcellular level, we observed and quantified the spatial-temporal variation of surface protein distribution. As a proof of concept, this work only focused on the MSC heterogeneity within a single-donor culture for similar passage numbers and seeding density. In the future, it would be informative to utilize a similar approach to investigate the heterogeneity associated with cell passage number, senescent state67, donor variation, and media composition, factors that have been shown to have significant impacts on MSC marker expression. Specifically, studying the relationship between MSC heterogeneity and replicative senescence would be particularly interesting, as it has been shown to be profoundly related to the characteristic, proliferation capacity, and multilineage potency of MSCs68,69,70. Furthermore, in addition to the tested classical MSC markers, several genes (e.g. indolamine-2,3-dioxygenase (IDO)71, alkaline phosphatase (ALP) and CD27172) have been recently shown to relate closely with MSC functions, providing an interesting opportunity for future study.

While AI-based microscopy presents many advantages over standard immunochemistry assays, many challenges persist. For example, it remains unknown whether the distribution of highly localized proteins (e.g. vesicle structures) or RNAs (e.g, telomeres), which are strictly invisible in the phase contrast images, can be accurately predicted by the ML models. Also, the black-box nature of CNNs provides limited information on how the model extracts transmitted light image features and translates them into fluorescent signals31,73. Furthermore, to apply this approach to high-throughput screening, it is crucial to develop reliable and automatic cell segmentation tools74.

Nevertheless, the current imaging platform can already enable many exciting applications in both basic and translational studies of MSCs. For instance, by studying the gene expression fluctuation during cell division, it is possible to address the clonal heterogeneity origin, which has been hypothesised to be associated with asymmetric partitioning of MSCs75. By introducing relevant receptor, pathway signaling, and oxidative stress genes, our method will also advance the specificity, readout dynamics, and throughput of toxicity and pharmacokinetic studies. Broadly, the ability to instantly quantify MSC characteristics paves the way for controlling cell source quality, optimizing culture conditions, and engineering specific cell functions, all critical steps for advancing current MSC-based cell therapies.

## Methods

### MSC culture

Human bone marrow-derived MSCs (ATCC, PCS-500-012) were cultured according to the manufactures instruction and previously published protocols76,77. In brief, after the MSCs were thawed, they were seeded into tissue culture flasks at a density of 5000 cells/cm$$^{2}$$ with the culture media comprising DMEM (Gibco, 1 g/mL glucose, 500 mL), 10% fetal bovine serum (Gibco), and 1% Penicillin/ Streptomycin (Gibco). The MSC culture media was then replaced every 24–48 h. Subculture of MSCs was performed at $$\sim 80$$ % confluency, in which the cells were washed with 1$$\times$$ PBS −/− twice, and incubated with 0.5% Trypsin-EDTA at 37 $$^\circ$$C for cell detachment. The dissociated cells were then centrifuged at 250g for 3 min, resuspended in warmed culture media and reseeded at a seeding density of 5000 cells/cm$$^{2}$$. For experiments, we seeded passage-3 MSCs into 2-well Ibidi slides (Ibidi, 80296) at a seeding density of 10,000 cells/cm$$^{2}$$. The experiment was carried out at least 24 h after seeding to ensure cell attachment.

### Sample preparation and immunostaining

To fix and immunostain the MSC samples, each Ibidi slide was first washed with PBS +/+. 4% PFA (Thermo Fisher Scientific, 28908) in 1$$\times$$ PBS +/+ (Gibco) was used as the fixative. After 5–10 min of incubation, the slides were washed with PBS. For staining, the MSCs were first blocked using a mixture of 2% donkey serum (Sigma-Aldrich, D9663-10ML) and 0.5% Triton X-100 (Sigma-Aldrich, T8787-50ML) for 30 min. After blocking, the slides were washed with PBS twice, and then incubated with the primary staining solution (0.5% BSA, 0.25% Triton X-100, and the primary antibody, see Supplementary Table S1). The slides were left in the staining solution for 30 min and then washed twice with 1$$\times$$ PBS. Afterwards, the secondary staining solution (with NucBlue and the secondary antibody) was added for 30 min. We then washed the samples twice with PBS and added 0.1% Tween 20 (Sigma-Aldrich, P9416-50ML) for storage.

### Transmitted light microscopy and fluorescent imaging

All stained MSC samples were imaged using an inverted microscope (Etaluma LS720, Lumaview 720/600-Series software) with a 20$$\times$$ phase contrast objective (Olympus, LCACHN 20XIPC) that allowed the acquisition of both phase contrast and fluorescence images. Approximately 600–900 images for each channel (i.e., phase contrast, 405 nm, 488 nm, and 594 nm) were obtained with a field of view $$\sim 380\;\upmu$$$$\times 380\;\upmu$$m. To conduct the time-lapse experiment, the same microscope and software was used in an incubator to image MSCs every 2 min over a period of 48 h (Videos S1, S2). While all of our selected antibodies have been previously validated, we also examined the non-specific binding by measuring the fluorescent intensity in samples that were only stained with secondary antibodies. Prior to further analyses, the background of the fluorescent data was evaluated and subtracted, following typical quantitative immunofluorescent microscopy procedures.

### CNN model training and characterization

Our model, consisting of a pair of generator and discriminator, was adapted from a previous image-to-image translation work40. Specifically, we constructed our machine learning code using a Python deep learning library, PyTorch, ensuring that our AI platform is compatible with different operating systems. The architecture of the generator was the standard U-Net that has been widely used for image segmentation. In this work, it was modified as an image translator that transformed the phase contrast images into fluorescent-like data. In particular, the used U-Net model contained short-cut connections between hidden features at the same level, effectively combining the fine-grained input details with the high-level semantics of shapes. The adapted discriminator was a multilayer CNN that used the concatenation of signal and target images as input, and generated a tensor containing the information for judging the prediction-target similarity. Instead of adopting the typical approaches to solving the classification problem in GAN, we utilized the least square loss, which has been shown to be a more stable function78. Different CNNs were trained to predict the fluorescent images for individual markers, with the training data consisting of approximately 600–1500 pairs of phase contrast (input) and immunofluorescent (target) images.

To quantify the prediction accuracy of the CNN, the Pearson correlation coefficient between the target x and prediction y pixel intensities was evaluated as $$\sum _{i=1}^{n}(x_i-{\bar{x}})(y_i-{\bar{y}})/ \left[ \sum _{i=1}^{n}(x_i-{\bar{x}})^2\sum _{i=1}^{n}(y_i-{\bar{y}})^2 \right] ^{1/2}$$. This calculation was applied to both the images and their corresponding Laplacian fields (kernel size = 31) in Python to obtain $$r_s$$ and $$r_{lap}$$, respectively (Figs. 1d,e, 2b,c,e,f). The $$C_{max}$$ curve shown in Fig. 2e was evaluated using $$C_{\max }=\frac{{\mathbb {E}}[Y\cdot {\hat{Y}}]}{\sqrt{{\mathbb {E}}[Y^2]{\mathbb {E}}[{\hat{Y}}^2]}}=\sqrt{\frac{SNR}{1+SNR}}$$ with $${\text {SNR}}={\sigma ^2_x}/{\sigma ^2_{\epsilon }}$$ , which determined the upper bound of the target-prediction correlation where the signal noise is uncorrelated. To calculate the SNR values of the fluorescent images used for Fig. 2d–f, we used the Root Mean Square (RMS) method with the following formula: $${\text {SNR}}={(S_{signal}-S_{noise})}/{N_{noise}}$$. The mean ($$S_{signal}$$ and $$S_{noise}$$) and standard deviation ($$N_{noise}$$) of pixel intensities were measured using Fiji ImageJ79.

### Single cell measurement

We used the Fiji ImageJ79 polygon selection tool to manually outline 100 cells for both the immunofluorescence images and the corresponding ML-predicted images. The outlined cells were then analyzed to evaluate the overall pixel intensity and the prediction accuracy (Fig. S9). We then repeated this outlining procedure to obtain 500 cell data points, in which the data were used for the multi-marker analysis and morphology measurements (Fig. 3). PCA (Fig. 3d,e) was performed using OriginLab (OriginPro, Version 2019. OriginLab Corporation, Northampton, MA, USA). The unsupervised clustering analysis (Fig. 3c) was performed using Python 3.8 (seaborn). Figures 3b and 4b,d were generated using OriginLab where for Fig. 4d, the autocorrelation function of Excel (RealStatistics package) was used. The heatmaps displayed in Fig. 5a–c, were obtained using Python v3.8 described in the Supplementary Information (Fig. S14). We used the Fiji ImageJ79 FFT command to compute the Fourier transform and display the frequency spectrum of the heatmaps displaying intracellular marker distribution. Heatmaps were converted to gray scale prior to FFT computation and a density-based 16 color look up table was applied to the resulting FFT to highlight the spectrum pattern (Fig. 5d).

## Code availability

Accession codes Software for training and an example dataset is available https://xuanqing94.github.io/ai-reporter/.

## References

1. 1.

Newman, R. E., Yoo, D., LeRoux, M. A. & Danilkovitch-Miagkova, A. Treatment of inflammatory diseases with mesenchymal stem cells. Inflamm. Allergy Drug Targets 8, 110–123. https://doi.org/10.2174/187152809788462635 (2009).

2. 2.

Wang, L.-T. et al. Human mesenchymal stem cells (MSCs) for treatment towards immune- and inflammation-mediated diseases: Review of current clinical trials. J. Biomed. Sci. 23, 76. https://doi.org/10.1186/s12929-016-0289-5 (2016).

3. 3.

Azari, M. F. et al. Mesenchymal stem cells for treatment of CNS injury. Curr. Neuropharmacol. 8, 316–323. https://doi.org/10.2174/157015910793358204 (2010).

4. 4.

Caplan, A. I. Mesenchymal stem cells. J. Orthop. Res. 9, 641–650. https://doi.org/10.1002/jor.1100090504 (1991).

5. 5.

Kolf, C. M., Cho, E. & Tuan, R. S. Mesenchymal stromal cells. Biology of adult mesenchymal stem cells: regulation of niche, self-renewal and differentiation. Arthritis Res. Therapy 9, 204. https://doi.org/10.1186/ar2116 (2007).

6. 6.

Zappia, E. et al. Mesenchymal stem cells ameliorate experimental autoimmune encephalomyelitis inducing T-cell anergy. Blood 106, 1755–1761. https://doi.org/10.1182/blood-2005-04-1496 (2005).

7. 7.

Horák, J. et al. Mesenchymal stem cells in sepsis and associated organ dysfunction: A promising future or blind alley?. Stem Cells Int. 2017, 7304121. https://doi.org/10.1155/2017/7304121 (2017).

8. 8.

Zhao, K. & Liu, Q. The clinical application of mesenchymal stromal cells in hematopoietic stem cell transplantation. J. Hematol. Oncol. 9, 46. https://doi.org/10.1186/s13045-016-0276-z (2016).

9. 9.

Le Blanc, K. et al. Treatment of severe acute graft-versus-host disease with third party haploidentical mesenchymal stem cells. Lancet (London, England) 363, 1439–1441. https://doi.org/10.1016/S0140-6736(04)16104-7 (2004).

10. 10.

Pevsner-Fischer, M., Levin, S. & Zipori, D. The of origins mesenchymal stromal cell heterogeneity. Stem Cell Rev. Rep. 7, 560–568. https://doi.org/10.1007/s12015-011-9229-7 (2011).

11. 11.

Ho, A. D., Wagner, W. & Franke, W. Heterogeneity of mesenchymal stromal cell preparations. Cytotherapy 10, 320–330. https://doi.org/10.1080/14653240802217011 (2008).

12. 12.

Kim, N. & Cho, S.-G. New strategies for overcoming limitations of mesenchymal stem cell-based immune modulation. Int. J. Stem Cells 8, 54–68. https://doi.org/10.15283/ijsc.2015.8.1.54 (2015).

13. 13.

McLeod, C. M. & Mauck, R. L. On the origin and impact of mesenchymal stem cell heterogeneity: New insights and emerging tools for single cell analysis. Eur. Cells Mater. 34, 217–231. https://doi.org/10.22203/eCM.v034a14 (2017).

14. 14.

Rennerfeldt, D. A. & Van Vliet, K. J. Concise review: When colonies are not clones: Evidence and implications of intracolony heterogeneity in mesenchymal stem cells. Stem Cells 34, 1135–1141. https://doi.org/10.1002/stem.2296 (2016).

15. 15.

Whitfield, M. J., Lee, W. C. J. & Van Vliet, K. J. Onset of heterogeneity in culture-expanded bone marrow stromal cells. Stem Cell Res. 11, 1365–1377. https://doi.org/10.1016/j.scr.2013.09.004 (2013).

16. 16.

Dwarshuis, N. J., Parratt, K., Santiago-Miranda, A. & Roy, K. Cells as advanced therapeutics: State-of-the-art, challenges, and opportunities in large scale biomanufacturing of high-quality cells for adoptive immunotherapies. Adv. Drug Deliv. Rev. 114, 222–239. https://doi.org/10.1016/j.addr.2017.06.005 (2017).

17. 17.

Rivière, I. & Roy, K. Perspectives on manufacturing of high-quality cell therapies. Mol. Ther. J. Am. Soc. Gene Ther. 25, 1067–1068. https://doi.org/10.1016/j.ymthe.2017.04.010 (2017).

18. 18.

Marklein, R. A. et al. Morphological profiling using machine learning reveals emergent subpopulations of interferon-g–stimulated mesenchymal stromal cells that predict immunosuppression. Cytotherapy 21, 17–31. https://doi.org/10.1016/j.jcyt.2018.10.008 (2019).

19. 19.

Klinker, M. W., Marklein, R. A., Lo Surdo, J. L., Wei, C.-H. & Bauer, S. R. Morphological features of IFN-g–stimulated mesenchymal stromal cells predict overall immunosuppressive capacity. Proc. Natl. Acad. Sci. 114, E2598 LP–E2607. https://doi.org/10.1073/pnas.1617933114 (2017).

20. 20.

Sekiya, I. et al. Expansion of human adult stem cells from bone marrow stroma: conditions that maximize the yields of early progenitors and evaluate their quality. Stem Cells (Dayton, Ohio) 20, 530–541. https://doi.org/10.1634/stemcells.20-6-530 (2002).

21. 21.

Smith, J. R., Pochampally, R., Perry, A., Hsu, S.-C. & Prockop, D. J. Isolation of a highly clonogenic and multipotential subfraction of adult stem cells from bone marrow stroma. Stem Cells (Dayton, Ohio) 22, 823–831. https://doi.org/10.1634/stemcells.22-5-823 (2004).

22. 22.

Docheva, D. et al. Researching into the cellular shape, volume and elasticity of mesenchymal stem cells, osteoblasts and osteosarcoma cells by atomic force microscopy. J. Cell. Mol. Med. 12, 537–552. https://doi.org/10.1111/j.1582-4934.2007.00138.x (2008).

23. 23.

Colter, D. C., Class, R., DiGirolamo, C. M. & Prockop, D. J. Rapid expansion of recycling stem cells in cultures of plastic-adherent cells from human bone marrow. Proc. Natl. Acad. Sci. 97, 3213 LP–3218. https://doi.org/10.1073/pnas.97.7.3213 (2000).

24. 24.

Simmons, P. J. & Torok-Storb, B. Identification of stromal cell precursors in human bone marrow by a novel monoclonal antibody, STRO-1. Blood 78, 55–62 (1991).

25. 25.

Ode, A. et al. CD73 and CD29 concurrently mediate the mechanically induced decrease of migratory capacity of mesenchymal stromal cells. Eur. Cells Mater. 22, 26–42. https://doi.org/10.22203/ecm.v022a03 (2011).

26. 26.

Lo Surdo, J. & Bauer, S. R. Quantitative approaches to detect donor and passage differences in adipogenic potential and clonogenicity in human bone marrow-derived mesenchymal stem cells. Tissue Eng. Part C Methods 18, 877–889. https://doi.org/10.1089/ten.TEC.2011.0736 (2012).

27. 27.

Lv, F.-J., Tuan, R. S., Cheung, K. M. C. & Leung, V. Y. L. Concise review: The surface markers and identity of human mesenchymal stem cells. Stem Cells 32, 1408–1419. https://doi.org/10.1002/stem.1681 (2014).

28. 28.

Campioni, D. et al. Immunophenotypic heterogeneity of bone marrow-derived mesenchymal stromal cells from patients with hematologic disorders: Correlation with bone marrow microenvironment. Haematology 91, 364–368 (2006).

29. 29.

Skylaki, S., Hilsenbeck, O. & Schroeder, T. Challenges in long-term imaging and quantification of single-cell dynamics. Nat. Biotechnol. 34, 1137–1144. https://doi.org/10.1038/nbt.3713 (2016).

30. 30.

Ounkomol, C., Seshamani, S., Maleckar, M. M., Collman, F. & Johnson, G. R. Label-free prediction of three-dimensional fluorescence images from transmitted-light microscopy. Nat. Methods 15, 917–920. https://doi.org/10.1038/s41592-018-0111-2 (2018).

31. 31.

Christiansen, E. M. et al. In silico labeling: Predicting fluorescent labels in unlabeled images. Cell 173, 792-803.e19. https://doi.org/10.1016/j.cell.2018.03.040 (2018).

32. 32.

Rivenson, Y. et al. Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning. Nat. Biomed. Eng. 3, 466–477. https://doi.org/10.1038/s41551-019-0362-y (2019).

33. 33.

Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16, 1215–1225. https://doi.org/10.1038/s41592-019-0458-z (2019).

34. 34.

Wang, H. et al. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat. Methods 16, 103–110. https://doi.org/10.1038/s41592-018-0239-0 (2019).

35. 35.

Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnol. 36, 460–468. https://doi.org/10.1038/nbt.4106 (2018).

36. 36.

Nehme, E., Weiss, L. E., Michaeli, T. & Shechtman, Y. Deep-STORM: Super-resolution single-molecule microscopy by deep learning. Optica. 5, 458–464. https://doi.org/10.1364/OPTICA.5.000458 (2018).

37. 37.

Moen, E. et al. Deep learning for cellular image analysis. Nat. Methods 16, 1233–1246. https://doi.org/10.1038/s41592-019-0403-1 (2019).

38. 38.

Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 15, 1090–1097. https://doi.org/10.1038/s41592-018-0216-7 (2018).

39. 39.

Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015 (eds Navab, N. et al.) 234–241 (Springer International Publishing, 2015).

40. 40.

Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. Image-to-image translation with conditional adversarial networks. in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 5967–5976. https://doi.org/10.1109/CVPR.2017.632 (2017).

41. 41.

Dominici, M. et al. Minimal criteria for defining multipotent mesenchymal stromal cells. International Society for Cellular Therapy position statement. Cytotherapy 8, 315–317. https://doi.org/10.1080/14653240600855905 (2006).

42. 42.

Ip, J. E. et al. Mesenchymal stem cells use integrin beta1 not CXC chemokine receptor 4 for myocardial migration and engraftment. Mol. Biol. Cell 18, 2873–2882. https://doi.org/10.1091/mbc.e07-02-0166 (2007).

43. 43.

Espagnolle, N. et al. CD146 expression on mesenchymal stem cells is associated with their vascular smooth muscle commitment. J. Cell. Mol. Med. 18, 104–114. https://doi.org/10.1111/jcmm.12168 (2014).

44. 44.

Yang, Z. X. et al. CD106 identifies a subpopulation of mesenchymal stem cells with unique immunomodulatory properties. PLoS ONE 8, e59354. https://doi.org/10.1371/journal.pone.0059354 (2013).

45. 45.

Nasef, A. et al. Selected Stro-1-enriched bone marrow stromal cells display a major suppressive effect on lymphocyte proliferation. Int. J. Lab. Hematol. 31, 9–19. https://doi.org/10.1111/j.1751-553X.2007.00997.x (2009).

46. 46.

Zhu, H. et al. The role of the hyaluronan receptor CD44 in mesenchymal stem cell migration in the extracellular matrix. Stem Cells (Dayton, Ohio) 24, 928–935. https://doi.org/10.1634/stemcells.2005-0186 (2006).

47. 47.

Botchkarev, A. Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. Interdiscip. J. Inf. Knowl. Manag. 14, 45–79. https://doi.org/10.28945/4184 (2019).

48. 48.

Kim, H.-J. & Lee, D. Image denoising with conditional generative adversarial networks (CGAN) in low dose chest images. Nucl. Instruments Methods Phys. Res. Sect. A: Accel. Spectrometers, Detect. Assoc. Equip. 954, 161914. https://doi.org/10.1016/j.nima.2019.02.041 (2020).

49. 49.

Baghaei, K. et al. Isolation, differentiation, and characterization of mesenchymal stem cells from human bone marrow. Gastroenterol. Hepatol. Bed Bench 10, 208–213 (2017).

50. 50.

Desponds, J. et al. Precision of readout at the hunchback gene: analyzing short transcription time traces in living fly embryos. PLoS Comput. Biol. 12, e1005256. https://doi.org/10.1371/journal.pcbi.1005256 (2016).

51. 51.

Esposito, A. et al. Quantitative fluorescence microscopy techniques. Methods Mol. Biol. (Clifton, NJ) 586, 117–142. https://doi.org/10.1007/978-1-60761-376-3_6 (2009).

52. 52.

Blot, V. & McGraw, T. E. Use of quantitative immunofluorescence microscopy to study intracellular trafficking: Studies of the GLUT4 glucose transporter. In Membrane Trafficking Vol. 457 (ed. Vancura, A.) 347–366 (Humana Press, 2008). https://doi.org/10.1007/978-1-59745-261-8_26.

53. 53.

Waters, J. C. Accuracy and precision in quantitative fluorescence microscopy. J. Cell Biol. 185, 1135–1148. https://doi.org/10.1083/jcb.200903097 (2009).

54. 54.

Haasters, F. et al. Morphological and immunocytochemical characteristics indicate the yield of early progenitors and represent a quality control for human mesenchymal stem cell culturing. J. Anat. 214, 759–767. https://doi.org/10.1111/j.1469-7580.2009.01065.x (2009).

55. 55.

Seiler, C. et al. Time-lapse microscopy and classification of 2D human mesenchymal stem cells based on cell shape picks up myogenic from osteogenic and adipogenic differentiation. J. Tissue Eng. Regen. Med. 8, 737–746. https://doi.org/10.1002/term.1575 (2014).

56. 56.

Yang, Y.-H.K., Ogando, C. R., Wang See, C., Chang, T.-Y. & Barabino, G. A. Changes in phenotype and differentiation potential of human mesenchymal stem cells aging in vitro. Stem Cell Res. Ther. 9, 131. https://doi.org/10.1186/s13287-018-0876-3 (2018).

57. 57.

Chinnadurai, R. et al. Potency analysis of mesenchymal stromal cells using a combinatorial assay matrix approach. Cell Rep. 22, 2504–2517. https://doi.org/10.1016/j.celrep.2018.02.013 (2018).

58. 58.

Matsuoka, F. et al. Morphology-based prediction of osteogenic differentiation potential of human mesenchymal stem cells. PLoS ONE 8, e55082 (2013).

59. 59.

Sasaki, H. et al. Label-free morphology-based prediction of multiple differentiation potentials of human mesenchymal stem cells for early evaluation of intact cells. PLoS ONE 9, e93952 (2014).

60. 60.

Wang, W. & Han, Z. C. Heterogeneity of human mesenchymal stromal/stem cells. Adv. Exp. Med. Biol. 1123, 165–177. https://doi.org/10.1007/978-3-030-11096-3_10 (2019).

61. 61.

Harichandan, A., Sivasubramaniyan, K. & Bühring, H.-J. Prospective isolation and characterization of human bone marrow-derived MSCs. Adv. Biochem. Eng./Biotechnol. 129, 1–17. https://doi.org/10.1007/10_2012_147 (2013).

62. 62.

Araújo, I. S. et al. Stochastic gene expression in Arabidopsis thaliana. Nat. Commun. 8, 2132. https://doi.org/10.1038/s41467-017-02285-7 (2017).

63. 63.

Dar, R. D. et al. Transcriptional burst frequency and burst size are equally modulated across the human genome. Proc. Natl. Acad. Sci. 109, 17454 LP–17459. https://doi.org/10.1073/pnas.1213530109 (2012).

64. 64.

Sun, L., Ashcroft, P., Ackermann, M. & Bonhoeffer, S. Stochastic gene expression influences the selection of antibiotic resistance mutations. Mol. Biol. Evol. 37, 58–70. https://doi.org/10.1093/molbev/msz199 (2020).

65. 65.

Raj, A. & van Oudenaarden, A. Nature, nurture, or chance: Stochastic gene expression and its consequences. Cell 135, 216–226. https://doi.org/10.1016/j.cell.2008.09.050 (2008).

66. 66.

Elowitz, M. B., Levine, A. J., Siggia, E. D. & Swain, P. S. Stochastic gene expression in a single cell. Sci. 297, 1183 LP–1186. https://doi.org/10.1126/science.1070919 (2002).

67. 67.

Liu, J., Ding, Y., Liu, Z. & Liang, X. Senescence in mesenchymal stem cells: Functional alterations, molecular mechanisms, and rejuvenation strategies. Front. Cell Dev. Biol. 8, 258. https://doi.org/10.3389/fcell.2020.00258 (2020).

68. 68.

Li, Y. et al. Senescence of mesenchymal stem cells (review). Int. J. Mol. Med. 39, 775–782. https://doi.org/10.3892/ijmm.2017.2912 (2017).

69. 69.

Drela, K., Stanaszek, L., Nowakowski, A., Kuczynska, Z. & Lukomska, B. Experimental strategies of mesenchymal stem cell propagation: Adverse events and potential risk of functional changes. Stem Cells Int. 2019, 7012692. https://doi.org/10.1155/2019/7012692 (2019).

70. 70.

Hu, Y. et al. Comparative study on in vitro culture of mouse bone marrow mesenchymal stem cells. Stem Cells Int. 2018, 6704583. https://doi.org/10.1155/2018/6704583 (2018).

71. 71.

Rubtsov, Y. et al. Molecular mechanisms of immunomodulation properties of mesenchymal stromal cells: A new insight into the role of ICAM-1. Stem Cells Int. 2017, 6516854. https://doi.org/10.1155/2017/6516854 (2017).

72. 72.

Kowal, J. M., Schmal, H., Halekoh, U., Hjelmborg, J. B. & Kassem, M. Single-cell high-content imaging parameters predict functional phenotype of cultured human bone marrow stromal stem cells. Stem Cells Transl. Med. 9, 189–202. https://doi.org/10.1002/sctm.19-0171 (2020).

73. 73.

Buhrmester, V., Muench, D. & Arens, M. Analysis of explainers of black box deep neural networks for computer vision: A survey. in Computing Research Repository (CoRR). http://arxiv.org/abs/1911.12116 (2019).

74. 74.

Lugagne, J.-B., Lin, H. & Dunlop, M. J. DeLTA: Automated cell segmentation, tracking, and lineage reconstruction using deep learning. PLOS Comput. Biol. 16, e1007673 (2020).

75. 75.

Huh, D. & Paulsson, J. Non-genetic heterogeneity from stochastic partitioning at cell division. Nat. Genet. 43, 95–100. https://doi.org/10.1038/ng.729 (2011).

76. 76.

Adamzyk, C. et al. Different culture media affect proliferation, surface epitope expression, and differentiation of Ovine MSC. Stem Cells Int. 2013, 387324. https://doi.org/10.1155/2013/387324 (2013).

77. 77.

Hagmann, S. et al. Different culture media affect growth characteristics, surface marker distribution and chondrogenic differentiation of human bone marrow-derived mesenchymal stromal cells. BMC Musculoskelet. Disord. 14, 223. https://doi.org/10.1186/1471-2474-14-223 (2013).

78. 78.

Mao, X. et al. Least squares generative adversarial networks. 2017 IEEE Int. Conf. Comput. Vis. (ICCV) https://doi.org/10.1109/ICCV.2017.304 (2017).

79. 79.

Schindelin, J. et al. Fiji: An open-source platform for biological-image analysis. Nat. Methods 9, 676–682. https://doi.org/10.1038/nmeth.2019 (2012).

## Acknowledgements

We thank Takuya Matsumoto, Eri Harada, Alex Hofmann, Gregory R. Johnson and Roy Wallman for insightful discussions. This work was supported by the UCLA SPORE in Prostate Cancer Grant (P50 CA092131), the Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at UCLA and California NanoSystems Institute at UCLA Planning Award grant and the National Science Foundation Grant (IIS-1901527).

## Author information

Authors

### Contributions

S.I., C.H., N.Y.C.L. conceived the project. X.L. implemented the deep learning model. S.I., X.L., B.S.L., and N.Y.C.L. designed experiments. S.I., X.L. and B.S.L. conducted the experiments. S.I., X.L., B.S.L., M.C.P., C.H. and N.Y.C.L. analyzed the results. S.I., X.L., B.S.L., M.C.P., C.H. and N.Y.C.L. wrote the paper. All authors reviewed the manuscript.

### Corresponding author

Correspondence to Sara Imboden.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

### Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

Supplementary Video S1.

Supplementary Video S2.

## Rights and permissions

Reprints and Permissions

Imboden, S., Liu, X., Lee, B.S. et al. Investigating heterogeneities of live mesenchymal stromal cells using AI-based label-free imaging. Sci Rep 11, 6728 (2021). https://doi.org/10.1038/s41598-021-85905-z

• Accepted:

• Published: