Morphodynamics facilitate cancer cells to navigate 3D extracellular matrix

Cell shape is linked to cell function. The significance of cell morphodynamics, namely the temporal fluctuation of cell shape, is much less understood. Here we study the morphodynamics of MDA-MB-231 cells in type I collagen extracellular matrix (ECM). We systematically vary ECM physical properties by tuning collagen concentrations, alignment, and gelation temperatures. We find that morphodynamics of 3D migrating cells are externally controlled by ECM mechanics and internally modulated by Rho/ROCK-signaling. We employ machine learning to classify cell shape into four different morphological phenotypes, each corresponding to a distinct migration mode. As a result, we map cell morphodynamics at mesoscale into the temporal evolution of morphological phenotypes. We characterize the mesoscale dynamics including occurrence probability, dwell time and transition matrix at varying ECM conditions, which demonstrate the complex phenotype landscape and optimal pathways for phenotype transitions. In light of the mesoscale dynamics, we show that 3D cancer cell motility is a hidden Markov process whereby the step size distributions of cell migration are coupled with simultaneous cell morphodynamics. Morphological phenotype transitions also facilitate cancer cells to navigate non-uniform ECM such as traversing the interface between matrices of two distinct microstructures. In conclusion, we demonstrate that 3D migrating cancer cells exhibit rich morphodynamics that is controlled by ECM mechanics, Rho/ROCK-signaling, and regulate cell motility. Our results pave the way to the functional understanding and mechanical programming of cell morphodynamics as a route to predict and control 3D cell motility.


S1: Additional Experimental Details
3D cell culture GFP-labeled MBA-MB-231 human breast carcinoma cells are purchased from GenTarget Inc. and are maintained according to the manufacturer's instructions. Briefly, growth media is prepared using Dulbeccos Modified Eagle Medium (Gibco, US) supplemented with 10% fetal bovine serum (Gibco, US), 1% penicillinstreptomyocin (Gibco, US), and 0.1 mM non-essential amino acid (NEAA 100x, ThermoFisher, US). Collagen solutions are prepared by diluting rat-tail collagen type I (Corning, US) with prepared growth medium, phosphate-buffered saline (PBS, 10x), and sodium hydroxide (NaOH, 0.1M) to a concentration of 1.5 mg/mL or 3.0 mg/mL with pH 7.4. To embed the cells in 3D collagen matrices, cells are suspended at very low density of approximately 650 cells/µL in ice-cold neutralized collagen solution and added to a 35 mm collagen coated glass bottom dish with a 7 mm microwell diameter (MatTek, US). The microwell containing ice-cold cell-collagen solution is covered with a coverslip so that the dish may be inverted during gelation to ensure dispersion of cells in 3D. The dish is then incubated on either a warming plate set to 25 • C, or in a tissue culture incubator (37 • C, 5% CO 2 ) for 30 minutes in order to solidify the matrix. For fiber alignment, a small magnetic iron shaving (<200 µm) is first placed onto the dish and then immersed in cell-collagen solution and placed onto the warming plate. We then drag the particle through the solution by an external magnet along a line for approximately 3 minutes while warming, and is then left to solidify [1]. The coverslip is removed after gelation time and the cellularized ECM is immersed with tissue culture medium and continuously incubated for 24 hours before imaging.

Immunofluorescence
Following a standard two-step immunofluorescence staining protocol (Ther-moFisher), each sample is incubated at room temperature with 4% formaldehyde solution in PBS for 15 minutes to fix the sample. After washing with PBS for an additional 5 minutes, the sample is then incubated at room temperature for 15 minutes in a working concentration of 0.5% Triton X-100 in PBS, used as a permeabilization agent. We then further wash the sample with PBS for 5 minutes. To prevent non-specific binding, a 3% bovine serum albumin blocking solution in PBS is added and then incubated at room temperature for 60 minutes. We then wash the sample with PBS for 5 minutes. F-actin antibody (Invitrogen) or phalloidin dye (Invitrogen) is added and then incubated at room temperature for 30 minutes. Finally, the sample is washed with PBS for 5 minutes and then is imaged with a 40X objective.

Microscopy and image analysis
Continuous imaging is done with a Leica TCS SPE confocal microscope equipped with a stage-top incubator (ibidi). Images are captured at a rate of 1 frame per 15 minutes. The raw images are gray scale with a resolution of 1024 x 1024 pixel 2 . The voxel size has been calibrated to equal 0.538 µm. A single x-y plane is imaged every 10 µm in the z-dimension per experiment for up to 24 hours. Using custom Matlab scripts, cell images are maximum-projected onto a x-y plane and tracked over time (see section S2). The projected images are manually segmented, and screened to remove cells that are not entirely within the viewing window.

S2: Geometric Characterization of Cell Images
Image processing Following acquisition of fluorescent images, data regarding cell shape and position are obtained by processing and binarizing the time-lapse images using custom NIH ImageJ and Matlab scripts. First, fluorescence images are background subtracted using a rolling ball radius of 50 pixels (26.88 µm) and then log-transformed in order to make cell edges highly visible and so that less fluorescent-intense cells are also quantified. A manual threshold is then applied for each image. After, cells are manually segmented for each z-stack if applicable. Since consecutive z-stacks may have cell overlap, custom Matlab scripts are then used to determine if the same cell is in multiple z-stacks. After, we take a maximum projection (2D) of each cell. Geometrical measurements are then taken on binary objects using Matlabs regionprops function, including area, perimeter, major axis length, minor axis length, solidity, eccentricity, convex area, extent, equivalent diameter, convex perimeter, fiber length (skeletonized max length), maximum inscribed radius, and maximum bleb radius (maximum secondary circle). Additional measures of form factor ( Area P erimeter 2 ), aspect ratio ( M ajoraxislength M inoraxislength ), Convexity (  Figure S1: Schematic of image processing and measures taken from binary using custom Matlab and Python scripts (scale-bar = 10 µm). This figure is prepared with Mathlab R2020a (www.mathworks.com) and ImageJ (https://imagej.net).

Tracking cell position
In order to track the real-space center of the cell, we use maximum inscribed circle (MIC) of the cell image. Compared to imaging the nucleus directly, which causes phototoxicity and is prone to multinucleate staining, MIC does not require additional probes. To further demonstrate the accuracy of MIC, we compare short videos of dual labeled MDA-MB-231 cells where the GFP channel labels the cytoplasm and RFP channel labels the cell nucleus (SYTO-64, Ther-moFisher). We find that MIC agrees very well with direct nucleus staining when determining the cell position, as shown in figure S2. For most of the frames, the deviation is less than 10% of the cell long axis. The root mean squared deviation is approximately 3 microns.

S3: ECM Dimension Modulates Real Space and Shape Space Dynamics
The morphodynamics of a cell display a strongly sub-diffusive characterization in geometric shape-space. Shown in Table S1, we quantify the fits of mean square displacement of measures and report the power over a ten hour lag period.  To evaluate performance, we employed the standard 10-fold cross validation scheme [2,3]. In particular, the validation set was randomly selected from available annotated training images. This means 10% of the total 3766 images, or 377 images are reserved for validation and the remaining 90%, or 3389 images were used for training per fold. In the ten-fold scheme, this process is repeated 10 times and an average of 92% accuracy is obtained. The high success rate validates the feasibility of the model. Then to obtain the final model which we will be using for experimental data analysis, all 3766 images were used for training. To examine how the trained machine learning model can be applied to real world problems, we evaluate the final performance with a dataset that is not included in the original 3766 images. This dataset is called unseen data, and the performance of the machine learning model (88% accuracy) is based on the results with this unseen dataset. This unseen test set was prepared with 50 cell images per class. The training and test sets are available on Figshare [4].

Details of machine-learning and SVM
In order to classify cells into particular migration mechanisms, we used supportvector machine (SVM) learning [5,6]. As a maximal-margin classifier, SVM was particularly attractive as the overlap between different phenotypes was unknown. Also important is that cells can display multiple phenotypes at once and thus a soft-margin classifier was vital. Lastly, given our small dimensional space and small training-sets, SVM was an optimum choice for classification purposes.
Labeled data were first acquired as described previously. Images were binarized and then geometrical data was obtained on labeled cells for training. We performed parameter grid search for RBF, linear, and polynomial kernel models, with 10-fold cross validation to determine best performance. A grid search determined that an RBF kernel with γ = 5.8 −3 and C = 4818 yields an average training, validation, and test set accuracy of 93.1%, 86.5%, and 85.5%, respectively. However, the model generalization was improved with γ = 0.01 and C = 1000 (average training, validation, and test set accuracy of 92.9%, 89.8%, and 89.5%, respectively). A linear model also recorded strong performance with C = 191.9 with average training, validation, and test set accuracy of 90.7%, 89.4%, and 84.5%, respectively. Although easier to interpret, the linear model was found to not consistently match supervised classification. For this reason, along with the slight increase in performance, we proceeded with the RBF kernel SVM model. The final optimized SVM model used in this work performs at 92.3%, 91.8%, and 88.0% on the train, validation, and held out test sets discussed in the Random-Forest classifier.
By closely examining the misclassified held out test data sets, we notice that the most errors come from binarization process which fail to preserve the geometric features of small protrusions. Therefore the performance of the SVM classifier can be further optimized by improving imaging quality, as well as more sophisticated binarization algorithms.

Errors introduced by 2D projections of 3D cell
In order to quantify classification errors made by our SVM algorithm as a result of using geometries of 2D maximum projections of cells, we have taken high resolution confocal stacks of two sample cells, one clearly a filopodia morphology, and another clearly a blebbing cell. Using the confocal stacks, we construct deconvoluted 3D volumetric cell renderings. We numerically rotate the 3D volumetric rendered cell images at 325 uniformly spaced spherical angles (3D orientations). For each orientation, we take 2D maximum projection, and examine the classification of the projected image. For the filopodial cell, we find that the orientation introduces no more than an 8.33% classification error rate, where all the errors are made as highly elongated filopodia become oriented parallel to the viewing angle. Figure S3 shows some of the 2D projections along with visual representations of the classification error.

Comparisons with Random-Forest classifier
To compare performance to other multi-classification models, we have trained a Random-Forest classifier, tuning hyperparameters of depth, number of estimators, and features by grid search [7]. The optimized model uses 1200 estimators, with a maximum depth of 180 splits, and uses no feature subset selection, yielding a bagged ensemble of regression trees. We perform 10-fold cross validation on the SVM training data, utilizing class weighting to account for the class imbalance during training. We also include our 200 image held out test set for evaluation. The mean accuracy score was 83. 3

Feature importance
In order to interpret the features used by the SVM and random-forest classifiers in their classifications, we have examined the overall feature importance. Figure  S4A shows the top two positive and negative features of each class used by the linear kernel model of SVM. Figure S4B shows the relative importance of features used by the random-forest classifier. Feature acronyms on the y-axis are described in order top to bottom: maximum inscribed radius, minor axis length, convex area, convexity, aspect ratio, eccentricity, area, perimeter, equivalent diameter, convex perimeter, curl, sphericity, extent, form factor, solidity, relative bleb length, major axis length, fiber length, inscribed area, bleb length, and perimeter curl. This figure is prepared with Mathlab R2020a (www.mathworks.com).

Details of parametric t-SNE embedding algorithm
To visualize high-dimensional morphological trajectories in three-dimensional space, a t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm is prepared for dimensionality reduction utilizing the geometric characterization of cells without requiring labels [8]. Since the t-SNE algorithm is a non-parametric model, it utilizes all available data to separate data clusters and cannot classify new data points without retraining which often leads to different cluster shapes. We required that the non-linear algorithm could embed new data points or entire trajectories into a consistent 3D space so that different experiments could be tracked similarly. To this end, a parametric t-SNE model is prepared which does not require re-training for new data and thus forms a consistent manifold. To make the parametric t-SNE, we train a neural network on 15,000 data points to learn a mapping by minimizing the Kullback-Leibler (KL) divergence between the Gaussian distance metric in the 21-dimensional geometric space and the Students-t distributed distance metric in the output 3-dimensional space. We use the same architecture as [9], which is a dense neural network with layers: 21 → 500 → 500 → 2000 → 3 where 21 is the input dimensionality of our geometric features and 3 is the output dimensionality representing the t-SNE embedded shape-space. In summary, the model is made up of three fully connected layers with subsequent ReLU activation after each layer, and an additional final fully connected layer calculating the three t-SNE components for each points. Functionally, the training of a non-parametric t-SNE and a parametric t-SNE is very similar. However, because the parametric model learns weights in the dense layers to perform the mapping from high to low dimensions, new input data can be transformed to the same embedded space. This model uses an Adam optimizer set to minimize the Kullback-Leibler divergence loss, and is trained with 800 iterations with a batch size of 256 examples and the tunable perplexity parameter set to 30.0. We then use the parametric t-SNE model to transform the high-dimensional trajectories of new cells into the three-dimensional space.
As seen in Fig. 2C of the manuscript, 15,000 randomly selected cell geometries were used as a training set for our parametric t-SNE model. Since the data is made up of a diverse group of cell geometries, t-SNE forms a continuous spectrum of cell shape which we call embedded "shape-space". However, once the data points are colorized using independent classification from SVM, it is obvious that the embedding is consistent with the SVM classification. Further, we use this same embedded shape-space to embed new data points and trajectories, as seen in Fig. S6.
To visualize the SVM boundaries of our training set, we also performed t-SNE on our SVM training set. Figure S5 shows cluster formation after 300 iterations with a perplexity factor of 30. Blebbing-based (green) and filopodial (magenta) morphologies show a clear separation in the t-SNE manifold both in 2-D and 3-D projections, with the space between them filled by actin-enriched leading edge (yellow) and lamellipodial migration which display some overlap but a separation in the 2-D projection.

S5: Manipulating Rho/Rock Signaling by Pharmacological Treatments t-SNE trajectories of pharmacological treatments
To demonstrate the effects of the Rho-ROCK pathway on cellular morphology, we chemically induce activation and inhibition of pathway proteins. Y27632, a specific inhibitor of Rho-associated protein kinases as well as ROCK-II activity [10][11][12], was purchased from Sigma-Aldrich and diluted to a working concentration at 3 µg/mL [10 µM] (0.1% v/v DMSO) in serum-free growth medium. Similarly, Rho activator II (CN03), known to robustly increase the level of GTP bound RhoA [13,14], was purchased from Cytoskeleton and diluted to a working concentration at 2 µg/mL (0.1% v/v DMSO) in serum-free growth medium.
Samples are prepared by suspending GFP-labeled MDA-MB-231 cells at low density in ice-cold neutralized collagen solution to approximately 650 cells/µL, following the same preparation and neutralization procedure described in S1. Ice-cold cell-collagen solution was then plated on a 35 mm collagen coated glass bottom dish with a 7 mm microwell diameter (MatTek, US). The microwell containing ice-cold cell-collagen solution is covered with a coverslip so that the dish may be inverted during gelation to ensure dispersion of cells in 3D. The dish is then incubated in a tissue culture incubator (37 • C, 5% CO 2 ) for 30 minutes in order to solidify the matrix. The coverslip is removed after gelation time and the cellularized ECM is then immersed with serum-free culture medium and continuously incubated for 20 hours before imaging. Culture media is replaced with serum-free prepared growth media containing HEPES (0.1% v/v DMSO) with or without chemical dilution for experimental or control condition, respectively, and then imaged with confocal microscopy for 24 hours. Figure S6 shows additional trajectories of cells that responded to chemical treatment. Cells treated with ROCK-inhibitor Y27632 show characteristic production of protrusions and/or sustained pre-existing protrusions. In Fig. S6A, this is shown by most trajectories moving toward filopodial migration (magenta) or toward actin-enriched leading edge migration (yellow). Conversely, cells treated with CN03 exhibit morphologies characteristic of cell contraction. Notably, cells do not produce protrusions following treatment and typically retract protrusions post-treatment. Visually, this is seen in Fig. S6B as a notable slide toward the blebbing migration mode (green) or toward lamellipodial cellspreading (blue). For comparison, Fig. S6C shows trajectories of cells without chemical treatment as control conditions. These cells exhibit all types of migration modes, with switching being non-uniform. Figure S6: Additional t-SNE time projections (solid) of (A) ROCK-inhibiting Y27632, (B) Rho-activating CN03 drug treated cells, and (C) control condition non-treated cells. This figure is prepared with Mathlab R2020a (www.mathworks.com).
We have further quantified the trajectories of drug treated and control cells in the t-SNE space. In particular, we calculate the ratio between net displacement and contour length in the t-SNE space for cells with and without drug treatments. A lower value of the ratio indicates the cell makes many detours (fluctuations in direction). As shown in Fig. S7, the ratio for CN03-treated cells is about 50% lower compared with the cells treated with Y27632. This is consistent with the observation made in the main text that cells treated with CN03 manifest strongly fluctuating and diverging trajectories. Cell culture and maintenance for ROCK activity and RhoA activation assays MDA-MB-231 cells are cultured to 90% confluency (treatment and control) in 100 mm dishes for treatment. CN03-treated and control cells are first serumstarved for 24 hours, while Y27632-treated and control cells are kept in 10% FBS culture medium prior to treatment. Control condition dishes are then immersed in their respective culture medium conditions with 0.1% DMSO v/v. CN03-treated cells are immersed in serum-free culture media at 2 µg/mL CN03 with 0.1% DMSO v/v. CN03-treated and control cells are kept in a tissue culture incubator (37 • C, 5% CO 2 ) for 4 hours. Y27632-treated cells are immersed in 10% FBS culture medium with 10 µM Y27632 and 0.1% DMSO v/v and kept in a tissue culture incubator for 12 hours. Thereafter, the media is aspirated from the dishes, and cells are washed with ice-cold PBS twice. 1 ml of cell lysis buffer with protease inhibitors and PMSF is added to the cells and the dishes are kept on ice for 20 minutes, following which the cells are transferred to a microcentrifuge tube. These cell lysates are then cleared by centrifugation at 14,000 g for 10-15 minutes at 4 • C. Control and treatment lysates are then appropriately diluted to equalize protein concentration as measured by NanoDrop.

RhoA activation assay
The examination of RhoA levels is performed using a RhoA Activation Assay Kit (Cytoskeleton, Inc., Denver, CO, USA, Cat no. BK036). For RhoA pulldown, Rhotekin RBD Agarose beads are added to all the samples and incubated at 4 • C for 1 hour with gentle agitation. After incubation, the beads are centrifuged, and the supernatant is removed. These beads are then washed thrice before resuspension in sample buffer for a subsequent immunoblotting procedure.

Quantitative assay for Rho kinase activity
Measurement of Rho kinase (ROCK) activity is performed using a ROCK Activity Immunoblot kit (Cell Biolabs, Inc., San Diego, CA, USA, Cat no. STA-415) according to the manufacturer's instructions. Briefly, cells resuspended in lysis buffer with protease inhibitors and PMSF are centrifuged at 14,000 g for 10 minutes at 4 • C. For both, control and treatment samples, kinase reaction is initiated by the addition of a kinase/ATP/substrate solution, followed by gentle agitation at 30 • C for 60 minutes. This reaction was quenched by the addition of sample buffer, after which samples are run on an SDS-PAGE gel for western blotting. The nitrocellulose membranes with proteins transferred over are blocked with 5% non-fat dry milk for 1 h at room temperature. The primary antibodies are then dissolved in 5% non-fat dry milk at 1:1000 concentration and used to detect the target protein in the blots. For the RhoA activation blot, the RhoA primary antibody was replaced with Anti-Mouse RhoA (LSBio, Cat. # C355184). The blots remain incubated at room temperature for 2 hours. These blots are then washed with TBST thrice for 5 minutes. Next, the blots are incubated with Goat anti-Mouse secondary antibodies at 1:2000 concentration for 1 hour at room temperature. The blots are developed using the Radiance Chemiluminescent substrate (Azure Biosystems, California, USA) and are visualized on an Azure Western blot molecular imaging system (Azure Biosystems, California, USA).

F-Actin structures respond to Rho/ROCK modulation
To visualize changes in stress fiber formation due to pharmological treatments, cells are plated on a 35 mm collagen coated glass bottom dish with a 7 mm microwell diameter (MatTek, US) at very low cell density. The dish is then incubated in a tissue culture incubator (37 • C, 5% CO 2 ) in growth media with 10% FBS until cells reach approximately 30% confluence. Cells are then administered with pharmological treatment as follows: ROCK-Inhibitor Y27632-treated cells are immersed in growth media with 10% FBS at 10 µM Y27632 (0.1% v/v DMSO). Y27632-control comparison cells are immersed in growth media with 10% FBS (0.1% v/v DMSO). Rho-activated CN03 cells are first washed once with PBS and then serum-starved in serum-free growth media for an additional 24 hours. CN03-treated cells are then immersed in 2 µg/mL CN03 (0.1% v/v DMSO) in serum-free growth medium for 4 hours. CN03-control comparison cells are also first washed once with PBS and then serum-starved for 24 hours, before finally being immersed in new serum-free growth media (0.1% v/v DMSO) for four hours. Cells are then fixed, permeabilized, blocked, stained, and imaged following the procedure described in S1 using phalloidin (ThermoFisher).

Confocal reflection intensity autocorrelation function
In order to quantify the density fluctuations of collagen ECM, we compute the autocorrelation functions of confocal reflection images of collagen gels. Reflection images are first background subtracted using a rolling ball with radius of 50 pixels (26.88 µm). Images are then log-transformed to make fibers highly quantifiable. Images are then mean subtracted, and the autocorrelation is calculated. Following, the autocorrelation is normalized and then smoothed using interpolation. The results are shown in figure S10. The spatial uniformity indicates an appropriate distribution of randomly oriented gel fibers. Additionally, these show that the decay in the autocorrelation is slower for higher density gels, an indication of the fiber quantity, and much faster for higher temperature gel, caused by the shorter fiber lengths and smaller pores [15]. The anisotropy of decay in the final graph indicates the direction of alignment.

ECM fiber network coherency
In order to determine the degree of local and global alignment of collagen fibers, we take the pre-processed confocal reflection images and use OrientationJ with 9-14 circular ROIs per image packed without overlap, with ROI sizes ranging from 145 to 450 pixel diameter. At least 3 images (one per experiment) were used to quantify coherency. Empirically, larger ROIs are reliable to quantify global alignment, while smaller ROIs were better used for local alignment measures. ROIs smaller than 145 pixels are not used, as it was noticed the coherency measured by OrientationJ can be highly biased for large fibers such as in 25 • C gels (145 pixels [78 µm] is approximately twice the average fiber length [≈ 41 µm] in a 25 • C gel). The results are shown in figure S11 and follows previous literature [16].

Rheology
Strain sweep rheology measurements are performed on varying density gels and varying gelation temperature with a AR-G2 rheometer (TA instruments) at a 1 Hz frequency in a parallel plate geometry. A standard peltier plate (TA instruments) allows gels to be formed at 25 • C and 37 • C. Young's modulus is shown in table S2. As shown, storage moduli in the linear regime for gels with 1.5, and 3.0 mg/mL collagen prepared at 25 • C are about 325 and 830 Pa, respectively. Similarly, for gels with 1.5, and 3.0 mg/mL collagen prepared at 37 • C are about 170 and 425 Pa, respectively.

Linear Regime G'
Collagen Density Temperature G' 1.5 mg/mL 25 • C 325 Pa 1.5 mg/mL 37 • C 171 Pa 3.0 mg/mL 25 • C 832 Pa 3.0 mg/mL 37 • C 427 Pa Table S2: The linear storage modulus of collagen fiber networks using AR-G2 Rheometer (TA Instruments) in a parallel plate geometry at a 1 Hz frequency.

S7: Additional Details of Morphological Phenotype Dynamics
Dwell time definition The dwell time of state i is determined by using where r i→i is the transition rate from state i back to itself, and D i→i is the corresponding dwell time. We find that this definition of dwell time maximizes the usefulness of data and is more robust to experimental shortfalls that affect the naive method of simply counting the number of frames a cell remains in the same phenotype for.

Details of transition rate calculations
Assuming the probability distribution of cell morphological phenotypes follow a Boltzmann distribution, then the probability P i is given by where N is the number of states, r i→j is the transition rate from state i to state j, and conversely r j→i is the transition rate from state j to state i. The transition matrix is thus calculated as probability per unit time, and hence each row in the transition matrix will sum to 1 N j r i→j = 1.
The transition rate r i→j is defined as the probability of transition from state i to state j per unit time, given by where dt m i→j is the transition time of the mth observation from class i to class j, C is the number of classes, and N i→j is the number of observed transitions from class i to class j. The transition time dt m i→j is the number of frames until a cell goes from a state with classification probability above threshold (>60%) to another state that exceeds the threshold, and therefore has a lower bound of 1 if the transition is immediate and increases as the cell passes through intermediate states. Using this method, we find that the transition rates and dwell times are stable, regardless of the length of trajectories in computational experiments.
Following SVM classification, phenotype dynamics can be properly drawn from data. Importantly, where maximum decisions by the SVM classifier do not exceed 60%, the classification is thus determined to an intermediate between two states. The intermediate state can be a chimera of two states, with most occurrences being as a cell transitions between morphological phenotypes. Because this state is not considered to be unique, we calculate the transition rate from state i that passes through N intermediate states prior to state j as 1/N f rames . We find by simulation that this method can most-accurately recover transition rates in comparison to methods using soft-max or ignoring intermediate states.

Probability flux
To investigate broken detailed balance in morphological phenotype space, we report probability flux calculations shown in figure S12. The probability flux from state i to state j is given by where N i→j is the number of transitions from state i to state j, and both summations in the denominator are over all available states. The probability flux φ i→j and the transition rate r i→j are related by φ i→j = P i r i→j , where P i is the probability of a cell being at state i. The net probability flux between states i and j is given by ∆φ i→j − ∆φ j→i . We find that the net flux between states are all less than 0.003 probability difference per hour for cells in any ECM condition we tested. To reveal if any small difference in probability flux may be significant, we also report the probability flux percent difference between states i and j is given by ∆φ i→j − ∆φ j→i ∆φ i→j + ∆φ j→i We find that the maximum net probability flux percent difference for cells in any ECM condition evaluated is less than 9%. In order to determine the migrational persistence of migration modes, we report first the velocity autocorrelation observed for cells in collagen ECM prepared at room temperature and [col] = 1.5 mg/mL. We find that the autocorrelation quickly decays to zero in a single time step (15 minutes) shown in figure S13(A). We additionally checked the cos(θ) distribution between consecutive steps and find that the concurrent steps seem to be taken randomly in direction, as given by the U-shaped distribution of figure S13(B). Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) Step Size (µm) To evaluate the motility of various migration mode transitions, steps sizes (frame-to-frame displacements) are separated into components parallel and per-pendicular to the direction of the previous step. Figure S14A shows the binned step size distribution in the persistent direction fit with a Gaussian distribution. It can be seen that these fits miss a significant portion of large steps. Rather, we find more suitable fits are obtained by fitting the magnitude of the step sizes, which follow a log-normal distribution as shown in figure S14B. The parameters from fits shown in figure S14B are subject to bin-size, and therefore we instead report parameters obtained by fitting the empirical cumulative distribution (CDF) of step size magnitude with a log-normal CDF, shown in figure S15. Step Size ( m) Step Size ( m) Step Size ( m) Because intermediate states account for a sizable portion of the data (approximately 10%), we seek to further evaluate its impacts in the motility characteristics. As an alternative approach, we eliminate the intermediate state by classifying cells according to their highest probability scores. This reduces the number of coarse-grained phenotypes from three (AM, ME and intermediate) to two (AM and ME). Using this approach, we reanalyze the motility characteristics and find there are little changes in the fitted means and variances of the steps. Therefore neither the intermediate states, nor the precise value of classification threshold (currently 60%) contribute significantly to the results presented in Fig. 5.
In our study, to reduce the imaging phototoxicity, and to account for the high resolution single cell images that vary their shapes, we uniformly apply the same MIC calculation to all cells, regardless of the cell shape. We also conduct additional analysis to examine the relation between MIC cell center and cell shape. For instance, we find the step size normalized by cell aspect ratio have similar distributions when tracked via MIC or cell nucleus (NUC for short) (figure S17A). Since aspect ratio is one of the most significant features that distinguish AM and ME cells, we conclude that MIC and NUC tracking yield consistent relations between cell motility and cell geometry.
We also confirm the conclusion that ME cells are more motile than AM cells when tracking cell nucleus or by tracking center of mass (CENT for short) (figure S17B-D). These results are consistent with our findings in the main text (Fig. 5). Step Size (µm) Step Size (µm) Step Size (µm)

Experiment details
Interface experiment was done in triplet. Briefly, the outer gel was first made by gelling cells in collagen solution (1.5 mg/mL neutralized) at 25 • C for 20 minutes on the DIGME stage [17]. After, the needle was gently removed and a new ice-cold collagen solution (1.5 mg/mL neutralized) containing cells was then dripped into the hole, gently swirled, and then placed into the incubator at 37 • C for 15 minutes. After, collagen gels were immersed in 3 mL of growth medium and continuously incubated for 24 hours before imaging. Imaging was taken near the interface, imaging bright-field, fluorescence (green), and backreflection confocal images. Using the back-reflection images, the interface was manually traced out through the z-stack. The distance to the interface from cell centroids in corresponding z-stacks were then measured away from the closest marked interface point using Matlab bwdist. The frequency vs distance is shown in figure S18, indicating a large number of cells were images close to the interface.

Details of interface analysis
Migration mode transitions were first determined by using continuous trajectories accounting properly for intermediate state classification, and then distances away from the interface were determined by the initial state location. A 1-D Gaussian kernel was used to acquire continuous local spatial probability densities of transitions (per hour), centered every 5.376 µm (10x the distance-to-pixel ratio) with a standard deviation of 26.88 µm. This yields the probability density (per hour) of observing a transition P (i → j, x) at a location x, as used in the main text. To mitigate the bias from non-uniform cell density, we then divide the prior probability by the spatial probability of observing a cell within the spatial window centered at x, M (x). M (x) is calculated using the same Gaussian kernel. To further demonstrate the biological significance of the machine-classified morphological phenotypes, we have examined the characteristic structures of actin cytoskeletons of different cell types. Figure S19 shows the immunofluorescent images of two typical cells per each morphological phenotype. It can be seen that filopodial cells (FP type, figure S19 A-B) have highly polarized F-actin bundles extending the long axis of the cell body. Blebbing cells (BB type, figure S19 C and G) show spherical blebs of various curvatures at cell membrane without actin polarization. Lamellipodial cells (LP type, figure S19 D and H) exhibit smooth actin-rich arcing at leading edges of overall elongated cell bodies. Finally, actin-enriched leading edge cells (AE type, figure S19 E and F) display sharp, actin-rich protrusions from the cell surface without actin polarization in the cell body. These features are consistent with previously reported actin cytoskeleton structures of different migration modes [18][19][20][21].