Abstract
Stochastic time series are ubiquitous in nature. In particular, random walks with timevarying statistical properties are found in many scientific disciplines. Here we present a superstatistical approach to analyse and model such heterogeneous random walks. The timedependent statistical parameters can be extracted from measured random walk trajectories with a Bayesian method of sequential inference. The distributions and correlations of these parameters reveal subtle features of the random process that are not captured by conventional measures, such as the meansquared displacement or the step width distribution. We apply our new approach to migration trajectories of tumour cells in two and three dimensions, and demonstrate the superior ability of the superstatistical method to discriminate cell migration strategies in different environments. Finally, we show how the resulting insights can be used to design simple and meaningful models of the underlying random processes.
Introduction
Stochastic time series, here used synonymously with random walks, play an important role in earth and life sciences, technology, medicine and economics. Most of these disciplines deal with complex systems in which multiple hierarchical processes are interacting at different timescales. Systems with this level of complexity are likely to change their statistical properties as a function of time, resulting in heterogeneous time series. It is therefore surprising that only few tools are available for the analysis and characterization of such timevarying random walks. Some of these tools are used in finance^{1,2,3}, mainly with the goal of forecasting. In science, heterogeneous time series have been successfully described by Hidden Markov models^{4}. However, systems with continuously timevarying statistics cannot be adequately modelled by a few discrete hidden states.
Owing to this lack of appropriate tools, many studies are still relying on conventional evaluation methods that were designed for simple physical systems. The most frequently used statistical measures for random walks, in particular the step width distribution (SWD), the meansquared displacement (MSD) and the velocity autocorrelation function, are implicitly assuming that the stochastic process can be globally described by a few characteristic parameters, such as a constant variance and a constant correlation time.
We demonstrate in this paper that the application of these conventional methods to heterogeneous random walks generates ‘anomalous’ results, such as nonGaussian SWDs or powerlaw MSDs with fractional exponents^{5,6,7}. These anomalies emerge inevitably from the temporal averaging over changing local statistics during the evaluation period (Supplementary Note 1), and therefore do not provide meaningful insights into the random walk apart from its heterogeneous nature. Moreover, these temporally averaging measures may remain unchanged even if the experimental conditions are significantly altered. This lack of sensitivity points to a fundamental limitation of conventional statistical methods for analysing heterogeneous processes. SWD, MSD and autocorrelation function average over the successive statistical parameters of the heterogeneous random walk, instead of using the parameter dynamics as a rich additional source of information.
In this study, we propose a superstatistical framework for modelling and analysing heterogeneous random walks. The term superstatistics refers to the superposition of several different stochastic processes^{8,9,10,11}. Accordingly, we describe the time series locally by a homogeneous random walk model with a minimum number of statistical parameters. In the case of cell migration, we use an autoregressive process of first order (AR1) with a persistence parameter q and an activity parameter a. These parameters (q_{t},a_{t}) are allowed to change with every time step of the random walk. By this way, heterogeneous time series of arbitrary complexity can be described (Supplementary Note 2).
We provide a new sequential Bayesian method to infer the timedependent parameters from measured random walk trajectories. In contrast to conventional maximum likelihood parameter estimation within a sliding time window, our method can handle both gradual and abrupt changes of the parameters. As a Bayesian method, it provides not only point estimates but also their confidence intervals. After extraction of (q_{t},a_{t}) from the measurements, the statistical properties of the timedependent parameters can be subsequently analysed by computing the temporally averaged joint posterior distribution p(q,a), the temporal autocorrelations C_{qq}(Δt) and C_{aa}(Δt), and the crosscorrelations C_{qa}(Δt).
In this paper, we use the migration of individual tumour cells as a case study of superstatistical analysis. Cell migration plays an essential role in many fundamental biological processes, such as embryogenesis, tissue repair or cancer development^{12,13,14}. Anomalous features of cellular random walks have been reported by several groups, and a variety of models have been proposed in the literature to account for those anomalies^{5,7,15,16,17,18}.
We demonstrate that anomalies of conventional statistical measures to describe cell migration are attributable to fluctuations of migration persistence q and activity a. Moreover, the joint distribution of persistence and activity, p(q,a), and the auto and crosscorrelations C_{ij}(Δt) of these two parameters provide characteristic fingerprints of the underlying random walks. Unlike globally averaging statistical measures, a superstatistical analysis can clearly resolve the effects of different environments on cell migration, such as migration in a threedimensional (3D) collagen network versus migration on a planar 2D culture dish. Furthermore, by observing individual cells in microfabricated 1D channel structures with varying diameter, we demonstrate that the temporal changes of the (q_{t},a_{t})parameters are directly associated with different local microenvironments that the cells experience along their migration path. Finally, we show how the extracted statistical properties of the timedependent parameters can be used to construct simplified models that reproduce all key features of the data, including the nonGaussian SWD and powerlaw MSD. While other types of models have also successfully reproduced these anomalous features, for example, using fractional diffusion equations^{7} or integrodifferential equations with complex memory kernels^{19}, the superstatistical framework achieves this with the simplest persistent random walk model (the twoparameter AR1 process), extended by the temporal variations of the two parameters (persistence and activity).
Results
Cell migration in 2D and 3D
We study the migration of the breast carcinoma cell line MDAMB231 in a 3D collagen gel and on a tissue culturetreated 2D plastic surface, either uncoated and or coated with the adhesion ligand fibronectin. Threedimensional cell positions within the random fibre network of a collagen gel (Fig. 1a,b) are detected by analysing the characteristic refraction pattern (Fig. 1b inset) around the cell nucleus. From the individual cell trajectories (Fig. 1c), we compute momentary migration properties, such as cell speed versus time (Fig. 1c inset). Since the gel has a free upper surface and thus a lower effective stiffness in the zdirection, cells react with a more pronounced horizontal (x–y direction) alignment and motion, in agreement with theoretical predications based on active cellular mechanosensing mechanisms^{20}. Therefore, only the x–y coordinates are used for comparing 2D and 3D migration.
Globally averaging statistical measures
For each individual cell trajectory, we compute the SWD, defined as the probability p(Δx,Δt) that the cell changes its xcoordinate by Δx within a lag time interval Δt, as well as the MSD, defined as r^{2}(Δt)=〈(r(t+Δt)−r(t))^{2}〉_{t,e}, where 〈〉_{t,e} indicates temporal and subsequent ensemble averaging over the different individual cells of the same migration environment.
Regardless of environment, the SWD shows a leptocurtic, approximately exponential shape (Fig. 5a inset and Supplementary Note 3). For lag times below 500 min, the MSD can be approximated by power laws (Fig. 5a) with a fractional exponent of 1.3 in the cases of 3D collagen and uncoated 2D plastic, but with a larger exponent of 1.7 in the case of fibronectincoated 2D plastic. It is remarkable that the SWD and MSD are practically indistinguishable for migration in 3D collagen and on uncoated 2D plastic, even though these environments require different migration strategies.
Within collagen, cells assume a pronounced elongated shape and typically form a pathfinding long and thin protrusion that can extend over >100 μm (Supplementary Movies 1 and 2; ref. 21). The directionally persistent trajectory of the cells is mainly defined by the contour of this long protrusion, resembling the movement of a needle in an array of obstacles^{22}. However, cells can also pull themselves along bundles of collagen fibres in a process known as contact guidance^{23,24}. Occasionally, encounters with obstacles or small pores in the disordered collagen network can force the cell to withdraw or change directions (Supplementary Movie 2). On planar surfaces by contrast, the cells spread and assume a flat, irregular shape. They also polarize and move preferentially along their polarization axis (Supplementary Movie 3), but they cannot take advantage of external cues to keep a persistent migration direction.
Despite these diverging migration modes, the net spatial advancement of MDAMB231 cells over time is similar in both environments. Therefore, the SWD and MSD for migration in 3D collagen and on uncoated 2D plastic are nearly identical. On fibronectincoated 2D plastic, the cells migrate more slowly but with a higher directional persistence (Supplementary Movie 4). Over time, this leads to a larger net spatial advancement compared with uncoated plastic. Accordingly, the MSD shows a higher fractional exponent of 1.7, and the SWD broadens (Fig. 5a).
Bayesian inference of timedependent parameters
For the superstatistical analysis of the data, we first compute for each cell trajectory {r_{t}=(x_{t},y_{t})} the vectorial displacements u_{t}=r_{t}−r_{t−1} for each measurement time step δt=5 min. The statistical relationship between two successive displacements is described by a 2D firstorder autoregressive process (AR1) defined by
This process is equivalent to a persistent random walk or a timediscrete Ornstein–Uhlenbeck process. The parameter q_{t}∈[−1,+1] describes the local persistence of the random walk, with q_{t}=−1 corresponding to antipersistent motion, q_{t}=0 to nonpersistent diffusive motion and q_{t}=+1 to persistent motion. The parameter a_{t}∈[0,∞] describes the local activity (noise amplitude) and sets the spatial scale of the random walk. Together, the two parameters determine the variance of the displacements according to var(u)=a^{2}/(1−q^{2}). The vector n_{t}=(n_{xt},n_{yt}) is normally distributed, uncorrelated random noise with unit variance.
To extract the timedependent joint probability density P(q_{t},a_{t}) of the parameters q_{t} and a_{t} from a sequence of displacements u_{t}, we use sequential Bayesian updating. We start at time t=0 with a flat prior distribution P_{0}(q,a) (see P_{0} in Fig. 2), which can be interpreted as a ‘first guess’ about the parameter values. From the measured successive displacements u_{0} and u_{1}, we compute the likelihood distribution L_{1}(q,a) (see L_{1} in Fig. 2), which provides a first information about probable parameter values.
The prior distribution P_{0} and the likelihood distribution L_{1} are multiplied to obtain the posterior distribution P_{0}L_{1}, which updates our guess of the parameter values for the next time step. In the case of a temporally homogeneous process with constant parameters, iterative multiplication of the posterior distributions with the likelihood distributions, P_{t}=P_{t−1}L_{t} (Fig. 2), would yield an increasingly accurate estimate of the parameter values. For heterogeneous processes, however, the possibility of changing parameters has to be taken into account. This is achieved by a transformation K of the posterior distribution, P_{t}=K(P_{t−1}L_{t}). The transformation K (blurring and preventing the posterior distribution to fall below a small cutoff value) is chosen such that both gradual and abrupt parameter changes can be identified (see Methods section). Finally, we perform the same sequential parameter inference in the reverse time direction (not shown in Fig. 2) and combine both distributions.
We validate this method by simulating random walk trajectories from prescribed stepwise (Fig. 3a) or gradually (Fig. 3c) changing parameter sequences {(q_{t},a_{t})}. We then reconstruct the parameter sequences from the simulated trajectories by sequential Bayesian inference. The mean values of the posterior distributions fluctuate around the ‘true’ parameter values, but follow the prescribed time evolution closely, both for abrupt (Fig. 3a) and gradual (Fig. 3c) parameter changes. We also find that the Bayesian method is superior to a maximum likelihood estimation with a sliding time window. The maximum likelihood estimation method cannot handle abrupt and gradual parameter changes equally well, and the user must find a compromise between long time windows that wash out sudden parameter jumps and short windows that lead to noisy results (Supplementary Note 4).
Heterogeneity of measured random walks
We next apply the Bayesian inference method to measured cell trajectories. An example for the parameter evolution of a cell migrating on uncoated 2D plastic is shown in Fig. 4. We find large variations of cell behaviour, both with time (Fig. 4a,b) and between individual cells (Fig. 4c). By plotting the cell activity versus persistence for all time points, we further find that individual cells can occupy different regions of the (q,a) parameter plane (Fig. 4c). Some cells remain in a small compact region of the (q,a)plane during the entire measurement period (brown), whereas others jump between disjunct subregions (green) or continuously change their parameters over time (Fig. 4c).
Superstatistical data evaluation
Joint probability distributions. We average the posterior distributions p(q,a) for all time points and all cells measured in the same environment (Fig. 5b). In contrast to MSD and SWD, the ensembleaveraged posterior distributions show large differences between all three environments. The peak position of the distribution shows the lowest persistence and highest activity for collagen, and the highest persistence and lowest activity for fibronectincoated plastic. Moreover, the spread of the distributions indicates that migration in collagen gels is more heterogeneous compared with migration on plastic. The p(q,a) distributions thus provide characteristic ‘fingerprints’ of the migration environments that can be used for automatic trajectory classification. In a ‘leaveoneout’ crossvalidation, we were able to assign ∼90% of the cell trajectories to the correct environment (see Methods section).
Parameter correlations. The auto and crosscorrelations of the timedependent parameters q_{t} and a_{t} reveal even larger differences between migration strategies in 2D versus 3D environments. Autocorrelation times are noticeably longer in a 3D environment (Fig. 5c), where the local biopolymer fibre configuration provides a guiding or trapping microstructure that influences a given migration mode for long time periods. Large differences between different environments are also visible in the crosscorrelations of the timedependent parameters (Fig. 5d). On fibronectincoated plastic, persistence and activity are negatively correlated for up to 100 min. This is consistent with the longknown observation that on highly adhesive surfaces, cells maintain persistent motion by performing sequences of small steps along the same direction^{25}. The continuous gliding motion is not seen on less adhesive, uncoated plastic surfaces. Instead, we observe a weakly positive crosscorrelations between q_{t} and a_{t}. In collagen, we find strong positive correlations between q_{t} and a_{t}, consistent with the observation that cells intermittently cover large distances with high directional persistence guided by long protrusions (Supplementary Movies 1 and 2).
Note that the activity parameter a_{t} should not be interpreted literally as the momentary cell speed u_{t}, but as a scale parameter that—together with q_{t}—determines the most probable value of the cell speed. To clarify this point, we also investigate the correlation between persistence q_{t} and momentary cell speed u_{t}. For migration on coated and uncoated plastic surfaces, we find a positive correlation between q_{t} and u_{t} (Supplementary Note 6). A similar relationship has been reported for a variety of different cell types migrating on fibronectincoated surfaces^{26}. In collagen, however, persistence and cell migration speed are uncorrelated (Supplementary Note 6).
Effect of local microenvironment
In the previous section, we have tacitly assumed that the local microenvironment has an immediate effect on migration persistence and activity. To test this assumption, we use a microstructured environment and measure cell migration through a linear (1D) array of sequentially narrowing channels and wider chambers. After extracting the timedependent parameters q_{t} and a_{t} from individual cell trajectories (Fig. 6a), we plot q_{x} (Fig. 6b) and a_{x} (Fig. 6c) versus the xposition.
The precise migration mechanism of different cell types through such environments is not well understood and may involve integrinmediated adhesiondependent^{27} or adhesionindependent^{28} strategies. Regardless of the migration mechanism, our microstructured environment forces the cells to adapt to different degrees of confinement in rapid succession. A cell that enters a channel first has to polarize and deform its nucleus. It can then transit the channel with high persistence and activity. When the cell nucleus exits the narrow channel and enters the wider chamber, persistence and activity decrease markedly. Thus, the superstatistical migration parameters are strongly correlated with the local properties of the environment.
Superstatistical modelling
We construct a series of simple models of cell migration that approximate the statistical properties of q_{t} and a_{t} found in the data. All models are based on an AR1 process. The superstatistical parameters q_{t} and a_{t} switch to new values, drawn from fixed distribution p_{model}(q,a), after exponentially distributed time intervals with mean value T_{model}. This regimeswitching approach leads to exponentially decaying autocorrelations of the parameters with correlation time T_{model}. We choose T_{model}=200 min taken from migration experiments in collagen (Fig. 5c). The parameter distribution p_{model}(q,a) is modelled as a bivariate Gaussian, centred at the main peak of the experimentally observed distribution p(q,a) (Fig. 5b).
We first consider the limit of zero variance for p(q,a), which corresponds to a homogeneous correlated random walk with constant q and a. In this case, the MSD is crossing over from a ballistic (slope 2) to a diffusive (slope 1) behaviour at a specific lag time that depends only on the persistence q. Increasing the variance of q generates a continuous mixture of crossover times, and the MSD starts to resemble a power law (Supplementary Note 1). In addition, the SWD becomes leptocurtic, but it does not show the exponential distribution found in the experiments. Finally, using an asymmetric bivariate normal distribution with positive correlations between q and a (Fig. 5b, dashed grey ellipse), the SWD, MSD and correlation functions match the measured data nearly perfectly (Fig. 5a,c,d, dashed grey line).
This example demonstrates how superstatistics can recapitulate the anomalous features of heterogeneous random walks by mapping the complexity of the system into a suitable distribution of parameter values p_{model}(q,a), while keeping the underlying stochastic process simple.
Discussion
In this study, we have applied the superstatistical framework to the specific example of tumour cell migration in environments with different dimensionality. The same approach, including the particular choice of the AR1 process as a local model, can be used for many other heterogeneous random walks in life sciences. For this purpose, we provide a Python implementation of the Bayesian algorithm for inferring the timedependent parameters q_{t} and a_{t} from random walk trajectories (Supplementary Software 1).
In principle, a sequential, gridbased inference of superstatistical parameters can also be performed by a Markov Chain Monte Carlo approach. In this case, the vector of model parameters to be inferred consists of the full set {(q_{t},a_{t})} of superstatistical parameters for all time points. In the past, Markov Chain Monte Carlo methods, mostly based on the Metropolis Hastings algorithm, exhibited serious convergence problems when applied to such highdimensional parameter spaces. Only recently, a novel sampling method based on Hamiltonian Monte Carlo has markedly improved the convergence^{29}. Our preliminary tests demonstrate that this new sampling algorithm can indeed find the parameter vector of a hierarchical superstatistical model, however, with a considerably longer computation time.
Our superstatistical framework can be readily adapted to more complex types of stochastic systems. In particular, the AR1 process can be replaced by any parameterized model with a defined likelihood function. For example, fluorescent beads attached to the cytoskeleton of living cells show fluctuations that can be described by a particle diffusing in a harmonic potential well^{30,31}. Due to cytoskeletal remodelling, the centre position of the potential well is changing on longer timescales. Together, this process can be modelled with an inhomogeneous random walk of the centre position, superposed with a harmonic overdamped oscillator^{32}. As a final example, recordings of neural spike trains are frequently modelled as inhomogeneous Poisson processes with a timedependent spike rate. In this case, sequential Bayesian inference can be used to extract the local spike rates from the time series of measured interspike intervals.
Methods
Cell culture and migration measurements
For migration experiments in collagen, on plastic and on fibronectincoated plastic, we use MDAMB231 breast carcinoma cells (obtained from the American Type Culture Collection (ATCC)). Cells are cultured in 75 cm^{2} flasks in Dulbecco s modified Eagle's medium (DMEM) (1 g l^{−1} Dglucose) and 10% fetal bovine serum, 1% penicillin/streptomycin at 37 °C, 5% CO_{2} and 95% humidity. Cells are passaged every second day. Trypsinethylenediaminetetraacetic acid (TrypsinEDTA) is used to detach cells.
To study cell migration on planar surfaces, we use tissue culturetreated plastic dishes with and without fibronectin coating (69 and 177 cells, respectively). In all 2D experiments, the sample time interval between frames was δt_{2D}=1 min.
For 3D experiments, we use reconstituted collagen gels (Fig. 1a) with controlled material properties as a substitute for biological tissue. At a collagen concentration of 2.4 mg ml^{−1}, these gels have an average pore radius of 1.3 μm and a shear modulus of 108 Pa (ref. 33). Cells are mixed with collagen solution before polymerization at a concentration of 15,000 cells per ml. The x, y and zposition of the cells within the collagen gel is determined from a characteristic intensity profile of the refraction pattern around the nucleus of the cell (inset of Fig. 1b). A 3D deconvolution of the intensity profile then defines the cell position with an accuracy of 2 μm (r.m.s.). Cell tracking is performed automatically in real time, and the cell position is used to keep the motorized microscope x–ycentred and zfocused onto the cell at all times. Using a timesharing mode, we are able to observe and follow up to 20 individual migrating cells within the same cell culture well over prolonged time periods (24 h). We record discrete cell positions with a sample time interval of δt_{3D}=2.5 min (Fig. 1c). Cells undergoing cell division during the time of observation were excluded. The number of analysed cells in collagen was 65.
We also study the migration of primary inflammatory ductal breast cancer cells (gift from Pamela Strissel and Reiner Strick, Womens Hospital, University Clinics Erlangen) within a microfabricated channel structure made of polydimethylsiloxan. The structure has a constant height of 3.7 μm and consists of 15 consecutive channels with diameters decreasing from 11 to 1.7 μm, separated by 20 × 20μmwide chambers (Fig. 6a). After staining the cell nuclei with Hoechst 33342 (1 μg ml^{−1}), the centre positions are tracked with a sample time interval of δt_{1D}=5 min. For superstatistical evaluation, a cell is chosen that passed through two successive channels within 150 min.
Bayesian parameter inference
Since the iterative updating of the parameter distribution described in this work is not analytically tractable, the presented algorithm is implemented using discretized probability distributions. Based on equally spaced parameter values q_{i} and a_{j} (i∈{1,2,..,N_{q}}, j∈{1,2,..,N_{a}}), a distribution p(q,a) can be approximated by a N_{q} × N_{a}dimensional matrix: (p(q,a))_{ij}=p(q=q_{i},a=a_{j}). The multiplication of two distributions is thus reduced to the elementwise multiplication of two matrices.
The prior distribution P_{t}=p(q_{t+1},a_{t+1}) holds the preliminary belief about the latent parameter values for the next time step, before seeing the corresponding data point. Using the data point u_{t+1}, we subsequently update the prior distribution by multiplying it with the likelihood L_{t+1}=p(u_{t+1}q_{t+1},a_{t+1};u_{t}) that describes the probability of observing a certain measurement u_{t+1}, given the values of the latent parameters (and the previous measurement u_{t}).
For the underlying AR1 process, the likelihood is given by
where d states the number of dimension of the velocity vectors (two in this study). Note that the inference method can also be applied to other underlying stochastic processes with more complicated likelihood functions. As our approach uses only the numerical values of the likelihood for discrete points of the (q_{t},a_{t})grid, the likelihood need not be expressed analytically as long as it can be computed numerically.
The next prior P_{t+1} is computed from the posterior distribution P_{t+1}=K(P_{t}L_{t+1}), with K being a transformation that accounts for both gradual and abrupt parameter changes as follows: To allow for abrupt parameter changes, we set the minimal probability of the posterior distribution to p_{min}=10^{−7}
To allow for gradual parameter changes, we blur the distribution by convolution with a box kernel B of radius R=0.03 defined as
Here, Θ(x) is the Heaviside step function. The posterior distribution of the parameters is normalized at every time step, since the transformation K does not preserve normalization. A systematic procedure to find optimal values for the two parameters p_{min} and R is given in the Supplementary Note 5.
Starting with a flat prior P_{0} and moving forward in time using the iteration described above, a series of ‘forward’ priors is generated. In the same way, we can start the iteration at the end of a trajectory, and build a series of ‘backward’ prior distributions . Finally, for each time step t, we multiply the t−1 and t+1 priors with the likelihood L_{t} to compute the final posterior distribution of the parameters (q_{t},a_{t}), so that . Note that the inference algorithm is run in both directions of time to ensure that for each estimated parameter pair (q_{t},a_{t}), all measured data points are taken into account and not only those of earlier times 0…t. In principle, however, the algorithm can also be used only in the forward direction, which may be useful for online analysis of a data stream.
Temporal and ensemble averages
Throughout this paper, the symbol 〈f_{t}〉_{t} denotes temporal averaging over all discrete time points. For our data evaluation (SWD, MSD and auto and crosscorrelations), we have additionally ensembleaveraged the timeaveraged properties over the individual cells of the same migration environment.
Auto and crosscorrelations
The autocorrelation C_{qq}(Δt) of the persistence parameter q_{t} is defined in the standard way as , where is the temporal average and is the variance of the parameter. The definition of the activity autocorrelation C_{aa}(Δt) is analogous. Finally, the crosscorrelation C_{qa}(Δt) between the two parameters is defined as .
Superstatistical modelling of cell migration
To model the statistical properties of cell trajectories in collagen (Fig. 5, grey dashed lines), we use a superstatistical regimeswitching process with an average switching time of τ=200 min. Parameter values (q_{t},a_{t}) are drawn from a bivariate Gaussian distribution, (q_{t},a_{t})∼N(μ,Σ), centred around the mean μ=(μ_{q},μ_{a})=(−0.05,0.55). The covariance matrix is with σ=0.3 and ρ=0.65. The 50% credibility region of the distribution is shown in Fig. 5b as a grey dashed ellipse. The values of q_{t} are restricted to the interval [−1,1].
Environmentspecific cell classification
For ‘leaveoneout’ crossvalidation, we calculate the squared deviation D between the timeaveraged posterior distribution of a single cell, denoted p_{single}(q,a), and each of the three ensemble and timeaveraged distributions p_{env}(q,a) (excluding that one cell). The calculation of the squared deviation is carried out as a sum over the N_{q} × N_{a}grid:
A cell is counted as correctly classified if the deviation to its true environment is the smallest, compared with the other two environments.
Additional information
How to cite this article: Metzner, C. et al. Superstatistical analysis and modelling of heterogeneous random walks. Nat. Commun. 6:7516 doi: 10.1038/ncomms8516 (2015).
References
 1.
Pedrycz W., Chen S. (eds.) Time Series Analysis, Modeling and Applications Springer (2013).
 2.
Zumbach, G. Discrete Time Series, Processes, and Applications in Finance Springer (2013).
 3.
Kirchgässner, G., Wolters, J. & Hassler, U. Introduction to Modern Time Series Analysis Springer (2013).
 4.
Rabiner, L. A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989).
 5.
Wu, P.H., Giri, A., Sun, S. X. & Wirtz, D. Threedimensional cell migration does not follow a random walk. Proc. Natl Acad. Sci. USA 111, 3949–3954 (2014).
 6.
Bursac, P. et al. Cytoskeletal remodelling and slow dynamics in the living cell. Nat. Mater. 4, 557–561 (2005).
 7.
Dieterich, P., Klages, R., Preuss, R. & Schwab, A. Anomalous dynamics of cell migration. Proc. Natl Acad. Sci. USA 105, 459–463 (2008).
 8.
Beck, C. & Cohen, E. Superstatistics. Physica A 322, 267–275 (2003).
 9.
Beck, C., Cohen, E. & Swinney, H. From time series to superstatistics. Phys. Rev. E 72, (2005).
 10.
Beck, C. Generalized statistical mechanics for superstatistical systems. Phil. Trans. R. Soc. A 369, 453–465 (2011).
 11.
Van der Straeten, E. & Beck, C. Superstatistical fluctuations in time series: Applications to shareprice dynamics and turbulence. Phys. Rev. E 80, 036108 (2009).
 12.
Rorth, P. Collective cell migration. Annu. Rev. Cell. Dev. Biol. 25, 407–429 (2009).
 13.
Rorth, P. Fellow travellers: emergent properties of collective cell migration. EMBO Rep. 13, 984–991 (2012).
 14.
Friedl, P. & Gilmour, D. Collective cell migration in morphogenesis, regeneration and cancer. Nat. Rev. Mol. Cell Biol. 10, 445–457 (2009).
 15.
Potdar, A. A., Jeon, J., Weaver, A. M., Quaranta, V. & Cummings, P. T. Human mammary epithelial cells exhibit a bimodal correlated random walk pattern. PLoS ONE 5, e9636 (2010).
 16.
Demou, Z. N. & McIntire, L. V. Fully automated threedimensional tracking of cancer cells in collagen gels: determination of motility phenotypes at the cellular level. Cancer Res. 62, 5301–5307 (2002).
 17.
Niggemann, B. et al. Tumor cell locomotion: differential dynamics of spontaneous and induced migration in a 3d collagen matrix. Exp. Cell Res. 298, 178–187 (2004).
 18.
Takagi, H., Sato, M. J., Yanagida, T. & Ueda, M. Functional analysis of spontaneous cell movement under different physiological conditions. PLoS ONE 3, e2648 (2008).
 19.
Selmeczi, D. et al. Cell motility as random motion: A review. Eur. Phys. J. 157, 1–15 (2008).
 20.
Bischofs, I. B. & Schwarz, U. S. Cell organization in soft media due to active mechanosensing. Proc. Natl Acad. Sci. USA 100, 9274–9279 (2003).
 21.
Koch, T. M., Münster, S., Bonakdar, N., Butler, J. P. & Fabry, B. 3d traction forces in cancer cell invasion. PLoS ONE 7, e33476 (2012).
 22.
Höfling, F., Frey, E. & Franosch, T. Enhanced diffusion of a needle in a planar array of point obstacles. Phys. Rev. Lett. 101, 120605 (2008).
 23.
Dickinson, R. B., Guido, S. & Tranquillo, R. T. Biased cell migration of fibroblasts exhibiting contact guidance in oriented collagen gels. Ann. Biomed. Eng. 22, 342–356 (1994).
 24.
Provenzano, P. P., Inman, D. R., Eliceiri, K. W., Trier, S. M. & Keely, P. J. Contact guidance mediated threedimensional cell migration is regulated by rho/rockdependent matrix reorganization. Biophys. J. 95, 5374–5384 (2008).
 25.
DiMilla, P. A., Stone, J. A., Quinn, J. A., Albelda, S. M. & Lauffenburger, D. A. Maximal migration of human smooth muscle cells on fibronectin and type iv collagen occurs at an intermediate attachment strength. J. Cell Biol. 122, 729–737 (1993).
 26.
Maiuri, P. et al. Actin flows mediate a universal coupling between theory actin flows mediate a universal coupling between cell speed and cell persistence. Cell 1–13 (2015).
 27.
Mierke, C. T., Frey, B., Fellner, M., Herrmann, M. & Fabry, B. Integrin α5β1 facilitates cancer cell invasion through enhanced contractile forces. J. Cell. Sci. 124, 369–383 (2011).
 28.
Hawkins, R. J. et al. Pushing off the walls: a mechanism of cell motility in confinement. Phys. Rev. Lett. 102, 1–4 (2009).
 29.
Hoffman, M. D. & Gelman, A. The NoUTurn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1351–1381 (2014).
 30.
Metzner, C., Raupach, C., Paranhos Zitterbart, D. & Fabry, B. Simple model of cytoskeletal fluctuations. Phys. Rev. E 76, 021925 (2007).
 31.
Raupach, C. et al. Stress fluctuations and motion of cytoskeletalbound markers. Phys. Rev. E. Stat. Nonlin. Soft. Matter. Phys. 76, 011918–011918 (2007).
 32.
Metzner, C., Raupach, C., Mierke, C. T. & Fabry, B. Fluctuations of cytoskeletonbound microbeads—the effect of bead–receptor binding dynamics. J. Phys. Condens. Matter 22, 194105 (2010).
 33.
Mickel, W. et al. Robust pore size analysis of filamentous networks from threedimensional confocal microscopy. Biophys. J. 95, 6072–6080 (2008).
Acknowledgements
We thank Thorsten Koch for help with image acquisition, and Caroline Gluth and Amy Rowat for helping with microfluidic device design and preparation. We also thank Pamela Strissel and Reiner Strick (University of Erlangen, University Clinics) for establishment of a primary inflammatory breast cancer cell line and for sharing these cells with our laboratory. This work was supported by the Deutsche Forschungsgemeinschaft, the Research Training Group 1962 ‘Dynamic Interactions at Biological Membranes: From Single Molecules to Tissue’, and the National Institutes of Health.
Author information
Affiliations
Department of Physics, Biophysics Group, FriedrichAlexanderUniversität ErlangenNürnberg (FAU), Erlangen 91052, Germany
 Claus Metzner
 , Christoph Mark
 , Julian Steinwachs
 , Lena Lautscham
 , Franz Stadler
 & Ben Fabry
Authors
Search for Claus Metzner in:
Search for Christoph Mark in:
Search for Julian Steinwachs in:
Search for Lena Lautscham in:
Search for Franz Stadler in:
Search for Ben Fabry in:
Contributions
C.Me. and B.F. designed the study. J.S. and L.L. developed the data acquisition software and performed the cell experiments. C.Me., C.Ma. and F.S. developed the theoretical model and analyzed the data. C.Me., B.F. and C.Ma. wrote the paper. All authors read and approved the final manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to Claus Metzner.
Supplementary information
PDF files
 1.
Supplementary Information
Supplementary Figures 112 and Supplementary Notes 16
Videos
 1.
Supplementary Movie 1
Timelapse phase contrast images of an MDAMB231 tumour cell migrating within a collagen gel over the time course of 7 h. Time is indicated in the upperleft corner (in h and min). The microscope is automatically following the movement of the cell to keep it in the focus position at all times. Note that the cell centre moves persistently despite frequent changes in cell shape.
 2.
Supplementary Movie 2
Timelapse integrated modulation contrast images of MDAMB231 tumour cells migrating within a collagen gel over 250 min. Time is indicated in the lower right corner in min. The cell probes its environment with a long protrusion that guides the movement of the cell body.
 3.
Supplementary Movie 3
Timelapse integrated modulation contrast images of MDAMB231 tumour cells migrating on tissue treated plastic over 18 h. Time is indicated in the upper left corner (in h and min). Cells are retracing their own paths or that of other cells. The scale bar is 100 μm.
 4.
Supplementary Movie 4
Timelapse images of MDAMB231 tumour cells migrating on fibronectincoated plastic over 14 h. Time is indicated in the upperleft corner in h and min. Note that cells are moving very persistently, only rarely retracing their paths. The scale bar is 100 μm.
Zip files
 1.
Supplementary Software 1
Python scripts and a corresponding readmefile. The first script, 'simulatedExamples.py', is a demonstration of our Sequential Bayesian Inference algorithm based on simulated data and provides extended documentation of the implementation details. The second script, 'dataAnalyzer.py', allows users to analyze arbitrary twodimensional velocity data using Sequential Bayesian Inference. The readmefile provides an instruction to the requirements and usage of the software.
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Further reading

1.
Persistent random deformation model of cells crawling on a gel surface
Scientific Reports (2018)

2.
Bayesian model selection for complex dynamic systems
Nature Communications (2018)

3.
Contact enhancement of locomotion in spreading cell colonies
Nature Physics (2017)

4.
Quantitative labelfree single cell tracking in 3D biomimetic matrices
Scientific Reports (2017)

5.
Cellular automaton models for timecorrelated random walks: derivation and analysis
Scientific Reports (2017)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.