Traction force microscopy with optimized regularization and automated Bayesian parameter selection for comparing cells

Adherent cells exert traction forces on to their environment which allows them to migrate, to maintain tissue integrity, and to form complex multicellular structures during developmental morphogenesis. Traction force microscopy (TFM) enables the measurement of traction forces on an elastic substrate and thereby provides quantitative information on cellular mechanics in a perturbation-free fashion. In TFM, traction is usually calculated via the solution of a linear system, which is complicated by undersampled input data, acquisition noise, and large condition numbers for some methods. Therefore, standard TFM algorithms either employ data filtering or regularization. However, these approaches require a manual selection of filter- or regularization parameters and consequently exhibit a substantial degree of subjectiveness. This shortcoming is particularly serious when cells in different conditions are to be compared because optimal noise suppression needs to be adapted for every situation, which invariably results in systematic errors. Here, we systematically test the performance of new methods from computer vision and Bayesian inference for solving the inverse problem in TFM. We compare two classical schemes, L1- and L2-regularization, with three previously untested schemes, namely Elastic Net regularization, Proximal Gradient Lasso, and Proximal Gradient Elastic Net. Overall, we find that Elastic Net regularization, which combines L1 and L2 regularization, outperforms all other methods with regard to accuracy of traction reconstruction. Next, we develop two methods, Bayesian L2 regularization and Advanced Bayesian L2 regularization, for automatic, optimal L2 regularization. Using artificial data and experimental data, we show that these methods enable robust reconstruction of traction without requiring a difficult selection of regularization parameters specifically for each data set. Thus, Bayesian methods can mitigate the considerable uncertainty inherent in comparing cellular tractions in different conditions.


Introduction
Mechanical forces between cells and their embedding matrix are essential for a variety of biological processes, ranging from migration of cells -including immune cells and cancer cells -to tissue maintenance and organ development, see [1][2][3][4][5][6][7] for only a few of the many review articles on this topic.Many of the relevant processes occur on a micrometer, or sub-micrometer lengthscale, for instance in nascent cell adhesion sites, filopodia, and bacterial adhesion.To understand these processes' mechanics and its biological control, reliable and accurate methods for measurement of cellular forces are required.
Traction force microscopy (TFM) is a versatile and perturbation-free method yielding a spatial image of substrate stress exerted by cells on relatively soft elastic gel substrates.This method has its origins in pioneering work by Harris et al. 8 , who employed flexible silicone substrates to investigate the mechanical forces that cells generate.][11][12][13][14] Traction force microscopy includes three distinct procedures, illustrated in Fig. 1(a): (1) Cells are plated on an elastic substrate containing fiducial markers allowing to quantify gel deformation visually, for instance fluorescent beads or quantum dots. 15,16 e deformations caused by adherent cells are recorded by taking images of the gel before and after removing the cell.(2) A discrete gel displacement field u is calculated by tracking the markers.The most common techniques for tracking are particle tracking velocimetry (PTV) and particle image velocimetry (PIV).(3) Finally, the traction force field f is calculated from the displacement field u by making use of a mechanical model of the elastic substrate.A variety of methods exist for this purpose, including finite element methods, [17][18][19][20][21] boundary element methods 15,22,23 and methods operating in Fourier space 11,21,[23][24][25] .Usually, calculation of traction from displacement requires either filtering or regularization approaches to reduce the effect of noise.The TFM methodology is limited by two common serious issues that introduce systematic errors.First, the resolution of the measured traction is usually not high enough to resolve processes at micrometer-sized cellular structures.Secondly, the most commonly used TFM algorithms require the user to choose a filter or a regularization parameter, which introduces a considerable degree of subjectivity regarding smoothness and magnitude of the resulting traction.In this article, we suggest methods for improving the state-of-the art with respect to these issues.
In the standard TFM approach it is assumed that the substrate is a homogeneous, isotropic, and linear elastic half-space.The mechanical model relating a continuous displacement field U i (x) to the traction force field F j (x ) on a two-dimensional (x = (x 1 , x 2 )) surface of the gel is expressed as the integral equation 26 where Ω denotes the whole surface of the substrate.The integrand contains a Green's function G i j (x) = (1 + ν)/(πE)[(1 − ν)δ i j /r + νx i x j /r 3 ], and E and ν represent the Young modulus and Poisson ratio, respectively.We also write r = |x| and δ i j is the Kronecker delta function.Calculation of the traction F j requires inversion of Eq. (1).A very popular and practical approach is to solve Eq. ( 1) in Fourier space. 23,24,27 Wth this approach, the inversion is often directly feasible if noise in the displacement data has been filtered prior to calculation of the traction.Optimal filtering, however, requires input of a prior-defined filter function that imposes a smoothness constraint on the calculated traction.Moreover, spatial clustering of traction into sparse regions is not conserved when switching from real space to Fourier space.To take advantage of the sparsity of traction patterns for better reconstruction, one can solve Eq. ( 1) in real space.Here, the integral in Eq. ( 1) can be converted into a matrix product by discretizing the traction field F j and interpolating it as a piecewise linear, continuous function using pyramidal shape functions h(x). 23,28 e write the two-dimensional, discrete displacement field as a 2m × 1 vector u, where m is the number of discretization nodes.The discrete traction field f is a 2n × 1 vector, where n is the number of nodes at which traction is prescribed.Then, Eq. ( 1) becomes where we also explicitly included the linear acquisition noise s that is present the experimental data.The matrix M represents the coefficients of a discretized integration.It can be calculated with different techniques.The required convolution of the Green's function with the shape functions h(x) can be done by numerical integration. 23However, the computational effort can be significantly reduced by using the convolution theorem in Fourier space on regular grids. 28In the supplementary S1, we show how this method for fast calculation of M can be extended to irregularly spaced measurements with the help of the shift theorem.Regardless of the way how M is calculated, the condition number of M is typically very large.This means that even the smallest noise s leads to very large errors if a direct inversion of Eq. ( 2) is attempted.A further characteristic of TFM is that the experimental data is always undersampled, which introduces a large degree of uncertainty.The challenge in traction force microscopy is to turn Eq. ( 2) into a well-posed problem, while providing a traction field f with correct magnitude at high spatial resolution.The standard approaches employed in TFM are L1 or L2 regularization that penalize norms of the solution to render traction calculation a well-defined numerical problem.However, linear problems involving large, ill-conditioned matrices occur commonly throughout engineering and physics.Consequently, a variety of solution methods exist that not yet been employed in the context of TFM.For example, the elastic net regularization.Literally, the elastic net regularization behaves like a stretchable fishing net that retains "all the big fish" while removing the small background signal. 29An alternative are proximal gradient methods, which usually operate in wavelet space and employ adaptive or non-adaptive thresholding of high spatial frequencies. 30,31 roximal gradient methods are for instance used for reconstructing lost parts of an image, [32][33][34] for analysis of MRI data, 35 and for analysis of genomic data. 36,37 l regularization methods have the weakness that they require selection of one or more constant regularization parameters to allow discrimination of noise and signal.
Bayesian statistics provides one solution to this problem.From the Bayesian point of view, regularization parameters can be seen of as random variables that are picked for every experimental sample from a prior distribution.Thus, one infers the most probable value of the regularization parameters in a conceptually similar way as one infers the solution. 38Such methods do not require the choice of a regularization parameter and are therefore potentially less prone to subjectiveness than the classical regularization.The conceptual framework of employing hierarchical priors for data regularization has been exploited for a large number of different applications, for instance in astrophysics, [39][40][41] , machine learning 42 , mechanical structure monitoring 43 , face recognition 44 , and radar imagery 45 .Therefore, it is to be expected that Bayesian analysis can be of great use for traction force reconstruction.
In this work, we systematically compare a range of different approaches for solving Eq. ( 2).Altogether, we study the performance of seven methods, illustrated as a schematic diagram in Fig. 1(b).First, we test various regularization methods.Among the regularization methods, we complement the classical TFM approaches, L1-and L2 regularization, with previously untested methods from computer vision, namely Elastic Net (EN) regularization, Proximal Gradient Lasso (PGL), and Proximal Gradient Elastic Net (PGEN).We find that the new EN regularization scheme has a substantially improved accuracy as compared to previous approaches but requires considerable extra computational cost.Secondly, we seek to establish Bayesian models that can automatically perform an optimal regularization of the data.Initial tests indicate that different freely available Bayesian hierarchical models are of little use for TFM, since the large number of hidden variables, even when used with sparsity priors, does not enforce sufficient data faithfulness.Instead, we find that the simplest-possible Bayesian models with global priors yields robust results that can be interpreted as optimal L2 regularization.We study two variants of this algorithm: Bayesian L2 regularization (BL2), where the magnitude of the noise in the displacement data must be measured separately, and Advanced Bayesian L2 regularization (ABL2), which requires no extra input.We test the Bayesian methods using artificial data and real experimental data.Our results suggest that BL2 is not only very robust, but also superior to classical L2 regularization when measurement noise is large.Most importantly, BL2 automatically determines the degree of regularization, which removes subjectiveness from the result.This advance is particularly relevant for in-detail comparison of cells in different conditions, where the varying signal-to-noise ratio previously made an unambiguous comparison challenging.

Regularization
A common heuristic approach to solve Eq. ( 2) is regularization of the solution through additional constraints.Here, not only the residual of (u − Mf) is minimized in a least-squares sense, but also the magnitude of the solution is penalized through its p-norm denoted by f p .The trade-off between minimization of the residual and minimization of the solution norm is determined by fixed regularization parameters, λ 1 and λ 2 , leading to a minimization problem of the type The two norms are explicitly written as . R 1 and R 2 are functions that are to be defined, e.g., as the unit matrix I.Of the large number of existing regularization strategies, we will focus on the following: • L2 regularization, employing an L2-norm with λ 2 > 0 and λ 1 = 0 to penalize traction magnitude through R 2 = I is currently the most common technique used for TFM. 23,46,47 L regularization is also known as ridge regression or Tikhonov regularization 48 and this method efficiently produces a continuous and smooth reconstructed traction field.This approach conveys a high level of robustness for the inversion problem in real space and also in Fourier space.See supplementary information for our implementation in real space. 49L1 regularization, also called Lasso, 50 is realized through setting λ 2 = 0, λ 1 > 0, and R 1 = I.With L1 regularization, small values of the reconstructed signal are efficiently set to zero.L1 regularization is therefore frequently used in the the field of compressive sensing (CS), 51 where the underlying assumption is that the signal can be represented in a sparse form where all but a few components of the signal vanish.Recently, the technique has found use for TFM 28,[52][53][54] and it is appropriate for traction fields containing few, sparsely located traction hotspots.In this case, it has been found that L1 regularization improves the ability to distinguish different traction hotspots.
• The elastic net regularization combines L1-and L2 regularization and both parameters through λ 1 > 0, λ 2 > 0, and ).Note that the actual implementation also involves a rescaling of the variables as suggested in Ref. 29 .This regularization scheme is known to have a better accuracy compared to L1 and L2 regularization if the coefficient matrix has many, correlated entries. 29EN regularization is well established for a wide variety of applications, most notably the analysis of genetic data, 36,[55][56][57] but has to date not been used for TFM.For our implementation of the EN we employ the popular convex optimization solver CVX, see supplementary information. 58,59 roximal gradient methods are an alternative approach to the optimization problems arising from the non-differentiable target functions in L1 regularization (PGL) and the EN regularization (PGEN). 30,60 ere, the penalty terms are chosen to be wavelet-transforms written as , where the Ψ l constitute an orthonormal Wavelet basis. 61,62 he optimization problem is solved through iterative soft thresholding, where the regularization parameters control the threshold below which the wavelet-coefficients are set to zero.[65] Therefore, these methods may be useful for TFM where traction images are reconstructed from undersampled displacement data.Details regarding our implementation of is given in the supplementary information.
These schemes have in common that they require the choice of one or two regularization parameters.Selecting the optimal regularization parameters is often a non-trivial problem.For L2-and L1-regularization, one can use the so-called L-curve criterion 49 to find regularization parameters that provide a tradeoff between minimization of residual from the inverse problem and the regularization penalty. 21,23,28,46,47,66 Usualy, the regularization parameter is assumed to be located at the inflection point of a curve described by the norm of the residual versus the norm of the solution in double-logarithmic axes.However, the L-curve criterion is of limited use for real data, since the inflection point does not always exist.Alternatively, multiple inflection points can appear, and the points are hard to localize precisely on the employed logarithmic scales.Moreover, the L-curve criterion does not behave consistently in the asymptotic limit of large system sizes or when the data is strongly corrupted by noise. 67,68 ence, in practice, regularization parameters are often chosen by visual inspection of the resulting traction field.This procedure lacks objectivity and significantly biases any conclusions drawn from later analysis of the traction forces.
Note that this problem is not specific to regularization, but the issue of distinguishing between noise and "real" signal appears generally with any type of method if the data is processed in any way to reduce noise.

Bayesian approaches for traction reconstruction
The discrete inverse problem u = Mf + s is of two-fold statistical nature.First, the traction varies from cell to cell and from image to image.Thus, f is a sample drawn from a distribution of possible traction values, which we denote by p(f|α) with an undetermined parameter α.The function p(f|α) describes any prior knowledge about the distribution of traction.For reasons that will become clear below, we will assume that the prior distribution for the 2n × 1 vector f is a Gaussian where The second source of randomness is the acquisition noise s.Typically, s is assumed to be drawn from a zero-mean Gaussian with unknown variance 1/β . 38,39,41,69 Inthe language of Bayesian statistics, the probability distribution p(u|f, β ) is called the likelihood function and determines the probability to to measure a particular vector u given a traction vector f.Since the noise is Gaussian, the likelihood function is where E u (u|f) = (Mf − u) T (Mf − u)/2 and Z u = (2π/β ) m .m is, as above, the number of two-dimensional displacements.The likelihood function p(u|f, β ) describes a situation that is exactly the reverse of the experimental situation, where we are looking for the probability of having f given measurements u.This situation is described by the posterior distribution p(f|u) and can be related to the likelihood via Bayes' rule Here, the marginal likelihood p(u|α, β ) is the overall probability of finding the displacements u when the traction distributions are integrated out.Thus, p(u|α, β ) is also called evidence for the model with {α, β , u}.
Assuming that α and β are known constants, one can maximize the posterior probability P(f|u, α, β ) with respect to f.The resulting solution then satisfies , which is exactly the formula employed for L2 regularization, Eq. ( 3), if the parameters α and β are related to the L2 regularization parameter as λ = α/β . 70Thus, our choice of a Gaussian prior is justified if we intend to perform an L2 regularization.Other popular choices for replacing Eq. ( 5) as prior are the Laplace distribution p(f|θ ) = (θ /2) exp(−θ /2 f 1 ), 71 and a product of a Gaussian and a Laplace distribution. 72Using these priors, we would have found the formulas corresponding to L1 regularization and EN regularization, respectively.Thus, regularization is equivalent to maximizing the posterior probability of a measurement assuming fixed, known parameters of the prior distributions.
However, α and β can also be treated as variables whose values can be determined by maximizing their probability p(α, β |u) = p(u|α, β )p(α, β )/p(u) ∼ p(u|α, β ), where we assume a uniform prior p(α, β ) and we can omit the marginal probability since it plays no role for the optimization.To calculate the evidence p(u|α, β ), we need to integrate out f in the posterior given in Eq. ( 6).Due to the Gaussian probabilities, this integration can be done conveniently by expanding the integrand around the most probable value f MP .On defining K(f) ≡ αE f (f) + β E u (f) and its Hessian, A ≡ ∇∇K(f), we expand as Taking the logarithm yields the final result The right hand side of this equation is a typical example for target functions employed in Bayesian analysis, for example in the context of data fitting. 38For TFM, numerical calculation of log(det A) requires some care.We employ here a Cholesky decomposition of the positive matrix A = LL T yielding log(det The logarithmic evidence, Eq. ( 8), assumes a maximum at those parameters α and β that are most likely associated with the measurement u.The implicit equations resulting from maximizing Eq. ( 8) read 38 2 αE MP f = 2n − αTrA −1 and 2 β E u = 2m − 2n + αTrA −1 .Since f MP and A depend on α and β , these equations need to be solved numerically.Once α and β are determined, the equivalent optimal L2 regularization parameter follows as λ = α/ β .We employ two approaches for determining the numerical values of α and β .In the first approach, called Bayesian L2 regularization (BL2), we estimate the inverse noise variance β directly from the data calculating the variance of the measured displacements in spatial regions that are very far away from any cell.Thus, in BL2 only α is determined through maximization of Eq. ( 8).In the second approach, termed advanced Bayesian L2 regularization (ABL2), we solve directly for α and β , which result in an increased computational cost.For both approaches, it is imperative to standardize the data to adjust its spread in different dimensions.For a displacement vector u of length 2m, we first subtract the mean ũ = u − 1 2m ū with ū = ∑ 2m i=1 u i /(2m).Next, we calculate the mean and standard deviation for all columns of the matrix . Thus, we can define a problem matrix where each column is normalized by its spread Mi j = (M i j − Mj )/ω j .The standardized problem therefore reads ũi = Mi j f j , which yields f i = fi /ω i .

Generation of artificial test data
To quantitatively compare the performance of different reconstruction methods, we require artificial data with exactly known traction force and displacements.The process of generating this data is shown in Fig. 2(a,i)-(a,iv) and involves prescribing traction force magnitude and direction in distributed circular areas, analytical calculation of the resulting displacements, 23 sampling displacements at discrete positions and addition of noise, and finally the reverse traction reconstruction.Supplementary S5 provides further details on the involved analytical calculations.Throughout the article, artificial test data is generated for gel substrates with a Young modulus of E = 10 kPa and a Poisson ratio of ν = 0.3.The size of the image plane is arbitrary, but fixed to 25 µm × 25 µ m and involves 9 or 15 circular traction spots.For these fixed geometries we vary the traction magnitude, density of displacements, and the noise level.

Evaluation metrics for assessing the quality of traction reconstruction
To evaluate the quality of the reconstructed traction, we introduce four different error measures comparing reconstructed traction and known original traction.For this purpose, traction at every grid node is written as a two-dimensional vector t = {t x ,t y }.Real traction and reconstructed traction are discriminated by superscripts as t real and t recon .The error measures are calculated by discriminating traction inside and outside of N i circular traction patches in a test sample.
• The Deviation of Traction Magnitude at Adhesions (DTMA) 23 is defined as where N i is the number of circular traction patches and the index i runs over all patches.The index j runs over all traction vectors in one patch.A DTMA of 0 represents a perfect average traction recovery and a negative or positive value implies underestimation or overestimation, respectively.
• The Deviation of Traction Magnitude in the Background (DTMB) is the normalized difference between the reconstructed and real traction magnitude outside the circular patches where the index k runs over all traction vectors outside the patches.A DTMB with a magnitude much smaller than unity implies low background noise in the reconstructed traction.
• The Signal to Noise Ratio (SNR) for TFM SNR = 1 measures the detectability of a real signal within a noisy background. 73As before, the index k runs over all traction vectors outside the patches while j is the index of each traction vector in the patch i.The value of the SNR runs from 0 to infinity where a SNR that is much larger than unity indicates a good separation between traction and noise.
• The Deviation of the traction Maximum at Adhesions (DMA) measures how peak-values of the traction over-or underestimate the true value.The quantity is defined as where the maxima of traction magnitude are calculated for each traction patch separately through index j.This error measure is particularly important since traction maxima are easy to extract from real experimental data.A DMA of 0 means that the local traction maxima in the reconstruction and in the original data are equal.Positive or negative values of the DMA indicate that the maximum of traction is overestimated or underestimated.

Experimental procedures
Primary murine podocytes were isolated and maintained by following previously published protocols. 74In brief, mGFP positive podocytes were isolated from mTom/mGFP*Nphs2Cre reporter mice and subsequent FACS based purification resulted in a primary podocyte culture of highest purity. 75Substratum matrices were prepared according to previously established protocols. 70Primary podocytes were seeded on polyacrylamide substrates with a Young's modulus of 16 kPA.Before, gels were functionalized via SULFO-SANPAH based crosslinking of fibronectin (UV light applied for 5 minutes).Cells were cultivated for 12 − 16 hours before imaging.Cover slips were transferred into flow chambers and cell removal to image the stress-free gel was achieved with a cell micromanipulator (Eppendorf).Heart muscle cells from embryonic rats were freshly prepared as described previously 76 .Cover slides were coated with approximately 70 µm thick silicone elastomer layer produced from a commercial two-component formulation (Sylgard 184, Dow Corning; mixing ratio 50:1 base to crosslinker by weight; cured overnight at 60 • C).These substrates contained fluorescent beads in their uppermost layer (FluoSpheres Crimson carboxylate-modified beads; Invitrogen) and were coated with fibronectin before cell seeding.Details on preparation and cell culture are published elsewhere 76 , calibration of stiffness 77 yielded a Young module of 15 kPa and a Poisson ratio of 0.5.Live cell microscopy on spontaneously beating cardiac myocytes was performed and positions of fluorescent beads were determined by cross-correlation as described in detail previously. 76,78

Manual selection of optimal regularization parameters is challenging
The optimal regularization parameters λ 1/2 in Eq. ( 3) are usually unknown.Classical methods for their choice are the L-curve criterion 67,79 or the generalized cross validation (GCV) for L2 regularization. 49,80 owever, these two methods hardly ever produce the same parameter values and results can differ substantially in the presence of noise, see supporting Fig. S1.To illustrate the strong effect of regularization on traction reconstruction, we focus on artificial test data where the underlying traction pattern is known.Figure 2(a) illustrates the generation of artificial traction fields consisting of circular patches each exerting 100 Pa.Fig. 2c) demonstrates how variation of the regularization parameters affects the error of traction reconstruction with different methods.Note that the errors exhibit minima for intermediate values of the regularization parameters.For the methods shown in panels i,iv,v of Fig. 2(c) (L2 regularization, PGL, PGEN), minima occur in the positive error of the background traction DTMB.In contrast, L1 regularization shown in panel ii of Fig. 2(c) exhibits a maximum in the DTMA, indicating a parameter regime where the traction magnitude is not underestimated.
The occurrence of clear minima in the error measures suggests that the corresponding regularization parameter values produce a faithful traction reconstruction.Indeed, employing the values corresponding to the error minima yields traction fields that visually compare well with the original data, see Figs. 2a) and 2d).Note that for L1 regularization, the reconstruction clearly overestimates the maximum traction locally.As shown in the supplementary Fig. S2(b), the overestimation of the maximum quantified by the DMA can only be reduced through ∼ 10 fold reduction of λ 1 , which however leads to strong background traction and suppression of real traction, see also Fig. S6.While the minima of the error measures in Fig. 2c) allow to determine a "best" regularization for test data, the resulting regularization parameter values deviate from those suggested by the L-curve criterion, see green lines in Fig. 2c).Moreover, the L-curves for these samples are complex and exhibit multiple turning points, illustrating the difficulty in choosing the right regularization parameter in experiments, see supporting Fig. S5.

The elastic net outperforms other regularization methods for traction reconstruction
To facilitate quantitative comparison of different reconstruction methods, we employ artificial data consisting of 15 circular traction spots with traction magnitude between 0 Pa and 250 Pa, see Fig. 3(a).Gaussian noise with a standard deviation given in percent of the maximal absolute value of the of true displacements is added.The spots have a diameter of 2 µm and the mesh constant for traction reconstruction is 0.5 µm.
Results from different regularization approaches are shown in Fig. 3(a).The figure illustrates that L2 regularization can yield realistic estimates for the absolute magnitude of traction on the spots but produces a strong traction background, which may render identification of traction sites difficult.The opposite deficiencies occur for results from L1 regularization.Here, the background is nicely suppressed, which can allow excellent resolution of very small traction spots.However, the peak tractions are significantly overestimated, which can not be mitigated by increasing the regularization parameter, see the supplementary Fig. S6.Note that the quality of L1 regularization can be improved by using an Iterative Reweighted Least Squares algorithm and the solution from the L2 regularization as an initial guess, see Supplemental Material.The best results are obtained with the EN regularization which combines the advantages of L1-and L2-regularization.Here, we obtain a clean background combined with acceptable accuracy in the absolute traction magnitude on the circular patches.The results from the proximal gradient methods PGL and PGEN qualitatively have a smooth appearance with a level of background traction that is between those of L2 and L1.Fig. 3(b) quantifies the described differences between the regularization methods through the error in traction magnitude on the traction spots (DTMA), the error in traction magnitude in the background (DTMB), signal to noise ratio (SNR), and error in maximum traction on the spots (DMA).The supplementary Fig. S9 contains additional plots of these quantities.We find that the reconstruction quality of traction and background improves with increasing number of displacement measurements m.Furthermore, EN regularization outperforms other regularization methods with regard to reconstruction accuracy of undersampled data (m/n < 1).However, the advantage of EN regularization comes at a significantly increased computation time and memory requirement as shown in Table 1.

Bayesian variants of the L2 regularization allow parameter-free traction reconstruction
We next consider the performance of the two Bayesian methods, BL2 and ABL2, that allow automatic choice of the optimal L2 regularization parameter, as schematically shown in Fig. 4(a).Both methods select the optimal regularization parameter by maximizing the logarithmic evidence, Eq. ( 8).As illustrated in Fig. 4(a), the regularization parameter is here deduced from the parameters β and α, characterizing the distributions of measurement noise and traction respectively.We first employ the same test data as used for Fig. 3, containing 5% Gaussian noise in the displacements with β = 400 Pix −2 .With BL2, the log evidence exhibits a clear maximum in a one-dimensional space as seen in Fig. 4(c).Figure 4(d) shows the reconstructed traction employing the optimal parameter λ2 = 76.75 Pa 2 /Pix 2 .Visual comparison of the color-coded traction magnitude in Figs.4(d) and 4(b) clearly shows that the reconstructed traction has the correct range.
For ABL2, the evidence is a function of β and α as seen in Fig. 4(e).Numerical localization of the maximum yields α = 3.06e4 Pa −2 and β = 394 Pix −2 , which is very close to the known input value of β = 400 Pix −2 .The optimal regularization parameter in this case is thus λ2 = α/ β = 77.66Pa 2 /Pix 2 , which agrees well with the estimate from BL2 (76.75 Pa 2 /Pix 2 ).The resulting traction map is is shown in Fig. 4(f) and is very similar to the traction map resulting from BL2 in Fig. 4(d).Thus, BL2 and ABL2 yield consistent parameter estimates that produce traction reconstruction of good accuracy.See supplementary Fig. S9 for a comparison of the Bayesian methods with non-Bayesian approaches.
As with other regularization approaches, quality of reconstruction strongly depends on the present noise.When the magnitude of the noise is comparable to the magnitude of the displacements caused by the traction (σ n /σ ū ≈ 1), little information can be recovered.For instance, the small circular traction sites labeled 1 and 2 in Fig. 3(a) are almost impossible to detect in the presence of 5 % noise, but can be reconstructed in the noise-free case, see the supplementary Fig. S7.To quantitatively assess the fidelity of reconstruction with small traction forces, we employ a constant 5 % but scale the tractions to mean values of (12 Pa, 16 Pa, 60 Pa, and 120 Pa).The resulting relative strength of noise and displacements is quantified through the ratio of standard deviations σ n /σ ū, which is plotted against our reconstruction quality measures in Fig. 4(g).For comparison, results from manual selection of the regularization parameter using the L-curve criterion are also given.The reconstruction qualities of BL2, ABL2 and L2 are similar when σ n /σ ū 1 (high traction).However, BL2 and ABL2 have an improved signal to ratio SNR compared with the L-Curve approach when σ n /σ ū approaches unity, see (iii).This is due to the difficulties with the L-curve criterion at high noise.The logarithmic evidence function exhibits in all cases a clear maximum, which enables robust and reliable choice of optimal parameters with BL2 and ABL2.In general, the results from BL2 are however more reliable since the optimization involves here only one parameter.Overall, the tests with artificial data show that these Bayesian methods containing few additional parameters to be determined from the data can resolve the ambiguity associated with manual choice of the regularization parameters over a wide range of signal strengths σ n /σ ū < 1.
BL2 and ABL2 are based on the simplest structure of a Bayesian model with only one, global prior distribution.One may hypothesize that more complex hierarchies of priors yield an improved traction estimate.For instance, it is possible to prescribe a position-dependent prior for sparse traction patterns through hierarchical Bayesian networks.Such methods require more advanced techniques for sampling of the probability distributions and optimization, such as variational techniques or Markov-chain Monte Carlo methods.][83] Results are shown in the supplementary information.However, the tested algorithms all produce highly overestimated, localized traction patterns that sensitively depend on noise.Such errors are likely due to the many free parameters of the models that, in spite of the sparsity constraints, do not favor a faithful data reconstruction.Thus, our tests suggests that these hierarchical network models are not suited for the inverse problem associated with TFM.

Test of methods with experimental data
To compare the performance of the different methods for real cells, we employ primary mouse podocytes studied with a standard TFM setup on polyacrylamide gels having a Young module of ∼ 16 kPa.The deformation field resulting from cellular traction is shown in Fig. 5 (a).Using this displacement data, we find that the variance of noise is ∼ 0.01 pix 2 = 103.4nm 2 in regions that are far away from the cell.The maximum displacement is 0.52 µm.Figs.5(b)-(h) show reconstruction results using all methods discussed above.As for artificial data, we find here that the EN regularization results in a very clear background.The traction magnitudes and outlines of adhesion sites are similar to those resulting from regularization with the L2 method.For L1 regularization, traction localizes in sparse regions and has a significantly higher value than for other methods.Proximal gradient methods produce smooth traction profiles as expected from the use of the soft wavelet thresholding.The magnitude of traction measured with PGL and PGEN is close to results of EN and L2.
Next, we considered the performance of the Bayesian methods.The logarithmic evidence, Eq. ( 8), calculated with BL2 and ABL2 reveals pronounced maxima, allowing to robustly choose the optimal parameters for the experimental data.See also supplementary Fig. S11.The resulting values for λ2 are 30.4Pa 2 /Pix 2 and 24.4 Pa 2 /Pix 2 for BL2 and ABL2, respectively, and thus agree reasonably well with each other.Figures 5(e),(h) show the traction fields calculated with BL2 and ABL2.These traction fields are visually very similar to the one obtained with standard L2 regularization.However, the L-curve criterion provides a much more uncertain estimate of a regularization parameter due to the difficulty of localizing it on a logarithmic scale, see supplementary Fig. S12.Note that the regularization parameters obtained from the L-curve criterion can not be directly compared to the parameters resulting from the Bayesian methods due to standardization employed for the latter.Overall, the suggested Bayesian models can eliminate ambiguity in TFM by automatically providing a consistent parameter choice.

Bayesian regularization enables consistent analysis of traction time sequences
TFM is frequently employed to study dynamical aspects of cell mechanics.Examples include cell migration, cell division, or cytoskeletal reorganization in response to extracellular stimuli.Such processes are usually accompanied with a change in the traction distribution.As a result, the optimal regularization parameter varies among different images in a time sequence of microscopy data.Additionally, the regularization parameter can also change if the degree of noise varies over time, which can be caused for example by stage drift or photo bleaching.In such cases, it is very challenging to perform a consistent, frame-by frame analysis to determine the degree of regularization with conventional methods.Thus, one fixed parameter is commonly employed for the whole time sequence and precision, but also accuracy of traction reconstruction is thereby sacrificed.
To test whether our Bayesian methods can be useful in this situation, we employ TFM data with a spontaneously beating cardiac myocyte, see Fig. 6.Due to the large size of the cell, we focus on a region of interest shown in Fig. 6(a).The analyzed time sequence corresponds to one cell contraction.Snapshots from frames 1, 4, and 6 are shown for illustration.Figure 6(b) shows the overall norm of reconstructed traction where λ 2 is either chosen according to the L-curve criterion, held at an intermediate, constant value, or automatically determined in BL2.The overall traction magnitudes are similar in frames 2-6 where traction is high.Differences occur, however, in the low-traction regime, where BL2 systematically yields lower values of traction.We expect that the results from BL2 are more trustworthy in this regime since the L-curve criterion yields highly ambiguous values for the regularization parameters, see supplementary Fig. S13. Figure 6(c) shows the overall norm of the gel displacement and the optimal regularization parameter estimated with BL2.The noise variance is small, ∼ 0.00003 pix 2 , in regions that are far away from the cell (pixel size 0.2 µm).As expected, λ BL2 is inversely correlated with the displacement magnitude.
Figs. 6e show snapshots of the resulting traction fields that illustrate again that BL2 produces slightly different results for low traction, most apparent in Figs.6(e)i,I and Fig. 6(f),i.Note that the traction field resulting from classical L2 regularization in Figs.6(e),i,I shows a noise background outside of the cell that is almost comparable to the real cellular traction.In contrast, BL2 suppresses this background at the price of an apparently reduced spatial resolution as seen in Fig. 6(f),i.However, this provides an objective distinction between real signal and noise, which is what is to be expected from a faithful data reconstruction.
Many, if not all, TFM methods critically rely on some form of noise reduction.Usually, traction is calculated from substrate displacement through the solution of a linear problem involving elastic Green's functions.Here, the effects of noise are not a technical issue relating to the data precision, but connected directly to the structure of the linear problem where even the slightest numerical noise can be amplified to an extent that the true solution is entirely lost.The most immediate approach to deal with noise is to filter the displacement field prior to traction reconstruction.Filtering becomes possible if the solution is calculated in Fourier space because the convolution theorem simplifies the matrix inversion. 24However, filtering the input data generally reduces the spatial resolution and optimal resolution can only be gained if the filter is adapted for each sample.In certain cases, moreover, data filtering is not sufficient to guarantee stability of the solution, for example, if the three-dimensional position of displacements is included.
A popular alternative strategy for enforcing well-behaved solutions is regularization.With regularization, Fourier-space inversion becomes more robust. 21,23 owever, regularization is also used for real-space approaches and has been used in conjunction with finite element methods or boundary element methods.Solving the linear problem in real space is generally more demanding, but has the advantage that the spatial sparsity of traction patterns is conserved.For TFM, two regularization methods have to date been used, namely L2 regularization 23,70,73 and L1 regularization 28,53,54 .These methods each have one regularization parameter that is chosen manually based on heuristics, which introduces a considerable degree of subjectivity in the resulting traction.
In this work, we systematically compare the classical L1-and L2 regularization to three other methods that have, to our knowledge, not yet been employed for TFM.These three regularization methods are the Elastic Net (EN), Proximal Gradient Lasso (PGL) and Proximal Gradient Elastic Net (PGEN).Our tests with artificial data clearly demonstrate that EN regularization outperforms other regularization methods with regard to the reconstruction quality.Here, accurate traction reconstruction is due to a simultaneous suppression of background noise and penalizing of large traction magnitude.In contrast, the proximal gradient methods PGL and PGEN are effective at producing a smoothed traction field, due to the local removal of high-frequency spatial variations.These results obtained with artificial data agree qualitatively with results from tests with experimental data.Here too, L1 and L2 regularization yield overestimated or underestimated traction on adhesion sites.EN again yields a clear background without producing excessively sharp traction peaks at adhesions, see Fig. 5. PGL and PGEN yield smooth traction fields and rounded adhesion site contours.While our work presents a comprehensive overview of regularization variants in TFM, it does not cover all variants and solution procedures.For example, it has been suggested to use an L1 norm for both the residual and regularization term, 54 and various other iterative regularization procedures can be tested for TFM in the future.
Next, we ask if Bayesian methods can eliminate the necessity of a manual choice of regularization parameters.Here, the corresponding parameter values are inferred by maximizing their evidence given a fixed class of chosen probability distributions.Using the simplest approach, our prior assumption on the traction forces is that they are drawn from one global Gaussian distribution with an unknown variance 1/α.The posterior distribution determining the probability of a particular traction field given a measured displacement field is then determined by the parameter 1/α and a further parameter 1/β , quantifying the variance of the measurement noise.For fixed values of α and β , maximization of the posterior distribution corresponds exactly to L2 regularization with λ 2 given by α/β .However, the values of α and β can also be determined through maximizing their probability conditioned on a given measurement and the chosen probability distributions.Here, this is equivalent to maximizing the evidence for u, given any two parameter values.We refer to the simultaneous determination of both parameters from the evidence as advanced Bayesian L2 regularization (ABL2).In an even simpler approach, we estimate the noise strength directly from the displacement data, leaving only one parameter α to be determined by maximization of the evidence; which we call Bayesian L2 regularization (BL2).These methods represent an automatic optimization of the L2 regularization.Thus the resulting traction field has all the qualitative features of L2 regularization, including the suppression of exceedingly high traction values and a visible background traction.For all our tests, we found that BL2 was a robust method yielding reasonable estimates for traction and regularization parameters.Due to the difficulty in choosing the correct regularization parameter 9/20 manually, BL2 has substantial advantages over the classical L2 procedure, in particular if the traction is so small that the resulting displacements are comparable to the noise σ n /σ ū ≈ 1.We mention that we have also tested more elaborate hierarchical Bayesian network algorithms that were originally designed for other purposes than for use with TFM.These include a variational approach termed "Bayesian compressive sensing using Laplace priors" (BCSL), 71 and Markov chain Monte Carlo methods, for instance the "Bayesian Lasso". 83In our experience, however, none of these methods could compete with the much simpler Bayesian L2 regularization when applied to TFM problems, see Figs.S8-S10,S14 in the supplementary information.
The advantage of employing Bayesian traction reconstruction is most apparent when cells in different conditions are to be compared.To perform a correct comparison of situations with different traction, different substrate rigidities, etc., it is technically necessary to adapt the regularization parameter for each case.However, the difficulty of finding the corresponding parameters usually makes this impossible, which introduces significant quantitative errors.Bayesian methods present a possible solution to this problem.We have shown here that BL2 produces regularization that varies smoothly and robustly with strong temporal changes in the traction exerted by a cell.Thus, we expect that this method can be of wide use for quantitative studies of cell physiology.

Figure 5 .
Figure 5. Test of all reconstruction methods using experimental data.(a) Image of an adherent podocyte with substrate displacements shown as green vectors.(b)-(h) Reconstructed traction forces using L2, L1, EN, BL2, PGL, PGEN and ABL2, respectively.Reconstruction with L2-type regularization exhibits a comparatively high background noise.L1-regulation shows very high, localized traction.Based on tests with artificial data, we expect that these peaks overestimate the traction.The EN method combines the advantages of L1 and L2 regularization, namely a clean background and localized traction of reasonable magnitude.PGL and PGEN have smooth traction forces at adhesion and background.(g-h) The Bayesian methods BL2 and ABL2 yield very similar results as the standard L2 regularization without requiring a search for the optimal regularization parameters.For better visibility, only every fourth traction vector is shown.Space bar: 25 µm.

Table 1 .
Overview of calculation time and RAM requirement for each method.The benchmark results were conducted with a data set consisting of 1000 displacement measurements and a traction field consisting of 2500 entries.