Direct observation of Thermomyces lanuginosus lipase diffusional states by Single Particle Tracking and their remodeling by mutations and inhibition

Lipases are interfacially activated enzymes that catalyze the hydrolysis of ester bonds and constitute prime candidates for industrial and biotechnological applications ranging from detergent industry, to chiral organic synthesis. As a result, there is an incentive to understand the mechanisms underlying lipase activity at the molecular level, so as to be able to design new lipase variants with tailor-made functionalities. Our understanding of lipase function primarily relies on bulk assay averaging the behavior of a high number of enzymes masking structural dynamics and functional heterogeneities. Recent advances in single molecule techniques based on fluorogenic substrate analogues revealed the existence of lipase functional states, and furthermore so how they are remodeled by regulatory cues. Single particle studies of lipases on the other hand directly observed diffusional heterogeneities and suggested lipases to operate in two different modes. Here to decipher how mutations in the lid region controls Thermomyces lanuginosus lipase (TLL) diffusion and function we employed a Single Particle Tracking (SPT) assay to directly observe the spatiotemporal localization of TLL and rationally designed mutants on native substrate surfaces. Parallel imaging of thousands of individual TLL enzymes and HMM analysis allowed us to observe and quantify the diffusion, abundance and microscopic transition rates between three linearly interconverting diffusional states for each lipase. We proposed a model that correlate diffusion with function that allowed us to predict that lipase regulation, via mutations in lid region or product inhibition, primarily operates via biasing transitions to the active states.


M1. Extraction of diffusion coefficient from step length.
To extract diffusion coefficient from single particle tracking data, we applied a simple Brownian diffusion model (sub class of the Gamma distribution). The approach was used to a) predict the most likely diffusion for each individual trajectory (and thus extract a mean diffusion, see Table 1) and b) predict the most likely diffusion coefficient for each state in each trajectory found via HMM analysis. The principle of maximum likelihood was used to estimate D, as it has no need for binning of data and hence avoids the introduction of binning bias. To estimate D, we used the equation below (as we published earlier 1 ) to maximize the likelihood given a distribution of observed step lengths, where the probability is given by: Where r is the observed step length, t is the time interval between steps and D is the corresponding diffusion coefficient.

M2. Determination of functional states and transition rates by Hidden Markov Analysis
Deconvolution of underlying functional states was performed by analysis with a Hidden Markov Model approach, similar to published methodologies 2 , on the distribution of observed step lengths for each trace following gamma distributions. Analysis of traces across lipase variants revealed, through BIC score (see Table S1), that a 3-state model was the best description. The analysis was done by fitting the total distribution of observed step lengths for each variant by 1-4 gamma distribution(s) and then comparing the results. All found states, for a given lipase variant, were plotted together in a histogram and fitted with a mixture of three Gaussians (four different populations in total, see Fig. 2).
Each pair of mobility transitions found ("state before" and "state after"), were plotted as Transition Density Plots (TDPs), as shown in Fig. S9, and separated into two clusters per diagonal using a combination of k-means clustering, as reported recently by us 3 , and two dimensional gaussian mixture model (four clusters in total). The number of clusters were determined from earlier discussed Gaussian distributions, and corresponds to the three underlying states found for each variant. To exclude severe outliers in each cluster, data points outside a 98 % confidence interval (2.5 sigma) of each cluster center was excluded from further treatment. While this ensures reasonable cluster separation, we note that clusters in close proximity may still suffer from overlapping and that this could influence the kinetic characterization (see Fig. S9-10). Transition rates were determined through fitting a single exponential decay to the lifetime of each state (see Fig. S10 for single exponential fits and Table 2 for transition rates) 2 . From this, a four-state linear model ( Fig. 2) was derived using similar methods as recently 2,4 .
The equilibrium constant between states can be found as, From which the relative free energy difference between adjacent states can be calculated, where R is the gas constant, T the absolute temperature (here 298 K) and Keq from above. Alternatively, the equilibrium concentrations, and thus state occupancies, can be extracted using the equilibrium constants between states, assuming they sum up to 1 and are interdependent: 1 = 1 1 + 6 + 6 * 0 2 = 6 1 + 6 + 6 * 0 3 = 6 * 0 1 + 6 + 6 * 0 where Kn is the equilibrium constant between adjacent states. Note that since all lipase variants display only three out of the four observed global states, the model for each individual variant looks as: Furthermore, the activation energy (energy barrier), Ea, can be found from transition state theory using the equation: where R is the gas constant, T the absolute temperature (298 K), h the Planck constant, kij the rate from state i to j and kB is the Boltzmann constant. Note that since all variants display 3 states, the Slow Intermediate Fast/Active above can be directly applied -even though the actual mobility of E3 changes between highly active and the less active variants.

M3. Fitting HMM states by Gaussian distributions.
Gaussian distributions were used to describe and deconvolute individual underlying functional states within the observed HMM states (Fig. 2). The fitting was done using the maximum likelihood. Since these methods set to maximize the likelihood function, and the resulting fit is generated from probabilities for each individual data point belonging to a certain Gaussian population, no binning is needed. This was preferred over a tradition non-least-squares approach, as binning may bias the fit and thus elute the true population. All fitting was done using custom made routines in Python.

M4. Bayesian Information Criterion (BIC)
All fits were evaluated using the Bayesian Information Criterion (BIC) a widely employed methodology to avoid overfitting as it punishes over-fitting by penalizing the addition of parameters in relation to the likelihood of the fit 5 . BIC can be calculated as where L is the maximum likelihood function for a given dataset and the free parameters, k is the number of parameters included in the model and n is the number of data points. BIC allowed us to identify the optimal number for each individual condition (see Table S1), thus one can utilize this and evaluate what model best describes the data.

M5. Hydrodynamic radius calculations
The hydrodynamic radius of a spherical body undergoing Brownian motion can be calculated from Stokes-Einstein theory 6 using the equation: ηr Where, kB is the Boltzmann constant, T the temperature in Kelvin, η the dynamic viscosity of the medium and r the Stokes radius (hydrodynamic radius). In this equation the only unknow parameter is the dynamic viscosity of the medium. To utilize the equation, one must assume true Brownian motion, meaning a hard sphere experiencing drag force from a viscous solvent.
As pure trimyristin at 21° degrees does not display a dynamic viscosity 7 , we used mobile lipids (see Fig. S1) to estimate the dynamic viscosity of our surfaces, following recently published methods 8 . By estimating the size of DOPE-ATTO655 conjugation to 1.5 nm 9 , a resulting viscosity of 5.82 ± 0.016 poise for the slightly hydrated trimyristin was found. From here the hydrodynamic radius of lipases was calculated using Stokes-Einstein. The results can be seen in supplementary table S4.

Supplementary Figures S1. Lipid diffusion within the trimyristin layer by SPT
Fig. S1. Lipid diffusion (ATTO-655-DOPE) within the trimyristin surface layer by SPT (1 ppm). A) Single particle tracking of fluorescently tagged DOPE lipids additionally reveals no significant change, again for both samples containing active lipase (dark green) and no lipase (light green). Error bars at least triplicate measurements.    Step

Fig. S6. Comparison of step length from simulation and SPT. A)
Step length distributions from simulated data (green) and tracked data (yellow). Data was simulated by making a video of gaussian intensity spots moving on a surface, following a single Brownian diffusion model. Data confirms the software's ability to track single particles on a surface. Noise was introduced to match experimental data. B) BIC values for HMM analysis on the tracked data from A. As expected, the methodology here suggests that a single underlying diffusional state is the best description. Step length [µm] Lipase Streptavidin  , which could be originate from poorly separated data in the corresponding TDP (see Fig. S9D). As for panel E (top right), we ascribe this to lack of data points. However, as 18/20 data set seems to fit well using a single exponential fit, we believe this to be the best approximation for explaining the observed behaviors.