Ratiometric Matryoshka biosensors from a nested cassette of green- and orange-emitting fluorescent proteins

Sensitivity, dynamic and detection range as well as exclusion of expression and instrumental artifacts are critical for the quantitation of data obtained with fluorescent protein (FP)-based biosensors in vivo. Current biosensors designs are, in general, unable to simultaneously meet all these criteria. Here, we describe a generalizable platform to create dual-FP biosensors with large dynamic ranges by employing a single FP-cassette, named GO-(Green-Orange) Matryoshka. The cassette nests a stable reference FP (large Stokes shift LSSmOrange) within a reporter FP (circularly permuted green FP). GO- Matryoshka yields green and orange fluorescence upon blue excitation. As proof of concept, we converted existing, single-emission biosensors into a series of ratiometric calcium sensors (MatryoshCaMP6s) and ammonium transport activity sensors (AmTryoshka1;3). We additionally identified the internal acid-base equilibrium as a key determinant of the GCaMP dynamic range. Matryoshka technology promises flexibility in the design of a wide spectrum of ratiometric biosensors and expanded in vivo applications.

and seedling 3 respectively) that roots were exposed to salt shock by addition of NaCl to the growth media. Final NaCl concentrations in reservoirs were estimated to be between 10-25 mM.
Samples were co-excited with 440 and 488 nm lasers. Emission intensities were simultaneously collected at 500-540 nm (green channel) and 570-650 nm (orange channel).  Chrystal structure of AfAMT1 with highlighted suppressor mutations. Cartoon illustration of F138 (cyan) and L255 (orange) position in side (a) and top view (b) of AMT monomer (AfAMT1 was used as proxy; PDB: 2B2F). Note, both residues point towards the ammonium transporter pore.
The experiment was repeated in three individual oocytes.

Supplementary Tables
Supplementary Table 1

Determination of chromophore maturation efficiency
In order to assess the effects of the Matryoshka concept on the maturation efficiencies of the constituent fluorescent proteins (FPs) we took a two-pronged approach by analyzing the separate single FP species (cpEGFP, sfGFP, and LSSmOrange) and the derivative Matryoshkas (ie. eGO and sfGO) by visible absorbance spectroscopy and intact protein mass spectrometry. All analyzed proteins were purified first by Ni +2 -affinity chromatography and then by anion exchange chromatography.
GFP is well known to exhibit two pH-sensitive absorbance bands which correspond to the protonated neutral (A-state) and deprotonated ionized (B-state) forms of the chromophore. We collected absorbance spectra at a range of pH values to vary the relative magnitude of these two bands and extrapolated to the intrinsic basis spectra for each form for both cpEGFP and cpsfGFP. LSSmOrange has only one peak which is largely pH insensitive and was used directly as the LSSmOrange basis spectrum. The weighting of extinction coefficients was performed using the standard base denaturation method introduced by Ward et al 3 .
With the basis spectra in hand we measured the absorbance spectra of the eGO and sfGO Matryoshka variants and performed least squares fitting using Matlab to determine the relative concentrations of mature GFP and OFP (orange FP, here LSSmOrange) chromophores ( Supplementary Fig 3a, b). Our findings suggest that the ratios of OFP:GFP are not exactly 1:1 as would be expected for complete maturation of both species. Instead, we observe that there is somewhat less mature LSSmOrange than GFP. This fact, on its own, does not limit the usefulness of the internal reference provided that the fraction of immature LSSmOrange is consistent and ideally reflects the maturation efficiency of solitary LSSmOrange itself.
To obtain a more complete picture including both mature and immature species we turned to intact protein mass spectrometry (MS). The chromophore maturation process involves a dehydration and oxidation step, which collectively results in a loss of 20 Da. This mass shift can be resolved with high resolution intact protein ESI-MS. Purified cpEGFP, cpsfGFP, LSSmOrange, eGO, and sfGO were all analyzed by LCMS with a Bruker micrOTOF-Q II mass spectrometer. The resulting mass spectra were processed by MaxEnt for deconvolution of the multiple charge states. All samples resulted in a major mass peak corresponding to N-terminal methionine loss and the expected mass shift due to chromophore maturation (-20 Da for single FP species, -40 Da for nested FP species) (Supplementary Fig 3c). Additionally, there were a number of smaller peripheral peaks, which may have arisen from immature chromophore species, salt adducts, methionine oxidation, etc. Those proteins containing LSSmOrange (LSSmOrange, eGO, and sfGO) in particular showed a greater abundance of these side peaks, possibly owing to incomplete chromophore maturation.
The central question is whether the Matryoshka strategy (ie. nesting an LSSmOrange inside a single FP biosensor) compromises the chromophore maturation efficiency. The absorbance analysis and the presence of the major peak at the expected masses of the GO species ( Supplementary Fig 3c) suggest that a significant portion of LSSmOrange is mature. However, for a more quantitative perspective we took a data-driven approach to the mass spectrometry data.
If the FPs do not influence the maturation properties of each other then one expects that the mass spectrum of the nested FP species should be the convolution of the two single spectra. For example, eGO should be the spectrum of cpEGFP convolved with the spectrum of LSSmOrange.
The corresponding spectra were numerically convolved in Matlab and are plotted against the spectra of eGO and sfGO ( Supplementary Fig 3d, e; compare yellow traces to blue traces). There is clearly close correspondence in the spectral shape.
We took this analysis one step further by decomposing the LSSmOrange spectrum ( Supplementary Fig 3c) into the projected correct peak and the remaining peripheral peaks. The "correct" spectral peak is centered at the expected mature protein mass and has a width dictated by counting statistics and the natural isotope abundances. The "peripheral" spectrum was simply the LSSmOrange spectrum minus the "correct" spectrum. Then the numerical convolution was calculated for the GFP spectrum and a linear combination of the "correct" and "peripheral" LSSmOrange spectra. The best fits were determined by least-squares optimization in Matlab and are shown as the red traces ( Supplementary Fig 3d, e). This fitting procedure results in nearly perfect reproduction of the GO spectra. In both cases the best fit actually has a slight enrichment of the "correct" peak relative to the original LSSmOrange spectrum.
Overall these results indicate that the Matryoshka construct leads to no decrease in the extent of LSSmOrange maturation and that the OFP:GFP disparity found from the absorbance measurements is likely a consequence of incomplete maturation which is probably also present in solitary LSSmOrange.

Predicted determinants of dynamic range
We used the data of the calcium sensors GCaMP6s and sfGCaMP6s to investigate determinants of the dynamic range, with the intention of developing a more generalizable approach to sensor engineering using a mathematical model. With a set of absorbance and fluorescence excitation measurements performed at a range of pH values under calcium-free and calcium-saturated conditions, we have been able to examine in detail the factors influencing the biosensor dynamic ranges. Previous efforts to understand the origins of the sensor response have focused primarily on changes in fluorescence quantum yields and differential pKa's between the apo-and ligandbound protein 4-6 . Our analysis has revealed that, in addition to these two identified mechanisms, a third factor -internal acid-base equilibrium -is of considerable importance to the dynamic range.
Fluorescent proteins (FPs) are routinely described to have a pKa that characterizes the transition between the neutral (often dark) and anionic (usually emissive) forms of the chromophore as a function of pH. A simple single-site titration is implicit in this treatment, yet, the coupling between the ionization state of the chromophore and the surrounding ionizable amino acid side chains can be quite significant. Consequently, these titrations can exhibit complex behavior, including negative cooperativity, response plateau regions, or apparent mixed states in limits of high or low pH. The original wild-type Aequoria victoria avGFP, in fact, dramatically exemplifies strong coupling to E222, resulting in a nearly flat pH response from pH 7-11. Our analysis indicates that site coupling plays a prominent role in these calcium biosensors as a major determinant of sensor performance.
We constructed a simple model to better understand the relationship among the three central mechanisms of sensor dynamic range: 1) the difference between the pKa's (or apparent pKa's, as shall be further clarified) between the apo and saturated species, 2) the difference in quantum yield of the emissive state between the apo and saturated species, and 3) the internal ionization equilibrium of the apo species. A two-site model (Supplementary Figure 13a) captures the essence of the internal buffering process and has been shown in previous studies to satisfactorily explain many of the anomalous pH titration behaviors in fluorescent proteins [7][8][9] . This two-site model postulates a secondary ionizable site, "X", whose ionization state is energetically coupled to the chromophore ionization. That is, the propensity of the chromophore to become deprotonated is different depending on whether "X" is ionized or neutral. When the magnitude of this coupling is large, one observes plateaus in the pH titrations as exemplified in Supplementary   Figure 13b. In this pH range (pH ~8-9) the behavior is dominated by the internal buffering process, independent of the external pH. The role of "X" is clearly filled by E222 in avGFP. Its identity in single-FP sensors is less clear, but is likely played by nearby ionizable residues and perhaps accounts, in part, for the performance sensitivity to the flanking amino acid sequences. The data for all the apo calcium biosensors revealed a plateau at high pH in the population of deprotonated chromophore short of 100%. This clearly indicates that the chromophores are experiencing internal buffering, because the chromophore ionization would otherwise proceed to completion. Importantly this sub-saturating limit reduces the possible amount of the emissive "B" state (fB), thus limiting the fluorescence and thereby boosting the dynamic range by a factor of 1/fB for any pH at or below the plateau region. The relative quantum yield (Φapo/Φsat) serves a similar role, by increasing the dynamic range in a pH-independent manner.

Supplementary
The calcium-saturated species closely conform to a simple single-site titration (i.e. they reach complete deprotonation in a sigmoidal fashion). The apo species do not. However, with their sizable site couplings, the first transition is indeed well described in terms of a single-site titration to the sub-saturating limit, fB. Furthermore, for the sake of simplicity we are assuming excitation of only the deprotonated GFP absorbance band and that the intrinsic extinction coefficients of the apo and ligand-bound deprotonated species are equal. Using these assumptions, we can derive an expression for the dynamic range as a function of pH, where pKa, sat is the pKa of the ligand saturated species, pKapp, apo is the apparent pKa of the apo species (explicitly defined in Supplementary Figure 13b), fSS is a function containing the pH dependence of a single-site deprotonation as , fB is the internal equilibrium factor in the apo species (explicitly defined in Supplementary Figure 13b with saturated and quantum yield-weighted apo curves. It is important to note that the dynamic range effects are multiplicative, not additive, and thus the relative significance of each may not be visually apparent. Note also that for illustrative purposes the dynamic range curves are calculated explicitly from the model in Supplementary Figure 13 in order to include the eventual transition to B'. Implicit in Eq. 1 is the assumption that this final transition falls outside the pH region of practical interest (i.e., Eq. 1 is valid from pH 4-9 in this example). The dashed blue line represents the noise free idealization and predicts that the dynamic range should improve asymptotically with decreasing pH. In actuality, however, the fluorescence intensity quickly approaches zero and will eventually be overwhelmed by background noise (red trace with 1% noise). This trade-off between increasing intrinsic dynamic range and decreasing signal-to-noise leads to a peaked value of realized dynamic range as a function of pH.
The theoretical maximum dynamic range can be obtained from Eq. 1 in the limit of acidic pH and zero noise. To explore the practical implications of the above formulation, we tabulated the factors influencing the dynamic range for GCaMP6s and sfGCaMP6s (Supplementary Table 5). These two specific examples are broadly representative of the two categories of calcium sensors we built based on the original cpEGFP or cpsfGFP scaffolds, respectively. There are a number of interesting implications, which follow from the itemized accounting of dynamic range. Based on the analyses, a key factor is the previously unappreciated internal equilibrium factor described here. GCaMP6s derives its large dynamic range by utilizing all three mechanisms: differential pKa's and quantum yields of the ligand-saturated and apo species, and internal acid-base equilibrium. By contrast, sfGCaMP6s relies almost exclusively on the differential pKa's between the apo and saturated species. Due to the multiplicative nature of the factors, sfGCaMP6s could significantly gain dynamic range by even a small improvement of the relative quantum yield and internal equilibrium factors. The maximum pKa-associated factors stand out for their extremely high values (Supplementary Table 5). This metric is rather deceptive, however, because it occurs in a limit of vanishing fluorescence intensity. In practice, when optimizing a biosensor, the realizable dynamic range also depends on the brightness and may be better served by improving other factors, i.e., relative quantum yield and internal equilibrium, which do not negatively impact brightness of the saturated species.

Supplementary
An important limitation of single-FP biosensors is their high sensitivity to the environmental pH.
That is, the true signal due to the intended analyte can be confounded by fluctuations in the cellular pH. The severity of this problem is tied directly to the relative contribution of the three dynamic range determinants described herein. In particular, the larger the dynamic range contribution due to the differential pKa the greater the sensitivity to pH will be. In contrast, the relative quantum yield and the internal equilibrium should not be significantly affected by pH changes. Consequently it may be an advantageous trade-off to have a biosensor that relies chiefly upon the latter two factors even at the cost of a lower overall dynamic range because its signal response would be largely immune to pH variation. versions, as well as systematic site directed mutagenesis of the residues listed above using the methods described here may be a path to optimizing the sensors and to better understanding the molecular basis of the interplay of the three principal components that affect the dynamic range.
In summary, we identified the internal acid-base equilibrium as a major determinant of the dynamic range achievable by single-FP and, by extension, Matryoshka biosensors. Furthermore, this new factor has been placed in quantitative context with the previously known mechanisms of differential pKa's and quantum yield differences between the ligand-saturated and apo species.
The importance of internal acid-base equilibrium is underlined by its responsibility for the majority of the dynamic range of GCaMP6s, one of the most responsive biosensors available to date. This more refined understanding of dynamic range may help facilitate future sensor design and optimization.

Supplementary Note 3 Generation of AmTrac sensors based on cpsfGFP
For comparison of the AmTryoshka sensor series with their appropriate parent sensors and to assess the effects of cpsfGFP versus cpEGFP in the original AmTrac 2 , variants that carried only cpsfGFP without the LSSmOrange (termed sfAmTrac) were generated. In AmTrac, we found  Affinities were derived from Hill plot in Supplementary Figure 17b