Enlarging the scenario of site directed 19F labeling for NMR spectroscopy of biomolecules

The possibility of using selectively incorporated 19F nuclei for NMR spectroscopic studies has retrieved increasing interest in recent years. The high gyromagnetic ratio of 19F and its absence in native biomolecular systems make this nucleus an interesting alternative to standard 1H NMR spectroscopy. Here we show how we can attach a label, carrying a 19F atom, to protein tyrosines, through the use of a specific three component Mannich-type reaction. To validate the efficacy and the specificity of the approach, we tested it on two selected systems with the aid of ESI MS measurements.

and to increase the 19 F chemical shift dispersion 35 .However, cysteine modification can have some drawbacks, including the fact that in a number of proteins cysteines are in the active site and/or coordinate metal cofactors.
A valuable alternative for protein labelling is tyrosine.One of the features that makes tyrosine an interesting labelling site is its average low natural abundance (just above 3%) 36 , making it one of the rarest amino acids in protein sequences.Moreover, being tyrosine a partially hydrophobic residue, its location can range from being deeply buried inside the protein hydrophobic core to being surface exposed.Since the buried ones are far higher in number than the exposed ones, selecting the right reaction conditions, it could be possible to covalently label the relatively rare surface exposed tyrosines in a site-selective way 37 .Several approaches have been proposed, and recently reviewed, for the modification of tyrosine residues 37 .The most relevant ones involve the use of diazonium coupling reactions 38,39 , of diazodicarboxyamides 40,41 , of sulfur/fluoride exchange (SuFEx) chemistry 42 and of the Mannich-type reaction 43 .This latter reaction targets the phenolic side chain of tyrosine residues on proteins and is one of the oldest methods developed for tyrosine bioconjugation; it has been successfully used both for grafting fluorophores 43 and synthetic peptides 44 to chymotrypsinogen.
The three component Mannich-type reaction (Fig. 1) is characterized by a first step in which an imine condensation between an aldehyde and an electron-rich aromatic amine takes place.Afterwards, the phenol ring of tyrosine is deprotonated and undergoes an electrophilic aromatic substitution with the iminium ion, resulting in the formation of a novel carbon-carbon bond.This reaction was used by Francis et al. to chemically modify proteins using either small peptides or small molecules 43 .Moreover, an interesting application was reported by Belle et al. in which this reaction was used to selectively incorporate a novel spin label for EPR spectroscopy experiments 45 .
In this work we report the protocol for tyrosine protein labelling using parafluoroaniline (p-FA) whose efficacy has been tested, through ESI mass spectrometry and 19 F NMR measurements, on two proteins of different size.

p-FA tyrosine conjugation
The immunoglobulin binding domain of Streptococcal protein G (GB1) and Hen Egg White Lysozyme (HEWL) were selected as test proteins (Fig. 2), both proteins having three tyrosine residues, located in different positions of the protein structure.
Both proteins were reacted, exploiting proper reagents ratio, with formaldehyde and the free 19 F label.The pH at which the reaction is carried out plays a crucial role in the formation of the desired fluorinated tyrosine adducts.Indeed, operating at pH 6.5 is crucial for minimising possible side reactions on unwanted amino acid residues like tryptophans.Moreover, at this pH value the equilibrium that characterizes the reaction, could be driven mostly towards the formation of the open ring Mannich adduct.The reacted samples were then analysed  by 19 F NMR to assess the presence of the fluorinated tag conjugated to the protein tyrosines, and to estimate the overall amount of fluorine nuclei conjugated onto the proteins and the number of tyrosines effectively involved in the conjugation reaction.The attachment of the fluorinated tag was further investigated by ESI-MS spectra of the intact protein before and after the coupling reaction.Mass spectrometry data were used to verify the efficiency of the conjugation reaction and the number of residues to which the tag is attached.

GB1
The 1D 19 F NMR spectrum of GB1 shows the presence of one well defined main peak and two smaller peaks (Fig. 3b), exhibiting different features, both in terms of shape and chemical shift, than the free fluorinated tag signal (Fig. 3a).However, it is impossible to establish the exact number of residues involved in the conjugation reaction by relying just on these NMR spectra.
It is feasible that the high intensity signals arise from one labelled tyrosine, but only the ESI-MS spectra clearly indicated (Fig. 4) that a single residue has been successfully labelled.The upfield smaller peaks observed in the 19 F NMR spectra can be associated with the non-covalent interactions between the protein and a small fraction of p-FA that cannot efficiently be removed during the purification steps of the reaction probably due to π-π stacking interactions between the aromatic ring of the tag and the aromatic rings of other residues.
The mass spectra data (Fig. 4b) indicate that the native unlabelled protein is still the predominant species; yet, a new peak is observed with a mass increase of 123 Dalton (50% intensity of the main peak).This peak originates from GB1 (native unlabelled protein Fig. 4a) with the p-FA tag attached to one residue which, according to the molecular weight increase, is forming the open ring adduct.By assuming an equal ionization efficiency for both species, we can directly assess the ratio of labelled to unlabelled protein which resulted to be 50:100.This high efficiency represents a partial surprise since the Mannich reaction usually employs electron-rich anilines to better attach the carbonyl group of the formaldehyde through a nucleophilic attack.The adduct was obtained through several optimization steps of the reaction conditions, such as the time of the reaction, the temperature, and the ratio of the reagents.
To identify the specific tyrosine modified by the bioconjugation reaction, we performed 1H-13C HSQC NMR spectra on both the native, unlabelled protein and the fluorinated protein.The comparison of the spectra (Supporting Information S3), suggests that the tag is attached to tyrosine 3. www.nature.com/scientificreports/HEWL A tagged HEWL sample showed a 1D 19 F NMR spectrum featuring two broad, very weak peaks close to each other at around -49 ppm, and another peak with higher intensity (Fig. 5a) at around -50 ppm.At first glance, this spectrum might suggest that the latter peak originates from an effectively 19 F-tagged protein tyrosine and that the weaker and broader peaks are due to the low level tagging of the two other tyrosines.However, the intense and sharp 19 F NMR signal at -50 ppm is detected even for a mixture of the protein and the correct amount of p-FA tag but without addition of formaldehyde (the needed linker between the protein and the fluorinated label) thus indicating that this signal is due to a non-covalent interaction between the fluorinated tag and the protein, while the broad peaks could arise from the tag bound to the protein.Mass spectrometry data (Fig. 6b) confirmed that a single tyrosine among the three of lysozyme (native unlabelled protein Fig. 6a) was modified and that the two distinct broad peaks of the NMR spectrum could originate from a coexistence between the open and closed ring Mannich adduct.The existence of this equilibrium was confirmed through the comparison between simulated and experimental isotopic patterns of the sample under investigation, with peaks at + 123 Dalton for the open ring adduct and at + 137 Dalton for the closed one.Nevertheless, in this case, the reaction efficiency was significantly lower than for the GB1 protein, probably due to the reduced exposure of the tyrosine residues on the protein surface.
To further corroborate this hypothesis, HEWL was treated with 30% DMSO before adding the conjugation reaction reagents.Addition of DMSO induces a partial unfolding of the protein thus increasing the solvent exposition of the residues, including tyrosine, and leading to an increase in the reaction efficiency.The 1D 1 H NMR spectrum was exploited to confirm the partial unfolding of HEWL after the addition of DMSO (Supporting Information S1).Partially unfolded HEWL was then subjected to the p-fluoroaniline labelling procedure through the Mannich reaction, following the same protocol and time scheme used for the completely folded   6c) shows a set of signals similar to those observed after tag binding to the folded protein, but with an overall increase of the labelling efficiency, with a ratio of tagged: untagged of 40:100.Moreover, the existence of a second set of signals, with a lower intensity and a higher molecular weight, confirms the presence of a second tyrosine residue labelled with the fluorinated tag.
Therefore, the use of DMSO allows gaining a larger amount of fluorinated tag attached on the tyrosine residue that is only partially labelled in the absence of DMSO.
These data confirm that only the accessible tyrosine residues can be effectively tagged and that, based on residue exposure, some selectivity in the tagging can be obtained.

Conclusions
This work presents a different application of an established mild bioconjugation reaction for NMR spectroscopy achieving the labelling of tyrosine residues with a small molecule containing the 19 F atom.The incorporation of a specific tag containing the 19 F atom offers the chance to investigate biomolecular systems in less crowded spectra compared to the 1 H spectroscopy.Moreover, the opportunity to insert 19 F atoms through a different approach than the direct overexpression of proteins with fluorinated amino acids, provides a different helpful way for situations where the direct overexpression is not applicable.We demonstrated that, using the three component Mannich-type reaction, it is possible to achieve valuable site selectivity among tyrosine residues depending on surface exposure of the tyrosine.The amphiphilicity of the phenolic side chain plays a crucial role in obtaining labelling selectivity, since most of the tyrosines are buried deep in the hydrophobic core and are not available for external modification.Therefore, both the chemical environment and the surface exposure of these residues play an important role in determining whether the residue can be labelled or not.Here, we have demonstrated that, upon adding a given amount of DMSO, HEWL can go from being labelled on one tyrosine to being labelled on two.The denaturating action of dimethyl sulfoxide exposes the residues previously inaccessible and buried inside the protein core, to the solvent and allows the conjugation reaction between the tyrosine and the fluorinated molecule.The use of a commercially available fluorinated tag such as the herein used p-fluoroaniline suggests the possible application of a wide variety of relatively cheap molecules.Moreover, the reaction yield obtained for both GB1 and HEWL with the addition of DMSO should be considered remarkable.In fact, the first step of the three component Mannich-type reaction is an imine condensation between the formaldehyde and the p-fluoroaniline.
Since the imine formation starts with a nucleophilic addition of the amine to the carbonyl group, the reaction has a higher efficiency if the amine is electron rich.The fluorine atom is considered an electron withdrawing group (EWG) that decreases the electron density from the nitrogen atom and reduces the efficiency of the nucleophilic attack on the carbonyl group.In conclusion, we have demonstrated how under optimized conditions, a low-cost reaction can be exploited to perform post-expression conjugation of small fluorinated molecules to tyrosine residues.In addition, we established how the protein folding properties play a crucial role in the number of tyrosine residues than can be labelled and even in the efficiency of the reaction towards specific amino acids.

GB1 T53C expression and purification
GB1 was expressed and purified according to already existing protocols 46 .Briefly, a pET-21a vector encoding for the immunoglobulin binding domain of streptococcal protein G (containing the mutation T53C) was used to transform BL21 (DE3) gold cell strain.E. coli cells were grown to mid-log phase at 37°C in LB medium, and then induced with 0.6 mM of isopropyl β-D-1-thiogalactopyranoside (IPTG).After induction the cells were grown for other 5 h at 20° C. The cell pellet was collected by centrifugation at 6000 rpm for 20 min and resuspend in phosphate buffer (100 mM sodium phosphate, 150 mM NaCl, pH 6.5).The suspension was heated to 80 °C, for 5 min, using a thermal bath, then cooled down on ice for 15 min and finally centrifuged at 40,000 rpm for 40 min.After filtering the supernatant, 5 mM DTT were added to the solution that was loaded onto a 16/600 Superdex 30 Increase (Cytiva) exchanging the buffer with 100 mM sodium phosphate, 150 mM NaCl, 1 mM TCEP, pH 6.5.

Hen egg white lysozyme
Hen egg white lysozyme was purchased from Sigma Aldrich.

F site directed labelling protocol
Both GB1 and HEWL were reacted with formaldehyde and p-fluoroaniline (both purchased from Sigma Aldrich) with a ratio of 1:100:30 in sodium phosphate buffer 100 mM at pH 6.5.In particular, after thawing, 100 µL of GB1 250 µM were buffer exchanged in the final phosphate reaction buffer using a PD10 desalting column.50 µL of formaldehyde 0.25M and 30 µL (for each tyrosine) of p-fluoroaniline, were added to the protein solution.
Regarding lysozyme, 3.5 mg of protein were resuspended in 1 mL of phosphate buffer for a final concentration of 250 µM.The same amount of formaldehyde and p-fluoroaniline used for GB1 were added to the HEWL solution.A second sample of HEWL was first pre-treated with 30% DMSO and then reacted with formaldehyde and p-fluoroaniline.For both proteins the reaction was incubated at 37 °C for 36h in a shaking incubator.Afterwards, the excess of p-fluoroaniline was removed through a 2.5 mL PD10 desalting column.The 3.5 ml sample volume obtained after passing through the desalting columns was concentrated exploiting a 3KDa, for GB1, and a 10KDa, for the HEWL, centricon (Merck) The volume was reduced until the protein reached 300 µM concentration.However, this method was not able to remove completely the unreacted p-FA tag, especially in the HEWL case.For this reason, a further purification of the labelled lysozyme was conducted by exploiting a gel filtration purification step.Briefly, the sample, in sodium phosphate buffer 100 mM at pH 6.5, was loaded into a size exclusion chromatography Superdex 16/60 75 pg column through a 1 mL loop.The labelled protein was collected in 1.5 mL fractions, that were concentrated to 300 µM.The 1D 19 F NMR spectrum conducted on this sample confirmed the complete removal of the free unreacted p-FA, (Supporting Information S2).After the purification step small aliquots of each sample were immediately taken and frozen for mass spectrometry (ESI-MS) analysis.

19
F magnetic resonance spectra ( 19 F NMR) were recorded with a Bruker 600 MHz spectrometer with a TXI probe.Chemical shifts are reported in delta (δ) units, part per million (ppm), and were referenced to trifluoroacetic acid (TFA) as internal standard.10% of deuterated water was added to the NMR tubes of each sample.All spectra were recorded at 298K.

Figure 1 .
Figure 1.Reaction scheme.General representation of three component Mannich type reaction on tyrosine residue.

Figure 3 .
Figure 3. GB1 19 F NMR spectrum.(a) Comparison between 19 F NMR spectra of p-FA (black) and GB1 after the conjugation reaction with p-FA (blue).(b) 4-fluoroaniline spectrum processed with a line broadening of 1Hz, showing distinct heteronuclear coupling between 19 F and 1 H nuclei.

Figure 4 .
Figure 4. ESI-MS spectra of GB1.Deconvoluted ESI mass spectra of (a) GB1, 10 -6 M in ammonium acetate and (b) GB1, 10 -6 M, after the reaction with fluorinated tag.The peak at 6347 Da represents the GB1 open ring adduct.The bound fragment is red in the drawn structure.

Figure 5 .
Figure 5. 19 F NMR spectra of lysozyme.(a) 19 F NMR spectra of p-FA (blue), lysozyme after the three component Mannich reaction (black).(b) Lysozyme after the three component Mannich reaction without (red) DMSO.The green spectrum represents the non-covalent interactions between the protein and the p-FA.

Figure 6 .
Figure 6.ESI-MS spectra of lysozyme.(a) Deconvoluted ESI mass spectrum of free HEWL, (b) deconvoluted ESI mass spectrum of HEWL after the conjugation reaction, (c) deconvoluted ESI mass spectrum of HEWL after the conjugation reaction with the presence of DMSO.The bound fragments are red in the drawn structures.