Highly selective tungstate transporter protein TupA from Desulfovibrio alaskensis G20

Molybdenum and tungsten are taken up by bacteria and archaea as their soluble oxyanions through high affinity transport systems belonging to the ATP-binding cassette (ABC) transporters. The component A (ModA/TupA) of these transporters is the first selection gate from which the cell differentiates between MoO4 2−, WO4 2− and other similar oxyanions. We report the biochemical characterization and the crystal structure of the apo-TupA from Desulfovibrio desulfuricans G20, at 1.4 Å resolution. Small Angle X-ray Scattering data suggests that the protein adopts a closed and more stable conformation upon ion binding. The role of the arginine 118 in the selectivity of the oxyanion was also investigated and three mutants were constructed: R118K, R118E and R118Q. Isothermal titration calorimetry clearly shows the relevance of this residue for metal discrimination and oxyanion binding. In this sense, the three variants lost the ability to coordinate molybdate and the R118K mutant keeps an extremely high affinity for tungstate. These results contribute to an understanding of the metal-protein interaction, making it a suitable candidate for a recognition element of a biosensor for tungsten detection.

Tungsten is a rare transition metal element from the group VI of the periodic table, together with chromium and molybdenum. It presents several important characteristics that include a high melting point (the highest of all metals, 3422 °C), good conductivity and a large variety of oxidation states (from −2 to + 6) 1, 2 . Due to its versatility, W is an important component of a broad range of civil and military industries. The applications include X-ray equipment, implanted medical devices, microwaves, ammunitions, light bulb filaments, jewelry, metal tools, among others 3,4 . Since the end of the 19th century, Portugal has an important W deposit, being the ninth major producer in the world (by the U.S. Geological Survey in 2016) 5 . In the past, W was viewed as a non-toxic metal but the scientific community became aware of its potential impact on the environment, through the accumulation in air, soils and water (where WO 4 2− is the dominant species), mainly in mining sites, but also in some industrial and military sites. Recent data indicate that high concentrations of W are associated with peripheral arterial disease 6 , increased prevalence of stroke 7 , fasting plasma glucose levels and diabetes 8, 9 . In biological systems, the incorporation of tungsten as a cofactor is limited to bacteria and archaea, with the W-containing enzymes almost restricted to obligate anaerobic prokaryotes 10,11 . In contrast, the similar element molybdenum can be also found in several enzymes from Eukarya. In most cases, the function of Mo and W-enzymes is to catalyze oxygen atom transfer reactions. One of the challenges of biological systems is to differentiate one metal from the other to avoid its incorrect insertion in the catalytic site of enzymes. Misincorporation of metals into proteins often leads to compromised activity or even inactive enzymes [12][13][14] .
Both tungsten and molybdenum are taken up by cells in the form of their soluble oxyanions (MoO 4 2− or WO 4 2− ) by high specific ABC transporters 15,16 . In prokaryotes, there are three types of tungstate/molybdate transporters: ModABC 17 , TupABC 18 and WtpABC 19 . They are composed of a periplasmic protein (component A), a transmembrane pore forming protein (component B) and a cytoplasmic protein (component C), which hydrolyzes ATP to transport the oxyanion into the cell cytoplasm 16 . The genes encoding the three components Comparison of DaG20 TupA with related structures. There are several models available at the PDB of molybdate/tungstate binding proteins. The ModA proteins from eubacteria or archaea share low sequence homology with DaG20 TupA (less than 22%) and, even though the proteins fold is similar, the RMSD upon superposition is over 3 Å. When the 3D model reported here is used as a search model at the PDB, three bacterial proteins with over 45% sequence identity are retrieved. These are the already mentioned GsTupA a protein with unknown function from Vibrio parahaemolyticus (Q87PK2, PDB code 3muq) and a LysR substrate binding domain from Wolinella succinogenes (Q7M8V9, PDB code 3kn3), all obtained by the Protein Structure Initiative (PSI). In common, the proteins contain the TTTS motif and some other important residues for tungstate binding (see below), indicating that they should be classified as the component A of the TupABC system (Fig. 2). The DaG20 TupA shares 45.6% sequence identity with GsTupA, with conserved secondary structure elements, suggesting that the two proteins are structurally very similar. However, superposition of the entire protein provides an unexpectedly high RMSD value (2.4 Å upon superposition of 208 Cα). A more realistic comparison can be achieved when considering the superposition of the independent lobes of the two proteins, yielding much lower deviations (RMSD for the superposition of lobes A and B from the two proteins is 1.24 Å and 1.16 Å for 106 and 96 Cα, respectively). These values support a high structural similarity between DaG20 TupA and GsTupA but also indicate some degree of flexibility of the lobes with respect to one another, as expected for the periplasmic A proteins of these transporters. GsTupA was likely crystallized in the presence of an ionic form of tungsten, and the deposited model reports a free W 6+ accommodated in the metal binding cleft. In this GsTupA-W 6+ holo form, the central cleft volume (363.9 Å 3 ) is four times smaller than the apo form of DaG20 TupA here reported (1480 Å 3 ). Unexpectedly, the cation is not coordinated to water molecules or protein residues. The water molecules surrounding the W 6+ are at a distance of 3.23-3.82 Å and the closest residues are at 3.9 Å (the OG1 of T9 and NH1 of R118 (GsTupA numbering) from lobes A and B, respectively). In DaG20 TupA these residues are at the same position although not superimposable with GsTupA. The data suggest that conformational changes take place upon metal binding, where the protein in the holo form adopts a more compact conformation.
Analysis of the overall structure and stability upon ligand binding. To elucidate the possible conformational changes of DaG20 TupA upon ligand binding, synchrotron SAXS data were collected both in the presence and absence of tungstate and molybdate. The scattering data of the apo form indicate a monomeric globular TupA protein with a radius of gyration (R g ) of 24.2 Å and a maximum particle size (D max ) of about 95 Å (see Table 1). In the presence of tungstate or molybdate (datasets DaG20 TupA-WO 4 2− and DaG20 TupA-MoO 4 2− , respectively), the overall shape of the protein remains globular but becomes more spherical and compact, leading to a decreasing R g (23 Å) and D max (90) Å. Importantly, the datasets obtained for TupA in the presence of tungstate and molybdate are very similar: the CorMap approach 28 reveals no statistically significant difference (C = 12, p-value = 0.12) between both datasets, indicating that in solution the protein adopts the same conformation upon binding both oxyanions.
Comparison of the distance distribution functions (P(r)) between holo and apo forms shows a more compact conformation for the holo form, in agreement with the idea that the protein 'closes' upon binding of the metal ion. The apo-form P(r) shape supports the notion of the two lobes being more separated than in the holo-state. The slowly decaying long interdistance tail observed in the P(r) function (Fig. 3, insert), suggests an unstructured N-terminal region containing the 23 residues long expression-tag.
The crystal structure reported here was used as a starting point to create models for both the holo-form and apo-form states. The 23 N-terminal residues that are missing in the crystal structure were calculated using BUNCH 29 . Based on GsTupA-W 6+ adduct, a tungstate group was added to the expected binding site of the BUNCH model, generating a theoretical holo-form model. The theoretical scattering curves calculated using this holo-form model of DaG20 TupA are in very good agreement with the SAXS measurements of the protein in the presence of tungstate or molybdate (discrepancy χ 2 = 0.9) (Fig. 4). The apo-form model was created by refining the initial BUNCH model with the SREFLEX program 30 . This approach revealed a slight opening of the lobes (RMSD of 1.5 Å for 274 Cα atoms) yielding an excellent agreement to the TupA SAXS data measured in the absence of tungstate or molybdate (χ 2 = 1.0). The optimum holo-form model yielding the best χ 2 to the experimental data and the smallest RMSD to the original structure is presented in Fig. 5 Table 1. Data collection and structural parameters obtained by SAXS. Molecular mass (M) was estimated from forward scattering I(0) and Porod volume respectively, the theoretical molecular mass predicted from sequence is 29.7 kDa. Radius of gyration, R g (Å) was calculated using the Guinier approximation and also the distance distribution function (P(r) using GNOM), which also estimates maximum particle dimension (D max ). χ 2 values correspond to discrepancies between models and experimental data, the lowest χ 2 value per dataset is highlighted. MX: crystal structure. *Data collected at BM29 beamline, ESRF, Grenoble, France. The rest of the datasets were collected at EMBL P12 beamline, DESY, Hamburg, Germany. . SAXS scattering data (points) and GNOM fits (lines) for TupA in the absence (TupA) and presence of tungstate (TupA W). The data collected in presence of molybdate was omitted from the main plot for clarity as it matches the TupA W data up to noise. Insert: distance distribution functions (P(r)) for the different conditions measured.
transition holo-apo is depicted by arrows. The SAXS data corroborate the hypothesis that DaG20 TupA is a flexible protein that is able to adopt a loose conformation in the free form but, when binding to molybdate or tungstate, switches to a more compact fold. This feature can be explored to design new detection systems, as discussed later.
To study the impact of the metal binding in the protein stability urea-polyacrylamide gel electrophoresis was carried out for the recombinant DaG20 TupA in the presence/absence of tungstate and molybdate. TupA-ligand samples were prepared using a 10-fold excess of metal and the excess was removed by size exclusion chromatography. The gel shows that, upon tungstate binding, the wild-type protein migrates in the gel in a larger extend For each experimental curve, CRYSOL fits are displayed for the crystallographic structure reported in this work (MX) and the holo-and apo-form models generated thereof. The best fit for TupA is the apo-form model (χ 2 = 1.0), while the best fit for both TupA W and TupA Mo is the holo-form model (χ 2 = 0.9).

Figure 5.
Three-dimensional coordinates for the holo-form hybrid model are displayed using a cartoon representation. The N-terminal section modeled with BUNCH is shown as small spheres. The large gray sphere in the center corresponds to the tungstate group (small red spheres represent O atoms) modeled by homology with PDB entry 3cfz and 3lr1. Vectors have been drawn connecting C α s from the holo-form model to the apoform model generated by SREFLEX, after superposition of lobe A, to display the 'opening' conformational transition. Upon optimal superposition including all C α (274), the RMSD is 1.5 Å. Insert: Urea-polyacrylamide gel electrophoresis of (1) TupA, (2) TupA+MoO 4 , (3) TupA+WO 4 . The samples in presence of ligand were first passed through a size exclusion PD-10 minitrap G-25 columns to eliminate the excess. than in the absence of metal or in the presence of molybdate. This indicates that TupA adopts a more compact conformation that is likely to increase stability under a urea gradient, in agreement with our SAXS analysis (Fig. 5, insert).

Oxyanion binding site.
Most of the residues that form the binding cleft of ModA are essentially polar but poorly conserved among proteins from different organisms, as discussed for EcModA 25 . In DaG20 TupA, the cleft is decorated with positively charged residues as seen by the electrostatic surface potential calculations (Fig. 6). The pronounced positive environment of the pocket must be an advantage to enable capture of the oxyanion, even when the extracellular concentration is low. Several residues are likely to be involved in ligand binding, attracting, accommodating and delivering the oxyanion to the membrane component of the transporter system, TupB. The TTTS motif is in lobe A, with the serine pointing towards the metal binding site. Opposite to these are R118, T124 and D170, also likely involved in oxyanion interaction (Fig. 7). These residues are highly conserved among other TupA proteins from Desulfovibrio species but also extending to proteobacteria, Green Non Sulfur bacteria and even to Gram-positive bacteria such as Firmicutes. When searching for similar sequences with the TTTS motif and excluding the Desulfovibrio genus, over 500 sequences were found with more than 44% identity.
Moreover, one chloride anion, arising from the crystallization conditions, was found at 3.8 Å from R118, indicating the propensity of the pocket to attract negatively charged ions. Although phosphate buffer was used during protein purification this ion is not occupying the WO 4 2− binding site, in agreement with what is already known for this type of transporters and with our previous experimental data where we showed that DaG20 TupA is highly specific for WO 4  .
To examine the importance of R118 in ligand binding, site-directed mutagenesis has been carried out, as described in the following section.
The relevance of arginine 118 in the oxyanion binding. Using ITC, we reported the binding affinity of DaG20 TupA for tungstate and molybdate (K d of 6.30 ± 0.02 pM and 6.1 ± 0.9 nM, respectively) 31 . The described dissociation constants agree with what has been observed for the putative TupA Cj1540 from C. jejuni, with this protein binding more tightly tungstate (K D 1.0 ± 0.2 pM) than molybdate (K D 50 ± 10 nM) 26 .  The same methodology was used to attend the ability of the R118E/Q/K variants for oxyanion binding (see Fig. 8 and Table 2). The results show that substitution of the positive side chain by a carboxylic acid (R118E variant) prevents the binding to both molybdate and tungstate, confirming the relevance of this conserved residue. When R118 is replaced by a glutamine (R118Q variant), the protein binds tungstate with much less affinity than the wild-type (K D of 90 ± 50 nM), also losing the ability to bind molybdate. However, if the arginine is replaced by a lysine, the results are remarkably different. Analysis of the ITC thermograms of R118K titrated with tungstate shows a very steep binding curve that indicates a strong interaction. The obtained curve hampers the determination of K D , extremely smaller as the determined for the wild-type. This problem was also observed before for    Table 2. ITC analysis of tungstate binding to TupA variants at 30 °C. In each case 10 µM protein was used for the titrations. n = measured stoichiometry of binding.
the DaG20 TupA-tungstate binding but was overcome by competition studies using molybdate. However, R118K totally lost the affinity to bind molybdate and such binding competition strategy cannot be adopted. Nevertheless, the results clearly show that R118K variant has an extremely high affinity and selectivity for tungstate.
This contrasts with what has been observed for E. acidaminophilum where the authors reported that the R/K mutant strongly diminishes the binding of tungstate 16,32 .

Conclusions
Tungstate is a widely used heavy metal which is raising environmental concerns due to bioaccumulation in soils and water. Currently, tungsten detection requires expensive equipment (such as ICP-mass spectrometry) with only very few reports on new detection methodologies that can be applied directly on the field. A promising research field is the use of enzyme biosensors for determining toxic compounds since the associated analytical systems are simple, rapid and selective. Insertion of tungsten (and molybdenum) in bacterial enzymes is crucial for its activity and requires an uptake system, usually in the form of oxyanion. The organisms developed an efficient biological system, an ABC-type transporter that allows the metal uptake as well as differentiation between the two similar elements, W and Mo. Three classes of ABC transporting systems are described in the literature, in particular ModABC, WtpABC and TupABC.
In this work, we characterized structurally and biochemically the oxyanion coordination and selectivity of the wild-type DaG20TupA and three R118 variants. X-ray crystallography and SAXS, reveals that TupA architecture has a common feature with other substrate binding proteins, which is the existence of two separate lobes. This characteristic may improve the interaction of the periplasmic component with the dimeric transmembrane component (TupB), but also provide the structural flexibility that allows TupA to switch between a loose or compact conformation in the absence or presence of the oxyanion, respectively.
The ITC results clearly indicate the relevance of R118 in the oxyanion coordination where three mutants lost the ability to bind molybdate. Curiously, for the R118K variant, the residue substitution increases the selectivity of the protein towards tungstate with high affinity, depleting its ability to bind molybdate.
Due to the conformational changes of TupA upon ligand binding and the high affinity and selectivity of the genetically modified variant R118K, this protein can be considered for biotechnology applications. An alkaline phosphatase based biosensor was recently described to detect W in aqueous media by Alvarado-Gámez and co-workers. The detection relies on the fact that tungstate affects the activity of the immobilized enzyme. However, this method is highly compromised by the presence of other elements like selenium, iron, calcium or aluminum, that even at low concentration interfere with the heavy metal detection 33 . The remarkable advantage of using TupA_R118K as the bioreceptor component of a sensor is that it's activity is not affected by the presence of other similar ions like molybdate, sulfate, phosphate or perchlorate, allowing selective detection. Combining different techniques, we could better understand the mode of binding of tungstate in TupABC system, paving the way to the development of a new trend in W detection.

DNA Cloning and Site-directed Mutagenesis.
The pet46-tupA expression vector 31 containing the tupA gene (locus tag Dde_0234) was used as a template to perform site-directed mutagenesis of the R118 residue. Primers listed in Table 3 were designed to construct the pET46-tupA_R118K, pET46-tupA_R118E and pET46-tupA_R118Q variants using the Site-Directed Mutagenesis Kit (QuikChange ® , Stratagene) following the manual kit instructions. XL1-Blue supercompetent cells were transformed with each expression vector and plasmids were isolated from a unique colony using the NZY-Tech Miniprep kit (NZY-Tech, Lisboa, Portugal). The variants constructs were confirmed by DNA sequencing using an ABI3700 DNA analyzer (Perkin/Elmer/Applied Biosystems, STABvida, Caparica, Portugal). The sequences were aligned and analyzed using the online tool BLASTp 34 and ClustalW 35 .
Expression of TupA variants. The heterologous expression conditions of TupA previously optimized 31 were tested for the three variants. These conditions yield a high percentage of protein in the insoluble fraction (data not shown). To overcome this problem, several parameters were tested, such as IPTG concentration (0.1, 0.3, 0.5, 0.8 and 1 mM) and different strains of expression hosts (Rosetta 2(DE3)pLysS, Origami (DE3) and Tuner (DE3), from Merck Millipore). The best results were obtained using E. coli Rosetta 2 and Origami cells for pET46-tupA_R118K and pET46-tupA_R118E, respectively. In both cases, the cells were transformed and cultured in sterile Luria-Bertani medium supplemented with ampicillin (100 µg/mL) at 180 rpm and 37 °C. When OD 600 reached 0.6, the protein expression was induced using 1 mM IPTG and cells were grown for 16 hours at 19 °C. For the pET46-tupA_R118Q plasmid, E. coli Tuner cells were transformed and the expression performed as previously described but using 0.3 mM of ITPG during the induction period.  Table 3. Primers used to mutate the R118 residue.
SCIENTIFIC RePORTS | 7: 5798 | DOI:10.1038/s41598-017-06133-y Protein isolation protocol. After protein expression, the cells were harvested at 5000 × g for 20 min and the pellet was resuspended in a ratio of 2 g cells/mL in 50 mM sodium phosphate buffer (pH 8.0) containing 500 mM NaCl, 5 mM DNase and 1 tablet/L of Protease Inhibitor Cocktail -EDTA (Sigma-Aldrich). The cells were disrupted by sonication and the solution clarified by centrifugation (13000 × g for 30 min). The soluble fraction was filtered through a 0.45 µm membrane. Protein purification was performed in a one-step protocol using an immobilized-metal affinity chromatography (IMAC), His GraviTrap (GE Healthcare) following the manufacturer's instructions. Target proteins were eluted using 50 mM sodium phosphate buffer (pH 8.0) containing 500 mM NaCl and 250 mM imidazole. The fractions were analyzed by 10% tris-tricine/polyacrylamide gel electrophoresis stained with Coomassie blue. Fractions containing TupA_R118K, TupA_R118E and TupA_R118Q were dialyzed against 5 mM Tris-HCl pH 7.6. All of the steps were performed at 4 °C.
Isothermal Titration Calorimetry. ITC experiments were performed as described previously 31 using a VP-ITC calorimeter (MicroCal GE Healthcare). Prior to experiments, the protein was extensively dialyzed against the reaction buffer (5 mM Tris-HCl, pH 7.5, prepared with Milli-Q dH2O). The reaction cell containing 10 µM of protein was equilibrated at 30 °C, titrated with sodium tungstate or molybdate (20 or 23 injections of 10 µl of a 100 µM oxyanion solution) and the heat response recorded. After subtraction of the baseline, the integrated heats were fitted to the single binding site model using the ORIGIN software package supplied with the calorimeter to derive, n, Ka and ΔH values.
Urea-polyacrylamide gel electrophoresis. The stability of TupA after ligand binding (tungstate and molybdate) was analyzed by urea gel electrophoresis using the Novex 6% tris(hydroxymethyl) aminomethane (Tris)-borate (TBE)-urea minigels and a XCell SureLock ™ Mini-Cell (Invitrogen), as previously described by Mehtab et al. 36 . The protein (5 µL, 50 µM) was mixed with 5 µL of 2× Novex sample buffer (without EDTA). The electrophoresis was carried out for 150 min at 180 V and 40 mA. The protein bands were examined using Coomassie blue staining. In order to avoid the metal chelation, EDTA was removed from the electrophoresis solutions.  31 .

Crystallization of
To obtain a crystal structure of the holo form of TupA (MoO 4 2− /WO 4 2− -TupA) soaking and co-crystallization experiments were performed. For the soaking experiments, crystals obtained in the described condition were stabilized by adding a harvesting buffer solution containing 32% (w/v) PEG 3350. The crystals were then incubated with a 5, 10 or 20-fold excess of ligand (prepared in 0.2 M magnesium chloride, 0.1 M HEPES pH 7.5 and 32% (w/v) PEG 3350) for 10 min to 24 hours. Macroscopically, the crystals were not damaged during the soaking and were afterward flash frozen using the mentioned cryo-protectant. Although more than 120 crystals were tested, almost all had poor to non-existing diffraction.
Data Collection, structure determination and refinement. Crystals were flash-cooled in liquid nitrogen using Paratone oil as a cryoprotectant and maintained at 100 K under a stream of gaseous nitrogen during data collection. A complete dataset was collected at beamline ID23-1 at the European Synchrotron Radiation Facility (ESRF, Grenoble, France). The crystals diffract beyond 1.40 Å resolution at a wavelength of 0.954 Å and belong to space group P12 1 1. Data was processed with the XDS 37 package and AIMLESS 38 from the CCP4 program package v. 6.3.0 (Collaborative Computational Project, Number 4, 1994) 39 . The data collection and processing statistics are presented in Table 4.
Structure determination was carried out by molecular replacement using PHASER 40 and several molecular models were selected according to sequence alignment homologies, namely: a conserved functionally unknown protein from Vibrio parahaemolyticus serotype O3:K6 (PBD code 3muq) and the GsTupA (PDB code 3lr1), after omitting all the cofactors and solvent molecules. The MR solution could only be obtained when the two models were superposed and small domains of the protein were used separately: Domain I, including the first 81 residues; Domain II comprising residues 82 to 188; and finally, Domain III with residues 189 to 236. After structure solution, Buccaneer was used for the automated model building 41 and REFMAC 5 for restrained refinement 42 . The water molecules were automatically added by REFMAC 5 and manually inspected in Coot 43 . Geometrical validation and model improvement was carried out using PDB_REDO 44 and the final values of 0.176 and 0.217 for R and R-free factors were obtained, respectively. The Ramachandran plot has 97.15% of the residues in the preferred regions, without any outliers. Mean bond angle and bond length deviations from ideal values and other refinement statistics are presented in Table 4. The deposited model contains 250 protein residues, 332 water molecules, two chlorides and one sodium ion. During the electron density inspection, two mutated residues were found: R107K and S138P. These mutations are likely to come from the cloning strategy but, since are located far from the tungstate binding pocket were considered irrelevant for structure analysis.
Small-angle X-ray scattering, data collection and analysis. SAXS data was collected at EMBL P12 beamline, DESY, Hamburg, Germany and at EMBL BM29 beamline, ESRF, Grenoble, France with protein concentration ranges of 12-0.5 mg/ml. Data was cropped from the first point of the Guinier region until the end of the useful range defined by SHANUM 45 . High and low concentration curves were merged to counter concentration effects such as interparticle interference using the program PRIMUS from the ATSAS package 46 . GNOM 47 was used to obtain the p(r) and determine the corresponding and values. BUNCH 29 and SREFLEX 30 were used to generate and refine high-resolution hybrid models using the crystallographic structure reported here as the starting point. The scattering curves from the high-resolution models were calculated using CRYSOL 48 . Data Availability. Coordinates and observed structure factor amplitudes of DaG20 TupA have been deposited in the Protein Data Bank under the accession code 5my5. The collected SAXS data and the generated high-resolution hybrid models have been deposited and are available at SASBDB (entries: SASDBD9, SASDBE9, SASDBF9, SASDBG9 and SASDBH9) 49 . Data collection, processing and structure refinement statistics  Table 4. Data collection, processing and structure refinement statistics for TupA crystal. Values in parentheses correspond to the highest resolution shell. # AU: asymmetric unit. * R free was calculated for 5.1% of the reflections randomly chosen from the data set.