Inhibitor binding mode and allosteric regulation of Na+-glucose symporters

Sodium-dependent glucose transporters (SGLTs) exploit sodium gradients to transport sugars across the plasma membrane. Due to their role in renal sugar reabsorption, SGLTs are targets for the treatment of type 2 diabetes. Current therapeutics are phlorizin derivatives that contain a sugar moiety bound to an aromatic aglycon tail. Here, we develop structural models of human SGLT1/2 in complex with inhibitors by combining computational and functional studies. Inhibitors bind with the sugar moiety in the sugar pocket and the aglycon tail in the extracellular vestibule. The binding poses corroborate mutagenesis studies and suggest a partial closure of the outer gate upon binding. The models also reveal a putative Na+ binding site in hSGLT1 whose disruption reduces the transport stoichiometry to the value observed in hSGLT2 and increases inhibition by aglycon tails. Our work demonstrates that subtype selectivity arises from Na+-regulated outer gate closure and a variable region in extracellular loop EL5.

Next, we resolved differences between both methods. The easiest segments to resolve were TM5, 6, and 8 where there was good agreement throughout much of the sequence. Both alignments agree on the N-terminal side of the helices; however, the sequence alignment does not create a gap for TM6 and 8, which motivated our choice of this alignment. The structural alignment was chosen for TM5 for the same reason. The alignments of TM3, 4, and 9 were more difficult to address due to low sequence identity and/or less agreement between the two alignment methodologies.
For TM3, both methods agree despite an incredibly low sequence identity for this segment. Unfortunately, both methods introduce a gap in the middle of the helix so that V141 in vSGLT fails to align to a residue in SiaT. We chose to close the gap up by retaining the alignment on the N-terminal side of the helix, which has a pattern of basic residues that are conserved in both transporters. TM4 is a short segment, and both alignment methods disagree throughout the entire stretch. Examining the structural alignment, it is clear that there is a major shift in the position of TM4 relative to the rest of the segments during transport. For this reason, we decided to adopt the result suggested by the sequence alignment. Finally, for TM9 there is a large discrepancy between the alignments produced between both methods, and while we would typically lean toward the structural alignment, TM9 undergoes a large motion with respect to the rest of the protein during the transport process. For this reason, we believe that the structural alignment is quite poor, and we adopted the result from the sequence alignment, which has a much higher identity in this region (~28%) compared to the structural alignment.

The sugar binding mode of vSGLT is conserved during transport
The hydrogen bonding pattern between the sugar bundle residues (Q69, E88, S91, N260, and K294) and the polar groups on the sugar is conserved between the inward-facing X-ray structure 4 Fig. 4c). Regardless, the high level of agreement between the binding mode of galactose in this outward-facing homology model and the inward-facing X-ray structure provides additional confidence in our alignment between SiaT and vSGLT, and our ability to make faithful homology models.

Sodium site stability in hSGLT1
We setup four short molecular dynamics simulations of hSGLT1 in which the Na + was present at either the Na1 or Na3 site in single or double occupancy with another Na + in the conserved Na2 site (Supplementary Table 1). Both attempts to simulate Na + at the Na1 position failed to provide a stable configuration ( Supplementary Fig.   4). In the absence of a Na2 ion, the Na1 ion quickly departs from the initial site placement and moves toward the Na2 site over the 10 ns simulation ( Supplementary   Fig. 4a). In the presence of a Na2 ion, the Na1 site is even more unstable and the ion leaves the protein to the extracellular space in 3 ns ( Supplementary Fig. 4b).
Counter to this initial finding, Na + in the Na3 site is stable in the presence or absence of the Na2 ion when simulated 10 ns or 30 ns, respectively (Supplementary Fig.   4c,d).

Na + binding site analysis
We catalogued Na + binding sites in unique chains of the structural database using the program Probe 6,7 . For an input protein structure, Probe outputs a list of all residues within van der Waals contact (or closer) to a Na + ion. Residues with contact type "wc" (wide contact) from Probe were not included in Na + coordination spheres, which excludes the more weakly coordinated residues from consideration. Supplementary Fig. 6a, b, and c were compiled from the binding site data.
The sodium-coupled sialic acid symporter (PDB ID 5NV9, resolution of 1.95 Å) and sodium-calcium exchanger proteins (PDB ID 5HYA, resolution of 1.90 Å) shown in Supplementary Fig. 5b,c were discovered to have similar binding sites to hSLGT1 (panel a) after searching through Na + binding sites that featured Ser/Thr side chain residues at relative positions i and i + 1 in sequence. Similar binding sites to 5NV9 shown in Supplementary Fig. 5d were discovered via searching through Na + binding sites that featured a bidentate Glu or Asp residue (green) as observed in the homology models hSLGT1/2 and the 5NV9 crystal structure. Structural similarity between 5NV9 and other binding sites was quantified via superposition and root mean squared deviation (RMSD) between the following atoms. The four carboxylate atoms of D182 (CB, CG, OD1, OD2) from 5NV9 were mapped to those in the bidentate Asp/Glu of the query binding site (CB/CG, CG/CD, OD1/OE1, OD2/OE2).
The Na + ions of each site were also included in the superposition. The three remaining oxygen atoms of the 5NV9 site (O342, OG345, OG346) were mapped sequentially onto all possible permutations of three oxygen atoms in the query binding site; the binding sites were then superimposed and the RMSD was calculated. The superposition with the lowest RMSD was returned. vSGLT was aligned to SiaT using a sequence-based alignment or a structure-based alignment method indicated by seq. and struct., respectively. Inclusion of hSGLT1 and hSGLT2 in the alignment follows from previously published alignments to vSGLT. The sequence used for all homology modeling is indicated by final and is Four hSGLT1 simulations were performed to assess the stability of Na + at each putative ion site. Na + at Na1 is not stable in single occupancy (a) or double occupancy (b) with another ion in Na2. The Na1 ion escapes the protein after 3 ns in the latter case. Na + in the Na3 site is stable in double (c) and single occupancy (d) for 10 ns and 30 ns, respectively. In all panels, hSGLT1 is white and bound ions are represented as yellow spheres drawn every 10 ps throughout the trajectory. Each Na + coordination sphere in a, b, and c contains five oxygen atoms. Four of these five oxygen atoms (O342/O206, OG345/OG1209, OG346/OG210, OD1183/OG77) and the Na + from 5NV9 and 5HYA were superimposed, and the RMSD was calculated with respect to these atoms. The poses shown are those resulting after superposition. Sodium is shown as purple spheres. (d) The protein data bank contains many examples of Na +coordination spheres similar to the Na3 site of hSGLT1. Structural matches to the coordination geometry of 5NV9 (a nearly identical coordination sphere to hSGLT1) are shown with PDB accession codes and RMSD. Dashed, cyan lines highlight the coordination between Na + and oxygen atoms of the matches that correspond to oxygen atoms in 5NV9. The binding sites from 5NV9 and each structural match were superimposed via carboxylate, Na + , and oxygen atoms (see Methods in the main text); and the RMSD was calculated with respect to these atoms.

Supplementary Figures
The poses shown are those resulting after superposition. Bidentate Glu or Asp residues are colored green.
Waters are shown as red spheres. Sodium is shown as purple spheres.   ASP  GLU  ASN  SER  THR  GLN  ARG  TYR  HIS  LYS  PRO  LEU  PHE  VAL  TRP  ILE  ALA  MET  CYS  MSE  0   100   200   300   400   Count s   main chain   GLY  ALA  VAL  SER  LEU  ILE  THR  ASP  ASN  ARG  GLU  LYS  GLN  PHE  HIS  TYR  PRO  CYS  MET  TRP  MSE  ASP  THR  ASN  SER  GLU  PHE  ALA  ILE  LEU  HIS  VAL  TYR  CYS  MET  ARG  PRO  GLN  LYS  TRP   The site of Na3 was identified via combination of simulation and functional assay, and it is located far from all other sites near the inward gate coordinating TM3, TM6, and TM10. The 1.9 Å X-ray structure of SiaT revealed two Na + ions, and Na + Hill coefficient of 1.5 -similar to values reported for hSGLT1, which is well established to have a 2-to-1 stoichiometry. MD simulations revealed that Na + in this new site called Na3 -not to be confused with the Na3 site in GlyT2 -is very stable 10 . The Na3 site in SiaT and hSGLT1, based on homology, is below Na2, engaging TM1, TM5, and TM8. Check marks indicate that Na + is present in the structure at the start of the simulation.