Structural insight into the substrate recognition and transport mechanism of the human LAT2–4F2hc complex

The L -type amino acid transporters (LATs) mediate neutral amino acids and thyroid hormones across membrane 1 – 5 . LAT2, which is mainly expressed in kidney and small intestine, has a broader substrate range than LAT1, including small amino acids 6 – 9 . LAT1 or LAT2 are the light chains of the heterodimeric amino acid transporters (HATs), which are composed of a light chain and a heavy chain. The 4F2 cell-surface antigen heavy chain (4F2hc) is the heavy chain for LAT1 and LAT2, playing roles in plasma membrane localization of LATs 4 and required for the stability and the transport activity of HATs 10 . The ﬁ rst cryo-EM structures of the human LAT1 – 4F2hc complex had been solved recently, but the native substrates bound structure of HATs still remains unknown. Here, we report the cryo-EM structures of the human LAT2 – 4F2hc complex bound with substrate Leu or Trp at resolution of 2.9 or 3.4 Å, respectively. These structures exhibit an inward-open conformation, similar to that of the human LAT1 – 4F2hc complex. The substrates Leu and Trp are all bound at the bottom of the inner pocket of the transporter, while Trp might adopt two different binding modes. Structural analysis and biochemical assays provide important basis for the working mechanism of HATs, especially by elucidating the substrate-binding mechanism. The details of recombinant expression and puri ﬁ cation of LAT2 – 4F2hc complex

The full-length human cDNA of LAT2 (accession number: NM_012244.4) was subcloned into pCAG with N-terminal FLAG tag and 4F2hc (isoform b, accession number: NM_001012662.2) into pCAG with N-terminal 10×His tag. The mutations were generated by a standard two-step PCR.
For incubation of the complex sample with tryptophan or leucine, after incubating at 4 °C for 2 hours with 1% GDN (w/v) (Anatrace), cells was centrifugated at 18,700×g for 45 mins to remove cell debris. The supernatant was loaded onto anti-FLAG M2 affinity resin (Sigma). The resin was washed with the wash buffer containing 25 mM Tris (pH 8.0), 150 mM NaCl, 0.05% GDN (w/v), following by protein eluted with wash buffer plus 0.2 mg/mL FLAG peptide. Then elution of anti-FLAG M2 affinity resin was further purified with Ni-NTA affinity resin (Qiagen). Wash buffer and elution buffer of nickel resin was wash buffer mentioned above plus 10 mM and 300 mM imidazole respectively. Then the protein complex was subjected to size-exclusion chromatography (Superose 6 Increase 10/300 GL, GE Healthcare) in buffer containing 25 mM Tris (pH 8.0), 150 mM NaCl and 0.02% GDN. The peak fractions were collected and concentrated for EM analysis.
For transport activity assay, the purification procedure of protein complex was almost same as that was described above except the membrane fraction was solubilized at 4 °C for 2 hours with 1% (w/v) LMNG (Anatrace) supplemented with 0.1% (w/v) cholesteryl hemisuccinate Tris salt (Anatrace) instead of GDN. The peak fractions were collected and stored at -80℃ for proteoliposomes preparation.

In vitro transport activity assay
Liposomes and proteoliposomes were prepared as described previously with a slight modification 1 . The reaction buffer inside liposomes and proteoliposomes contains 20 mM potassium phosphate (pH 6.5), 150 mM KCl and 10 mM L-leucine (Sigma). All transport activity assays were performed at room temperature. Leucine or L-[ 14 C] Tryptophan uptake was stopped after 1 min or 5 mins by rapidly filtering the reaction solution through a 0.22 µm GSTF filter (Millipore) and washed with 2 mL of ice-cold wash buffer (20 mM potassium phosphate (pH 6.5) and 150 mM KCl). The filter was then used for liquid scintillation counting.
To initiate the substrate competition assay, extra 1mM unlabeled amino acids was added into reaction buffer. The measurement of K m and V max was performed in the presence of unlabeled leucine at the indicated concentrations in reaction buffer outside the proteoliposome, and the uptake of 3 H-labeled leucine was stopped after 15 s. In each measurement, liposome containing no protein was involved in as empty control.
Data was processed by GraphPad Prism software.

Cryo-EM sample preparation and data acquisition
The purified LAT2-4F2hc complex was concentrated to ~ 10 mg/mL and incubated with 10 mM Leu or 2 mM Trp for 2 hours if necessary before being applied to the grids. Aliquots (3.5 µL) of the protein complex were placed on glow-discharged holey carbon grids (Quantifoil Cu R1.2/1.3). The grids were blotted for 3s or 3.5 s and flash-frozen in liquid ethane cooled by liquid nitrogen with Vitrobot (Mark IV, Thermo Fisher Scientific). The prepared grids were transferred to a Titan Krios operating at 300 kV equipped with Gatan K3 Summit detector and GIF Quantum energy filter for the LAT2-4F2hc+Leu complex, or with Cs corrector, Gatan K2 Summit detector and GIF Quantum energy filter for the LAT2-4F2hc+Trp complex. A total of 3,384 and 5,030 movie stacks were automatically collected using AutoEMation for LAT2-4F2hc+Leu and LAT2-4F2hc+Trp, respectively, with a slit width of 20 eV on the energy filter and a preset defocus range from -1.2 µm to -2.2 µm in super-resolution mode. The total electron dose was approximately 48 e -/Å 2 for each stack, which contained 32 frames. The stacks were motion corrected and dose-weighted 2 with MotionCor2 3 and binned 2-fold, resulting in a pixel size of 1.087 Å/pixel or 1.091 Å/pixel for the LAT2-4F2hc+Leu or the LAT2-4F2hc+Trp complex, respectively. The defocus values were estimated with Gctf 4 .

Data processing
A total of 2,830,087 or 3,217,534 particles were automatically picked from 2,357 or 4,656 manually selected micrographs using Relion 5-10 for the LAT2-4F2hc+Leu or LAT2-4F2hc+Trp complex, respectively. After 2D classification, a total of 1,584,241 or 2,569,261particles were selected for the LAT2-4F2hc+Leu and LAT2-4F2hc+Trp complex, respectively. The selected particles were subjected to global angular searching 3D classification against an initial model generated with Relion. For each of the last several iterations of the global angular searching 3D classification, a local angular searching 3D classification was performed, during which the particles were classified into 4 classes. A total of 1,584,241 or 1,152,978 non-redundant good particles were selected from the local angular searching 3D classification for the LAT2-4F2hc+Leu and LAT2-4F2hc+Trp complex, respectively. Then, these selected particles were subjected to multi-reference 3D classification and local defocus refinement. The overall resolutions of the 3D auto-refinement after post-processing were 2.9 Å or 3.4 Å with a particle number of 751,924 or 713,248, for the LAT2-4F2hc+Leu and LAT2-4F2hc+Trp complex, respectively.
The 2D classification, 3D classification and auto-refinement were performed with Relion 3. The local defocus refinement was accomplished with Gctf. The resolution was estimated with the gold-standard Fourier shell correlation 0.143 criterion 11,12 with high-resolution noise substitution 13 .

Model building and structure refinement
Model building of the LAT2-4F2hc+Trp and the LAT2-4F2hc+Leu complex was performed with Coot 14 based on the cryo-EM maps with the PDB model of the LAT1-4F2hc complex (PDB ID: 6IRT) as a starting template. The subsequent modeling was performed in Coot with aromatic residues as land markers, as most of these residues were clearly visible in our cryo-EM maps. Each residue was manually checked with the chemical properties considered during model building.
A total of 927 amino acid residues were constructed for the LAT2-4F2hc+Trp and the LAT2-4F2hc+Leu complex. The N-terminal sequences of both 4F2hc and LAT2 were not modeled because of the invisibility of the corresponding density in the map. 8 sugar moieties, 2 lipid moieties, and 1 ligand amino acid molecule were assigned for LAT2-4F2hc complex according to the cryo EM maps. There were 6 water molecules assigned for the LAT2-4F2hc+Leu complex.
Structure refinement was performed with Phenix 15 with secondary structure and geometry restraints to prevent structure overfitting. To monitor the overfitting of the model, the model was refined against one of the two independent half maps from the gold-standard 3D refinement approach. Then, the refined model was tested against the other map 16 . Statistics associated with data collection, 3D reconstruction and model refinement can be found in Supplemental Data Table S1.

Supplementary information, Figure S6
Figure S6. Sequence alignment of LAT2 homologues. The sequences were aligned using clustalX. The seven aligned sequences are LAT2, LAT1, ASC1, y + LAT1, y + LAT2, b 0,+ AT, xCT from Homo sapiens. Amino acids that are identical or conserved in at least four sequences are coloured red or yellow, respectively. The gating residue Phe243 is labelled with a solid black circle. The residues that are involved in substrate specificity of neutral and smaller amino acids are marked with solid blue and magenta circles, respectively. Asp of b 0,+ AT, y + LAT1 and y + LAT2 corresponding to which LAT1 and LAT2 have non-charged residues are outlined with a black box. The resides important for the second substrate binding mode of LAT2 are indicated by solid green circles. The UNIPROT IDs of aligned sequences are listed as below. LAT2: Q9UHI5; LAT1: Q01650; ASC1: Q9NS82; y + LAT1: Q9UM01; y + LAT2: Q92536; b 0,+ AT: P82251; xCT: Q9UPY5.