3D architecture and structural flexibility revealed in the subfamily of large glutamate dehydrogenases by a mycobacterial enzyme

Glutamate dehydrogenases (GDHs) are widespread metabolic enzymes that play key roles in nitrogen homeostasis. Large glutamate dehydrogenases composed of 180 kDa subunits (L-GDHs180) contain long N- and C-terminal segments flanking the catalytic core. Despite the relevance of L-GDHs180 in bacterial physiology, the lack of structural data for these enzymes has limited the progress of functional studies. Here we show that the mycobacterial L-GDH180 (mL-GDH180) adopts a quaternary structure that is radically different from that of related low molecular weight enzymes. Intersubunit contacts in mL-GDH180 involve a C-terminal domain that we propose as a new fold and a flexible N-terminal segment comprising ACT-like and PAS-type domains that could act as metabolic sensors for allosteric regulation. These findings uncover unique aspects of the structure-function relationship in the subfamily of L-GDHs. Lázaro et. al. report the first 3D structure of a large glutamate dehydrogenase (L-GDH), the one corresponding to the Mycobacterium smegmatis enzyme composed of 180 kDa subunits (mL-GDH180), obtained by X-ray crystallography and cryo-electron microscopy. This structure reveals that mL-GDH180 assembles as tetramers with the N- and C-terminal domains being involved in inter-subunit contacts and unveils unique features of the subfamily of L-GDHs.

The relevance of L-GDHs 180 in bacterial physiology has been emphasized in previous studies of environmental 8 and pathogenic species 10,11 . Among the later, the mycobacterial L-GDH 180 (mL-GDH 180 ) is part of a signal transduction pathway that senses amino acid availability to control metabolism and virulence of Mycobacterium tuberculosis 7,12,13 . This enzyme is essential for the in vitro growth of the tubercle bacillus 10,11 whereas it is crucial for Mycobacterium bovis BCG survival in media containing glutamate as the sole carbon source 14 . Moreover, diverse mechanisms have been implicated in the regulation of L-GDHs 180 . The catabolism of glutamate by mL-GDH 180 is inhibited by the regulator GarA 6,7 when extracellular nitrogen donor amino acids are available 12 whereas the L-GDH 180 from Streptomyces clavuligerus 1 (filo Actinobacteria, which includes mycobacteria) as well as L-GDHs 180 from Proteobacteria 2,4,5 are directly regulated by amino acids. Despite the key roles of L-GDHs 180 in the redistribution of amino groups within cells, their 3D structure has remained elusive, preventing a deeper understanding of the molecular basis of enzyme function.
Here we report the 3D structure of the mL-GDH 180 isoform from Mycobacterium smegmatis, obtained through an integrative approach that combined single-particle cryo-EM and X-ray protein crystallography data at resolutions between 3.59 and 6.27 Å. Our findings reveal unique characteristics of domain organization and oligomeric assembly in the L-GDHs subfamily, thus allowing to update the annotation of the Pfam family PF05088 that includes the L-GDHs 180 , and offer a rationale for the direct regulation of L-GDHs 180 by metabolites. Furthermore, our cryo-EM data uncover fluctuations of the quaternary structure of mL-GDH 180 that are possibly relevant for the allosteric regulation of the enzyme activity.

Results
The 3D architecture of mL-GDH 180 . As revealed by X-ray protein crystallography and single-particle cryo-EM (Figs. 1 and 2), mL-GDH 180 assembles into a homotetramer. mL-GDH 180 monomers are arranged around perpendicular twofold axes that pass through a central cavity in the structure. The 6.27 Å resolution crystal structure of the selenomethionine (Se-Met) derivative of mL-GDH 180 ( Fig. 1 and Table 1), obtained as illustrated in Supplementary Fig. 1 through an integrative strategy that also included cryo-EM data up to 3.59 Å, revealed that the protein subunits display, to the best of our knowledge, a unique domain organization (Fig. 1a). The N-terminal segment comprises three ACT (Aspartate kinase-Chorismate mutase-TyrA) -like 15 (hereafter ACT*, see below) domains (ACT*1-3), a PAS (Per-Arnt-Sim) -type 16 domain and three helical motifs (HM1-3). Notably, the primary structures of ACT and PAS domains are poorly conserved and, therefore, these modules are often difficult to identify from BLAST searches 15,16 . The C-terminal region consists of a single helical domain that showed no detectable structural similarity to previously characterized proteins in Dali 17 , ECOD 18 , CATH 19 , and VAST 20 searches and, therefore, constitutes a possible new fold.
The catalytic domains in the mL-GDH 180 complex were not found to contribute intersubunit contacts (Fig. 1a). Instead, the N-and C-terminal regions of mL-GDH 180 provide dimer-like interactions between pairs of monomers. Contacts between mL-GDH 180 subunits engage the ACT*2, ACT*3, and C-terminal domains (Fig. 1b). Most of the residues involved in interfacial hydrogen bonds or salt bridges in mL-GDH 180 are strictly conserved in the enzyme isoform from M. tuberculosis (O53203, 72% sequence identity) 7 , the L-GDH 180 from S. clavuligerus (E2Q5C0, 47% sequence identity) 1 and the L-GDH 115 from Nocardia farcinica (A0A0H5NTF9, 55% sequence identity over non-gap aligned columns). Except for a single amino acid (Arg560), the same group of residues is also conserved in the L-GDH 180 from P. aeruginosa (Q9HZE0, 40% sequence identity) 2 . These observations underscore the functional relevance of the oligomeric assembly found for mL-GDH 180 .
ACT and PAS modules are known to regulate functionally diverse proteins by driving conformational and/or quaternary structural changes 15,16 . The binding of specific amino acids to ACT-ACT interfaces confers allosteric control to oligomeric enzymes involved in amino acid metabolism 15 whereas PAS modules sense and transduce chemical or physical stimuli to typically dimeric effector domains 16 . The ACT* domains of mL-GDH 180 differ from the archetypal ACT fold in that strand β 1 is located in the position usually occupied by strand β 4 , creating an ACT-like ββαββα topology with a β 1 β 2 β 4 β 3 antiparallel sheet ( Fig. 1b and Supplementary Fig. 2). Similar variations of the characteristic ACT fold have been described for aspartate kinases and a mammalian tyrosine hydroxylase 15,21 , including sixteen core residues that are conserved in the ACT*1-3 domains of mL-GDH 180 ( Supplementary Fig. 2). Notably, the interaction between ACT*3 modules in mL-GDH 180 produces a continuous eightstranded antiparallel β-sheet with helices on one side (Fig. 1b). A similar side-by-side arrangement of ACT domains generates allosteric amino acid binding sites in 3-phosphoglycerate synthases and aspartate kinases 15 . Close to a dimer-like interface, the PAS module in mL-GDH 180 adopts a typical fold (Fig. 1c), comprising a core five-stranded β-sheet usually involved in signal sensing 16 , and displays up to 12% sequence identity with PAS domains in sensor histidine kinases retrieved in Dali 17 searches.
Similarly to S-GDHs 50 , the catalytic core of mL-GDH 180 consists of subdomains SDI and SDII (Fig. 1d), with the active site located in a groove in-between. Functionally important residues in the catalytic domain of L-GDHs 180 have been previously identified by their conservation in sequence comparisons of diverse GDHs 1 . The SDI in mL-GDH 180 contains most of the residues of the glutamate-binding region whereas the SDII conforms the dinucleotide-binding site.
Intrinsic flexibility and alternate conformers of mL-GDH 180 . Cryo-EM and SAXS data uncovered the intrinsic flexibility of native mL-GDH 180 (Fig. 2, Supplementary Fig. 3, and Supplementary Table 1). The domains comprised in the 1-500 region of the protein are stabilized by crystal contacts in the crystallographic structure. In contrast, 2D averages for side views of mL-GDH 180 tetramers revealed a high degree of flexibility at distal ends, where ACT*1-2 and PAS domains reside, and their corresponding densities vanished in 3D cryo-EM maps (Fig. 2a). A 3Dclassification of the detected mL-GDH 180 particles was performed to distinguish alternate conformers of the enzyme. Two mL-GDH 180 conformers were found, called the open and closed conformations (Fig. 2b), for which the ACT*3 module, the HM3, the catalytic domain and the C-terminal region were defined in ; a tetramer (as ribbons) is formed by crystallographic symmetry (CS); oval symbols represent twofold axes. The 2mFo-DFc electron density (gray mesh), contoured to 1.5σ, is shown for one protein subunit on the right. Domains boundaries are given in residue numbers in a scheme below; CD catalytic domain, CTD C-terminal domain, AS active site. A comparative scheme of L-GDHs 180 , L-GDHs 115 , and S-GDHs 50 is also provided, with approximate residue numbers. b Oligomeric interfaces (areas in Å 2 ) involve the domains ACT*2, ACT*3, and CTD. Contacting residues (as sticks in insets) labeled in bold characters are strictly conserved in diverse L-GDHs. The topology of domains ACT*2 and ACT*3 is highlighted with rainbow colors; white positions within the rainbow depict conserved core residues 15 . c The PAS domain. d The CD is shown with the SDI and SDII in yellow and orange, respectively. The βαβ motif is involved in dinucleotide binding 1 . The glutamate-binding region (GluBR, cyan) and the dinucleotide-binding region (DNBR, green) 1 Table 2). The two conformers differ in the relative positions of the centers of mass of the subunits (Fig. 2b). The catalytic domains in mL-GDH 180 monomers in contact through their N-terminal segments are found closer to each other in the less stable closed conformation compared to the open form. Overall, these findings reveal transitions of the quaternary structure that could intervene in the allosteric regulation of the enzyme.

Discussion
L-GDHs 180 were discovered in 2000 from a study of Streptomyces clavurigerus 1 and were later isolated from other diverse bacterial species, including Pseudomonas aeruginosa 2 , psychrophilic bacteria 3-5 , Caulobacter crescentus 8 , and Mycobacterium spp 6,7 . As sequences of L-GDHs were identified, they were analyzed in light of the available crystallographic evidence for S-GDHs 50 1,22 .
S-GDHs 50 are hexameric enzymes in which the oligomeric interfaces are conformed by motifs that are located within the catalytic domain 1,22 . Most of these motifs are substantially modified in L-GDHs, either through sequence changes, insertions, or deletions 1,22 . In agreement with proposals that the oligomeric assembly would then be different for the two enzyme subfamilies 1,22 , the quaternary structure of mL-GDH 180 depends on interactions established by the N-and C-terminal regions flanking the catalytic domain (Fig. 1a) and is radically different from that of S-GDHs 50 (Fig. 2e). The stoichiometry of the mL-GDH 180 complex observed by cryo-EM and X-ray protein crystallography (Figs. 1 and 2) is supported by molecular weight estimates from SAXS data (Supplementary Table 1) and is consistent with previous reports of tetrameric complexes of L-GDHs studied in solution 2,9 . Furthermore, most of the residues involved in interactions between mL-GDH 180 monomers are conserved (Fig. 1b) not only in mycobacterial isoforms of the enzyme but also in L-GDHs from diverse species in Actinobacteria and Proteobacteria. Notably, the C-terminal domain, for which we did not find structural similarity with other characterized proteins, has a conserved length among L-GDHs (Fig. 1a) and the contacts between residues predicted from sequence alignments by Raptor X (Supplementary Fig. 6) further support the interactions observed experimentally. All these findings suggest that the oligomeric assembly of mL-GDH 180 may be a common theme in the enzyme subfamily.
The catalytic domains in the mL-GDH 180 complex are oriented opposite to those in S-GDHs 50 (Fig. 2e), with the SDI (Fig. 1d) directed toward the distal ends of the protein, where the monomers N-terminal region resides. This segment comprises ACTlike modules as well as a PAS-like domain arranged in tandem (Fig. 1a) and shows a high degree of flexibility (Fig. 2a). A comparison of the mL-GDH 180 conformers identified by cryo-EM (Fig. 2b) shows that conformational changes in the N-terminal region correlate with alterations in the relative positions of the catalytic domains. Taking into account the known roles of ACT modules in the allosteric control of oligomeric enzymes involved in amino acid metabolism 15 , our findings offer a rationale for previous evidence pointing out the direct regulation of diverse L-GDHs 180 by metabolites 1,2 .
In conclusion, our findings suggest that the N-terminal segment of mL-GDH 180 (as well as in related enzymes) could transduce intracellular metabolic stimuli to the catalytic core by driving changes in the quaternary structure. The reported 3D model of mL-GDH 180 can now frame future studies to dissect the structure-function relationship of this enzyme and other members of the L-GDHs subfamily.

Methods
Protein production and purification. The sequence coding for the L-GDH 180 from M. smegmatis MC 2 -155 (MSMEG_4699, Uniprot A0R1C2) was cloned into vector pLIC-His 23 employing the oligonucleotides Fw: CCAGGGAGCAGCCTC GATGATTCGCCGGCTTTCGG and Rv: GCAAAGCACCGGCCTCGTTACCC AGTCGTTCCGGTCCC. The resulting plasmid was used to produce N-terminally His6-tagged mL-GDH 180 in E. coli cells. Transformed E. coli cells were grown at 37°C in medium supplemented with ampicillin or carbenicillin until reaching 0.8 units of optical density at 600 nm. Protein expression was then induced by adding isopropyl β-D-1-thiogalactopyranoside (IPTG) to a final concentration of 0.5 mM, and the incubation was continued for 18 h at 14°C. Cells were harvested by centrifugation and sonicated. Following clarification by centrifugation, the supernatant was loaded onto a HisTrap HP column (GE Healthcare) equilibrated with buffer 25 mM HEPES, 500 mM NaCl, 20% v/v glycerol, 20 mM imidazole, pH 8.0, and His6-tagged mL-GDH 180 was purified by applying a linear imidazole gradient (20-500 mM). The protein was then further purified by size-exclusion chromatography, as described below. mL-GDH 180 containing fractions, as confirmed by SDS-PAGE and measurements of glutamate dehydrogenase activity 6 , were pooled and used immediately. The protein was quantified by electronic absorption using the molar absorption coefficient of 171,090 M −1 cm −1 , predicted from the amino acid sequence by the ProtParam tool (http://web.expasy.org/protparam/).
For EM and SAXS experiments, native mL-GDH 180 was produced in E. coli BL21(DE3) cells grown in LB broth. Size-exclusion chromatography was performed using a Superose 6 10/300 GL column (GE Healthcare) equilibrated in buffer 20 mM MES, 300 mM NaCl, 5 mM MgCl 2 , pH 6.0. Instead, Se-Met mL-GDH 180 for crystallographic studies was produced in E. coli B834 (DE3) cells grown in SelenoMethionine Medium Complete (Molecular Dimensions), and size-   25 , and the final average included frames 2-15 with a total dose of 28 e − /Å 2 on the sample. The contrast transfer function (CTF) of the micrographs was estimated using CTFFIND4 26 . The particles were automatically selected from the micrographs using autopicking from RELION-3 27 . Evaluation of the quality of particles and selection was performed after 2D classifications with SCIPION 28 and RELION-3 27 software packages. The initial volume for 3D image processing was calculated using common lines in EMAN 29 and using the algorithm 3D-RANSAC 30 . With this initial reference, additional rounds of automated particle picking were performed. An initial data set of 276,704 particles was subjected to 2D and 3D class averaging in order to select the best particles. The 3D-classification of the 106,190 final particles with imposed D2 symmetry resulted in two different conformations, a closed (40%) and an open form (60%), with estimated resolutions of 6.6 Å and 4.47 Å, respectively. The set of particles for the open tetramer was further refined after particle polishing in RELION-3 27,31 over dose-weighted frames (total set of 20 frames), resulting in a 3D EM map at 4.19 Å. A focused refinement on the core of the subunits (excluding blurred regions at the tip ends) further improved the resolution to 3.59 Å for a monomer in the open conformation. This refinement focused on single mL-GDH 180 subunits was performed after the alignment of all the monomers following the D2 symmetry, with masked subunits. Local resolution was estimated using RELION-3 27,31 .
Model fitting into cryo-EM maps was performed using the programs UCSF Chimera 32 , Namdinator 33 , phenix.real_space_refine 34  Negative staining electron microscopy. Negative-stained grids of mL-GDH 180 were prepared using 2% uranyl acetate and visualized on a JEM-1230 transmission electron microscope (JEOL Europe) at an acceleration voltage of 80 kV. Images were taken in low dose conditions at a nominal magnification of ×30,000 using a GATAN CCD camera, resulting in 2.3 Å/pixel sampling.
Labeling of N-terminally His6-tagged mL-GDH 180 was performed by direct incubation of electron microscopy grids in solutions containing 5 nm Ni-NTA-Nanogold (Nanoprobes). Briefly, after glow discharging the grids, the protein was incubated for 1 min on the grids, fixed with 2% paraformaldehyde for 10 min at 4°C , washed 5 min with PBS, incubated for 15 min with Nanogold diluted 1/75 in PBS, washed twice with PBS, and finally stained with 2% uranyl acetate for 45 s.
Crystallization, X-ray data collection, and structure determination. Crystallization screenings were carried out using the sitting-drop vapor diffusion method and a Mosquito (TTP Labtech) nanoliter-dispensing crystallization robot. Crystals of Se-Met mL-GDH 180 grew after 4-6 months from a 16.5 mg/ml protein solution containing an equimolar amount of GarA from M. tuberculosis, by mixing equal volumes of protein solution and mother liquor (100 mM sodium cacodylate pH 5.8, 12% v/v glycerol, and 1.25 M (NH 4 ) 2 SO 4 ), at 4°C. Single crystals were cryoprotected in mother liquor containing 32% v/v glycerol and flash-frozen in liquid nitrogen. X-ray diffraction data were collected at the synchrotron beamline ID23-1 (European Synchrotron Radiation Facility, Grenoble, France), at 100 K, using wavelength 0.99187 Å. Diffraction data were processed using XDS 36 and scaled with Aimless 37 from the CCP4 program suite 38 .
The crystal structure of Se-Met mL-GDH 180 was solved by molecular replacement using the program Phaser 39 . As search probe we used the atomic coordinates of a model built as follows. First, a poly-Ala model of mL-GDH 180 was obtained from a preliminary ca. 7 Å resolution cryo-EM map of the protein, by employing the program phenix.map_to_model 40 . Features of the catalytic domain in mL-GDH 180 monomers became apparent in the model, suggesting that the Nterminus of the polypeptide chains was located at the tips of the particle. This was confirmed by labeling N-terminally His6-tagged mL-GDH 180 with Ni-NTA-Nanogold (Nanoprobes) and visualizing particles by negative staining electron microscopy. Then, the catalytic domain of mL-GDH 180 (residues 702-1220) was homology-modeled by using the structure of the S-GDH 50 from C. glutamicum (PDB code 5GUD) as template and employing MODELLER 41 as implemented in the HHpred server 42 . One copy of the model of the catalytic domain was rigidbody fitted into the 7 Å cryo-EM map of mL-GDH 180 , which allowed updating the starting poly-Ala model by correcting helical elements and incorporating strands corresponding to the catalytic domain in one monomer of mL-GDH 180 . From this, the D2 tetramer was then rebuilt by applying NCS operators detected by phenix. find_ncs 43 and the model was refined against the 7 Å cryo-EM map using phenix. real_space_refine 34 with NCS and secondary structure restraints. Finally, one of the protein chains in the resulting model was used as search probe to solve the crystal structure of Se-Met mL-GDH 180 by molecular replacement.
Two monomers were placed within the asymmetric unit, which taken together with nearby crystallographic symmetry mates replicate the quaternary structure observed by cryo-EM. After crystallographic refinement using phenix.refine 44,45 with NCS and secondary structure restraints, mFo-DFc and 2mFo-DFc electron density maps displayed rod-shaped electron density peaks that remained unmodeled at this stage and that most likely corresponded to helices in the Nterminal region of mL-GDH 180 . Phase improvement by density modification with RESOLVE 46 provided additional evidence in support of such elements. The Nterminal segment of mL-GDH 180 (residues 1-701) was modeled ab initio using RaptorX 47,48 , one of the top-ranking ab initio structure prediction methods according to recent CASP evaluations 49,50 . Raptor X works by initially estimating residue-residue contacts from residue coevolution patterns and uses the predicted contacts to drive model building; such technique has proven highly successful especially when integrated with experimental data (multiple examples overviewed in Abriata et al. 51 ). The residue-residue contact map predicted by RaptorX and the models produced from it revealed that the N-terminal segment of mL-GDH 180 comprises an array of contiguous domains, which were subsequently individually rigid-body fitted into the electron density maps. Similarly, the C-terminal domain of mL-GDH 180 (residues 1221-1594) was modeled ab initio employing RaptorX 47,48 and used to correct and complete the crystallographic model. Finally, un-modeled or poorly modeled segments in the CD were manually built employing Coot 35 from a 3.59 Å resolution cryo-EM map obtained for a monomer of mL-GDH 180 . The structure was then further refined by iterative cycles of manual model building with Coot 35 , used to apply stereochemical restraints, and crystallographic refinement of atomic coordinates and individual B-factors using phenix.refine 44,45 with NCS and secondary structure restraints. The final model contained 93% of the residues within favored regions of the Ramachandran plot and 0.2% of outliers. The crystallographic structure of Se-Met mL-GDH 180 correctly explained the connecting loops and bulky amino acid side chains evidenced for residues 500-1588 by a 4.19-Å cryo-EM map of the protein, which allowed to validate the strategy used for model building. Furthermore, the position of Se-Met residues in the crystal structure of Se-Met mL-GDH 180 matched the position of peaks in an anomalous difference map calculated with diffraction data acquired at 0.979338 Å (12.66 keV), the Se K-edge.
Even though Se-Met mL-GDH 180 crystallized in the presence of GarA from M. tuberculosis, electron density maps did not reveal evidences of co-crystallization and molecular replacement attempts with Phaser 39 using the atomic coordinates of GarA in PDBs 2KFU or 6I2P failed. The evidence of helical elements in all mL-GDH 180 domains allows excluding the presence of GarA (an all beta protein) from modeled regions, particularly from those involved in crystal contacts (mL-GDH 180 residues 1-500). The crystallization of a protein from a mixture of two or more proteins is not an unusual phenomenon, and it has even been reported that a protein can crystallize in a different space group due to the presence of other proteins in the sample without giving rise to co-crystals 52 , just to mention one example.
Atomic coordinates and structure factors obtained for Se-Met mL-GDH 180 were deposited in the Protein Data Bank under the accession code 7JSR.
Small angle X-ray scattering. Synchrotron SAXS data were collected at BioSAXS ID14EH3 beamline (European Synchrotron Radiation Facility, Grenoble, France) and recorded at 15°C using a PILATUS 1 M pixel detector (DECTRIS) at a sampledetector distance of 2.43 m and a wavelength of 0.931 Å, resulting momentum transfer (s) ranging from 0.009 to 0.6 Å −1 .
mL-GDH 180 was assayed at concentrations ranging from 1 to 14 mg/ml in buffer 25 mM Tris, 150 mM NaCl, pH 7.5. For the buffer and the samples, 10 2D scattering images were acquired and processed to obtain radially averaged 1D curves of normalized intensity versus scattering angle. In order to optimize background subtraction, buffer scattering profiles recorded before and after measuring every sample were averaged. Then, for each protein sample, the contribution of the buffer was subtracted. All subsequent data processing was performed using the ATSAS suite 53 .
Average scattering curves corresponding to different protein concentrations were compared using PRIMUS 53,54 . To obtain the idealized scattering curve the low s region of the most diluted sample and the high s region of the most concentrated sample were merged. The values of the forward scattering intensity I (0), the radius of gyration R g as well as the dimensionless Kratky plot were calculated using PRIMUS 53,54 . Guinier plots of independent average scattering curves evidenced a constant R g at different protein concentrations. The Porod volume was estimated using DATPOROD 53 and an s max value equal to 7.5/R g . The pairwise distance distribution function p(r) and the maximum particle dimension D max were calculated using GNOM 53,55 with a reduced χ 2 value of 1.07 for curve fitting. After running DAMMIN 53,56 the excluded volume was estimated as V ex = volume of a single dummy atom × number of dummy atoms/0.74). Finally, the MW was estimated from the Porod volume and the excluded volume.
Statistics and reproducibility. One protein crystal was employed for structure determination. For EM and SAXS studies, a protein batch was prepared in each case immediately before the experiment. No data were excluded from the analyses. All attempts of replication were successful.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.