Structural and mechanistic insights into the bifunctional HISN2 enzyme catalyzing the second and third steps of histidine biosynthesis in plants

The second and third steps of the histidine biosynthetic pathway (HBP) in plants are catalyzed by a bifunctional enzyme–HISN2. The enzyme consists of two distinct domains, active respectively as a phosphoribosyl-AMP cyclohydrolase (PRA-CH) and phosphoribosyl-ATP pyrophosphatase (PRA-PH). The domains are analogous to single-domain enzymes encoded by bacterial hisI and hisE genes, respectively. The calculated sequence similarity networks between HISN2 analogs from prokaryotes and eukaryotes suggest that the plant enzymes are closest relatives of those in the class of Deltaproteobacteria. In this work, we obtained crystal structures of HISN2 enzyme from Medicago truncatula (MtHISN2) and described its architecture and interactions with AMP. The AMP molecule bound to the PRA-PH domain shows positioning of the N1-phosphoribosyl relevant to catalysis. AMP bound to the PRA-CH domain mimics a part of the substrate, giving insights into the reaction mechanism. The latter interaction also arises as a possible second-tier regulatory mechanism of the HBP flux, as indicated by inhibition assays and isothermal titration calorimetry.


Results and discussion
Phylogenetic analysis suggests the evolutionary origin of plant HISN2 sequences. We have analyzed 53 111 available sequences assigned to InterPro families IPR008179, IPR021130, IPR002496, and IPR038019 to assess the sequence similarity between prokaryotic and eukaryotic HISN2-equivalent enzymes and trace the evolution of plant HISN2 proteins. The analysis suggests that plant bifunctional enzymes derive from the Myxococcales order in the class of Deltaproteobacteria (Fig. 2). Fungal trifunctional proteins (HIS4 in yeast) with PRA-PH, PRA-CH, and HDH (histidinol dehydrogenase) activities also derive from orders close to Myxococcales. Moreover, sequences from some Gammaproteobacteria and Spirochaetia of PRA-PH, PRA-CH, and ProFAR isomerase activities seem to derive from a similar common ancestor. Multifunctional enzymes permit an optimal yield of gene expression without a need for additional transcription regulation, as noted in the genetic history of the HBP 16 . Aside from the multifunctional enzymes, most bacterial classes like Alpha-, Beta-, Gamma-, and Deltaproteobacteria, Actinobacteria, Flavobacteria, Cytophagia, and Opitutae express singleactivity enzymes. Monofunctional enzymes are also common in the superkingdom of Archaea; however, there is a small group of archaeal species with bifunctional enzymes (Fig. 2).
As recently reported by Del Duca et al. 23 , gene elongation was a leading mechanism in the evolution of hisA, hisF, hisB, and hisD histidine biosynthetic genes. The hypothesis for their evolution was confirmed by high sequence similarities between two halves of the proteins and by structural and biochemical studies. Since sequences of the four enzymes encoded by those genes are highly conserved in prokaryotic and eukaryotic organisms, it is most likely that the gene elongation occurred in the early stage of HBP evolution, before the Last Universal Common Ancestor 23 . The diversity in hisI/E (bacteria), HIS4 (fungi), and HISN2 (plants) may be another example of the importance of the gene elongation and duplication that occurred at the very early stage of the HBP evolution.
The overall structure of MtHISN2: a dimeric enzyme with discrete and directly interacting pyrophosphohydrolase and cyclohydrolase domains. The complete sequence of MtHISN2 contains 283 amino acid residues (UniProt ID 24 : A0A072U2X9; Gene: 25498966). All plant enzymes of the HBP are encoded by the genomic DNA and contain N-terminal chloroplast-targeting signal peptides 22 . In MtHISN2, bioinformatic analysis with TargetP 25 suggested the signal peptide encompasses approx. forty N-terminal residues. In A. thaliana HISN2, the target peptide spans fifty residues (UniProt ID: O82768). We designed the construct to include sequence conserved in plant species; hence our final construct starts from Val49, preceded by a linker tripeptide, Ser-Asn-Ala.
The X-ray structure of MtHISN2 was solved by experimental phasing using single-wavelength anomalous dispersion (SAD) on zinc cations bound to the protein. The unliganded protein (with metals) crystallized in the C2 space group (Table 1) with two protein chains in the asymmetric unit (ASU). MtHISN2-AMP complex crystallized in the C2 space group but with different unit cell parameters (Table 1) and six protein chains (three dimers) in the ASU. The obtained electron density maps allowed us to trace most of the protein chain unambiguously, except for up to eighteen C-terminal residues and fragments between 157-165 and 186-194 (model-and chain-dependent) that were disordered.
MtHISN2 forms a tight dimer of 26.4 kDa subunits (Fig. 3A), sharing a ~ 4000 Å 2 interface, according to PISA analysis 26,27 . The dimeric form is consistent with the size-exclusion elution profile (not shown). The dimer's surface area is ~ 20,000 Å 2 and is negatively charged (Fig. 3B), agreeing with the calculated pI of 5.3. The negative charge suggests that metal cations play an important role in interactions with negatively charged, phosphatecontaining substrates, PR-ATP and PR-AMP.  www.nature.com/scientificreports/ The enzyme dimer is formed by two mutually swapped polypeptide chains, forming a bilobial protein-each domain forms one lobe (Fig. 3). Per sequence analogy to corresponding enzymes from bacteria and other plant species, those domains catalyze PRA-PH and PRA-CH reactions (Fig. 1). The PRA-CH domain is located at the N-terminus, spanning residues 49-158 (Fig. 3). The PRA-PH domain lies at the C-terminus, ranging from residues 172-283. It must be noted here that in this article, we treat a domain as a complete and functional dimeric entity-with two active sites. The existence of a monomeric form of either PRA-PH or PRA-CH domain is highly improbable as it would expose vast hydrophobic regions. In Arabidopsis, both domains, apparently as dimers, were shown as functionally independent, even when expressed separately 28 .
The PRA-PH domain consists of two overlapping and swapped protein chains built entirely of α-helices connected by loops. Each chain of the domain contributes five α-helices (α4-α8, Fig. 3A). Helices α6 and α7 form a four-helix bundle with their counterparts from the dimer mate, α6* and α7* (an asterisk denotes an element from the other subunit within the dimer). Helices α6 and α7 contain the PRA-PH active sites, defined near the metal-binding sites 1 and 2 (MBSs, Fig. 4A). Except for the bundle consisting of the four longest helices, there are short helices α4 and α8 and their counterparts α4*, α8* that overlap on top of each other, creating a tight chain swap. The swap separates the four-helix bundle from the PRA-CH domain.
In general, PRA-PH enzymes are Mg 2+ -dependent 29 . However, the MtHISN2 crystals could only be grown in the presence of a low concentration of Zn 2+ in addition to Mg 2+ . Zinc often binds to proteins at non-specific sites or at sites naturally binding other metals 30 , which likely was the case here. Thus, we decided to use a more general term-MBSs-in this work to avoid confusion. There are two unique MBSs in the PRA-PH domain. MBS1 contains Zn 2+ coordinated by two carboxyl oxygen atoms of Glu220 and one carboxyl oxygen of Glu217 (Fig. 4A). In MBS2, Zn 2+ is tetrahedrally coordinated by carboxylic groups of Glu214, Glu234, Asp237, and a water molecule. In some subunits, Glu217 also participates in Zn 2+ coordination in MBS2-resulting in the disappearance of MBS1. Because metal at MBS1 was absent in some subunits in our structures, only MBS2 may www.nature.com/scientificreports/ be catalytically relevant. Also, it is very likely that in vivo Mg 2+ cations (not Zn 2+ ) occupy MBS2, as magnesium, not zinc, is required for PRA-PH activity 31 . The PRA-CH domain also consists of two overlapping chains but has an entirely different structure (Fig. 3A). The domain connects with the PRA-PH domain via two long loops consisting of twelve residues (159-171), each belonging to one chain. The core of the PRA-CH domain is made of β-strands and α-helices forming the so-called barrelizing β-grasp fold (β-GF), wherein the β-sheet "grasps" an α-helix in a fasciclin-like assemblage 32 . There are many kinds of β-GF, but all of them share a similar topology, where β-strands form a mixed β-sheet surrounding a helix (α2 in MtHISN2). The most characteristic feature of the core four-stranded β-sheet is that the flanking strands are parallel to each other, while the two middle strands are anti-parallel to the flanking strands. This means that the first and the last strands (by sequence) are located in the central part of the sheet with a crossover via an α-helical fragment. Variety of unrelated proteins where the β-GF was found indicates that, despite its relatively small size, the β-GF is a multifunctional scaffold suited for small-molecule binding (PR-AMP in this case). In MtHISN2, a β-strand is followed by a helix and a loop that together form a super-secondary motif responsible for the domain swap. The β-sheet is connected with the motif via a long loop spanning residues 138-149 and contains residues coordinating MBSs 4-5 (see, Fig. 4B).
In our structures, the PRA-CH domains contain two or three (model-and subunit-dependent) MBSs that bind metal cations through conserved aspartate (Asp125*, Asp127*, and Asp129* in MtHISN2), cysteine (Cys126*, Cys142, Cys149), and histidine (His143) residues (Fig. 4B). As noted by D'Ordine et al. 33 , corresponding residues are universally conserved in cyclohydrolases. In the PRA-CH structure from Methanobacterium thermoautotrophicum, the aspartate residues (Asp85, Asp87, and Asp89) coordinated Cd 2+ in a site corresponding to the MBS3 of MtHISN2 (Asp125, Asp127, and Asp129, respectively), where Zn 2+ was bound 33 . The MBS3 is formed by the carboxylic groups of Asp125*, Asp127*, Asp129* and by two water molecules in a shape of a trigonal bipyramid. In the next site, MBS4, Zn 2+ is coordinated tetrahedrally by two water molecules and Nε of His143 and by the thiol of Cys126*. However, we did not observe a metal cation bound at MBS4 in the MtHISN2-AMP complex, suggesting that a metal bound to the MBS4 can either promote substrate binding or may not be physiologically relevant. Lastly, Zn 2+ bound in the MBS5 is coordinated by thiols of Cys126*, Cys142, and Cys149 Table 1. Diffraction data and refinement statistics.

MtHISN2
MtHISN2-AMP Structural alignment of MtHISN2 and its bifunctional bacterial counterpart reveals differences in the enzyme architecture while individual domains are similar. Structural comparisons of bacterial PRA-PH enzymes revealed high structural similarity, despite low sequence identities 29 . For instance, sequence identity as low as 31% between HisE from Mycobacterium tuberculosis (MtbHisE, PDB ID: 1Y6X) and Chromobacterium violaceum HisE (CvHisE, 2A7W) still results in a very similar three-dimensional structure. BLAST sequence alignment between MtbHisE and MtHISN2 shows no significant similarity; however, both proteins share similar architecture in secondary structure topology, chain swaps, and the four-helix bundle. www.nature.com/scientificreports/ The structural similarity despite relatively low sequence identity applies to the PRA-CH domain as well, as reflected by the RMSD of 0.68 Å between the MtHISN2 PRA-CH domain and HisI of M. thermoautotrophicum, sharing sequence identity of 40%. As pointed by D'Ordine et al., alignment between archaeal, bacterial, and eukaryotic sequences, e.g., M. thermoautotrophicum, E. coli, S. cerevisiae, reveals that some residues are highly conserved among PR-AMP cyclohydrolases 33 , which is consistent with their role in metal coordination also in MtHISN2.
So far, the only structure of a bifunctional HisIE enzyme has been determined for Shigella flexneri SfHisIE (PDB ID: 6J2L) 35 . Sequences of MtHISN2 and SfHisIE share 35% identity and 51% similarity, which indicates relatively low conservation. However, SfHisIE has a similar topology to MtHISN2 and lacks only the β7 strand and the α5 helix (in MtHISN2 topology). The SfHisIE sequence has three gaps, corresponding to residues 159-171, 185-188, 223-225 in MtHISN2 (Fig. 5A).
Despite MtHISN2 and SfHisIE are somewhat distant homologs, their structural alignment reveals significant similarity in both individual PRA-PH (RMSD of 0.90 Å) and PRA-CH domains (RMSD of 0.84 Å). For instance, the PRA-CH active site of SfHisIE and MtHISN2 share a very similar architecture (Fig. 5B). However, significant differences arise from the comparison of the entire enzyme molecules. When the PRA-CH domains are superposed, relative rotations of the PRA-PH domains, measured as the axis of the α4 helix, differ by ~ 40° (Fig. 5C). Another major difference is the presence of a super-secondary strand-helix-loop motif near the domain-domain interface in the plant enzyme. It encompasses residues 150-172 of the MtHISN2 sequence, which correspond to 105-110 in SfHisIE. In MtHISN2, it is involved in domain swapping by mutually overlapping corresponding chains, whereas SfHisIE lacks that motif entirely (Fig. 5C). In summary, most differences between MtHISN2 and SfHisIE appear near or at the inter-domain junction.
The architecture of MtHISN2 indicates that PR-AMP intermediate is released between the two catalytic events. The protein structure was investigated using CAVER 3.0 36 PyMOL Plugin in the context of possible tunnels that may connect active sites of PRA-PH and PRA-CH domains to shuttle the PR-AMP intermediate. Such tunnels are common in hydrolases, including two-domain hydrolases [37][38][39] . In MtHISN2, none of those tunnels would allow the transport of molecules-even as small as water-between the catalytic sites. We note that in some cases, such tunnels appear after binding of small molecules that change the overall shape of a protein; however, (i) we did not detect any conformational changes in the enzyme, and (ii) the diameter of the narrow fragment between the domains is only ~ 15 Å wide. This excludes the possibility of moving the PR-AMP intermediate between the catalytic sites. Because the catalytic sites of both domains are > 40 Å apart, PR-AMP must diffuse to the solvent (chloroplast stroma) after pyrophosphate cleavage to reach the PRA-CH domain. This also means that after being produced by the PRA-PH domain, PR-AMP molecules may be processed further by a PRA-CH domain in a different enzyme molecule.

AMP binding to the PRA-PH domain: positioning of the PR-ATP N1-phosphoribosyl. Our
MtHISN2-AMP complex showed that the enzyme active sites are adapted to bind nucleotides despite the lack of super-secondary structures typical for such specificity. More precisely, there are no Rossmann-fold motifs, often  www.nature.com/scientificreports/ associated with cofactors like FAD, NAD + , and NADP + , or Walker motifs, commonly present in ATP-binding proteins 40,41 . The previous analysis of SfHisIE also did not reveal Rossmann fold and Walker motifs 35 . In the MtHISN2-AMP complex, AMP molecules were found near MBSs in both domains, PRA-PH and PRA-CH. For clarity, representative AMP molecules with the lowest B-factors are described. AMP bound in the PRA-PH domain formed hydrogen bonds through the phosphate moiety and the adenine ring (Fig. 6A). The guanidine group of Arg183 formed polar hydrogen bonds with one oxygen of the phosphate. The second oxygen of the phosphate group interacted with the hydroxyl groups of Ser195 and Thr197 and with the backbone amide of Thr197. The backbone amide of Trp196 contacted the third phosphate oxygen. The adenine N1 atom interacted with the Arg263 guanidine group. We also observed the π-π stacking between the adenine ring and the Tyr240 side chain; the approximate inter-ring distance was 3.6 Å (Fig. 6A).
In that context, we note that AMP bound to the PRA-PH domain in our structure most likely does not show a part PR-ATP (substrate) or PR-AMP (product). This conclusion is based on the orientation of the AMP phosphate group pointing away from the metal center (MBS1-2) and interacting with the guanidine group of Arg183 instead. In contrast, the ATP fragment of PR-ATP should have its triphosphate group near the metal center for the hydrolysis to occur. To gain more insights, we utilized two in silico methods in parallel. We analyzed putative phosphate-binding regions in the MtHISN2 structure using Nucleos 42 . It indicated that more phosphate groups (e.g., triphosphate) could bind near the MBS1-2 sites rather than near Arg183 (Fig. 6B). Molecular docking of www.nature.com/scientificreports/ PR-ATP with AutoDock Vina was consistent with the Nucleos results (Fig. 6B). The proposed orientation of the adenine ring of PR-ATP was rotated by ~ 180° in the ring's plane to the AMP pose in the MtHISN2-AMP complex. This means that the binding of AMP to the PRA-PH domain in our MtHISN2 complex apparently shows the positioning of the N1-phosphoribosyl of PR-ATP and the plane of its adenine ring.
AMP binding to the PRA-CH domain: an update to the catalytic mechanism. The second AMP binding site was located within the PRA-CH domain (Fig. 6C). The phosphate moiety formed an extensive network of hydrogen bonds with surrounding residues. The phosphate O1 atom bound to Nε of Trp107 and the backbone N of Gly110. The O2 atom interacted with the hydroxyl group and the backbone N of Ser113 and the hydroxyl group and backbone N of Thr112. The O3 atom was bound to the hydroxyl group of Ser113, the amine group of Lys109, and a water molecule. Moreover, the adenine N6 atom interacted with the carbonyl of Thr141, whereas N7 H-bonded with the amine group of Lys109. We also observed edge-to-face interaction between the aromatic rings of the adenine and Trp107, with ≈ 3.5 Å distance and angle ω ≈ 45°. As reported by D'Ordine et al. 33 , the in silico docking of PR-AMP to the PRA-CH enzyme from M. thermoautotrophicum indicated that the substrate molecule in the active site is bound mainly by eighteen residues of which sixteen are conserved, and one is preserved in all PR-AMP cyclohydrolases 33,43 . The authors proposed two phosphate-binding regions, (i) Ser60, Thr61 and Ser62 (Ser100, Arg101, Ser102 in MtHISN2) for the N9-phosphoribosyl, and (ii) Glu71, Ser72 and Ser73 for the N1-phosphoribosyl (Glu111, Thr112, Ser113 in MtHISN2). Another interaction predicted by the authors to assist in substrate recognition is edge-to-face interaction between the adenine ring and Trp67 (Trp107 in MtHISN2). The N9 ribosyl group was proposed to interact with Mg 2+ and Arg15, which has no corresponding residue in MtHISN2. His110 (His143 in MtHISN2) was predicted to have a role in catalysis and π-stacking with the incoming substrate molecule. In terms of N1-and N9-phosphoribosyl orientations, a similar model has been reported by Wang et al. 35 , who also used in silico PR-AMP docking.
The AMP position in our MtHISN2-AMP complex does not agree with the previously-presented in silico models. Nevertheless, the MtHISN2-AMP complex is the first experimental structure showing (at least) a part of the PR-AMP substrate in the PRA-CH active site. In the MtHISN2-AMP complex, N9-phosphoribosyl interacts with the region formed by residues 107 WTKGETS 113 , suggesting that the PR-AMP pose would be rotated by ~ 180° in the adenine ring plane, compared to the model by D'Ordine et al. 33 . In consequence, the region formed by residues 100 SRS 102 , likely interacts with the N1-phosphoribosyl. It is also possible that MBS3 plays a role in binding the N1-phosphoribosyl, especially since Mg 2+ bound to the corresponding site was essential for the activity of other PRA-CH enzymes 33,44 . Our AMP pose with the N6 atom pointing towards the protein core (and not the solvent) agrees with the complexes of adenosine deaminases, a family of Zn 2+ -dependent hydrolases acting on adenosine-like substrates 45,46 . We must also note that we observed C2′-endo ribose in the MtHISN2-AMP complex, meaning that even AMP, lacking the N1-phosphoribosyl, already binds "contracted" to the PRA-CH active site. D'Ordine et al. acknowledged that dealing with the flexibility of ribose rings was a big challenge during docking 33 . In our docking experiments, PR-AMP was bound to the PRA-CH domain (Fig. 6D) in a pose that is compatible with that of AMP in the MtHISN2-AMP (Fig. 6B, D).
Thanks to the conserved three-cysteine active site (Cys142, Cys149, and Cys126*, MBS5), the general PRA-CH mechanism may be adopted from other reports 33,47 and updated by the experimental position of AMP, which mimics a part of PR-AMP (Fig. 6C,D). First, PR-AMP is oriented in the catalytic pocket by the two phosphatebinding regions, namely (i) N1-phosphoribosyl orients towards 100 SRS 102 and/or Mg 2+ coordinated by Asp125*, Asp127*, and Asp129*, while (ii) N9-phosphoribosyl attracts to 107 WTKGETS 113 . The adenine moiety is secured by a hydrogen bond between its N7 atom and Nζ amine of Lys109 and by the edge-to-face interaction with Trp107. The nucleophilic water molecule in the Zn 2+ coordination sphere (MBS5) is activated by His143, acting as a general base. A metal cation (MBS4) may play a role in priming His143; in the unliganded MtHISN2 structure, His143 does not bind a water molecule but instead is in the MBS4 coordination sphere (Fig. 4B). The activated water molecule (or rather a hydroxyl anion) performs a nucleophilic attack on the purine C6 atom, breaking the N1-C6 bond. Distances observed in the MtHISN2-AMP complex, Zn 2+ …H 2 O of 2.4 Å, Nδ of His143…H 2 O of 3.0, and H 2 O…C6 of 3.1 Å, are consistent with this mechanism. The role of the His143 as the general base is supported by lack of detectable activity of the H143E mutant, while a (weaker) binding of PR-AMP may still occur, as deduced from the K d for AMP of 68 μM (Fig. 7). Moreover, environment of the active site pocket suggests that the optimal positioning of N1-phosphoribosyl may stretch the substrate, aiding the ring hydrolysis (Fig. 6D).

AMP is an inhibitor of the PRA-CH domain of MtHISN2 at physiologically-relevant concentrations.
AMP is an activity regulator of plant HISN1 enzymes and their counterparts from other kingdoms of life. Although it has been shown that AMP alone does not exhibit an inhibitory effect on MtHISN1, it significantly increases sensitivity to feedback regulation by free histidine 18 . However, so far, there have been no indications that other HBP enzymes could be regulated by AMP. In this work, MtHISN2 inhibition by AMP was assayed using PR-ATP produced enzymatically, as PR-ATP is commercially unavailable. The PR-ATP production, prior to MtHISN2 measurements, was monitored spectrophotometrically (at 290 nm, Fig. 7A). That mixture was then used to trigger AMP inhibition assays with MtHISN2, in which the PR-ATP concentration was 18 μM, so that absorbance changes (at 300 nm) could be monitored 44 . Since ATP-PRT enzyme was still present in the MtHISN2 reaction mixture, we also cross-validated the assay by including free histidine (at 100 μM), known to inhibit ATP-PRTs. We observed that 100 μM AMP caused over 60% inhibition. It must also be noted here that the AMP concentration, for instance, in maize chloroplasts ranges from 40 μM to 260 μM 48 . This puts MtHISN2-AMP interaction as a possible secondary regulation mechanism of the HBP flux. Unfortunately, to our knowledge there is no data on the PR-ATP concentration in vivo. Notwithstanding, the 18 μM concentration used in our assay may even be exaggerated, as PR-ATP is readily processed by the HBP. Interestingly, when  www.nature.com/scientificreports/ both AMP and histidine were present, the MtHISN2 inhibition was mitigated to 41% (Fig. 7B). Because ATP-PRT enzymes bind AMP in the presence of histidine, the pool of AMP available to bind to MtHISN2 decreases, providing the most likely explanation to this phenomenon. The control sample, without MtHISN2, excluded the impact of the ATP-PRT reaction on the observed absorbance change at 300 nm at the moment of the HISN2 reactions, which were run simultaneously. AMP interaction with MtHISN2 in solution was further investigated using isothermal titration calorimetry (ITC). Our data show that AMP binding to MtHISN2 (Fig. 7D) is characterized by the K d value of 47 ± 6 µM and stoichiometry N = 1. Thermodynamic parameters are ΔH = -3352 ± 324 cal/mol and ΔS = 8.6 cal/mol/deg. To deduce whether the obtained K d can be attributed to AMP binding to the PRA-PH or to the PRA-CH domain, we performed ITC experiments on point mutants of MtHISN2. Four mutants within the PRA-CH domain (K109A, T112V, S113A, and H143E) and three within the PRA-PH domain (R183E, T197V, and Y240T) were tested and the results are shown in Fig. 7E. The results clearly indicate that the AMP binding affinity is lowered in the case of PRA-CH domain mutants. Moreover, these mutations significantly lower the heat effect of AMP binding in comparison with PRA-PH domain mutants (Fig. 7E). These two observations indicate that AMP binding to the PRA-CH domain is driven by enthalpy, thus can be measured by ITC. We cannot also exclude an auxiliary impact of AMP binding to the PRA-PH domain on the overall MtHISN2 activity.

Conclusions and outlook
This article is the fifth in a series of papers that show the structures of plant HBP enzymes. Previous structures were reported for: HISN1 18 , HISN5 49 , HISN7 50 , and HISN8 51 . In this work, we experimentally solved the structure of the HISN2 enzyme from the model legume, Medicago truncatula using X-ray diffraction data. The bifunctional MtHISN2, with distinct PRA-PH and PRA-CH domains, showed significantly different relative orientation of the domains than in bacterial enzymes. Comparing bacterial and plant enzymes shed new light on the possible design of small-molecule inhibitors as potential antibiotics or herbicides. In this perspective, HisI, HisE, (or HisIE), homologs of fungal HIS4, and plant HISN2 enzymes may arise as promising molecular targets. If one wants to target bacterial or plant enzymes specifically, regions other than the conserved active sites appear most auspicious. The proposed insights into the regulation and catalytic mechanism provide groundwork for the design of HISN2 inhibitors, in addition to bringing a deeper comprehension of the plant HBP.
MtHISN2 interacts with AMP, as shown by our complex crystal structure, inhibition assays, and ITC experiments, which indicated that MtHISN2 activity regulation occurs in a physiologically-relevant range of AMP concentration. This way, the HBP flux can be tightly controlled on two steps, catalyzed by HISN1 and HISN2 enzymes. The need to control the HBP flux rises from a high metabolic cost of the pathway, estimated as equivalent to over thirty ATP molecules 52 . The HBP is at the same time the only pathway of amino acid biosynthesis that utilizes carbon and nitrogen directly from ATP. As fluctuations of the AMP/ATP ratio reflect the cell metabolic status, an AMP-based control can regulate resource consumption by the HBP.

Materials and methods
Cloning, expression, and purification. The total RNA was isolated from young M. truncatula leaves using the RNeasy Plant Mini Kit (Qiagen). The following reverse transcription with oligo dT 18 primer yielded the complementary DNA (cDNA). The chloroplast-targeting peptide was recognized using the TargetP 1.1 server 25,53 , and the produced construct was N-truncated at Val49. The desired fragment was amplified by polymerase chain reaction; primers used in this work are given in Table 2. The expression plasmid, based on the pMCSG68 backbone (Midwest Center for Structural Genomics), was created by the ligase-independent cloning Table 2. Primer sequences used in this work.

Primer name Sequence
MtHISN2- WT-F  TAC TTC CAA TCC AAT GCC GTA GAC TCA TTG TTG GAC AGT GTA AAATG   MtHISN2-WT-R  TTA TCC ACT TCC AAT GTT ATC AAT TTT CCA CCG ATT TCT GGG TTGG   K109A-F  GTT GTG GAC CGC GGG AGA GAC CTC CAA TAA TTT CAT CAA TGT C   K109A-R  GTC TCT CCC GCG GTC CAC AAC GAT GAT CGT GACC   T112V-F  GGA GAG GTG TCC AAT AAT TTC ATC AAT GTC CAT GAT GTC   T112V-R  GAA ATT ATT GGA CAC CTC TCC TTT  www.nature.com/scientificreports/ method 54 . Mutagenic substitutions were conducted using the Polymerase Incomplete Prime Extenstion (PIPE) method 55 on the wild-type MtHISN2 expression plasmid as a template and primers listed in Table 2. Correctness of all inserts was confirmed by DNA sequencing. Overexpression was carried in BL21 Gold E. coli cells (Agilent Technologies) in LB media with 150 μg/mL ampicillin. After incubation with shaking at 190 rpm at 37 °C until the A 600 reached 1.0, the cultures were chilled to 18 °C, and isopropyl-d-thiogalactopyranoside was added at a final concentration of 0.5 mM to start overexpression, which went on for 18 h. The cell pellet from the 2-L culture was centrifuged at 3500×g for 20 min at 4 °C and resuspended in 35 mL of binding buffer [50 mM Hepes-NaOH pH 7.5; 500 mM NaCl; 20 mM imidazole; 2 mM tris(2-carboxyethyl)phosphine (TCEP)] and stored at − 80 °C for purification.
The cells were disrupted by sonication (4 min with intervals for cooling), and the cell debris was removed by centrifugation at 25,000×g for 30 min at 4 °C. The supernatant was mixed with 3 mL of HisTrap HP resin (GE Healthcare) in a column on the VacMan setup (Promega). The resin-bound protein was washed five times with the binding buffer and eluted with 20 mL of elution buffer (50 mM Hepes-NaOH pH 7.5; 500 mM NaCl; 400 mM imidazole; 2 mM TCEP). The His 6 -tag was cleaved with TEV protease (at final concentration 0.2 mg/ mL) overnight, simultaneously with dialysis to lower the imidazole concentration to 20 mM. The second run through the HisTrap resin resulted in pure MtHISN2 in the flow-through to which ZnCl 2 was added at 100 µM final concentration. The sample was concentrated to 2.4 mL and loaded on a HiLoad Superdex 200 16/60 column (GE Healthcare), equilibrated with buffer: 25 mM Hepes-NaOH pH 7.5, 100 mM KCl, 50 mM NaCl, 100 µM ZnCl 2 , and 1 mM TCEP. The protein was then concentrated and used for crystallization or functional assays.
Crystallization, X-ray data collection, and processing. MtHISN2 was crystallized using the vapor diffusion method. The protein concentration was 10 mg/ml, as determined by A 280 measurement (molar extinction coefficient, ε of 43,430 M −1 ⋅cm −1 ). The unliganded structure results from crystals (hanging-drop) obtained by mixing 4 μl of the protein solution and 2 μl of 60% Morpheus D1 condition (Molecular Dimensions) 56 1 μl mixtures). The crystals appeared in A11 condition (0.2 M potassium iodide, 20% Polyethylene glycol 3350). Immediately before crystal harvesting, 1 μl of PEG/Ion A11 condition with 50% of glycerol was added to the drop. All crystals were vitrified in liquid nitrogen and stored for synchrotron data collection.
Diffraction data were collected at the SER-CAT beamline 22-ID and SBC 19-ID at the Advanced Photon Source, Argonne National Laboratory, USA. Diffraction data were processed with the XDS package 57 . Anisotropic truncation of X-ray data for the MtHISN2-AMP complex was done using the STARANISO server 58 . Data processing statistics are given in Table 1.
Determination and refinement of the crystal structures. The crystal structure of MtHISN2 was solved by SAD using protein crystallized in the presence of 100 µM ZnCl 2 , using the same data as for the MtHISN2 unliganded structure refinement (PDB ID: 7BGM). Notably, other MtHISN2 crystals were also soaked with selenourea crystal, as proposed by Luo 59 , but no selenourea molecules were found upon inspection of the final electron density maps. The phasing was performed with Phenix.Autosol 60 . The initial model was built using Phenix.AutoBuild 61 , and was placed inside the unit cell with the ACHESYM server 62 . COOT 63 was used for manual model corrections between rounds of automatic model refinement in Phenix.Refine 64 . The nearly finished model of MtHISN2 served to solve the AMP complex by molecular replacement with PHASER 65 . The refinement statistics are listed in Table 1.
The inhibition assay was performed in five cuvettes simultaneously; their content together with the experiment result is shown in Fig. 7B. Before the reaction, the cuvettes containing 900 µL of the kinetic buffer + /− AMP and/or histidine, both at 100 µM (final concentration) and wild-type MtHISN2 at 19 nM (f.c.) were incubated for 30 min. The control cuvette did not contain MtHISN2. To start the reaction, 100 μl of the R1 mixture (PR-ATP) was added, the initial PR-ATP concentration was ~ 18 μM. The reaction progress was measured by monitoring ProFAR formation at λ = 300 nm 44 .
Comparative activity assay of MtHISN2 mutants was performed using 790 μL of kinetic buffer to which 200 μl of the R1 mixture was added. The reactions were started by adding 10 μl of 1 mg/ml solutions of MtHISN2 variants. The control cuvette did not contain MtHISN2. The assay was performed in eight 1-ml cuvettes simultaneously, and the reaction progress was monitored at λ = 300 nm; the result is shown in Fig. 7C www.nature.com/scientificreports/ Microcalorimetric study of the interaction between HISN2 and AMP. ITC measurements were carried out with MicroCal PEAQ-ITC (Malvern) at 298 K. Titrations of AMP (2 mM) against MtHISN2 protein (kept at ≈ 100 µM concentration determined at 280 nm) were done in 25 mM HEPES buffer pH 7.5 (100 mM NaCl, 50 mM KCl, 1 mM TCEP, 4 mM MgCl 2 , 10 μM ZnCl 2 ). AMP was injected in 19 aliquots of 2 µl. Raw ITC data were analyzed with the Origin 7.0 software (Origin-Lab) to obtain thermodynamic parameters like stoichiometry (N), dissociation constant (K d ), and the changes in the enthalpy (ΔH) and entropy. One set of binding sites model was fitted to the data. Reference power was set to 5. A stirring speed of 750 rpm and spacing of 150 s was used. Experiments were performed in triplicate. To assign the AMP binding to a particular domain, analogical AMP titration measurements were carried on MtHISN2 mutants of the PRA-CH domain (K109A, T112V, S113A, H143E) as well as of the PRA-PH domain (R183E, T197V, Y240T).
In-silico analyses and data presentation. The EFI-ESN web server 68 served to calculate the sequence similarity network. The number of sequences (53 111) in the four included InterPro families: IPR008179, IPR021130, IPR002496, and IPR038019 was limited to the UniRef90 subset, which contained 21 942 sequences. The calculations were based on the alignment score of 50 for sequences between 70 and 1000 residues long. The figure was created in Cytoscape 3.3 69 ; 6748 outliers were manually excluded from the figure. Molecular figures were created in UCSF Chimera 70 , which also served to calculate the RMSD values for Cα atom pairs within 2-Å distance. Molecular docking was performed in AutoDock Vina 71 . The ligand and receptor files were prepared in PyRx 72 and the UCSF Chimera DockPrep tool. The receptor file was based on MtHISN2-AMP complex, with AMP removed. The search box was approx. 30 × 30 × 30 Å, centered at the AMP binding sites.
The Nucleos webserver 42 was used to identify putative phosphate binding sites in the MtHISN2 structure. The allowed RMSD for the structural matches between the MtHISN2 structure and the reference mini-structures of nucleobases, carbohydrates, and phosphates was set to a default value of 0.6 Å. The results for nucleobase and carbohydrate predictions were omitted in the presentation.
Caver 3.0.3 PyMol plugin was used to calculate molecular tunnels in the structure of MtHISN2 with following parameters: minimum probe radius = 0.9, shell depth = 10, shell radius = 8, clustering threshold = 3.5.

Research involving plants.
Studies complied with local and national regulations for using plants.