Molecular basis of enzymatic nitrogen-nitrogen formation by a family of zinc-binding cupin enzymes

Molecules with a nitrogen-nitrogen (N-N) bond in their structures exhibit various biological activities and other unique properties. A few microbial proteins are recently emerging as dedicated N-N bond forming enzymes in natural product biosynthesis. However, the details of these biochemical processes remain largely unknown. Here, through in vitro biochemical characterization and computational studies, we report the molecular basis of hydrazine bond formation by a family of di-domain enzymes. These enzymes are widespread in bacteria and sometimes naturally exist as two standalone enzymes. We reveal that the methionyl-tRNA synthase-like domain/protein catalyzes ATP-dependent condensation of two amino acids substrates to form a highly unstable ester intermediate, which is subsequently captured by the zinc-binding cupin domain/protein and undergoes redox-neutral intramolecular rearrangement to give the N-N bond containing product. These results provide important mechanistic insights into enzymatic N-N bond formation and should facilitate future development of novel N-N forming biocatalyst.

I suggest providing other E69 substituted variants for in vitro analysis in addition to E69A (eg, E69D, N, Q, L) because the E69A also affects the size of the cavity.
Please also discuss the amino acid residues which might be important for interacting with the gamma carboxylic acid of Glu, alpha amine and alpha carboxylic acid of Lys in the model of RHS1.
Please add more descriptions for the QM calculation in the material and methods. Please describe which residues were used for QM layer. For deprotonation of NH2, via carboxylic acid of Glu, is there any amino acid residue that might be interacting with Glu to stimulate this process? How the two water molecules come from in this deprotonation? I could not understand why the authors give these two water molecules. Are these residues stabilized by the protein? Or observed in the crystal structure? If it is not, I am speculative about this process.
Please also provide the energy diagrams of the other steps in the supplementary figure.
I wonder the authors can try to soak the product (1) into the crystal of RHS1 to solve the structure of the complex of RHS1 with the product. Because the condition for crystallization condition for RHS1 should be available, the authors can at least try this. The result should strengthen their hypothesis.
Please also confirm the native function of RHS1 using heterologous expression using MetRS-like domain encoded near its gene. It is useful to confirm the native substrate of it.
Please describe the existence of lysine N-hydroxylases in supplementary figure 18b. I am interested in the substrate specificities of both MetRS and cupin domains described in Supplementary Figure 18. I assume that MetRS domains have high substrate specificities while cupin domains may have rather promiscuous substrate specificities. Can authors elucidate this by shuffling the cupin domains in the heterologous expression system?
Authors also should confirm that the isomerization of 4 to 3 and difficulty of 4 to 1 using QM calculation without enzyme?
Multiple grammatical mistakes were found throughout the manuscript. Please check the manuscript carefully. Some of them are listed below.

Reviewer #2 (Remarks to the Author):
In this manuscript Zhao et al. show the catalytic process of a stepwise N-N bond formation route involving an esterification followed by rearrangement steps. This paper adds to our knowledge of N-N bond biosynthesis and provides a chance to find new natural products containing N-N linkeages. In general the paper is worth publishing but the following issues need to be resolved first: 1. The authors need to show the production of 3 by solely the MetRS-like domain (PyrN-MetRS) to exclude the possible catalytic role of other residues of PyrN-cupin in the proposed first step. Similarly, the authors also have to present the assays with the co-incubation of PyrN-MetRS and RHS1. The currently shown assays employing E56A are inadequate to support the full role of RHS1 due to the presence of almost all residues of the PyrN-cupin. 2. In Fig. 2a, the assay without tRNA should be included and the RNase treated assays can be moved to supplementary material. 3. The point mutations of the proposed metal-binding residues caused partial or full loss of zinc ion binding. For these mutants, the authors should try the supplementation of exogenous zinc to check if activity can be restored. 4. The assays in this paper contain 10 mM MgCl2, which can bind EDTA via chelation. It therefore needs to be explained why the authors only treat the assays using 5 mM EDTA? 5. Writing mistakes: the "Spd40" (line 63, 80, 127, 211, and 335) and "s56-1" (line 63 and 81) should be "Spb40" and "s56-p1" according to reference 5, respectively.

Reviewer #3 (Remarks to the Author):
The authors describe a mechanistic proposal that is supported largely on the basis of computational results for the formation of N-N bonds by a family of cupin enzymes. The biochemical results follow closely the work of Nishiyama (JACS 2018 9083) and hence the impact and novelty rests in the mechanistic study.
The general mechanism shown in Figure 3 is a 5-endo-tet. This is almost certainly not the case and no data (experimental or computational) has been provided to suggest or support this.
Additionally, the mechanistic study has a flaw that invalidates the general pathway. Namely, the authors have not performed accurate transition state calculations. The only data provided is that in Sup. Figure 17 where TS1 was located. It was not confirmed that this is the transition state, and in fact this method of locating a transition state is not expected to find it. Instead the authors need to locate the transition states and determine their energies. The coordinates of these TS's need to be provided and the level of theory needs to be provided and benchmarked (none of these things have been done).
Insufficient characterization detail is provided for the new compounds, and if a revision were to be considered against my recommendation, I could outline those issues.

REVIEWER COMMENTS:
This manuscript describes a detailed analysis of the N-N bond-synthesizing enzyme, PyrN, by mainly using in vitro analysis of the recombinant enzyme. The authors detected N-N bond formation activity using the recombinant enzyme using glutamate and N-hydroxylysine as substrates. PyrN and its homolog, Spd40 were shown to catalyze N-N bond formation in vivo but their function could not be reconstituted in the previous study. Therefore, this result has significant importance. Furthermore, they analyzed the function of cupin and MetRS domains of PyrN individually by using PyrN variants and truncated enzymes. In addition, they analyzed the products using isotope-labeled substrates carefully. These results clearly indicated that the MetRS domain catalyzes the synthesis of ester intermediate (4) using glutamate and N-hydroxylysine via AMPylation. Furthermore, they showed that 4 is unstable and rapidly isomerized to N-glutamyl-N6-hydroxy-lysine in the absence of the active cupin domain.
The cupin domain was shown to catalyze the isomerization of 4 to synthesize the N-N bond-containing compound (1). Next, they showed that another cupin family enzyme, RHS1 whose structure was solved previously, can also catalyze the same isomerization as PyrN to synthesize an N-N bond. Based on the structure, they carried out site-directed mutagenesis, docking modeling, MD and QM calculation. As a result, they proposed the reaction mechanism of the isomerization reaction in which Glu69 plays an important role by capturing the cleaved intermediate. Finally, they analyzed several PyrN homologs discovered from the database and analyzed them in vivo and in vitro. As a result, they discovered several PyrN homologs with different substrate specificities resulting in N-N bond-containing compounds with different amino acids instead of the Glu residue. Most of the experiments seemed to be carried out carefully and most of the results are convincing enough to support their conclusion. Since there are still only few information for the mechanism of N-N bond synthesizing enzymes, this manuscript is important for the field of natural products and enzymatic chemistry. Response: Thank you for your positive comments.
REVIEWER COMMENTS: I suggest providing other E69 substituted variants for in vitro analysis in addition to E69A (eg, E69D, N, Q, L) because the E69A also affects the size of the cavity. Response: Thank you for your useful suggestion. We have generated four more E69 substituted variants (E69D, E69N, E69Q, E69L), and evaluated their protein expression level and in vitro catalytic activity. We found that the E69Q and E69N variants were also inactive, but surprisingly, both the E69D and E69L variants expressed exclusively in the inclusion body, preventing their further activity assays (Supplementary Fig. 26).

REVIEWER COMMENTS:
Please also discuss the amino acid residues which might be important for interacting with the gamma carboxylic acid of Glu, alpha amine and alpha carboxylic acid of Lys in the model of RHS1. Supplementary Fig. 14a, and described these interactions in detail in the Figure legend. REVIEWER COMMENTS: Please add more descriptions for the QM calculation in the material and methods. Please describe which residues were used for QM layer. Response: Following the suggestion of the referee, we have revised the description for the selection of QM regions in QM method section (Supplementary Fig. 16). In addition, we have added a schematic picture for the selection of QM region of QM/MM in Supplementary Fig. 16.

REVIEWER COMMENTS:
For deprotonation of NH2, via carboxylic acid of Glu, is there any amino acid residue that might be interacting with Glu to stimulate this process? How the two water molecules come from in this deprotonation? I could not understand why the authors give these two water molecules. Are these residues stabilized by the protein? Or observed in the crystal structure? If it is not, I am speculative about this process. The two water molecules cannot be observed in the crystal structure since the two waters are mediated by the bound substrate while in the crystal structure the substrate is absent. However, our long-term MD simulation indicated that two waters can penetrate into the active site and thus mediate a persistent H-bonding network between the positively charged substrate-NH3 + group and the CO2group (Figure 5b,  Supplementary Fig. 14d and 18). Such water-mediated proton channel is stabilized by the substrate itself, while has little interaction with protein. Indeed, the water-mediated proton transfer is ubiquitous in biological systems. For related discussion, please check in Main text Line 334~346.

REVIEWER COMMENTS:
Please also provide the energy diagrams of the other steps in the supplementary figure. Response: We have provided energy diagrams for each reaction step of QM calculation in Supplementary Fig. 17b-d. REVIEWER COMMENTS: I wonder the authors can try to soak the product (1) into the crystal of RHS1 to solve the structure of the complex of RHS1 with the product. Because the condition for crystallization condition for RHS1 should be available, the authors can at least try this. The result should strengthen their hypothesis. Response: Thank you for your suggestion. We have tried to repeat the crystallization condition (0.2 M Potassium Iodide, 20% (w/v) PEG 3350) for RHS1. However, we did not get RHS1 crystals using this buffer condition. A molecular docking model of RHS1 with the product 1 is shown in Supplementary Fig. 22. Crystallization of other cupin proteins (listed in Supplementary Fig. 27b) are currently underway, hope we could get a co-crystal structure of one of these cupins with the corresponding product, and publish the results in the near future.

REVIEWER COMMENTS:
Please also confirm the native function of RHS1 using heterologous expression using MetRS-like domain encoded near its gene. It is useful to confirm the native substrate of it. Response: Thank you for your suggestion. We found that this MetRS-like protein expressed exclusively in insoluble form in E. coli system. However, we were able to overexpress this MetRS-like gene, together with its associated cupin and lysine N 6hydroxylase genes in the original Rhodococcus strain (Supplementary Fig. 10b). LC-MS analysis of the culture supernatant of this engineered Rhodococcus strain revealed the production of 1, supporting that this MetRS-like domain protein naturally utilize L-Glu.

REVIEWER COMMENTS:
Please describe the existence of lysine N-hydroxylases in supplementary figure 18b. Supplementary Fig. 27c, showing that all the genomic regions harboring genes/gene pairs selected for synthesis, also contain N 6 -lysine hydroxylase genes.

Response: we have added this information in the new
REVIEWER COMMENTS: I am interested in the substrate specificities of both MetRS and cupin domains described in Supplementary Figure 18. I assume that MetRS domains have high substrate specificities while cupin domains may have rather promiscuous substrate specificities. Can authors elucidate this by shuffling the cupin domains in the heterologous expression system? Response: Thank you for the insightful suggestion. We have tested the substrate specificities of selected cupins in an E. coli system expressing N 6 -lysine hydroxylase and PyrN MetRS-like domain (Supplementary Fig. 32). We found that the cupins that naturally use amino acids with relatively larger side chains (L-tyrosine and L-serine) are also able to accept the unstable ester intermediate 4 to produce 1, supporting the promiscuous substrate specificities of cupin domains/proteins (also see Main text Line 413-419).

REVIEWER COMMENTS:
The general mechanism shown in Figure 3 is a 5-endotet. This is almost certainly not the case and no data (experimental or computational) has been provided to suggest or support this.

Response:
We are sorry about the misunderstanding. We were only intended to show the atoms connected in the final product 3/1, after the intramolecular arrangement. We have made the correction by removing the arrows (see the new Fig. 3e in Main text).

REVIEWER COMMENTS:
Additionally, the mechanistic study has a flaw that invalidates the general pathway. Namely, the authors have not performed accurate transition state calculations. The only data provided is that in Sup. Figure 17 where TS1 was located. It was not confirmed that this is the transition state, and in fact this method of locating a transition state is not expected to find it. Instead the authors need to locate the transition states and determine their energies. The coordinates of these TS's need to be provided and the level of theory needs to be provided and benchmarked (none of these things have been done). Response: We thank the referee for his/her constructive comments and insightful suggestions. Following the referee's suggestion, the detailed energy diagrams, the frequency data along with the coordinates of the transition state for each step of QM calculation have been added in this manuscript. Please check the Supplementary Fig.  17, 24 & 25 and the Supplementary_data_1. In addition, we have re-examined all reaction steps with QM/MM calculations, in which all the transition states were located by relaxed potential energy surface scans followed by full TS optimizations using the DL-FIND code. Our QM-predicted mechanism is consistent with QM/MM calculations, confirming the calculations are reliable enough. Following the suggestions of the referee, we have tested various DFT functionals for the initial N1-O1 cleavage step (see the Table below). We found that B3PW91 and M06L predict similar barriers as B3LYP, while for M06 and wb97xd, the predicted barriers are higher. For BP86 and TPSS, the predicted barriers are much lower than that from B3LYP. Indeed, B3LYP has been extensively used in studying the Zn-containing enzymes and demonstrated to be quite reliable (Refs, Abdel-Azeim, S. et al. J. Comput. Chem. 32, 3154-3167 (2011);Samanta, P. N. & Das, K. K. J. Mol. Graph. Model. 63, 38-48 (2016);Fu, Y. et al. Front. Chem. 9, 706959 (2021); Sen, A. et al. J. Phys. Chem. B 125, 8814-8826 (2021).)