Novel Type II and Monomeric NAD+ Specific Isocitrate Dehydrogenases: Phylogenetic Affinity, Enzymatic Characterization, and Evolutionary Implication

NAD+ use is an ancestral trait of isocitrate dehydrogenase (IDH), and the NADP+ phenotype arose through evolution as an ancient adaptation event. However, no NAD+-specific IDHs have been found among type II IDHs and monomeric IDHs. In this study, novel type II homodimeric NAD-IDHs from Ostreococcus lucimarinus CCE9901 IDH (OlIDH) and Micromonas sp. RCC299 (MiIDH), and novel monomeric NAD-IDHs from Campylobacter sp. FOBRC14 IDH (CaIDH) and Campylobacter curvus (CcIDH) were reported for the first time. The homodimeric OlIDH and monomeric CaIDH were determined by size exclusion chromatography and MALDI-TOF/TOF mass spectrometry. All the four IDHs were demonstrated to be NAD+-specific, since OlIDH, MiIDH, CaIDH and CcIDH displayed 99-fold, 224-fold, 61-fold and 37-fold preferences for NAD+ over NADP+, respectively. The putative coenzyme discriminating amino acids (Asp326/Met327 in OlIDH, Leu584/Asp595 in CaIDH) were evaluated, and the coenzyme specificities of the two mutants, OlIDH R326H327 and CaIDH H584R595, were completely reversed from NAD+ to NADP+. The detailed biochemical properties, including optimal reaction pH and temperature, thermostability, and metal ion effects, of OlIDH and CaIDH were further investigated. The evolutionary connections among OlIDH, CaIDH, and all the other forms of IDHs were described and discussed thoroughly.

as a potential genetic modification target towards optimized Z. mobilis strains to produce ethanol 6 . The characterization of NADP-IDH from Microcystis aeruginosa may provide new ideas for controlling blue-green algae through biological techniques 7 . IDHs from pathogenic bacteria, such as Plasmodium falciparum, Mycobacteriu tuberculosis, and Leptospira interrogans, have been reported as drug targets, because they stand at a branch point of the TCA cycle and glyoxylate shunt [8][9][10] . IDHs are also ideal immuno-diagnostic candidates, due to their highly conserved housekeeping function. For example, M. tuberculosis IDHs elicit strong B-cell responses in tuberculosis (TB)-infected populations and can differentiate between healthy vaccinated and TB populations 11 . In addition, Helicobacter pylori IDH can be an immunogen that interacts with the host immune system to subsequently lead to possible autolytic release and significantly elicit humoral responses in individuals with invasive H. pylori infection 12 .
Besides pathogenic bacterial IDH, human cytosolic NADP-IDH (IDH1) and mitochondrial NADP-IDH (IDH2) have been considered as drug targets. Mutations in IDH1 and IDH2 are frequently identified in various cancers, such as glioblastoma multiforme and acute myeloid leukemia 13,14 . Heterozygous IDH mutations are remarkably specific to a single codon in the conserved and functionally important arginine 132 residue (R132) of IDH1 and 172 residue (R172) of IDH2. Mutations result in the simultaneous loss of normal IDH catalytic activity. However, the production of a-KG and NADPH grants mutated IDHs with the neomorphic activity of reducing a-KG to 2-hydroxyglutarate (2-HG), which is accompanied by the oxidation of NADPH to NADP 1 15,16 . The accumulation of 2-HG competitively inhibits a-KG-dependent enzymes, thus causing cellular alterations in epigenetics, collagen maturation, and hypoxia signaling [17][18][19] .
As an ancient enzyme, IDH acquired various primary structures and different oligomeric states through evolution. Four kinds of IDHs have been reported: monomer, homo-dimer, homo-tetramer, and hetero-oligomer. Monomeric IDHs have been characterized from various eubacteria, and all of them are highly specific to NADP 1 20-22 . Because the amino acid sequence identities are ,10% between monomeric IDHs and other types of IDHs, this group has been recognized as a separate clade that evolved independently 20,23 . Dimeric and multimeric IDHs have been divided into three phylogenetic subfamilies [23][24][25] . Subfamily I is a prokaryotic group, in which NAD 1 and NADP 1 usage is widespread within archaeal and eubacterial homo-dimeric enzymes. Subfamily II is mainly composed of eukaryotic homo-dimeric NADP-IDHs, with a small number of eubacterial homo-dimeric NADP-IDHs. Subfamily III is comprised of mitochondrial hetero-oligomeric NAD-IDHs and eubacterial homo-tetrameric enzymes with either NAD 1 or NADP 1 as the cofactor 26 .
Because subfamily III IDHs share more than 30% sequence identities with subfamily I members but less than 15% sequence identities with subfamily II counterparts, subfamilies III and I can be combined accordingly. Therefore, we simply divided the IDH protein family into type I IDHs (subfamily I and III) and type II IDHs (subfamily II) in our previous study 4 . Three total types of IDH can be distinguished: type I IDH, type II IDH, and monomeric IDH. Interestingly, both NAD 1 and NADP 1 are broadly utilized among type I IDHs. However, homo-dimeric type II IDHs and monomeric IDHs are all NADP 1 specific 4 . We previously demonstrated that NAD 1 use is an ancestral trait of IDH, and the NADP 1 phenotype of eubacterial dimeric NADP-IDH arose on or about the time that eukaryotic mitochondria first appeared (about 3.5 billion years ago) to synthesize NADPH for the adaptation of bacterial growth on acetate 4 . Consequently, NADP 1 -specific type II IDHs and monomeric IDHs may also have their own NAD 1 -specific ancestors. However, no NAD-IDHs that belong to these two subfamilies have been explored.
The rapid growth of the IDH protein ''pool'', which has been contributed by various genome projects, should allow us to search for putative NAD 1 -specific type II IDHs and monomeric IDHs. In the present study, four novel NAD-IDHs that belong to the families of type II IDHs and monomeric IDHs were reported for the first time. Enzymatic properties of these NAD-IDHs, including the oligomeric state in solution, optimum pH and temperature for catalysis, thermostability, and metal ion dependency, were characterized thoroughly. Kinetic parameters of the two IDHs towards coenzymes were determined in detail, and their coenzyme-binding sites were evaluated by site-directed mutagenesis. The discovery of these novel NAD-IDHs will refine the chemistry and phylogeny of IDH and provide new insights into the evolution of this ancient protein family.

Results
Phylogenetic Analysis and Sequence Alignment. In our previous study, we divided the IDH protein family into type I IDHs and type II IDHs based on sequence homology 4 . Herein, we expanded the classification of this ancient family by incorporating new members (Fig. 1). Monomeric IDHs were included, although their overall sequence homologies with other forms of IDH are relatively low (<10%) 22,27 . Consequently, they constituted a monophylogenic clade on the phylogenetic tree ( Fig. 1). Therefore, three main IDH subfamilies (type I IDH, type II IDH and monomeric IDH) were identified to comprise the whole IDH protein family. Members in IDH subfamily were further classified into eleven subgroups with regard to different coenzyme specificity, different oligomeric states and diverse resources (Fig. 1). Different tree building algorithms were also employed, such as UPGMA method, Maximum Likelihood method and Minimum Evolution method. All trees contain three well-supported monophyletic groups: type I IDHs, type II IDHs, and monomeric IDHs (see Supplementary Fig. S3, Fig. S4 and Fig. S5 online). When an outgroup of malate dehydrogenase (MDH) was added into the phylogenetic study, the overall topology of the evolutionary tree remained very similar (see Supplementary Fig. S6 online).
In the type I subgroup, the most common ones are the eubacterial homodimeric NADP-and NAD-IDHs, which are represented by E. coli IDH and Z. mobilis IDH, respectively 6,23 . The pairwise amino acid sequence identity among this subgroup was more than 50%. Mitochondrial hetero-oligomeric NAD-IDHs, together with a batch of eubacterial homotetrameric NAD-IDHs, were grouped into a single branch in the type I subfamily. More than 45% identity exists between these eukaryotic and prokaryotic NAD-IDHs 26 . Although this branch was designated as subfamily III in other IDH phylogenetic studies 20,24,25,28 , it is reasonable to encompass it into the type I subfamily, because proteins in this branch exhibit more than 30% identity with type I eubacterial homodimeric IDHs. A small group of IDHs, which were represented by Rickettsia IDH and Thermus thermophilus IDH, was branched before the mitochondrial group. These IDHs were structurally distinguished from other dimeric IDHs, as they were longer in the C-terminal region. Although they were assigned as subfamily IV IDHs in two recent studies 28,29 , they were included as type I IDHs in our study, as they shared considerable sequence identities (.40%) with mitochondrial NAD-IDHs.
The type II subfamily was comprised of homodimeric NADP-IDHs from eukaryotes and eubacteria (Fig. 1). A small group of NADP-IDHs, which are represented by Thermotoga maritima IDH (TmIDH) 30 , branched clearly before the well-characterized homodimeric NADP-IDH clade. Although the TmIDH-like proteins were highly identical (.50%) to type II homodimeric NADP-IDHs, these two kinds of IDHs branched separately in the type II subtree. This separate branching may be due to the fact that TmIDH-like IDHs can form homotetramers in solution under some conditions 25,30 . Because all reported type II IDHs have been NADP 1 specific, no NAD-IDHs have been found in this subfamily. In this study, we identified the type II NAD-IDH group for the first time. This small group of distinctive IDHs was represented by IDH from O. lucimarinus CCE9901 (OlIDH), along with several counterparts that are derived from marine algae, such as Micromonas sp. RCC299 (MiIDH, GenBank accession number: XP_002502450). OlIDH demonstrated substantial sequence identity (.30%) with typical type II homodimeric NADP-IDHs and low identity with type I IDHs (,15%). Therefore, OlIDH clustered into a unique clade among the type II subfamily.
Although OlIDH was recognized as a member of the type II subfamily, its coenzyme specificities seemed to be different from those of the other type II NADP-IDHs. The crystal structure of human cytosolic homodimeric NADP-IDH, which is the most investigated type II IDH, shows that Arg353 and His354 were involved directly in coenzyme discrimination 31 . The conservation of these two NADP 1binding residues was confirmed by the structures of type II NADP-IDHs from M. tuberculosis 9 , Desulfotalea psychrophila 32 , Thermotoga maritima 30 , porcine mitochondria 33 , and yeast mitochondria 34 . However, the corresponding amino acid residues in OlIDH were substituted with Asp326 and Met327 (Fig. 2). The replacement of two positively charged amino acids with one negatively charged amino acid (Asp) and one neutral amino acid with a large side chain (Met) suggests that OlIDH should favor its binding to NAD 1 over NADP 1 , because Asp in the coenzyme binding site will repel the 2'phosphate of NADP 1 but can properly contact with NAD 1 35,36 . Other suspected type II NAD-IDHs that were grouped with OlIDH also had Asp and Met (or Leu) in the corresponding sites (see Supplementary Fig. S1 online), suggesting that they share the same coenzyme-binding mechanism.
All monomeric IDHs were separated into a monophyletic group in the phylogenetic tree (Fig. 1). Two subgroups could be distinguished in the clade clearly, one of which was represented by the well-studied monomeric NADP-IDHs from Azotobacter vinelandii(AvIDH) 21 and Corynebacterium glutamicum (CgIDH) 20 . The other, however, was newly discovered and was proposed to be NAD 1 specific. The representative member of this special subgroup was the Campylobacter sp. FOBRC14 IDH (CaIDH). CaIDH was 731 amino acids in length, which is typical for monomeric IDH. However, CaIDH shared less than 50% sequence identity with monomeric NADP-IDHs, whereas monomeric NADP-IDHs shared more than 70% sequence identity among themselves. Sequence alignment results show that the key coenzyme-binding residues, His589 and Arg600, in AvIDH 20 , which are absolutely conserved in all monomeric NADP-IDHs, had been replaced by Leu and Asp at the corresponding sites of CaIDH (Fig. 2). The presence of negatively charged Asp and neutral Leu with a large side chain eliminates the possibility of NADP 1 use by CaIDH, thus making CaIDH the first putative NAD 1 -specific monomeric IDH. Besides CaIDH, some other homogeneous monomeric NAD-IDHs, which are derived from Campylobacter species, such as Campylobacter curvus (CcIDH, GenBank accession number: WP_018136314), The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. Phylogenetic analyses were conducted in MEGA6. The IDH sequences used were listed in Table S1. OlIDH and CaIDH were marked by ''.'' and ''m'' respectively. were also found (see Supplementary Fig. S2 online). These characteristic IDHs shared more than 70% sequence identity with each other, and they all had Leu and Asp (or Leu) at the putative coenzymebinding sites (see Supplementary Fig. S2 online).
Overexpression, Purification, and Oligomeric State Determination. The recombinant OlIDH and CaIDH that were tagged with 63His were successfully produced in E. coli and then purified to homogeneity (Fig. 3a). Purified OlIDH and CaIDH gave a single band around 45 kDa and 80 kDa in SDS-PAGE, respectively, which compared well with the theoretical molecular mass of 63His-tagged OlIDH (46 kDa) and CaIDH (81 kDa). Size exclusion chromatography (SEC) was then performed to estimate the oligomerization status of OlIDH and CaIDH in solution. A single-elution peak was observed for OlIDH, and its native molecular mass was estimated to be 76 kDa (Fig. 3b). This data can be interpreted as the protein being presented as a homodimer in solution, since the calculated mass of a monomer is 46 kDa. As for CaIDH, a dominant peak, corresponding to a molecular weight of 81 kDa, was eluted (Fig. 3c). This proved that the overwhelming majority of CaIDH presented as a monomer in solution. The fraction (,5%) that eluted before the main peak was calculated to be 150 kDa, and this represented a dimeric CaIDH. The dimeric CaIDH was more likely to be induced by factors, such as protein concentration, salt, temperature, and pH.
However, molecular weight estimation by SEC is limited by the premise that the protein interacts with the column resin in an ideal way (e.g., no electrostatic or hydrophobic interactions) 37,38 . In our experiment, the disagreement between the SEC estimation (76 kDa) and the deduced molecular weight of the dimeric OlIDH (92 kDa) may be due to non-ideal interactions between the protein and the SEC media. In order to accurately assign the molecular masses of the recombinant OlIDH and CaIDH, we performed the MALDI-TOF/ TOF mass spectrometry. The molecular weight of the recombinant OlIDH and CaIDH were precisely determined to be 93 kDa and 82 kDa, respectively (Fig. 4), demonstrating the homodimeric structure of OlIDH and the monomeric structure of CaIDH.
Enzyme Activity and Kinetic Characterization. The specific activity of purified recombinant OlIDH was 72.3 U/mg with NAD 1 and only 3.8 U/mg with NADP 1 . Recombinant CaIDH showed similarly high specific activity with NAD 1 (53.2 U/mg) and very low activity with NADP 1 (8.7 U/mg). This observation primarily confirmed the NAD 1 preference of OlIDH and CaIDH, as suggested by the bioinformatic analysis. Kinetic characterization results show that the K m of OlIDH for NADP 1 was over 16-fold greater than the K m for NAD 1 . The coenzyme specificity (k cat /K m ) of OlIDH was 99-fold greater for NAD 1 than NADP 1 (Table 1). Consequently, OlIDH showed a high preference for NAD 1 , thus becoming the first NAD 1 -specific IDH in the type II subfamily. As expected, CaIDH was also characterized as an NAD 1 -specific IDH by kinetic analysis, and its coenzyme specificity was 61-fold greater towards NAD 1 than NADP 1 (Table 1). Hence, CaIDH represents the first known monomeric NAD-IDH.
Because the newly defined type II NAD-IDHs and monomeric NAD-IDHs stand for important IDH subfamilies, identification of one enzyme for each is not sufficient enough to support their dis-   tinctiveness. We therefore characterized another two NAD-IDHs that belong to these two novel subfamilies, respectively. MiIDH, an OlIDH analog from Micromonas sp. RCC299, and CcIDH, a CaIDH analog from Campylobacter curvus, were also produced in E. coli and purified to homogeneity. Kinetic analysis showed that both MiIDH and CcIDH are NAD 1 -specific as expected, because their preference for NAD 1 was 224-fold and 37-fold over NADP 1 , respectively ( Table 1). The conformation of the other two NAD 1 -specific members from type II IDH and monomeric IDH subfamilies further validated the novelty of our finding.
Coenzyme Binding Site. To evaluate the significance of the putative coenzyme-determining sites (Asp326 and Met327 in OlIDH, Leu584 and Asp595 in CaIDH), each mutant enzyme containing two point mutations, R 326 H 327 for OlIDH and H 584 R 595 for CaIDH, was constructed, based on the protein sequence alignment (Fig. 2). The mutated enzymes were successfully produced in E. coli and purified to homogeneity. CD spectroscopy was performed to determine the secondary structure of wild-type and mutant enzymes (Fig. 5). The results show that OlIDH R 326 H 327 and CaIDH H 584 R 595 mutants were very similar to the wild-type enzyme, thus indicating that mutations that occur at key activity sites do not cause significant changes in protein secondary structure.
The kinetic characterization results are reported in Table 1. The OlIDH R 326 H 327 mutant displayeda 22-fold higher K m value for NAD 1 than that of the wild-type enzyme. Meanwhile, the mutant enzyme showed a greatly increased affinity to NADP 1 , as demonstrated by a 270-fold decrease in K m value. The k cat /K m of OlIDH R 326 H 327 towards NADP 1 was 49-fold higher than that of the wildtype enzyme, whereas the k cat /K m for NAD 1 underwent a 44-fold decrease. Consequently, the overall specificity of the OlIDH R 326 H 327 mutant was 22-fold greater for NADP 1 than that for NAD 1 .
Therefore, the two point mutations in OlIDH completely altered its coenzyme specificity, which demonstrates that Asp326 and Met327 were key specificity determinants for OlIDH.
The importance of Leu584 and Asp595 in the direct binding of NAD 1 to CaIDH was also confirmed by the mutagenesis study. The CaIDH H 584 R 595 mutant's affinity to NAD 1 was loosened, as evidenced by a17-fold elevation in K m , as compared to that of the wild-type enzyme. By contrast, the mutant enzyme displayed a 45-fold decrease in K m for NADP 1 . The k cat /K m of CaIDH H 584 R 595 towards NADP 1 was 92-fold higher than that of the wild-type enzyme, whereas the k cat / K m for NAD 1 underwent a 13-fold decrease. Consequently, the overall specificity of the CaIDH H 584 R 595 mutant was 19-fold greater for NADP 1 than that for NAD 1 . Thus, the monomeric NAD 1 -specific CaIDH was converted to an NADP 1 -dependent enzyme by two mutations in the coenzyme binding sites.
Biochemical Characterization. The effects of pH on OlIDH and CaIDH activities were determined for the NAD 1 -linked reaction. OlIDH exhibited slightly different pH activity profiles and optimum pH using Mg 21 or Mn 21 as its cofactor. The optimum pH for OlIDH was pH 9.0 or pH 8.5 in the presence of Mg 21 or Mn 21 , respectively (Fig. 6a). Furthermore, the optimum pH for CaIDH was pH 8.0 or pH 7.5 in the presence of Mg 21 or Mn 21 , respectively (Fig. 6d). OlIDH showed similar pH-activity correlation with the homodimeric NAD 1 -specific Z. mobilis IDH (8.5 with Mg 21 and 8.0 with Mn 21 ) and Streptococcus suis IDH (7.0 with Mg 21 and 8.5 with Mn 21 ) from the type I subfamily 6,39 . CaIDH showed a slightly lower optimum pH when compared to the monomeric NADP-IDH from Streptomyces lividans TK54 (9.0 with Mg 21 and 8.5 with Mn 21 ) 22 .
The optimum reaction temperature for OlIDH was around 40uC with either Mg 21 or Mn 21 as the cofactor (Fig. 6b). Results from heat inactivation studies demonstrate that recombinant OlIDH was stable below 45uC, but its activity rapidly declined as the temperature was raised. Incubation at 45uC for 20 min caused a 28% or 21% loss of activity in the presence of Mg 21 or Mn 21 (Fig. 6c), respectively, whereas incubation at 50uC caused a 91% or 84% loss of activity in the presence of Mg 21 or Mn 21 , respectively (Fig. 6c). The optimum temperature for CaIDH activity was around 45uC or 40uC in the presence of Mg 21 or Mn 21 , respectively (Fig. 6e). Recombinant CaIDH retained the majority of the activity below 45uC. However, its activity dropped rapidly as the temperature was raised. Incubation at 55uC for 20 min caused a 55% or 75% loss of activity in the presence of Mg 21 or Mn 21 , respectively (Fig. 6f). The effects of different metal ions on the activities of OlIDH and CaIDH were examined ( Table 2). Both enzymes needed the presence of a divalent cation for catalysis, although fractions of activities were observed for OlIDH (5.8%) and CaIDH (7.9%) when no metal ions were added. Mn 21 was the most favorable cation for both OlIDH and CaIDH, and its role could be largely replaced by Mg 21 (68.6% for OlIDH and 78.3% for CaIDH). Mn 21 has also been found to be the preferred cation for other homodimeric NAD-IDHs from

Discussion
In the present study, four novel IDHs, belong to two novel IDH subfamilies, were reported for the first time. Two of them were the NAD 1 -specific homodimeric IDHs from marine alga, O. lucimarinus (OlIDH) and Micromonas sp. RCC299 (MiIDH), and the other two were the NAD 1 -specific monomeric IDH from pathogens Campylobacter sp. FOBRC14 (CaIDH) and Campylobacter curvus (CcIDH). OlIDH and MiIDH were found to be the first NAD-IDHs in the type II subfamily, as all members of this subfamily were previously thought to be NADP 1 specific. CaIDH and CuIDH, however, were found to be the first NAD 1 specific monomeric IDHs. These four IDHs, together with their NAD 1 specific counterparts, constituted two separate branches on the phylogenetic tree. Most previous studies have proposed that IDH favors NADP 1 over NAD 1 as a coenzyme 22,25,27,31,34,40 . However, as more and more NAD-IDHs from diverse backgrounds being reported, NAD 1 appears to be widely used by IDH through nature 6,26,39,41 . By adding two groups of OlIDH-like and CaIDH-like NAD-IDHs, we have expanded and refined the evolutionary classification of the IDH protein family. The phylogenetic tree presented in Fig. 1 contains three well-supported monophyletic groups: type I IDH, type II IDH, and monomeric IDH. Monomeric IDHs were also included in the present phylogenetic analysis for the integrity of IDH family, and the same principle was applied by Delbaere et al. 20 . The most important advancement was the discovery of type II homodimeric NAD-IDHs and monomeric NAD-IDHs, which completed the classification of the IDH protein family in the view of coenzyme specificity. Neither of these groups of NAD-IDH have been reported in previous studies 4, [23][24][25] . IDHs with NAD 1 specificity are ancestral to NADP-IDHs, and this evolutionary hypothesis has been demonstrated by experimental reverse evolution, which was applied to the typical type I E. coli NADP-IDH 4,23,35 . The findings of NAD 1 -specific OlIDH and CaIDH, together with their homologous proteins, will help identify the possible ancestors of type II IDHs and monomeric IDHs, respectively.
By searching the genome of O. lucimarinus CCE9901 (GenBank Assembly ID: GCA_000092065.1), one copy of IDH gene could be found, which suggests that OlIDH is likely the only functional IDH isozyme in this marine algae. The genus Ostreococcus is composed of a group of globally distributed, photosynthetic, unicellular green algae, and these cells are the smallest known eukaryotes 42 . Because all known NAD-IDHs from eukaryotes are hetero-oligomeric and consist of at least two different subunits 3,43 , OlIDH represents the first eukaryotic homodimeric NAD-IDH. Furthermore, IDHs from some other marine algae, such as Micromonas sp. RCC299, Emiliania huxleyi and Thalassiosira oceanica(see Supplementary Table S1 online), showed high homology to OlIDH (.70%) and branched together with OlIDH on the phylogenetic tree. Considering the antiquity of oceanic algae, the ancient trait of NAD 1 specificity that was possessed by IDHs in these organisms can be fairly explained.
Similar to OlIDH, the monomeric CaIDH is likely the only active IDH isozyme found in Campylobacter sp. FOBRC14 (GenBank Assembly ID: GCA_000287855.1). The monomeric NAD-IDH group seemed to be very small in composition, because only IDHs from the genus Campylobacter were included (see Supplementary  Table S1 online). These monomeric IDHs shared very high sequence identity (.60%), and the putative NAD 1 -binding sites were conservative (Fig. 2). Campylobacter is the most common cause of bacterial foodborne illness and has drawn a lot of attention in recent years 44,45 . As a pathogen, it is surprising that the glyoxylate bypass is absent in Campylobacter sp. FOBRC14, as no genes encoding isocitrate lyase and malate synthase can be found in its genome. The glyoxylate bypass ensures the bypass of two oxidative steps of the TCA cycle and permits the net incorporation of carbon during the growth of most microorganisms on acetate or fatty acids as the primary carbon source. The end products of the bypass can be used for gluconeogenesis and other biosynthetic processes 46 . Most intracellular human pathogens, such as Salmonella typhimurium 47 , Burkholderia pseudomallei 48 and M. tuberculosis 49 , need the glyoxylate bypass for their virulence, because fatty acids are the only abundant sources of C2 carbon in mammalian tissues 50 . Interestingly, the existence of the glyoxylate bypass in microorganisms has always been accompanied by at least one NADP-IDH isozyme, which provides the majority of NADPH to support bacterial growth on limited carbon sources 4 . Because the glyoxylate bypass could not be detectedin Campylobacter sp. FOBRC14, it is understandable that an NAD 1 -specific IDH, rather than an NADP 1 -specific IDH, was found in this organism. The finding also suggests that the infection mechanism of Campylobacter sp. FOBRC14 may be different from that of pathogens with the glyoxylate bypass.
As an eukaryotic NAD-IDH, OlIDH shared very similar kinetic properties with hetero-oligomeric NAD-IDHs from other eukaryotic cells. The K m value for NAD 1 of OlIDH (136.6 mM) was in a similar range of those determined for Yarrowia lipolytica yeast (136 mM) 51 and rats (148.9 mM) 52 . It was higher than that for humans (70 mM) 53 and lower than that for budding yeast (210 mM) 54 . When compared to homodimeric NADP-IDHs of type II subfamily, OlIDH showed much lower affinity to its coenzyme than NADP-IDHs do to NADP 1 , such as in the wild pig (5.6 mM) 55 , rat (11.5 mM) 56 and budding yeast (20 mM) 57 . Due to the decrease in cofactor affinity, OlIDH has much lower k cat /K m (0.444 mM 21 s 21 ) than its type II NADP-IDH counterparts (5.96 mM 21 s 21 for wild pig and 9.1 mM 21 s 21 for rat) 55,57 . The poor performance of OlIDH in catalysis may be understood as a latent ancient phenotype, thereby providing more evidence for the age of OlIDH among the type II subfamily. The comparison of kinetic parameters shows that the preference of CaIDH for NAD 1 over NADP 1 (61-fold) was significantly lower than that of monomeric NADP-IDHs, such as S. lividans TK54 (85,000-fold) 22 and C. glutamicum (50,000-fold) 27 , thus making CaIDH an old and ineffective enzyme in using NAD 1 . Both OlIDH and CaIDH represent ancient members in the type II and monomeric subfamily, respectively. The modern, sophisticated type II homodimeric NADP-IDH and monomeric NADP-IDH are very possibly refined from old NAD 1 -utilizing ancestors through evolution, as partially evidenced by the fact that just two point mutations in the coenzyme-binding sites of OlIDH and CaIDH were sufficient in converting them to NADP 1 -utilizing enzymes (Table 1).

Conclusion
In the present study, we refined and expanded the phylogenetic classification of the IDH protein family by dividing the type I, type II, and monomeric subfamilies and identifying two new groups: one group of type II NAD-IDHs, represented by OlIDH, and one monomeric NAD-IDHs, represented by CaIDH. Thus, the classification of the IDH protein family in the view of coenzyme specificity is now complete. OlIDH and CaIDH were heterologously produced and enzymatically characterized in detail. Although the NAD 1 specificity of the two enzymes was confirmed by kinetic analysis, both enzymes were demonstrated to be ineffective NAD 1 -utilizing enzymes. The coenzyme specificity of both enzymes could be completely altered from NAD 1 to NADP 1 by merely mutating two coenzyme-binding amino acids, thus suggesting the ancestral positions of OlIDH and www.nature.com/scientificreports SCIENTIFIC REPORTS | 5 : 9150 | DOI: 10.1038/srep09150 CaIDH in the type II and monomeric subfamilies, respectively. Further studies are clearly needed to understand these two novel groups of IDH in the areas of structure determination and catalytic mechanism investigation.

Methods
Gene Synthesis. IDHs from Ostreococcus lucimarinus CCE9901 (OlIDH, GenBank accession number: ABP01147), Micromonas sp. RCC299 (MiIDH, GenBank accession number: XP_002502450), Campylobacter sp. FOBRC14 (CaIDH, GenBank accession number: EJP74315) and Campylobacter curvus (CcIDH, GenBank accession number: WP_018136314) were the four targets of this study. Full-length genes encoding these four proteins were synthesized through the gene synthesis service by Generay Biotech Co., Ltd. (Shanghai, China). The coding sequences for four genes were codon optimized by selecting only the most preferential codons according to the Escherichia coli bias. The artificial genes were then inserted into the expression vector, pET-28b (1), between the Nde I and Xho I sites, thus generating four recombinant plasmids, pET-OIIDH, pET-MiIDH, pET-CaIDH and pET-CcIDH. The gene sequences were confirmed by sequencing.
Overexpression and Purification of Wild-type and Mutated Enzymes. E. coli Rosetta (DE3) cells were transformed with pET-OIIDH, pET-MiIDH, pET-CaIDH, pET-CcIDH, or recombinant plasmids carrying the mutated IDH genes and grown at 37uC with vigorous shaking in LB medium containing 30 mg/ml kanamycin and 25 mg/ml chloramphenicol. Then, cells were inoculated in 50 ml fresh LB media with the same antibiotic. When the OD 600 of the culture reached 0.6, isopropyl-1-thio-b-D-galactopyranoside was added to the culture at a final concentration of 0.1 mM with subsequent cultivation overnight at 20uC. Cells were harvested by centrifugation at 4,000 rpm for 15 min and then resuspended in lysis buffer (50 mM Tris-HCl, pH 7.5, 500 mM NaCl). The insoluble debris was removed by centrifugation at 12,000 g for 20 min at 4uC. Then, enzymes with 63His-tag were purified by using BD TALON Metal Affinity Resins (Clontech, USA), according to the manufacturer's instructions. The expression abundance and purification homogeneity were verified by sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE).
Enzyme Assay. The activities of wild-type and mutant enzymes were assayed by a modification of the method by Cvitkovitch et al. 58 . Activity assays were carried out in 25uC 1-ml cuvettes (1-cm light path) containing 35 mM Tris-HCl buffer (pH 7.5), 2 mM MgCl 2 or MnCl 2 , 1.5 mM DL-isocitrate, and 1.0 mM NAD 1 . The increase in NADH was monitored at 340 nm with a thermostated Cary 300 UV-Vis spectrophotometer (Varian, USA), using a molar extinction coefficient of 6.22 mM 21 cm 21 . One unit of enzyme activity represented the reduction of 1 mM of NAD 1 per minute. Protein concentrations were determined using the Bio-Rad protein assay kit (Bio-Rad, USA) with bovine serum albumin as the standard.
Kinetic Analysis. To measure the Michaelis constant (K m ) values of the wild-type and mutant enzymes for NAD 1 and NADP 1 , the isocitrate concentration was kept fixed at 1.0 mM with varying cofactor concentrations. Apparent maximum velocity (V max )and K m values were calculated by nonlinear regression using Prism 5.0 (Prism, USA). All kinetic parameters were obtained from at least three measurements.
Temperature and pH Effects. The effects of temperature and pH on the activity of recombinant OlIDH and CaIDH were carried out using the assay method described above. The activities of purified recombinant OlIDH and CaIDH were assayed in 35 mM Tris-HCl buffer between pH 6.5 and 9.5 in the presence of Mn 21 (Mg 21 ). The optimum temperature was determined at temperatures that ranged from 25uC-55uC. The thermostability of recombinant OlIDH and CaIDH through heat inactivation were determined by incubating enzyme aliquots at 25uC-50uC for 20 min. After incubation, the aliquots were immediately cooled on ice, and the residual enzyme activity was measured by using the standard enzyme assay.
Metal Ion Effects. The effects of different metal ions on the activities of recombinant OlIDH and CaIDH were determined using the standard assay method, including 2 mM monovalent ions (K 1 , Li 1 , Na 1 , and Rb 1 ) and divalent ions (Ca 21 , Co 21 , Cu 21 , Mg 21 , Mn 21 , Ni 21 , and Zn 21 ).
Circular Dichroism Spectroscopy. Circular dichroism (CD) spectroscopy was conducted using a Jasco model J-810 spectropolarimeter (Oklahoma City, OK, USA). The ellipticity measurements, as a function of wavelength, were performed as described previously 59 . Briefly, purified protein samples (0.3 mg/mL) were prepared in 50 mM sodium phosphate and 60 mM NaCl (pH 7.5). The ellipticity (h) was obtained by averaging three scans of the enzyme solution between 200 and 260 nm at 0.5-nm increments. The mean molar ellipticity [h] (deg cm 2 dmole 21 ) was calculated from [h] 5 h/10nCl, where h was the measured ellipticity (millidegrees), C was the molar concentration of protein, l was the cell path length in centimeters (0.1 cm), and n was the number of residues per subunit of enzyme (415 for OlIDH and the mutant, 737 for CaIDH and the mutant).
MALDI-TOF/TOF mass spectrometry. Mass spectrometry analyses were conducted using an AB SCIEX MALDI TOF-TOF 5800 Analyzer (AB SCIEX, USA) equipped with a neodymium: yttrium-aluminum-garnet laser (laser wavelength was 349 nm), in linear high mass positive-ion mode. SA was used as the matrix and TFA was applied as an ionization auxiliary reagent. The TOF/TOF calibration mixtures (AB SCIEX) were used to calibrate the spectrum to a mass tolerance within 10 ppm. The MS spectra were processed using TOF-TOF Series Explorer software (V4.0, AB SCIEX).
The typical IDH was used as bait to identify similar IDH sequences in protein database by performing BLAST Link search (http://www.ncbi.nlm.nih.gov/sutils/ blink.cgi?mode5query). The redundancy of the query results was eliminated by keeping one IDH sequence for each species, while removing all the other identical IDH sequences for the same species. IDH sequences with relative high homology to the bait sequence were taken for phylogenetic analysis, and their coenzyme binding sites were evaluated by sequence alignment in the first place, in order to confirm their identical coenzyme specificity with the query IDH. Other IDH sequences among the search results were discarded either because their coenzyme usages were different from the bait IDH as predicted by coenzyme binding sites alignment, or because their sequence identities with the query IDH were relatively low, which may suggest their different distribution on the phylogenetic tree. By applying this principle, 197 IDH sequences in total, representing the eleven subgroups encompassed by the three IDH subfamilies, were chosen for phylogenetic analysis. IDH sequences from diverse resources were downloaded from GenBank via the National Center for Biotechnology Information web site (http://www.ncbi.nlm.nih.gov/). The bootstrapped neighborjoining tree was constructed with the MEGA 6 software (http://www.megasoftware. net/), based on the sequence alignment by Clustal X program (ftp://ftp.ebi.ac.uk/pub/ software/clustalw2) 60,62 . In order to improve the accuracy of the phylogenetic analysis, some other tree building algorithms were also employed, such as UPGMA method, Maximum Likelihood method and Minimum Evolution method. One outgroup of malate dehydrogenase (MDH) was tried into the phylogenetic study in order to examine whether the IDH tree will be disturbed by adding different proteins.