Introduction

Bioactive natural products containing reduced phosphorus functional groups are produced by an array of bacteria and fungi. These compounds are chemically categorized according to the redox states of their phosphorus centers, falling into two groups; phosphonates (P valence=+3), characterized by a single carbon to phosphorus (C–P) bond and phosphinates (P valence=+1), which typically contain two carbon to phosphorus bonds (resulting in C–P–C bond motifs).

The bioactivities of reduced phosphorus compounds are usually attributed to their molecular mimicry of phosphoryl transition state intermediates or phosphate esters, which are central to many biological processes.1 In addition, the C–P bond confers substantial chemical stability and resistance to enzymatic breakdown. Thus, these compounds are resistant to hydrolysis by ubiquitous phosphatases that catabolize structurally similar phosphate esters and anhydrides. In light of these properties, it is not surprising that numerous synthetic compounds and natural products containing C–P bonds have found use in medicine and agriculture as antiparasitics, antibacterials, herbicides and antivirals.1, 2

The useful properties of phosphonate natural products have prompted numerous studies regarding their biosynthetic origins and we now understand much about the production of fosfomycin,3 a clinically utilized antibacterial, FR-900098,4 an antimalarial candidate, and the antimicrobials rhizocticin5 and dehydrophos.6 These bioactive phosphonates share early biosynthetic steps, but diverge significantly in later steps to create end-product complexity. Importantly, the enzyme phosphoenolpyruvate phosphonomutase is, with a single exception, shared by all phosphonate biosynthetic pathways. Conservation of this enzyme has been exploited for identification of the biosynthetic genes clusters for many known C–P compounds, and for the discovery of new phosphonate/phosphinate producers.7

In contrast to the significant molecular diversity seen among the phosphonates, all known phosphinate natural products are structurally similar. Members in this class include phosphinothricin-tripeptide (PTT)8, 9 (also called bialaphos) and phosalacine10 (PAL). Both molecules incorporate the nonproteinogenic phosphino- amino-acid phosphinothricin (PT), differing by only one amino-acid substituent (phosphinothricyl-alanyl-leucine in PAL and phosphinothricyl-alanyl-alanine in PTT, Figure 1). PT is a potent glutamine synthetase inhibitor and the disruption of this enzyme in plants leads to ammonia-dependent chloroplast bleaching.11 This activity led to the commercial development of synthetic PT for use in agricultural weed control (for example, Basta and Liberty, Bayer Crop Science).

Figure 1
figure 1

Comparison of phosphonate and phosphinate gene clusters and associated biosynthetic pathways. (a) Comparison of gene clusters encoding PTT and PAL biosynthetic enzymes with the phosphinate biosynthetic cluster of unknown function from Streptomyces sp. WM6386. Open reading frames shown in black do not have proposed roles in the biosynthesis of PTT or PAL. Most ORFs shown in blue have roles the synthesis and activation of phosphonoformate, an early intermediate. Also included is phpI (bcpA), which shares sequence and functional similarity to ppm, the gene associated with step I. Green ORFs are similar to those encoding enzymes utilized in glycolysis and the TCA cycle. Red ORFs encode proteins likely involved in non-ribosomal peptide synthesis. Yellow ORFs contain transmembrane domains. Gray ORFs likely encode proteins involved in tailoring reactions or self-resistance. ORFs encoding transcriptional regulators are colored brown. (b) Model for PTT and PAL biosyntheses adapted from multiple references and this work. ORF assignments to biosynthetic steps are based on published work or by inference from gene similarity. Gene nomenclature from the S. viridochromogenes cluster is used throughout. Cofactors and water have been omitted for clarity. Label 3C in step VIII denotes the input of a three-carbon compound such as glycerate or 3-phosphoglycerate. Label ‘AA’ in step XVI indicates the addition of two amino-acid residues. For PTT biosynthesis, two alanine residues are added; for PAL one alanine and leucine each are incorporated. The brackets surrounding the methyl group in step XVII indicates the donor for this reaction has not rigorously been shown. (c) Putative phosphonate biosynthetic gene cluster from F. alni. (d) Hypothetical biosynthetic pathway for a predicted phosphonic acid metabolite deduced from F. alni genome mining. The GenBank locus identifiers are those from the published genome sequence.51 Homologs from early PTT biosynthesis allowing for plausible pathway prediction are noted in brackets.

Studies on PTT biosynthesis were initiated nearly 30 years ago, making it one of the most thoroughly studied of the reduced phosphorus natural products. The current biosynthetic model, which is the culmination of numerous studies carried out in Streptomyces hygroscopicus and Streptomyces viridochromogenes,1 includes over 18 experimentally characterized or postulated steps (Figures 1a and b).12 However, despite a significant body of prior work,1, 12 questions remain regarding PTT biosynthesis, including poorly characterized steps required for conversion of phosphonoformate to carboxyphosphonoenolpyruvate (Figure 1b, steps VIII and IX) and the timing and mechanisms underlying P-methylation (step XVII), peptide bond formation (step XVI) and chain release. The identity of the transporter(s) that secrete the mature, polar tripeptide from producing cells also remains unknown. In addition, based on the experimentally established boundaries for the S. viridochromogenes PTT cluster,13, 14 the S. hygroscopicus locus had not yet been fully sequenced. Thus, only 15 of the 24 ORFs that comprise the minimal S. viridochromogenes locus had sequenced S. hygroscopicus orthologs.13, 14 In contrast to PTT, few studies have targeted PAL biosynthesis. Based on the similarity of PTT and PAL, it is likely that their biosynthetic clusters are similar, with minor differences leading to the incorporation of variant terminal amino acids. However, the biosynthetic locus from the only known producer, K. phosalacinea, has yet to be identified or sequenced.

To address these unanswered questions in phosphinate biosynthesis, we completed the sequence of the S. hygroscopicus PTT biosynthetic cluster and also that of the PAL biosynthetic locus from K. phosalacinea. These were systematically compared along with the previously published PTT locus from S. viridochromogenes, providing new insights into the biosynthesis of these important natural products. Using these gene clusters as a model, we also analyzed a novel phosphinate found in Streptomyces sp. WM6386 and S. sviceus, as well as a related cluster found in Frankia alni. Collectively, our data demonstrate the utility of comparative genetic analysis and genome mining for discovery of bioactive reduced phosphorus compounds and characterization of their biosynthetic pathways.

Materials and methods

Fosmid libraries of Streptomyces strain WM6386, Kitasatospora phosalacinea DSM 43860 and Streptomyces hygroscopicus ATCC 21705 were constructed and screened as previously described.7 DNA sequencing was performed at the WM Keck Center for Comparative and Functional Genomics, University of Illinois (Urbana IL, USA). Sequence chromatograms were aligned and edited using Sequencher 4.7 (Gene Codes Corp., Ann Arbor, MI, USA).

Protein homologs encoded within multiple phosphonate gene clusters were compared using ClustalW15 using default parameters. Putative enzyme catalytic domains were identified with InterProScan. Newly sequenced genes were translated in silico and putative translation start sites were selected using GeneMark.hmm16 using Streptomyces coelicolor codon preferences and by comparison against close homologs in GenBank. Hypothetical polypeptides were interrogated against the non-redundant database using FASTA.17 Non-ribosomal peptide synthetase (NRPS) enzyme adenylation domain substrate specificity residues were identified using ‘NRPSpredictor’ online software at http://www-ab.informatik.uni-tuebingen.de/software18 and by using AntiSmash 2.0.19 Known NRPS enzymes with specificity residues similar to those identified in this work were identified using NRPS BLAST20 online at http://nrps.igs.umaryland.edu/nrps and AntiSmash. Similarity scores between NRPS specificity codes were calculated using AlignX software integrated into the VectorNTI Advance 11 analysis suite from Invitrogen (Carlsbad, CA, USA).

Results and Discussion

Sequencing the PTT and PAL biosynthetic clusters from S. hygroscopicus and K. phosalacinea

To isolate the S. hygroscopicus PTT and K. phosalacinea PAL biosynthetic gene clusters, we constructed and screened fosmid libraries using the phosphoenolpyruvate phosphonomutase encoding gene, ppm, as a molecular probe. Two overlapping clones carrying the S. hygroscopicus PTT locus were identified and sequenced (GenBank:KP026916), whereas a single clone spanning the entire K. phosalacinea PAL locus was obtained and sequenced (GenBank:KP185121).

Systematic comparisons of the two PTT loci with the PAL locus revealed strict synteny and homologs for every gene within all three phosphinate biosynthetic loci (Figure 1a), suggesting that the biosynthetic pathways for the three organisms are nearly identical. In general, the S. hygroscopicus and S. viridochromogenes proteins were more similar to each other (~60–96% identity) than they were to those from K. phosalacinea (~30–87% identity) (Table 1). The gene cluster boundaries, which have been experimentally defined in S. viridochromogenes,13, 14 are also conserved in the S. hygroscopicus and K. phosalacinea loci. The ORFs located upstream of the first conserved gene (phpA) do not share appreciable similarity or orientation between any of the clusters. The 3′-ends of the gene clusters are marked by the characterized transcriptional regulators phpR in S. viridochromogenes and brpA in S. hygroscopicus21, 22, 23 (Figure 1a). A phpR/brpA homolog, which we assume controls expression of the PAL locus, was found in the corresponding location in K. phosalacinea. These regulatory proteins share the lowest amino-acid identities among homologs found in the three clusters (Table 1). We additionally found two conserved genes of unknown function (orf416 and orf192) downstream of the phpR homologs in the S. viridochromogenes and S. hygroscopicus clusters. The orf416 translation products share 78.9% identity, whereas those of orf192 share 84.3% identity. These ORFs encode putative MFS-family transporters and XRE-family transcriptional regulators, respectively. Interestingly, neither ORF is required for the heterologous expression of the S. viridochromogenes locus in Streptomyces lividans,14 and deletion of orf416 does not affect PTT production in the native strain.13 These genes are absent in the K. phosalacinea PAL locus.

Table 1 Identity comparisons of S.hygroscopicus, K. phosalacinea, F. alni and Streptomyces WM6386 translation products against S. viridochromogenes PTT orthologs

Ordering the thiotemplate assembly line in PAL and PTT biosynthesis

Formation of peptide natural products, including PAL and PTT, commonly involves multidomain NRPS-family proteins. These complex enzymes use substrate-specific catalytic domains to determine the structure and sequence of their peptide products. Typical NRPS proteins include amino-acid recognition and adenylation (A) domains that activate and load specific amino acids, peptidyl carrier protein (PCP) domains that carry the activated amino acids via a thioester linkage, and condensation (C) domains that catalyze peptide bond formation between tethered intermediates; additional modification domains can also be present (see review24 for additional detail). NRPS proteins are highly variable, with the number and order of individual domains defining the final peptide products.

Studies on PTT biosynthesis in S. viridochromogenes established the role of PhsA in loading PT precursor N-acetyl(demethyl)phosphinothricin25, 26, 27 and of PhsB and PhsC in loading the two alanine residues.26, 27 The homologous proteins in S. hygroscopicus are likely to have the same activity. However, given the divergent composition of the PAL and PTT peptides, the conservation of PhsA, PhsB and PhsC in K. phosalacinea is somewhat surprising. All three sets of NRPS homologs share identical domain architectures with A-PCP didomains in the PhsA proteins; PCP-C-A-PCP domains in PhsB homologs and C-A-PCP domains in PhsC proteins. Thus, differences in NRPS architecture are unlikely to underlie the incorporation of divergent amino acids in PTT and PAL. A parsimonious argument suggests that PhsA is responsible for loading the PT precursor in all three organisms. Deciphering the loading specificities of the PhsB and PhsC is more complex.

In S. viridochromogenes, both PhsB and PhsC activate alanine,26, 27 although the order in which they act during peptide assembly is not known. However, because PAL incorporates both alanine and leucine, the Kitasatospora PhsB and PhsC enzymes must load different amino acids. Because both PAL and PTT share N-terminal PT-Ala residues, identifying which protein loads alanine in Kitasatospora will also establish whether PhsB or PhsC acts first during PTT biosynthesis. To do this, we examined the so-called specificity codes of the PhsB and PhsC orthologs (Table 2). This code is derived from a set of 10 residues that comprise the amino-acid recognition motif of NRPS adenylation domains, allowing its use as a powerful predictor of amino-acid-loading specificity.28 All three PhsB homologs share nearly identical (100% similarity score) specificity residues, and given experimental data from S. viridochromogenes PhsB,26 it is very likely that the PhsB proteins from S. hygroscopicus and K. phosalacinea also activate alanine. Likewise, the specificity residues from the S. viridochromogenes and S. hygroscopicus PhsC proteins are identical, again predicting the loading of Ala consistent with the structure of the PTT peptide. In contrast, the K. phosalacinea PhsC specificity residues differ substantially from the other PhsC proteins (Table 2). Moreover, these residues predict the loading of Leu (80% identity to Q8G982_m1__leu code), the terminal amino acid in PAL. Taken together, these analyses imply that the functional order of the peptide synthetases used in the biosynthesis of both PAL and PTT is PhsA, followed by PhsB and PhsC (Figure 2).

Table 2 Comparison of selectivity-conferring residues from phosphinothricin-tripeptide and phosalacine NRPS enzymes
Figure 2
figure 2

Thiotemplate assembly and chain release models for PTT and PAL biosyntheses. Illustration showing NRPS domain architecture common to the PhsA, PhsB and PhsC enzymes of S. viridochromogenes, S. hygroscopicus and K. phosalacinea. PhsA from S. viridochromogenes loads both N-acetyldemethylphosphinothricin and N-acetylphosphinothricin; the desmethyl form is shown. The illustrated PhsA/B/C enzyme order is supported by the analyses in this work. PAL PhsC specificity for leucine awaits biochemical confirmation. (a) shows the use of either or possibly both PhpL and PhpM homologs in assembly line product release. Alternatively, one thioesterase homolog might be involved in product release where the other is involved in an editing role, as previously suggested.30 (b) shows a possible role for PTT/PAL PhpL homologs as PCP-to-PCP domain transacylases. An increasing number of Type II thioesterase homologs containing the variant GXCXG thioesterase motifs have been identified in natural product biosynthetic clusters where they function as transacylases. This model would then leave PhpM homologs containing typical GXSXG thioesterase motifs as product-releasing thioesterases. We note that spontaneous hydrolysis might also contribute to chain release in both models.

Determining the N-acetyl(demethyl)phosphinothricin NRPS specificity code

Studies of PTT biosynthesis showed that N-acetyldemethylphosphinothricin and N-acetylphosphinothricin25, 27 are substrates for PhsA, whereas similar compounds lacking N-acetyl groups are not acceptable substrates.25, 27 This finding is also surprising because most known NRPS enzymes incorporate α-amino acids. To investigate why the N-acetylation of (demethyl)phosphinothricin is required for amino-acid activation we examined the specificity code of PhsA in the PTT and PAL loci. All three PhsA orthologs share an identical specificity code (Table 2) that differs from all others currently available in the NRPS database.19 Interestingly, the PhsA specificity code, which is most similar to those found in A-domains that load ornithine and glutamate (Table 3), lacks the aspartate residue of the canonical D-X8-K motif found in the binding pocket signature in most bacterial NRPS A-domains.28 This Asp residue coordinates the α-amine of amino-acid substrates and is therefore invariant in the majority of NRPS A-domains. In infrequent cases where NRPS A-domains specify a substrate other than an α-amine, the substrate-binding residues reflect this by exchanging the conserved Asp for a different residue.28 Thus the substitution of Val for Asp at the first position likely determines the enzyme’s specificity toward N-acetyl(demethyl)phosphinothricin, and the exclusion of unacylated (demethyl)phosphinothricin. This finding supports the idea that PhsA loads the PT precursor in all three pathways and may prove useful in defining the specificity of as yet uncharacterized NRPS proteins.

Table 3 Comparison of PhsA selectivity residues against similar selectivity motifs from NRPS A- domains

Thioesterase function in PAL and PTT biosynthesis

In most NRPS assembly lines product release is catalyzed by thioesterases (TE) that hydrolyze the mature product from the terminal PCP domain.29 These proteins are easily recognized by their α/β hydrolase fold and conserved GXSXG active site motif. Two types of TE have been observed. Most assembly lines contain integrated (Type I) TE domains fused within larger NRPS proteins. Previous analyses on the S. viridochromogenes PTT cluster revealed that none of the three NRPS proteins include Type I TE domains.26 Based on this observation, it was assumed that one of the two stand-alone (Type II) TEs would serve during product release. However, mutational analyses of the S. viridochromogenes phpL and phpM Type II genes showed that neither is required for PTT production, although production is substantially lowered in their absence.30 This finding led Eys et al to suggest an alternative model in which a ‘mini-TE’ domain, discovered in the N-terminus of S. viridochromogenes PhsA, would be used in thiotemplate product release.30 In this scenario, PhsA TE would cleave the tripeptide from the PhsABC complex, whereas PhpL/M would serve a substrate-editing role, similar to characterized Type II TEs found in other natural product assembly lines. Significantly, our results revealed the ‘mini-TE’ residues of S. viridochromogenes PhsA are not conserved in S. hygroscopicus or K. phosalacinea (Figure 3a). In addition, we could not detect conserved GXSXG motifs elsewhere among the PhsA orthologs. Thus, based on the assumption that all three organisms use the same mechanism for product release, the mini-TE model is unlikely to be correct.

Figure 3
figure 3

Thioesterase signatures extracted from PTT and PAL biosynthetic genes. (a) Alignment of the first 59 N-terminal residues of PhsA from S. viridochromogenes, S. hygroscopicus and K. phosalacinea. The GXSXG motif hypothesized to be part of a mini-thioesterase domain30 in S. viridochromogenes PhsA is boxed. Arrowheads indicate equivalent positions in PhsAs from S.hygroscopicus and K. phosalacinea; the GXSXG motif found in the S. viridochromogenes synthetase is not conserved in the other PhsAs. Highly similar or identical residues are highlighted in gray. (b) The alignment of catalytic motifs extracted from PTT/PAL PhpL homologs with those from Type II thioesterase-like transacylases involved in zorbamycin (ZbmVIId), syringomycin (SyrC), coronamic acid (CmaE) and bactobolin (BtaH) biosyntheses. Each contains a cysteine-centered GXCXG variant thioesterase motif (boxed, with consensus residues indicated by arrowheads). Residues surrounding the conserved catalytic histidine found in most Type II thioesterases (SviridoPhpL His223) are also shown.

Despite their dispensability for PTT production, we believe that one of the Type II TEs, which are conserved in all three gene clusters, is likely to catalyze product release. It should be noted that the S. viridochromogenes double mutant lacking phpL and phpM produced PTT only at very low levels. Incomplete phenotypes such as this are challenging to interpret, and it is useful to recall that deletions of other core PTT biosynthetic genes have also yielded mutants with reduced compound production titers.31 In these cases, it has been argued that other enzymes in the cell can partially substitute for the pathway-specific gene products. Thus, the low-level PTT production seen in the phpL and phpM mutants could be enabled by other Type II TE homologs (including GenBank: ZP_07302110, ZP_07302075, ZP_07302004) found in the S. viridochromogenes draft genome. Alternatively, spontaneous hydrolysis from the terminal synthetase,29 could also allow the low levels of production seen in the mutant.

It has been observed that phpL in both S. viridochromogenes30 and S. hygroscopicus23 encodes a variant (GXCXG) thioesterase motif, whereas phpM encodes the canonical GXSXG motif. Our analyses indicate that these motifs are also conserved in the K. phosalacinea PhpL/M homologs (Figure 3b). Type II TE containing GXCXG motifs have been identified within various antibiotic gene clusters. Recent data suggest that these proteins act not as thioester hydrolases, but instead as transacylases that shuttle tethered intermediates between PCP domains located within NRPS assembly lines. Biochemically characterized examples of these PCP-to-PCP shuttles are found in the biosynthetic pathways of coronamic acid32 and syringomycin,33 where they have been proposed, but not tested, to function in the production of the bactobolins34 and zorbamycin.35 Moreover, it has been shown that modification of the GXSXG motif to GXCXG yields Type II TE variants with increased transacylase function.36 Interestingly, the PhsB homologs found in the PTT and PAL clusters contain two PCP domains, one of which cannot easily be assigned a function based on the standard logic of NRPS assembly lines. Based on the putative transacylase motif, we propose that PhpL catalyzes transfer of N-acetyl(demethyl)phosphinothricin between the PCP domain of PhsA and the N-terminal PCP domain of PhsB (Figure 2b). In this model, PhpM would be responsible for final product release. We recognize that, if correct, neither function can be absolutely required because the S. viridochromogenes mutants described above produce some PTT.

Using PTT biosynthesis to decipher a novel phosphinate cluster in Streptomyces sviceus and the environmental Streptomyces isolate WM6386

Until recently, PTT and PAL were the only known phosphinate natural products; however, examination of the phosphonate gene clusters identified by screening environmental actinomycete isolates suggests that additional examples will be found in nature.7 A recently identified gene cluster found in Streptomyces sp. WM6386, encodes homologs of enzymes that catalyse the first 10 steps of PT biosynthesis,7 which, if expressed, would direct the multistep transformation of phosphoenolpyruvate to the hydrogen-phosphinate intermediate phosphinopyruvate (Figure 1,Table 1). A highly similar gene cluster can be found in S. sviceus ATCC 29083 (Table 4). Although the S. sviceus cluster includes gaps and suspected sequencing errors, a comparison of gene order and conservation between the two clusters reveals a contiguous stretch of 25 genes that likely comprise the full putative phosphinate biosynthetic locus. Because biosynthetic genes leading to phosphinopyruvate comprise nearly half of the WM6386 locus (10 of 25), we designated this gene cluster mpbA-mpbR (for modified phosphinopyruvate biosynthesis).

Table 4 Genes clustered with phosphinopyruvate biosynthetic homologs in Streptomyces strain WM6386

In addition to genes predicted to encode phosphinopyruvate biosynthesis, the mpb locus encodes several proteins with Type II PKS domains and accessory proteins. Typical Type II PKS assembly lines consist of a series of trans acting protein domains, including an acyltransferase domains that tether CoA-modified carboxylates to the active site of an acyl carrier protein (ACP) and a trans ketosynthase domain that iteratively adds additional substrates to the ACP-bound substrate (see review24). In the mpb cluster, mpbF appears to encode a stand-alone ACP domain, whereas the adjacent mpbE encodes the phosphopantetheinyl transferase required to convert apo-MpbF to its active holo form. A putative CoA transferase, mpbG, is encoded within the cluster, but no acyltransferases genes were found, raising the question of how the putative CoA derivative produced by MpbG would engage the ACP. In addition, no ketosynthase-like proteins are encoded in the mpb cluster, suggesting that elongation of the phosphinopyruvate-derived backbone might not occur by typical PKS mechanisms. However, incorporation of a two-carbon unit might occur through the action of MpbM, a pyridoxylphosphate-dependent C-acetyltransferase homolog. Finally, the mpb cluster lacks a canonical thioesterase that would be required for releasing ACP-bound products, although it does encode a β-lactamase homolog. A β-lactamase with thioesterase activity was recently described in the fungal PKS-system responsible for atrochrysone biosynthesis,37 suggesting a possible role for mpbN. Several additional genes, including a putative Rieske-domain protein (mpbI), an amidohydrolase/ decarboxylase (mpbJ), a short-chain alcohol oxidoreductase (mpbK) and aminomutase (mpbL) might be involved in product-tailoring reactions. One of the more contextually unusual genes (mpbD) found within the cluster encodes a putative phosphite dehydrogenase. Because the predicted intermediate phosphonoformate is known to spontaneously decompose to phosphite and CO2, we speculate that this protein may be involved in recycling this adventitious side-product. Finally, the remaining mpb genes share similarity to transcriptional regulators (mpbA, mpbR) and transmembrane transport proteins (mpbBC and mpbH) that are often associated with natural product gene clusters.

Relationship of the PTT/PAL and mpb clusters to other reduced phosphorus biosynthetic loci

During our investigations into the biosynthesis of hydroxyethylphosphonate, a conserved intermediate in several phosphonate pathways, we noted that a small gene cluster in the F. alni ACN14a genome (Figure 1c) that appears to direct the synthesis of a novel phosphonate.38, 39 Interestingly, this cluster is also related to the PTT/PAL gene clusters, encoding homologs of proteins that catalyse the first six steps of PTT biosynthesis leading to the production of phosphonoformate (Figure 4). Analysis of additional genes in the Frankia cluster suggests that the end product is not phosphonoformate. ORF fraal6371 in the F. alni cluster encodes a predicted ligase similar to pantoate-β-alanine ligase. Related members of the cytidylyltransferase superfamily can form peptide bonds40 and we propose that the F. alni enzyme modifies phosphonoformate via amide bond formation with an unknown amino acid, as in rhizocticin biosynthesis,5 or with a primary amine such as β-alanine (Figure 1d, reaction VII). Finally, a putative SAM-dependent O-methyltransferase (fraal6373) is likely to modify the phosphonate product as in dehydrophos6, 41 and fosfazinomycin,42 whereas a putative transporter (fraal6374) may be responsible for export of the final phosphonate product.

Figure 4
figure 4

Metabolic branchpoints from studied reduced phosphorus antibiotic biosyntheses. A number of phosphonate and phosphinate biosynthetic pathways share conserved early biosynthetic steps. Illustrated are a number of transformations common to multiple reduced phosphorus antibiotics and how they relate to the F. alni, WM6386 and PTT/PAL biosyntheses discussed in the text.

Reduced phosphorus biosynthetic loci have emerged as rich sources of unusual biosynthetic enzymes, including two that are shared among some of biosynthetic gene clusters described here: hydroxyethylphosphonate dioxygenase (HEPD), the enzyme encoded by phpD, and the phosphinate methyltransferase encoded by phpK. HEPD is an iron-dependent dioxygenase that catalyzes the unusual cleavage of hydroxyethylphosphonate to hydroxymethylphosphonate and formate.43 When phpD was first sequenced, there were no genes with significant homology found in GenBank13, 14 and its reaction mechanism was deemed biochemically unique.43 Since that time, it has become clear that HEPD forms a mechanistically unified family that includes hydroxypropylphosphonate epoxidase and methylphosphonate synthase.44 We compared the HEPD/PhpD crystallographic data with the peptide to sequences of homologs from S. viridochromogenes, S. hygroscopicus, K. phosalacinea, WM6386 and F. alni using ESPRIPT.45 We found key residues of the S. viridochromogenes enzyme were conserved in each, including those coordinating hydroxyethylphosphonate and metal binding within the enzyme’s active site. (Supplementary Figure S1). Because these important residues are conserved across the PhpD ortholog panel, it is likely these enzymes should be functional, and probably substrate-conservative, in their respective pathways. We also note that phpD genes are a useful bioinformatic hallmark for identifying reduced phosphorus pathways that most likely proceed through a hydroxymethylphosphonate intermediate, especially when found in conjunction with ppm.

Unlike HEPD, much less is known about the phosphinate methyltransferase encoded by phpK and its orthologs. Most of the data establishing its role in PTT biosynthesis has been derived from genetic analyses and the isolation of biosynthetic intermediates from blocked mutants.46, 47, 48 Advances in bioinformatics led to the eventual realization that PhpK is a predicted member of the radical S-adenosyl-L-methionine family of enzymes,49 and the enzyme apparently requires methylcobalamin as a methyl donor. Considering this enzyme probably utilizes at least two cofactors, proceeds through a radical mechanism and represents a novel path to direct phosphorus alkylation, much research interest surrounds the mechanistic enzymology of PhpK. More recently, a report by Werner et al.50 demonstrated the cloning and in vitro reconstitution of PhpK from K. phosalacinea to enable the study of direct phosphorus methylation. This publication represents the only other study, aside from the current work, to describe any portion of the PAL biosynthetic locus. The authors indicated that K. phosalacinea PhpK (GenBank: AHZ58300.1) shares >99% identity to the previously published S. viridochromogenes homolog. Our sequence comparisons gave a significantly different result, with PhpK from K. phosalacinea sharing only 80% identity with that of S. viridochromogenes. Indeed, no orthologous protein pairs from any of the three PTT/PAL producers shared such a high level of identity (see Table 1). The source of the discrepancy between our K. phosalacinea PhpK sequencing data and that of the prior report remains unclear.

Studies on reduced phosphorus natural products are increasingly important as we struggle to find new drug scaffolds to fight antibiotic-resistant bacterial infections. Given the wide spectrum of biological activities already recognized for compounds of this type, and their already proven use in medicine and agriculture, new family members represent important lead compounds for drug development. The work presented here sets the stage for the further mechanistic investigations on known phosphinate peptide natural products and describes two promising biosynthetic loci for the discovery of new and possibly bioactive reduced phosphorus small molecules. Further, our analyses contribute to a growing catalog of reduced phosphorus enzymology, which will help in deconvoluting other reduced phosphorus loci and will aid genome sequence annotation efforts.

Abbreviations

Compound numbers, abbreviations and formal names are as follows: 1 phosphoenolpyruvate, 2 phosphonopyruvate, 3 phosphonoacetaldehyde, 4 hydroxyethylphosphonate, 5 hydroxymethylphosphonate, 6 phosphonoformaldehyde, 7 phosphonoformate, 8 CMP-5’-PF, 9 phosphonoformylglycerate (hypothetical intermediate), 10 carboxyphosphonoenolpyruvate, 11 phosphinopyruvate, 12 phosphinomethylmalate, 13 isophosphinomethylmalate, 14 α-keto-deamino-demethylphosphinothricin, 15 demethylphosphinothricin, 16N-acetyl-demethylphosphinothricin, 17N-acetyl-demethylphosphinothricyl-alanyl-alanine or N-acetyl-demethylphosphinothricyl-alanyl-leucine, 18N-acetylphosphinothricyl-alanyl-alanine or N-acetyl-phosphinothricyl-alanyl-leucine, 19 phosphinothricyl-alanyl-alanine (PTT) or phosphinothricyl-alanyl-leucine (PAL), 20N-phosphonoformyl-β-alanine.