Introduction

Polyketides and their semi-synthetic derivatives comprise a wide range of therapeutics in today’s pharmaceutical market, with annual sales reaching over $20B USD.1 Key examples include antibiotics (erythromycin, tylosin), antifungals (nystatin, amphotericin), immunosuppressants (FK506, rapamycin) and anticancer agents (epothilone, mitomycin). These chemically diverse compounds are made up of type I modular polyketide synthases (PKSs) from simple metabolites such as propionyl-CoA, malonyl-CoA and methylmalonyl-CoA. Native PKSs typically comprise five or more modules (a structural subunit of a complete PKS complex), which generate molecules containing 12 or more carbon atoms in the polyketide backbone, which is commonly cyclized into rings that are functionalized with sugars. Since Katz and co-workers2 provided the first evidence that type I PKSs function in the linear order indicated by their DNA sequences, and that each step in the overall biosynthesis is determined by a single, specific catalytic domain, rational engineering of the PKSs has been attempted by many academic and industrial groups to produce natural product analogs with improved or novel bioactivities. Despite the large investment of time and money, however, these efforts have met with very limited success, undoubtedly due to our insufficient understanding of the enzymes. After more than 25 years of intense study, we still have limited knowledge of the protein−protein interactions, substrate specificities, kinetics and overall structures of these complex enzymes, although Khosla and co-workers3 have recently achieved the first in vitro reconstitution with slight modifications of a naturally occurring synthase, and Skiniotis and co-workers4, 5 have provided the first structure of an entire PKS module. We strongly believe that, along with structural understanding, fundamental biochemical understanding of these enzymes is required for production of novel molecules by PKS engineering. In this short review, we will discuss newly obtained knowledge of type I PKSs from our recent studies of these enzymes.

Our work, located at the Joint BioEnergy Research Institute, a US Department of Energy facility, centers on repurposing pharmaceutically relevant type I PKSs to produce a variety of short-chain compounds that can be useful as biofuels or industrial chemicals. Our standardized approach is to first construct the target PKS gene(s) that could generate the desired product, produce the soluble PKS protein(s) in Escherichia coli and use the purified material to produce detectable levels of the desired product in vitro. This enzyme system could also be employed to determine the kinetics, in particular, the relative specificities, of the required substrates.

Substrate specificities of PKS domains

Recombination of various PKS domains or modules allows one to produce a nearly infinite range of chemicals. We refer to this as repurposing. Initial efforts were dedicated to repurposing a PKS to produce short, highly branched carboxylic acids such as 2,2,4 (or 2,4,4)-trimethylpentanoic acid (isooctanoic acid), which would be bio-based substitutes for petroleum-derived gasoline after chemical reduction. Branched-chain starter substrates in the polyketide chain initiation reaction are required for production of highly branched polyketides. It was known that the acyltransferase (AT)-acyl carrier protein (ACP) loading didomain of the avermectin PKS could use a variety of molecules, including the branched chain acyl-CoAs, isobutyryl- and 2-methylbutyryl-CoA,6, 7 to initiate avermectin biosynthesis, but we were not able to produce a soluble, active loading didomain either as a stand-alone protein or linked to an extender module (module 1) that would give us the required diketide in vitro. Hence we turned to the lipomycin PKS, which was proposed to use isobutyryl-CoA to initiate α-lipomycin biosynthesis, and whose AT-ACP loading didomain is linked to module 1 to form the first polypeptide (LipPks1) of the PKS complex (Figure 1a).8 In our hands, linkage of lipPKS1 to the erythromycin PKS thioesterase (TE) gave rise to soluble, active protein (LipPks1+TE) in E. coli. One of the key findings in our analysis was an unexpected substrate profile of the loading AT domain of the lipomycin PKS (Figure 1b).8, 9 AT domains of type I PKSs are the primary determinants of building block specificity in polyketide biosynthesis. Previously, in vitro kinetic studies of the loading AT domain of the erythromycin PKS revealed its specificity for propionyl-CoA, although acetyl- and butyryl-CoA were also accepted as substrates with about 40-fold lower kcat/KM.10 In contrast, the lipomycin loading AT accepted four different substrates (propionyl-CoA, isobutyryl-CoA, 2-methylbutyryl-CoA and isovaleryl-CoA) within only an eightfold variance in kcat/KM. Because we were interested in generating 2,4,4-trimethylpentanoic acid, we also used pivaloyl-CoA, which contains a tert-butyryl group as a potential substrate for the loading AT domain, but found very poor incorporation. Interestingly, 2-methylbutyryl-CoA showed comparable kcat/KM to that of the proposed natural substrate, isobutyryl-CoA. This was not anticipated because an α-lipomycin analog that would use 2-methylbutyryl-CoA as the starter had not been reported in the native producer, Streptomyces aureofaciens Tü117.8 Nonetheless, the in vitro kinetic parameters suggested that an analog of α-lipomycin that used 2-methylbutyryl-CoA to start the biosynthesis should be produced in the native host if the intracellular concentration of 2-methylbutyryl-CoA were raised to a level sufficient to enable incorporation. To achieve this, we added isoleucine, the precursor of 2-methylbutyryl-CoA, to the S. aureofaciens Tü117 culture and subsequently identified the predicted novel analog 21-methyl-α-lipomycin (Figure 1a), which exhibited similar in vitro antibiotic activity to lipomycin.11 Although substrate specificities of AT domains are usually assumed from structures of the corresponding segments of polyketide products isolated from the culture fluid, which depends on the particular culture conditions employed, intrinsic substrate specificity can only be determined by in vitro kinetic analysis. These studies highlight the importance of using kinetic analysis to determine actual substrate specificities of AT domains, which, in this case, led to the production of a novel polyketide. To further explore analogs of known polyketides, we are currently analyzing intrinsic substrate specificities of extender ATs that incorporate rare malonyl-CoA derivatives,12 which may ultimately lead to the discovery of new drug candidates.

Figure 1
figure 1

Repurposing the α-lipomycin biosynthetic pathway. (a) Proposed model for α-lipomycin and 21-methyl-α-lipomycin biosynthesis. (b) TE from erythromycin PKS (black box) was attached to the C-terminus of LipPks1 to release the product. The resulting engineered PKS (LipPks1+TE) produces 3-hydroxy carboxylic acids from various starter CoAs and methylmalonyl-CoA in the presence of NADPH in vitro. ACP, acyl carrier protein; AT, acyltransferase; DH, dehydratase; KR, ketoreductase; KS, ketosynthase; Nrps, non-ribosomal peptide synthetase; TE, thioesterase.

Unique substrate specificities of various domains in the borrelidin PKS were also uncovered in our work to produce adipic acid, an important diacid monomer used in the polymer industry and one of the most widely used commodity chemicals worldwide. To accomplish adipic acid biosynthesis through PKS engineering, we focused on the borrelidin PKS because it contains the only known loading AT domain that initiates polyketide biosynthesis with a carboxylated substrate (Figure 2a).13 The proposed natural substrate based on the isolated borrelidin structure is trans-1,2-cyclopentanedicarboxylic acid (trans-1,2-CPDA) activated to its cognate CoA. Our studies indicated not only a requirement of the loading AT for substrates containing a terminal carboxyl moiety, but also that this loading didomain could use additional acidic substrates beyond trans-1,2-CPDA.14 When the first two protein subunits of the borrelidin PKS (BorA1 and BorA2) were incubated with succinyl-CoA and malonyl-CoA, the reduced condensation product, 3-hydroxyadipic acid, attached to the ACP in BorA2 was generated (Figure 2b). Detection of the extended diketide on BorA2 ACP employed the P-Pant ejection assay, where the PKS is treated with trypsin and the peptide fragment containing the phosphopantetheine (p-pant) prosthetic group linked to the acyl chain is subjected to MS ionization, which results in the release of the p-pant-acyl chain from the peptide and subsequent internal cyclization of the p-pant moiety. The exact mass of the cyclized p-pant-acyl chain can be determined by LC/MS.15 Interestingly, replacement of succinyl-CoA with n-butyryl-CoA did not result in production of the corresponding diketide 3-hydroxyhexanoyl-ACP, indicating that either the loading AT or the ketosynthase (KS) domain of BorA2 (module 1) required an acidic acyl starter. To determine which domain imposed the selectivity, apo-BorA1 ACP was produced in an sfp− E. coli host. Sfp, a phosphopantetheinyl transferase,16 can be used to charge PKS apo-ACP domains with various acyl-CoAs. After apo-BorA1 ACP was purified, it was charged with either succinyl- or n-butyryl-CoA in vitro with Sfp, and used subsequently with BorA2 for production of the corresponding ACP-bound diketide. Both compounds were produced, indicating that only the loading AT domain requires the acidic acyl starter, and that the KS domain of BorA2 does not require but can tolerate an acidic acyl group. It is not known if KS domains of other PKSs that do not generate acid products are tolerant of acidic nascent acyl chains. In addition, our in vitro analysis of the loading AT domain (BorA1 AT) revealed a clear preference for one enantiomer (presumably 1R, 2R).14 It is still unclear if the loading AT evolved to discriminate against one of the stereoisomers that may be present in the cell, or if only one isomer is biosynthesized and the stereospecificity is incidental.

Figure 2
figure 2

Repurposing the borrelidin biosynthetic pathway. (a) Proposed model for borrelidin biosynthesis. (b) KR in BorA2 was replaced with DH from BorA3 (dotted box) and KR and ER from various PKS modules (boxes shown in black). TE from the erythromycin PKS (black box) was attached to the C-terminus of BorA2 to release the product. The resulting KR-swapped PKS produces adipic acid from succinyl-CoA and malonyl-CoA in the presence of NADPH in vitro. ER, enoylreductase; all other abbreviations as in Figure 1.

BorA2 carries only a ketoreductase (KR) domain (Figure 2a). Hence, to produce adipic acid, BorA2 must be engineered to carry the full complement of reductive domains: KR, the dehydratase (DH) and enoyl reductase (ER). To accomplish adipic acid production, we replaced the KR domain of BorA2 with KR−DH−ER tridomains from a number of PKS modules, and subsequently introduced additional replacements of the DH domain. Interestingly, we found that the DH domain from the second module of the borrelidin PKS (BorA3) was required for efficient production of adipic acid in vitro (Figure 2b).17 These results indicated that DH domains generally disfavor carboxylated substrates, and that only specialized DH domains will employ a distally carboxylated 3-OH acyl-ACP chain as a substrate for dehydration. The KR and ER domains from a variety of PKS modules appear to be tolerant to acidic substrates.17

Order of reactions in a PKS module

There is great potential utility for incorporation of gem-dimethyl groups in engineered polyketide products particularly for use as fuels, such as in trimethylpentane (isooctane). Examples of polyketide-derived natural products containing gem-dimethyl groups include epothilone (Figure 3a), pederin, bryostatin, and yersiniabactin (Figure 3b), which are generated via introduction of one or two methyl groups into the nascent acyl chain by a SAM-dependent C-methyltransferase (MT) domain present in the cognate module. The canonical view of gem-dimethyl group generation is that the C-MT acts at the 2-position of the nascent chain immediately after the KS domain mediates the condensation reaction to produce the extended β-ketoacyl-ACP intermediate.18 We have demonstrated, however, that methylation precedes condensation in gem-dimethyl group generation in the epothilone biosynthesis (Figure 3c).19 In this in vitro study, we generated malonyl-, methylmalonyl- or dimethylmalonyl-ACPs in epothilone module 8 (M8), which contains a C-MT domain, using Sfp to charge the apo-ACP domain produced from an sfp− E. coli host. Employing isobutyryl-S-N-acetyl-cysteamine to charge the KS domain and the p-pant ejection assay to determine the structure of the diketide generated on the ACP domain, we found that epothilone M8 could condense the isobuyryl starter with dimethylmalonyl-ACP in the absence of SAM but was unable to produce the cognate diketides with malonyl- or methylmalonyl-ACPs in the absence of SAM. Both substrates were extended in the presence of SAM. This unexpected finding indicates that this PKS module methylates methylmalonyl-ACP (the AT domain is specific for methylmalonyl-CoA) and then uses dimethylmalonyl-ACP as the substrate in the condensation reaction. Interestingly, the biochemical work also indicated that if the native AT domain of epothilone M8 were exchanged with an AT domain that incorporates malonyl-CoA, the C-MT domain would dimethylate malonyl-ACP to produce the gem-dimethyl group seen in the native product.

Figure 3
figure 3

Order of gem-dimethyl group biosynthesis. (a, b) Schemes showing gem-dimethyl group generation in epothilone and yersiniabactin biosyntheses. (c) Schemes showing the proposed order of gem-dimethyl group generation in epothilone biosynthesis, where methylation precedes condensation. (d) Alternative schemes showing that dimethylation can precede or follow condensation in the yersiniabactin PKS. HMWP1, high-molecular-weight protein 1; MT, methyltransferase; PCP, peptidyl carrier protein; all other abbreviations as in Figures 1 and 2.

A similar experiment was performed employing the yersinabactin PKS, which carries a C-MT domain as well as a KR domain, and whose AT domain is specific for malonyl-CoA. Hence, the gem-dimethyl group in the product yersinabactin is the result of dimethylation. Here we found that malonyl- and dimethylmalonyl-ACP could be equally extended in the absence of SAM;19 methylmalonyl-ACP was not examined. Hence, in contrast to the findings with epothilone M8, where methylation precedes condensation, in the yersiniabactin PKS dimethylation can precede or follow condensation in vitro (Figure 3d), but it is likely that a yet-to-be-determined preferred route takes place in vivo. This study not only sheds new light on the mechanism of C-methylation in type I PKSs but also reveals that PKS engineering strategies to incorporate gem-dimethyl groups may likely require a partner KS domain that accepts dimethylmalonyl-ACP as substrate for the condensation. To our knowledge, the KS domains in epothilone module 8 and the yersiniabactin PKS are the first examples that have been shown to accept dimethylmalonyl-ACP in vitro.

Stereochemical specificity

Type I PKS products are typically rich in stereocenters, and altering the stereochemistry is an important method to increase chemical diversity. The exchange of KR domains in model PKSs has been shown to predictably alter the stereochemistry of both the β-hydroxyl and α-methyl groups.20, 21 A naming convention has been established to describe the β-hydroxy (A or B) and α-substituent (1=nonepimerized, 2=epimerized) sterochemical outcomes, as shown in Figure 4a. In the aforementioned lipomycin LipPks1+TE studies, we could successfully produce various 3-hydroxy carboxylic acids in vitro (Figure 1b).9 LipPks1 contains an A2-type KR domain that specifically yields the (2S,3S) configuration in the acid products,22 which agree with the recent structural elucidation of α-lipomycin.23, 24 Importantly, these products could potentially be used to produce bio-plastics in the presence of polyhydroxyalkanoate synthases if the stereochemistry of the hydroxyl group were converted from S to R.25 To accomplish this, we exchanged the original A2-type KR domain with three different B-type KR domains (B, B1, and B2) that were inferred to yield 3 R-OH groups but, although all of the engineered PKSs were competent in condensation, none of these attempts were successful in producing a (3 R)-hydroxycarboxylic acid.22 The reason is still unclear, particularly in light of the finding that Leadlay and co-workers21 had successfully altered the stereochemistry of the 3-OH group by exchanging an A1-type KR with a B1-type KR. We did, however, alter the 2-methyl stereochemistry in the 3-hydroxy acid by employing an A-1 type KR domain in a KR swapping experiment. GC/MS analysis clearly demonstrated that we were able to successfully convert the stereochemistry from S to R at C2 (Figure 4b), which provided the first experimental evidence of stereochemical conversion of polyketide products from anti to syn in PKS engineering.22 This is of particular interest in light of recent work hypothesizing that anti products are thermodynamically favored.26

Figure 4
figure 4

Stereochemical conversions of type I PKS products by KR domain exchange. (a) Each of the KR types that provide different stereochemical outcomes in their natural product biosynthesis is shown. (b) Scheme showing replacement of A2-type KR in LipPks1 with A1-type KR (black box). TE domain from erythromycin PKS (black box) was attached to the C-terminus of LipPks1 to release the product. The resulting KR-swapped PKS produced polyketide products with predicted stereocenters from various acyl-CoA starters and methylmalonyl-CoA in the presence of NADPH.

Conclusions

Although fundamental mechanistic understanding of type I PKSs has continuously expanded over the last decade, there is still a significant amount of knowledge needed if we are to achieve combinatorial biosynthesis using engineered PKSs with reasonable success rates. To accomplish this, we believe that kinetic analyses designed to investigate the details of the molecular mechanisms, along with gaining a more complete understanding of the overall structure of modules and PKS complexes are required. In addition, exploring new heterologous hosts for polyketide production would also be important. Combinatorial biosynthesis of polyketides has only been reported in Streptomyces coelicolor27 and E. coli,28 with a limited success in terms of product titers compared to their wild-type counterparts. By utilizing insights learned from fundamental analysis of type I PKSs, we hope to contribute to the achievable goal of exploitation of this attractive enzyme as a platform for novel molecule production.