Initiating polyketide biosynthesis by on-line methyl esterification

Aurantinins (ARTs) are antibacterial polyketides featuring a unique 6/7/8/5-fused tetracyclic ring system and a triene side chain with a carboxyl terminus. Here we identify the art gene cluster and dissect ART’s C-methyl incorporation patterns to study its biosynthesis. During this process, an apparently redundant methyltransferase Art28 was characterized as a malonyl-acyl carrier protein O-methyltransferase, which represents an unusual on-line methyl esterification initiation strategy for polyketide biosynthesis. The methyl ester bond introduced by Art28 is kept until the last step of ART biosynthesis, in which it is hydrolyzed by Art9 to convert inactive ART 9B to active ART B. The cryptic reactions catalyzed by Art28 and Art9 represent a protecting group biosynthetic logic to render the ART carboxyl terminus inert to unwanted side reactions and to protect producing organisms from toxic ART intermediates. Further analyses revealed a wide distribution of this initiation strategy for polyketide biosynthesis in various bacteria.

27. Page 13, line 258. Suggest, 'However, such systems may yet be discovered by genome mining, while in the meantime, the results reported here should inspire synthetic biology efforts to generate novel polyketides incorporating terminal carboxylate groups.' (And here, a reference to a review on PKS genetic engineering would be appropriate) 28. Page 20, line 460. There is an odd mixture here of cloning methodology, and protein expression/purification. These should be separated, and reference made to the appropriate table in the Supplementary. 29. Supplementary Table 1. The listed domain composition of Art11 isn't consistent with Figure 2, as according to the figure, module 4 contains a tandem of ACP domains. Also, the authors could distinguish between the ECH1 and ECH2 homologs (Art20 and Art21). 30. Supplementary Fig. 6. Can the authors explain why the difference between the observed and calculated masses is larger in part a than in part b, for the same species? 31. Supplementary Fig. 7. Labeling should be added to this figure to identify to which of the multiple species potentially present in each assay the observed HPLC-MS peaks correspond. The legend to a is also incorrect, as in fact, malonyl-CoA methyl ester wasn't detected, but its expected mass is indicated. Same comment for part b of the legend. Reviewer #2: Remarks to the Author: Li et al present their most recent findings on the biosynthesis of aurantinin and its congeners. The compounds and their biogenesis has been under investigation since the late 1970s and its biosynthesis has been (vaguely) elucidated before (citation 12, https://doi.org/10.1021/acs.jafc.6b04455) using genome analysis.
The current ms. tackles the assignment of the different (cryptic) ORF to specific functions in the biosynthesis and elucidate some interesting features. Apart from the possible (but not yet proven!) convergent biosynthetic layout (as also found in a number of other polyketides), the authors report one to the best of my knowledge new function: The loading unit malonic acid is selectively methylated by an methyltransferase which is essential for the activity of the whole PKS assembly line. Its significance for the self-protection of B. subtilis fmb609 is discussed. The authors study this methyltransferase in vivo and to some extent in vitro and identify some homologs in other PKS gene clusters. Indeed, the biosynthetic mechanism is interesting and worth publication. However, for publication in a high-impact journal more definitive answers to some pressing questions would be necessary. The authors themselves suggest some further experiments that would bolster up this study in the discussion. Thus, this article largely rests on the on-line methyl esterification initiation strategy, which indeed is unprecedented but at the same time rather a curiosity than a mechanism of great relevance for natural product research. The more general message of this ms. thus is that researchers should look at least twice at unusual ORF -in vivo and in vitro. Overall, I recommend submission to a more specialized journal.
Reviewer #3: Remarks to the Author: The manuscript by Li et al. describes a partial dissection of the Aurantinins (ARTs), which are antibacterial natural products generated by a trans-AT modular polyketide synthase system. The authors conducted a series of gene disruption (and allied complementation) experiments to determine the boundaries of the art biosynthetic gene cluster (BGC) that is comprised of 28 genes across ~80 kb. They also performed precursor incorporation studies to identify the subunits and source of methyl groups within the pathway. The aurantinins are structurally unique metabolites that contain a 6/7/8/5 fused ring system. Two hypotheses currently drive the studies to test whether a single linear polyketide chain is generated that subsequently forms the unique tetracyclic ring system, or two linear chains converge to create the core molecule. Figure 1 provides the data relating to the outer boundaries of the BGC, and a series of comprehensive gene deletion and complementation experiments were conducted to support their characterization. A series of new molecules that represent intermediates in the pathway from gene deletion studies were isolated and characterized by MS and various rigorous NMR methods. The authors made the surprising finding that the starter unit of the pathway is in fact malonyl-CoA methyl ester, which they describe as a "protecting group" that renders the molecule inactive until that late stage of biosynthesis where the methyl group is removed by an esterase to unveil the carboxylate. It is this form of the molecule (ART B) that displays antibiotic activity against Gram positive bacteria. Biochemical studies were conducted on the recombinantly expressed methyltransferase (ART28) to demonstrate directly the formation of malonyl-CoA methyl ester. Direct studies were also performed on the corresponding esterase (ART9) to generate ART B (maximally active) from ART 9B (inactive).
There is significant merit in this study. The finding of a new type of starter unit that appears to "protect" the organism from its own antibiotic (self-resistance?) is important. Using in silico mining of natural product BGCs, the authors were able to identify numerous systems that appear to employ a similar biosynthetic strategy. This aspect of the work is perhaps the single most important contribution. Other aspects of the study (BGC sequencing, gene deletion, complementation) are relatively standard methods, but the authors were able to dissect some important aspects of the pathway.
Major Criticisms: On the other hand, there are numerous aspects of the work that fall short. First, the most significant question is arguably the mode of assembly of the unusual 6/7/8/5 tetracyclic ring system. Although the authors present this as a challenge, the work leaves this question untouched. Second, although the hypothesis is presented that the malonyl-CoA methyl ester starter provides some form of self-protection to the producing microbe, the authors fail to address how this "prodrug" strategy operates at the molecule level on the proposed cellular target (see reference 12), or whether other types of cell targets may be involved in the antibacterial activity. In this respect, the authors' scholarship on this topic is inadequate. What other biosynthetic systems employ transient chemical modification as a mode of self-protection? Indeed, there are numerous examples in the literature and the authors need to cover this topic in the discussion section to enhance scholarship and to place their work in a proper perspective.

Reviewer 1's remarks:
The building blocks employed to initiate polyketide biosynthesis represent an important source of structural diversity in these natural products. In addition, as shown here for the first time by analysis of aurantinin biosynthesis, they can be leveraged to introduce protecting group chemistry into the pathway, which is followed by late-stage unmasking to yield the antibiotic product. This elegant conclusion is supported by generation and analysis of a battery of genetic mutants, as well as reconstruction in vitro of the O-methylation chemistry using purified enzymes, with both types of approach supported by analytical chemistry (HPLC-MS and NMR). While the results described here are novel and will be of substantial interest to the biosynthetic community, there are a number of points that need to be addressed before the ms becomes suitable for publication in Nat. Commun. These are detailed below. Major comments: 1. The biggest concern are the results of the inactivation mutants (as shown for example in Fig. 1, c and d). The wild type chromatogram exhibits multiple peaks, only some of which can be attributed to mature ART metabolites (A, B and D), while other peaks are also present. However, in the art11, art1, art28 and art17MT* mutants, essentially ALL peaks have disappeared. Do the authors have an explanation for this finding? Also, in the wild type chromatogram, where is ART C?

Response: Thanks for the reviewer's careful inspections. The multiple peaks in the HPLC chromatogram of wild type strain are inferred to be aurantinin (ART) analogues based on their characteristic UV-vis spectrum of the unique triene moiety in ARTs and LC-MS data. During the isolation process, we noticed that some of the ART analogues are not stable, which may be one of the reasons that only ART A-D were identified till now.
In the art gene mutant strains, if the inactivated gene is essential to the ART polyketide chain assembly, the production of all those ART analogues will be blocked and those multiple peaks will disappear. For the four mutant strains that the reviewer noticed, (i) genes art11 and art17 encode ART polyketide synthases that are essential for the assembly of the polyketide chain; (ii) the art28 gene was proposed to be involved in the polyketide chain initiation process by us in this manuscript; and (iii) the art1 gene encodes a conserved anti-terminator that is responsible for regulatory control the transcription of the art biosynthetic gene cluster. Once gene art1 was inactivated, the whole art gene cluster failed to work, leading to no ARTs production. Similar regulatory manners existed in several cases of secondary metabolites from Firmicutes (DOI: 10.1038/nmicrobiol.2017.3). In summary, all these four genes are either structure genes essential for the ART polyketide chain assembly or polyketide genes positively controlling the ART gene cluster. Therefore, all peaks related to ARTs have completely disappeared in these four mutants.
In addition, the peak for ART C was marked on HPLC in the revised manuscript based on high-resolution mass analysis and comparison with the reported data.
2. Page 5, line 82. It is rather unreasonable to state that analysis of the trans-AT PKS provided 'few clues' as to the origin of the two polyketide chains, as there is publicly-available software (TransATor, doi: 10.1038/s41589-019-0313-7; https://github.com/pcm32/transator-container) which is capable not only of analyzing the composition in domains of trans-AT PKS subunits, but of predicting the structure of the resulting intermediates with good confidence. What is the result of analysis using TransATor, as this could provide important validation for the tool, or rather highlight certain deficiencies of interest to the community?

Response: As the reviewer suggested, we tested several different trans-AT PKS clusters on TransATor including the ART biosynthetic gene cluster. This publicly available analysis tool is user-friendly and highly reliable in prediction of biosynthetic domains of trans-AT PKS clusters. The predicted structures by TransATor are informative in all tested cases.
The TransATor results about the ART biosynthetic gene cluster were shown below. As we can see, the resulting polyketide intermediate is similar to the proposed core polyketide chain of ART, but provides limited information regarding the special starter unit, malonyl-ACP methyl ester, and the uncommon succinyl-CoA unit. Considering the spread of these two uncommon units, we believe the accuracy of in silico tools like TransATor could be further increased by incorporating the knowledge about malonyl-ACP methyl ester starter unit and succinyl unit described in this manuscript.
The bioinformatic analysis of the ART gene cluster by tools like transATor was added in the revised manuscript as: 'In silico analysis results of the art gene cluster using bioinformatic tools like TransATor 17  4. Page 12. As this is not the first time that a 'protecting group strategy' has been described for modular natural product biosynthesis, the authors should reference prior art in this part of the discussion (e.g. doi: 10.1038/nchembio.688 and subsequent articles which cite it (doi: 10.1039/c4sc01927j , etc.))

Response:
The content about protecting group strategy in natural product biosynthesis was added to the discussion section as suggested. The related papers were also referenced in the revised manuscript.
The added content in the revised manuscript is attached as below (page 13, line 257): 'Similar protecting group strategy has been observed during the biosynthesis of a number of natural products and a versatile range of activation strategies have been recruited. For example, the 'pro-drug' of antibiotic xenocoumacin is activated via cleavage of an acylated D-asparagine by a membrane-anchored peptidase XcmG upon secretion 42 . In addition, zeamines 43 , colibactin 44 , and zwittermycin 45 are also activated by post-assembly proteolytic processing; while calyculin 46 , oleandomycin 47 , and naphthyridinomycin 48 are activated by dephosphorylation, deglycosylation, and oxidative processes, respectively.' 5. Supplementary Fig. 2. It appears that the two large branches of this tree have been artificially superimposed on each other (a vertical line from the green branch overlaps with a horizontal line from the blue branch). Is there a benign explanation for this observation? Response: We greatly appreciated reviewer's careful inspection on this figure! It was overlooked by us that the phylogenetic analysis figure generated from the website had something wrong due to the display problem. We are really sorry about that! To solve this problem, the raw data of the phylogenetic analysis was retrieved and viewed with another viewer software (TreeDyn). This new figure was used in the revised Supplementary Information as reordered Supplementary Fig. 6 (as shown below).
6. Supplementary Fig. 8. It is remarkable that all of these reactions went to 100%, despite the non-native nature of the ACP domain to which the substrate was tethered. The authors might like to comment on this result, and the fact that it implies that protein-protein interactions contributing to specificity are limited in this case. Response: Actually, the initial reactions were not as efficient as those showed in Supplementary Fig. 14 (the original Supplementary Fig. 8 [6][7][8] . In principle, the polyketide core structures are consistent with the domain organizations of module PKSs, which was called the co-linearity rule 9 . The vast structural diversity of polyketides is introduced by different combinations of initiation, extension, and termination steps as well as versatile tailoring processes. For the initiation process, malonyl-coenzyme A (CoA) is the most frequently used starter unit, which is usually decarboxylated to an acetyl-thioester to begin the extension process. Consequently, most polyketides constructed in this way have a terminal methyl group unless it is modified by tailoring enzymes [10][11][12]  5. Page 3, line 47. The authors describe the apparent origin of the short polyketide chain as 'extraordinary' because if it were generated by classical PKS chemistry, it would require a tail-to-tail condensation. However, as alternative origins are plausible (and indeed they show that it arises from intact incorporation of succinate), the 'extraordinary' nature of this chain is rather exaggerated. Response：'Extraordinary' was replaced by 'uncommon' in the revised manuscript. 8. Page 5, line 88. The β-methylation cassette typically includes a discrete ACP domain, and indeed, the authors identified Art7 as a stand-alone ACP. It should thus be mentioned here. (Similarly, page 6, line 99, should be 'five genes') Response：The art7 gene which encodes the stand-alone ACP domain was included to the β-branching genes. And, the 'four genes' for β-branching was changed to 'five genes' in revised manuscript. This change was made in the following text as well. 9. Page 5, line 94. Reference should be made at this point to the appropriate Supplementary figures/tables, which should be reordered to be sequential. Also, the authors should comment on the yields, as PKS mutagenesis is typically associated with substantial reductions in titers. Response：The related Supplementary figures and tables were referred (page 6, line 101) and their numbers were reordered sequentially both in the revised manuscript and in Supplementary information.

. A great number of polyketides are constructed by giant modular polyketides synthases (PKSs) with multifunctional domains. After initiated by the loading module, the polyketide chain will be elongated iteratively by each extension module in a manner analogous to fatty acid biosynthesis. The acyltransferase (AT) domain loads a specific extender unit to acyl carrier protein (ACP) and the ketosynthase (KS) domain performs a two-carbon addition to ACP-tethered acyl-thioesters via head-to-tail Claisen condensation. The newly synthesized β-ketothioester intermediates could be modified by additional domains (e.g. ketoreductase (KR), dehydratase (DH), enoyl reductase (ER), and methyltransferase (MT)) in the extension module before they are released by an offloading module which usually contains a thioesterase (TE) or a reductase (R) domain 3-5 . Specifically, a large family of modular PKSs is trans-AT (or AT-less) PKS, which does not have an AT domain in each extension module as cis-AT PKS but shares a standalone AT by multiple extension modules
The comment 'Productions of all the demethylated ART congeners were reduced dramatically in those mutant strains with PKS mutagenesis' was added in the revised manuscript (page 6, line 104).
In addition, the yields of different demethylation congeners produced by MT mutants were also described in the part of Isolation of compounds in Materials and methods, respectively (page 18, line 374).
10. Page 6, line 98. Non-expert readers will not understand what the 'co-linearity rule' refers to. An explanation should therefore be provided (and could be included in the revised introduction).

Response： The concept of co-linearity rule for PKS was provided in the first paragraph of the revised manuscript (page 3, line 42):
'In principle, the polyketide core structures are consistent with the domain organizations of module PKSs, which was called the co-linearity rule 9 .' 11. Page 6, line 103. A more accurate description is that the β-branching enzymes together incorporate the methyl groups into the intermediates generated on Art11. Response：We rephrased the sentence as 'implying that the methyl groups at C-5 and C-7 are incorporated into the intermediates generated on Art11 by the β-branching enzymes together' (page 6, line 110).
12. Page 6, lines 105 and 109. It would be better to say that they 'hypothesized' that Art17 assembled the short polyketide chain (but again, this would have been pre-validated by the use of TransATor), and Art2 initiates the biosynthesis.
Response：We have predicted the structure using art gene cluster as the query by TransATor. The generated results suggested that Art17 could be responsible for the short polyketide chain assembly despite not inclusive of the special succinate unit. Consequently, we replaced the word 'proposed' with 'hypothesized' in the revised manuscript.
13. Page 6, line 110. The authors could already point out here that use of succinate as a starter unit would explain the apparently 'extraordinary' tail-to-tail condensation of two acetate units referred to earlier in the text. Response： We rephrased the sentence as 'by loading an uncommon starter unit succinyl to the first ACP of Art17' (page 6, line 116).
14. Fig. 2. The authors should explain the two colors of ACP in the β-methylation reaction inset (i.e. that the reaction sequence occurs in two different modules of Art11). In addition, for the non-expert reader, the legend should include definitions of all domain name abbreviations (ACP, KS, KSo, etc.). The authors should also explicitly state what the colored carbons refer to (i.e. points at which the methyl groups have been removed by inactivation of the corresponding cMT domains) Response：The reaction sequence occurred on differently colored ACP-bound intermediates in module 3 and module 4 were explained in the legend of Fig. 2. The carbons labeled with varied colors represent the locations where the methyl groups were absent by inactivation of the corresponding MT domains, which was added in the legend of Fig. 2. All domain abbreviations in Fig.2 were explained in the legend as shown below. ' Figure 2. The proposed biosynthetic pathway of ARTs based on the two-polyketide-chain assembly model. The inset depicts the process that methyl groups are appended to C-5 and C-7 of the ACP-bound intermediates by β-branching system. The methyl group at C-5 is first installed on the β-carbonyl of nascent polyketide chain tethered on ACP (dark blue) in module 3. subsequently, C-7 methylation was formed on the intermediate tethered on module 4 ACP (orange) in a same manner. The absence of methyl groups at C-2, C-14, and C-18 were highlighted with points in green, blue, and purple corresponding to the inactivated domains Art11MT, Art13MT, and Art14MT, respectively. ACP, acyl carrier protein; KS, ketosynthase; KS 0 , non-elongating ketosynthase; KR, ketoreductase; DH, dehydratase; ER, enoylreductase; MT, methyltransferase; HMGS, hydroxymethylglutaryl synthase.' 15. Page 7, line 122. It would be better to write that the feeding experiments 'suggested a possible origin for the free terminal carboxylic acid group' as being via oxidative cleavage, as the authors go on to show that it is generated by an alternative route (i.e. 'indicated' is misleading) Response：Thanks for the reviewer's suggestion! The sentence was rephrased as 'suggesting a possible origin for the free terminal carboxylic acid group by oxidative C-C bond cleavage of the polyketide chain and subsequent hydrolysis' (page 7, line 130).
16. Page 7, line 125 (also Page 7, lines 135 and 139 and Page 8, line 145). The polyketide chain is not strictly-speaking 'split', but the BVMOs rather insert oxygen into the chain. Response：We agreed with the reviewer. The word 'split' was removed and the description that 'OocK and VdtE inserted an oxygen into the polyketide chain between the two carbons of one acetate unit' was used to make it more appropriate (page 7, line 133). Similar descriptions were also used in following text in the revised manuscript (page 8, lines 143, 147, and 152).
17. Page 7, line 130. Again, appropriate reference needs to be made to the Supplementary materials. Response：The relevant Supplementary Figure 8 and Table 6 to ART 9B were added in the revised manuscript (page 7, line 139).
18. Page 7, lines 120−134. Suggest using a synonym to 'expose' to avoid repetition, for example, 'reveal' or 'liberate' Response：The word 'exposing' was changed to 'liberating' in the revised manuscript as suggested.
19. Page 8, line 142. But saying that this particular experiment was done'carefully', this would imply that the other experiments to which they refer (and by other researchers) were not. Response：Thank you for the suggestion! The word 'carefully' was removed to avoid unnecessary misunderstanding. 24. Figure 5. In this figure (and in the accompanying text, page 12), the authors have mixed chain initiation strategies for both cis-AT and trans-AT PKSs. In light of the fact that the two systems apparently have distinct evolutionary origins which likely contribute to the observed diversity, it would be appropriate to separate the strategies by PKS family. Again, definitions of domain/stand-alone enzyme acronyms should be provided. Also, 'loading starter unit onto an ACP'; 'by on-line modification of the starter unit'. Capitalize Transamidation, but not Type I. Also suggest as a title, 'Selected initiation strategies…', as this is not a complete list. Finally, citations to the relevant work demonstrating these strategies should be added. The discussion part about polyketide initiation strategies was also revised as shown below (page 11, line 226). 'Polyketide chain extension generally starts immediately after the starter unit is loaded; the initiation process includes preparation of the acyl-CoA starter unit (if it is not available from primary metabolism) and transacylation of the starter unit to the loading ACP 10, 12,32 . However, in some cases the starter unit must be modified after being loaded onto an ACP in order to trigger polyketide chain extension. One classic example is on-line decarboxylation of the malonyl or methylmalonyl starter unit. In cis-AT modular PKSs, it is catalyzed by the N-terminal KS Q domain of PKSs [33][34][35][36] (Fig.  5a), while in trans-AT modular PKSs, it is performed by a GCN5-related N-acetyltransferase (GNAT)-like domain 37,38 (Fig. 5b). In addition, an on-line transamidation strategy has been proposed to initiate biosynthesis of glutarimide-containing compounds such as isomigrastatin and cycloheximide, although this mechanism requires further validation by in vitro studies 39,40 (Fig. 5b).' 25. Page 10, line 199. Suggest, '…build on this work to establish a common…' Response：Changed as suggested.

Response： Thanks for the reviewer's suggestion! In the revised manuscript, we selected several representative examples of polyketide initiation strategies involving online modification of the starter unit and separated into two parts in
26. Page 10, line 201. Having introduced the concept of trans-AT PKSs earlier, it is not necessary to use the terms here (and indeed, the addition of a second one ('AT-less') will be confusing at this stage) Response：To avoid the confusion, 'AT-less' was showed only once when the term 'trans-AT PKS' was introduced for the first time in the revised manuscript (page 3, line 40).
27. Page 13, line 258. Suggest, 'However, such systems may yet be discovered by genome mining, while in the meantime, the results reported here should inspire synthetic biology efforts to generate novel polyketides incorporating terminal carboxylate groups.' (And here, a reference to a review on PKS genetic engineering would be appropriate) Response：We rewrote this sentence as suggested and two reviews on PKS genetic engineering were referenced as ref. 50  30. Supplementary Fig. 6. Can the authors explain why the difference between the observed and calculated masses is larger in part a than in part b, for the same species? Response: In the original version, we just picked two peaks from the HR-MS data that are less than 5 parts per million (ppm) to the calculated mass of ART B, which is generally accepted by most journals. Thanks for the reviewer! After we scrutinized the mass data, the observed mass much closer to the calculated mass of ART B could be found in the HR-MS data of B. subtilis Δart28/Bc-bioC. Thus, a more accurate mass data ([M+H] + =781.4159) was used in part a of Supplementary Fig. 12 in the revised manuscript (the original Supplementary Fig.6 ).
31. Supplementary Fig. 7. Labeling should be added to this figure to identify to which of the multiple species potentially present in each assay the observed HPLC-MS peaks correspond. The legend to a is also incorrect, as in fact, malonyl-CoA methyl ester wasn't detected, but its expected mass is indicated. Same comment for part b of the legend. Response: We are sorry for the confusion in Supplementary Fig. 13 (the original Supplementary Fig. 7)! In this figure, the Art28 catalyzed methylation of malonyl-CoA was tested. The results showed that Art28 can not use SAM as a methyl donor to form malonyl-CoA methyl ester. In part a, the left LC-MS data was the EIC for malonyl-CoA, which showed that, compared with the control assay with boiled Art28, the substrate malonyl-CoA was not clearly consumed in the malonyl-CoA+SAM+Art28 assay. The right LC-MS data was the EIC for malonyl-CoA methyl ester, which showed that no malonyl-CoA methyl ester was generated whether in the malonyl-CoA+SAM+Art28 assay or in the control assay. In part b, the left LC-MS data was the EIC for SAM, which showed that, compared with the control assay with boiled Art28, the substrate SAM was not clearly consumed in the malonyl-CoA+SAM+Art28 assay. The right LC-MS data was the EIC for SAH, which showed that no SAH was generated whether in the malonyl-CoA+SAM+Art28 assay or in the control assay.
In the revised manuscript, we labeled the peaks of malonyl-CoA and SAM as suggested. In addition, EIC for malonyl-CoA, EIC for malonyl-CoA methyl ester, EIC for SAM, and EIC for SAH were added to the related LC-MS data panel, respectively.