Abstract
Comparative analysis of the enediyne biosynthetic gene clusters revealed sets of conserved genes serving as outstanding candidates for the enediyne core. Here we report the crystal structures of SgcJ and its homologue NCS-Orf16, together with gene inactivation and site-directed mutagenesis studies, to gain insight into enediyne core biosynthesis. Gene inactivation in vivo establishes that SgcJ is required for C-1027 production in Streptomyces globisporus. SgcJ and NCS-Orf16 share a common structure with the nuclear transport factor 2-like superfamily of proteins, featuring a putative substrate binding or catalytic active site. Site-directed mutagenesis of the conserved residues lining this site allowed us to propose that SgcJ and its homologues may play a catalytic role in transforming the linear polyene intermediate, along with other enediyne polyketide synthase-associated enzymes, into an enzyme-sequestered enediyne core intermediate. These findings will help formulate hypotheses and design experiments to ascertain the function of SgcJ and its homologues in nine-membered enediyne core biosynthesis.
Similar content being viewed by others
Introduction
The enediynes represent one of the most fascinating families of natural products for their unprecedented molecular architecture and extraordinary biological activities, and they have had profound impact on modern chemistry, biology and medicine.1, 2, 3, 4 Since the structure of the neocarzinostatin (NCS) chromophore was first elucidated in 1985,5 the enediyne family of natural products has grown steadily with a total of 15 enediynes structurally characterized to date, of which four were isolated in the cycloaromatized form.4 The enediynes are classified into two subcategories according to the size of the enediyne core structures.1, 2 Members of the 9-membered enediyne core subcategory included NCS, C-1027, kedarcidin, maduropeptin, N1999A2, the sporolides, the cyanosporasides (CYA, CYN) and the fijiolides (Figure 1a). Members of the 10-membered enediyne core subcategory included the calicheamicins (CAL), esperamicins (ESP), dynemicin (DYN), namenamicin, shishijimicin and uncialamycin (Supplementary Figure S1). The enediynes have provided an outstanding opportunity to decipher the genetic and biochemical basis for the biosynthesis of complex natural products,1, 2, 3, 4 to explore ways to make novel analogues by manipulating genes governing their biosynthesis,6, 7, 8, 9, 10 and to discover new enediyne natural products by mining microbial genomes for the trademark enediyne biosynthetic machineries.3, 4, 11, 12
The first set of biosynthetic gene clusters for the 9-membered enediyne C-1027 13 and the 10-membered enediyne CAL14 was cloned in 2002. Since then, a total of seven biosynthetic gene clusters for 9-membered enediynes (that is, C-1027,13 NCS,15 MDP,16 kedarcidin,17 sporolides,18 CYA19 and CYN 19) and three biosynthetic gene clusters for 10-membered enediynes (that is, CAL,14 ESP (partial)11 and DYN20) have been reported. Comparative analysis of these gene clusters revealed a set of five genes common to both 9- and 10-membered enediynes (that is, the enediyne polyketide synthase (PKS) cassette consisting of E3/E4/E5/E/E10 (Figure 1b)), characterization of which has unambiguously established (i) the polyketide origin for both 9- and 10-membered enediynes and (ii) a convergent model for enediyne biosynthesis.3, 4, 21, 22 Although significant progress has been made toward elucidating the biosynthesis of the peripheral moieties present in enediynes, little is known about the enediyne core biosynthesis. In vivo and in vitro studies have established that the iterative type I PKS enzyme E initiates both 9- and 10-membered enediyne core biosynthesis via an acyl carrier protein-tethered linear polyene intermediate, which, in the absence of other enediyne PKS-associated enzymes, could be released by the thioesterase E10 to afford a heptaene.21, 22, 23, 24, 25 However, the enzymes and chemistry responsible for converting heptaene, or the nascent acyl carrier protein-tethered linear polyene intermediate, into the 9- and 10-membered enediyne cores remain elusive. Many of the candidate genes, predicted to be associated with enediyne core biosynthesis, are often annotated to encode proteins of unknown function.3, 4 Inactivation of these candidate genes in vivo afforded mutant strains that often failed to accumulate any biosynthetic intermediate, revealing few clues for their function in enediyne core biosynthesis. Lack of functional prediction, together with the unavailability of suitable substrates, essentially forfeits any practical attempt to directly characterize these proteins biochemically in vitro.
Here we report the crystal structures of SgcJ and its homologue NCS-Orf16, together with gene inactivation and site-directed mutagenesis studies, to gain insight into enediyne core biosynthesis. We first closely examined the seven gene clusters that encode 9-membered enediyne biosynthesis and uncovered seven genes (E2, E7, E8, E9, E11, M and J), in addition to the five genes, that is, E3/E4/E5/E/E10, encoding the enediyne PKS cassette, that are absolutely conserved but their function could not be predicted on the basis of bioinformatics analysis alone. We then subjected these targets to high-throughput structural biology analysis. This effort resulted in several structures, including SgcJ from the C-1027 and its homologue NCS-Orf16 from the NCS biosynthetic machineries. We next confirmed that SgcJ is absolutely required for C-1027 biosynthesis, inactivation of which in the C-1027 overproducer Streptomyces globisporus SB102210 completely abolished C-1027 production in the resultant ΔsgcJ mutant strain SB1027. We finally showed that SgcJ and NCS-Orf16 share a common structure with the nuclear transport factor 2 (NTF2)-like superfamily of proteins, featuring a hydrophobic pocket in the α+β barrel structure that could constitute as a putative substrate binding or catalytic active site. Site-directed mutagenesis of the conserved residues lining this site abolished C-1027 production, suggesting that SgcJ and its homologues may play a catalytic role in the 9-membered enediyne core biosynthesis.
Results and discussion
SgcJ and homologues are conserved among the 9-membered enediyne biosynthetic gene clusters but their function could not be predicted
Inspired by the enediyne PKS cassette, consisting of E3/E4/E5/E/E10, that is conserved among the seven 9-membered and three 10-membered enediyne biosynthetic gene clusters characterized to date, we recently completed a virtual survey of all bacterial genomes available in public databases using the enediyne PKS cassette as a probe.3, 4 This effort resulted in the identification of an additional 77 putative enediyne biosynthetic gene clusters, implying that enediynes are more common than currently appreciated on the basis of structurally characterized enediyne natural products.1, 2, 3, 4, 11, 12 We subsequently constructed an enediyne genome neighborhood network, including both the 10 known and 77 putative enediyne gene clusters, to facilitate cluster annotation and predict 9- and 10-membered enediyne core biosynthesis. The enediyne PKS cassette is present in all 87 gene clusters, suggesting that they may be responsible for biosynthesis of a common intermediate for both 9- and 10-membered enediyne cores. Subsets of genes that are unique to either 9- or 10-membered enediyne gene clusters are also identified, as exemplified by the E2, E7, E8, E9, E11, M and J genes from the seven known 9-membered enediyne biosynthetic gene clusters (Figure 1b), and they may play roles in diversifying the common intermediate into the 9- or 10-membered enediyne cores, respectively.3, 4
Among this set of genes is SgcJ (Figure 1b), and its homologues are present in the 34 putative 9-membered enediyne biosynthetic gene clusters (Supplementary Figure S2). SgcJ and homologues are comprised of 140–160 amino acids, with amino acid sequence identities ranging from 30 to 66%. According to the BLASTP search result, SgcJ homologues feature a domain of unknown function (DUF4440) and belong to the NTF2-like superfamily, a large group of related proteins that share a common protein fold. The NTF2-like superfamily proteins are widely found in both prokaryotic and eukaryotic organisms and possess versatile functions.26 Proteins in the NTF2-like superfamily are generally defined into two categories, enzymatically active and non-enzymatically active proteins. The former group includes enzymes with varying activities such as the ketosteroid isomerase,27 scytalone dehydrogenase28 and polyketide cyclase.29, 30 The latter group includes proteins that could play roles as diverse as facilitating protein transport into the nucleus31 or mediating multimerization of calcium/calmodulin-dependent protein kinase II (CaMKII),32 or may function as a receptor.33 The enediyne variants of SgcJ show less than 18% amino acid sequence identity to functionally characterized NTF2-like proteins. Owing to the diverse functions of the NTF2-like superfamily, bioinformatics analysis alone fell short of predicting the function of SgcJ and its homologues in the 9-membered enediyne core biosynthesis.
Gene inactivation reveals that sgcJ is necessary for enediyne biosynthesis
To establish a functional linkage of sgcJ and its homologues with enediyne biosynthesis, we inactivated sgcJ in the C-1027 overproducer S. globisporus SB1022 10 by replacing it with the kanamycin resistance cassette through λ-RED-mediated PCR targeting mutagenesis34 (Supplementary Figure S3a). The genotype of the resulting ΔsgcJ mutant strain SB1027 was confirmed by PCR and Southern analysis (Supplementary Figure S3b). SB1027 was fermented under the established conditions for C-1027 production with S. globisporus SB1022 as a positive control.9, 10, 13, 35 Although C-1027 production by SB1022 was readily confirmed upon both bioassay against Micrococcus luteus and BIA, SB1027 completely abolished the production of C-1027, which was unambiguously verified by HPLC and ESI-MS analysis (Figure 2, panels I and II). The requirement for sgcJ in C-1027 biosynthesis was further supported by the fact that the ΔsgcJ mutation in SB1027 could be complemented by expressing a functional copy of sgcJ in trans, restoring C-1027 production in the complementation strain SB1028 to the level comparable to that of SB1022 (Figure 2, panels I and III). Taken together, these data clearly established that SgcJ plays a necessary role in C-1027 biosynthesis and, by analogy, the essential role SgcJ homologues play in nine-membered enediyne core biosynthesis. However, SB1027 failed to accumulate any biosynthetic intermediate to sufficient levels for isolation and structural characterization, revealing no clues for its exact function. We therefore opted to solve the structures of SgcJ and its homologues in an attempt to elucidate their function in 9-membered enediyne core biosynthesis.
The overall structure of SgcJ and NCS-Orf16 reveals structural similarity to NTF2-like superfamily proteins
The crystals of SgcJ were obtained in the monoclinic space group C2 with unit cell parameters a=72.7, b=86.9 and c=55.3 Å and α=γ=90.0°, and β=121.6°. The asymmetric unit contained two peptide chains, corresponding to a solvent content of 50.9%. The asymmetric unit also contained molecules of citric acid, glycerol, phosphate, pentaethlene glycol and tetraethylene glycol, which were present in the crystallization condition. The final model of SgcJ was refined to a resolution of 1.7 Å with an R factor of 16.9% and an Rfree factor of 19.5%. Ramachandran analysis reveals that 99.6% of the residues were in the favored region with none in disallowed regions. Electron density map was well-defined for residues Ser3-Asp140 and Ala10-Asp140 for the two polypeptide chains in the asymmetric unit. Data collection and refinement statistics are summarized in Table 1.
The NCS-Orf16 crystals were obtained in the monoclinic space group P21 with unit cell parameters a=98.3, b=52.8 and c=131.8 Å and α=γ=90.0°, and β=90.1°. The asymmetric unit contained 10 peptide chains, corresponding to a solvent content of 45.0%. The final model of NCS-Orf16 was refined to a resolution of 2.72 Å with an R factor of 21.6% and an Rfree factor of 25.6%. Ramachandran analysis reveals that 97.2% of the residues were in the favored region with none in disallowed regions. Electron density map was well-defined for residues Thr19-Arg142 for each peptide chain in an asymmetric unit. Data collection and refinement statistics are summarized in Table 1.
SgcJ and its homologues show high amino acid sequence homology (Figure 3a), with SgcJ and NCS-Orf16 sharing 45% amino acid sequence identity. The crystal structures of SgcJ and NCS-Orf16 feature a common three-dimensional structural fold (Figure 3b). The structure of SgcJ superimposed well with NCS-Orf16 with a root-mean-square deviation (rmsd) of 0.83 Å for the Cα atoms. The overall structures of SgcJ and NCS-Orf16 form a cone-like α+β barrel structure, which are both comprised of a long N-terminal α-helix (α1-α2) passing though the curved six-stranded antiparallel β-sheet (β1-β6), with two additional shorter α-helices (α3 and α4) neighbor upon the α1-α2 helix (Figure 3b). The β-sheet packs against the three α-helices to form a hydrophobic core within the α+β barrel (Figure 3b). The crystal structures of SgcJ and NCS-Orf16 are packed as homodimers in an asymmetric unit, which are generated via non-crystallographic twofold axes (Figure 3c). The dimer interface is formed via a hydrogen-bonding network and salt bridges between the flat-face of the β-sheet from each monomer. Both SgcJ and NCS-Orf16 were indeed found to be homodimers in solution upon size exclusion chromatography (Supplementary Figure S4).
Consistent with the BLASTP search result, a search of the PDB databank using the DALI server36 revealed that SgcJ and NCS-Orf16 belong to the NTF2-like superfamily. This versatile superfamily is a classic example of divergent evolution wherein the proteins have similar overall structures but diverge greatly in their functions.26, 37 Several crystal structures for NTF2-like superfamily proteins were reported, of which the functions have been characterized, including the association domain of CaMKII from mouse (PDB entry 1HKX),32 NTF2 from rat (PDB entry 1OUN),38 ketosteroid isomerase (KSI) from Pseudomonas putida (PDB entry 1OPY),27 scytalone dehydatase (mgSD) from Magnaporthe grisea (PDB entry 1STD),28 and polyketide cyclases SnoaL from Streptomyces avidinii (PDB entry 1SJW)29 and Tcm ARO/CYC from Streptomyces glaucescens (PDB entry 2RER).30 Despite low amino acid sequence identities, ranging from 9.8 to 17.5%, SgcJ was found to share similar folds with each of the NTF2-like superfamily proteins listed, with rmsds of 3.0, 2.6, 2.7, 2.7, 3.1 and 3.0 Å for the Cα atoms, respectively (Figure 4a). SgcJ, NTF2, KSI and SnoaL form homodimers, while mgSD and CaMKII form a trimer and tetradecamer, respectively. Most importantly, all these NTF2-like superfamily proteins contain a hydrophobic pocket in the α+β barrel structure (Figure 4a), which forms a cavity that could be adapted to create an enzyme active site or a small molecule/peptide binding site, thereby serving the versatile functions.
Putative substrate binding cavity and catalytic residues of SgcJ and its homologues
KSI (isomerase),27 mgSD (dehydratase),28 SnoaL (cyclase)29 and Tcm ARO/CYC (cyclase)30 are enzymatically active proteins within the NTF2-like superfamily. Although their functions are different, they share a common catalytic mechanism: (i) a general base abstracts a proton from Cα of a carbonyl group to form an enolate intermediate, which is stabilized by a general acid; (ii) the enolate intermediate tautomerizes back to the carbonyl group followed by double bond rearrangement or nucleophilic attack (Figure 4b). In KSI, mgSD, SnoaL and Tcm ARO/CYC, the general acid-base pairs that initiate the reactions are Asp40-Tyr16/Asp103, His85-a water bound by Tyr30 and Tyr50, Asp121-Gln105, and Tyr35-Arg69, respectively (Figure 4b). Interestingly, the crystal structures of SgcJ and NCS-Orf16 reveal conserved Asp111-Tyr72 and Asp115-Tyr76 pairs located at the entrance of the pocket, respectively (Figure 4a). Since these amino acids are known to act as the general acid-base pair in catalysis, it is tempting to speculate that SgcJ may play a similar catalytic role in transforming the nascent linear polyene intermediate along with other enediyne PKS-associated enzymes, into the 9-membered enediyne core.
Additionally, both SgcJ and NCS-Orf16 form a hydrophobic cavity within their pockets. Despite less than 50% amino acid sequence identity between SgcJ and NCS-Orf16, the amino acids lining the cavities are conserved: Trp29, Phe37, Tyr72, Trp118 and Tyr132 in SgcJ versus Trp32, Phe40, Tyr76, Trp122 and Tyr136 in NCS-Orf16 (Figures 3a and 5). Intriguingly, in the crystal structure of SgcJ, a molecule of pentaethylene glycol (1PE in chain A) and one of tetraethylene glycol (PG4 in chain B) were found bound in the cavity and surrounded by the conserved amino acid residues (Figure 5a). These polyethylene glycol molecules may mimic the binding of the linear polyene intermediate, which is sequestered and stabilized by the conserved aromatic residues lining the cavity during biosynthesis of the otherwise unstable 9-membered enediyne core intermediates.
The putative general acid-base catalytic pair and the amino acids lining the cavity are partially conserved among the SgcJ homologues in the seven known (Figure 3a) and 34 putative (Supplementary Figure S2) 9-membered enediyne biosynthetic gene clusters. To provide additional experimental data to support the catalytic role SgcJ and its homologues may play in enediyne core biosynthesis, we mutated each of the six conserved residues in SgcJ (that is, D111A and Y72A acting as the general acid-base pair, and W29A, F37A, W118A and Y132A lining the cavity) by site-directed mutagenesis. The expression constructs (pBS1148 to pBS1153) for the mutant variants of sgcJ were identical to pBS1146, in which the expression of sgcJ or its mutant variants was under the control of the constitutive ErmE* promoter. Introduction of pBS1148-pBS1153 individually into SB1027 afforded SB1030-SB1035, respectively, which were fermented, with SB1028 as a positive control, to examine if they could complement the ΔsgcJ mutation in SB1027. Gratifyingly, none of six mutants restored C-1027 production (Figure 2, panels III and V-X), consistent with the proposal that these conserved residues are involved in substrate recognition, catalysis or both. Taken together, we now propose that SgcJ plays a catalytic role in transforming the linear polyene intermediate, along with other enediyne PKS-associated enzymes, into an enzyme-sequestered 9-membered enediyne core intermediate.
SgcJ and its homologues are pathway specific for enediyne biosynthesis
We have previously demonstrated that PKSEs and thioesterases from different 9- and 10-membered enediyne machineries are freely interchangeable and 9- versus 10-membered enediyne core biosynthetic divergence occurs beyond the PKSE-thioesterase chemistry.21, 22 Given the sequence homology among SgcJ and its homologues (Figure 3a and Supplementary Figure S2) and the structural similarity as exemplified by SgcJ and NCS-Orf16 (Figures 3b,4a and 5), as well as the common catalytic role proposed for SgcJ and homologues in 9-membered enediyne core biosynthesis, we finally asked if SgcJ and its homologues are pathway specific. An expression vector pBS1147 for ncs-orf16 was similarly constructed as pBS1146 for sgcJ, in which the expression of ncs-orf16 was under control of the constitutive promoter ErmE*. Introduction of pBS1147 into the ΔsgcJ mutant strain SB1027 afforded SB1029, which offered the opportunity to examine if ncs-orf16 could cross-complement the ΔsgcJ mutation in SB1027. SB1029 was fermented with SB1028 as a positive control. Cross-complementation was not observed as evidenced upon HPLC analysis that showed no C-1027 production in SB1029 (Figure 2, panels III and IV). This result would suggest that SgcJ and its homologues are pathway specific for 9-membered enediyne core biosynthesis. Close comparison of the SgcJ and NCS-Orf16 structures indeed showed subtle differences in protein surface electrostatics and the shape of the putative cavities (Figure 5), which may account for unique protein-protein interaction or accommodate varying enediyne core intermediates for different 9-membered enediyne biosynthetic machineries.
Conclusions
The enediynes have served as an outstanding model to study the biosynthesis of complex natural products. Since cloning of the first set of enediyne biosynthetic gene clusters nearly 15 years ago,13, 14 significant progress has been made toward elucidating the biosynthesis of the peripheral moieties present in enediynes, but biosynthesis of the eneidyne cores remains elusive.1, 2, 3, 4 Comparative analysis of the enediyne gene clusters clearly revealed sets of genes that are highly conserved among the 9-membered, 10-membered or both enediynes, serving as outstanding candidates to study enediyne core biosynthesis.3, 4 Many of these candidate genes, however, are often annotated to encode proteins of unknown function, inactivation of which in vivo afforded mutant strains that often failed to accumulate any biosynthetic intermediates, thereby revealing few clues for their function in enediyne core biosynthesis. As a result, in spite of the great progress made in the past decade in characterizing the enediyne PKS enzyme E and its cognate thioesterase, cumulating to the discovery of a linear heptaene and its variants as the earliest possible intermediates or shunt metabolites for enediyne core biosynthesis,21, 22, 23, 24, 25 the exact nature of the nascent linear polyketide intermediates and their subsequent transformation to 9- and 10-membered enediyne cores remain unknown.
Recent technological advance in X-ray crystallography has made it possible to apply high-throughput structural biology as a practical tool to functionally characterize genes with deduced products that show little sequence homology to proteins of known function.39 While the current study fell short of establishing the exact function for SgcJ and its homologues, the structures of SgcJ and NCS-Orf16 and comparison to the NTF2-like superfamily of proteins allowed us to (i) define a putative substrate binding or catalytic active site, (ii) correlate the function of SgcJ to C-1027 biosynthesis by site-directed mutagenesis of the conserved residues lining this site, and (iii) propose that SgcJ and its homologues may play a catalytic role, along with other enediyne PKS-associated enzymes, in transforming the linear polyene intermediate into an enzyme-sequestered 9-membered enediyne core intermediate. These findings will surely help formulate hypotheses and design experiments to ascertain the function of SgcJ and its homologues in 9-membered enediyne core biosynthesis in the future.
Materials and methods
Strains, plasmids and culture conditions
Bacterial strains, plasmids and primers used in this study are summarized in Supplementary Tables S1, S2, and S3, respectively. Escherichia coli strains and M. luteus ATCC 9431 were cultured in lysogeny broth or grown on lysogeny broth agar plates. S. globisporus wild-type and recombinant strains were cultivated at 28 °C on ISP Medium 4 (Becton Dickenson, Franklin Lakes, NJ) for sporulation. Antibiotics for selection were used at the following concentrations: 25 μg ml−1 for apramycin and thiostrepton, and 50 μg ml−1 for chloramphenicol and kanamycin.
Construction of the ΔsgcJ mutant strain S. globisporus SB1027
The ΔsgcJ mutant strain SB1027 was constructed in the C-1027 overproducer S. globisporus SB102210 by gene replacement via homologous recombination. Briefly, the 1.5-kb kanamycin resistance cassette was amplified by PCR from pJTU4659 with primers sgcJtgtF and sgcJtgtR (Supplementary Table S3) and used to replace sgcJ in cosmid pBS1005 35 via λ-RED-mediated PCR targeting mutagenesis34 to generate pBS1143. The ΔsgcJ gene was then excised from pBS1143 as a ~21 kb XbaI-SpeI fragment and inserted into the XbaI site of pSET151 to afford pBS1144. pBS1144 was finally introduced into S. globisporus SB1022 by E. coli-S. globisporus conjugation.40 Exconjugates resulting from the desired double-crossover homologous recombination were selected on the basis of kanamycin-resistant and thiostrepton-sensitive phenotype, and named SB1027, the genotype of which was confirmed by PCR and Southern analysis (Supplementary Figure S3).
Construction of ΔsgcJ complementation strains S. globisporus SB1028 and SB1029
A 0.8-kb fragment bearing oriT was amplified by PCR from plasmid pSET152 with primers oriT152F and oriT152R (Supplementary Table S3), digested with KpnI, and cloned into the same site of pUWL201pw to generate pBS1145. A 420-bp fragment of sgcJ and a 432-bp fragment of ncs-orf16 were amplified by PCR from cosmids pBS1005 35 and pBS5007 15, with primers sgcJ201NdeIF and sgcJ201EcoRIR, and ncs16NdeIF and ncs16HindIIIR, respectively (Supplementary Table S3). The resultant products were digested with NdeI and EcoRI (for sgcJ), and NdeI and HindIII (ncs-orf16), and cloned into the same sites of pBS1145 to afford pBS1146 (for sgcJ) and pBS1147 (for ncs-orf16), respectively. Both pBS1146 and pBS1147, in which the expressions of sgcJ and ncs-orf16 were under the control of the constitutive ErmE* promoter,40 were finally introduced into the ΔsgcJ mutant strain S. globisporus SB1027 by E. coli-S. globisporus conjugation.40 Exconjugates were selected on the basis of thiostrepton-resistant phenotype as the desired complementation strains, and named SB1028 (that is, sgcJ expressing) and SB1029 (that is, ncs-orf16, expressing), respectively.
Site-directed mutagenesis of SgcJ
Plasmids of the sgcJ mutants, pBS1148 (W29A), pBS1149 (F37A), pBS1150 (Y72A), pBS1151 (D111A), pBS1152 (W118A) and pBS1153 (Y132A), were constructed by the QuikChange site-directed mutagenesis method, following the manufacturer’s protocol (Agilent Technologies, Santa Clara, CA) and using pBS1146 as a template. The primers used are listed in Supplementary Table S3. The mutations were verified by DNA sequencing. Each of the mutant constructs was then introduced into the ΔsgcJ mutant strain SB1027 by conjugation, yielding the complementation strains SB1030 (that is, SB1027/pBS1148), SB1031 (that is, SB1027/pBS1149), SB1032 (that is, SB1027/pBS1150), SB1033 (that is, SB1027/pBS1151), SB1034 (that is, SB1027/pBS1152) and SB1035 (that is, SB1027/pBS1153), respectively.
Production, isolation and analysis of C-1027
S. globisporus recombinant strains were cultured following a two-step fermentation procedure reported previously, and both stages utilized the same medium (1% glycerol, 2% dextrin, 1% fish meal, 0.5% peptone, 0.2% (NH4)2SO4, 0.1% MgSO4, 0.2% CaCO3, pH 7.0).9, 10, 13, 35 Briefly, fresh spores of the recombinant strains were inoculated into 250-ml baffled flasks containing 50 ml of medium and incubated at 28 °C and 250 rpm for 48 h. The resultant seed cultures (2.5 ml) were then inoculated into 250-ml baffled flasks containing 50 ml of the same medium, and fermentation continued at 28 °C and 250 rpm for 7 days. The C-1027 overproducer SB1022 and the ΔsgcJ mutant strain SB1027 were cultured in medium without any antibiotics. All other recombinant strains used in this study were cultured in medium supplemented with 5 μg ml−1 thiostrepton to retain the introduced plasmids.
Isolation and HPLC analysis of the C-1027 chromophore were carried out by following published procedures.9, 10, 13, 35 Briefly, fermentation broth (50 ml) was adjusted to pH 4.0 with 0.1 N HCl and centrifuged to remove any precipitate. To the supernatant, (NH4)2SO4 was then added to 50% saturation, and the precipitated C-1027 chromoprotein was collected by centrifugation and dissolved in 2 ml of 0.1 M potassium phosphate, pH 8.0. The latter was extracted with 2 ml of EtOAc twice, and the combined EtOAc extract was concentrated in vacuo and re-dissolved in CH3OH. HPLC was carried out on a Beckman ultrasphere-ODS dp analytical column (5 μm, 150 × 4.6 mm) (Beckman Coulter, Indianapolis, IN), eluted isocratically with 20 mM potassium phosphate (pH 6.8)/CH3CN (50:50 v/v) at a flow rate of 1.0 ml min−1 and UV detection at 350 nm on a Varian HPLC system with a Prostar 330 PDA detector (Agilent Technologies). LC-MS analysis of C-1027 was performed on an Agilent 6230 TOF LC-MS instrument (Agilent Technologies).
Determination of C-1027 production by bioassay and biochemical induction assays
Determination of C-1027 production by bioassay against M. luteus ATCC 9431 was carried out as described previously.9, 10, 13, 35 Alternatively, C-1027 production was also followed by the biochemical induction assay according to literature procedures,11 which uses the E. coli BR513 strain as an indicator and specifically detects agents with DNA damage activities. Briefly, 10 μl of fermentation supernatant or an agar plug were applied onto agar plates seeded with E. coli BR513 and incubated for 3–4 h at 37 °C. The plates were then overlaid with soft agar containing 0.7 mg ml−1 of X-gal and incubated at 37 °C for additional 30–60 min to develop the characteristic blue color, indicative of DNA damage, and thus C-1027 production.
Gene expression and protein purification
PCR amplification of sgcJ from S. globisporus genomic DNA13 and ncs-orf16 from S. carzinostaticus genomic DNA15 by KOD Hot Start DNA polymerase (EMD Millipore, Billerica, MA) followed the manufacturer’s protocols using primers sgcJ-F and sgcJ-R and primers orf16-F and orf16-R primers, respectively (Supplementary Table S3). The amplification buffer was supplemented with betaine to a final concentration of 2.5 M. The PCR products were purified and cloned into pMCSG57, yielding pBS1154 (expressing sgcJ) and pBS1155 (expressing ncs-orf16), by the ligation-independent procedures.41 The expression plasmids were then transformed into E. coli BL21(DE3)-Gold strain (Stratagene, San Diego, CA) for protein production. Production and purification of SeMet-labeled SgcJ and NCS-Orf16 were performed according to standard protocol.42 Briefly, the cell were cultured at 37 °C in 1 L of enriched M9 medium42 until OD600=1.0. After air-cooling the culture down at 4 °C for 60 min, inhibitory amino acids (25 mg each per liter L-valine, L-isoleucine, L-leucine, L-lysine, L-threonine and L-phenylalanine), selenomethionine (SeMet) and isopropyl-β-D-thiogalactoside (IPTG) were added. The cells were incubated overnight at 18 °C, harvested and re-suspended in lysis buffer (500 mM NaCl, 5% (v/v) glycerol, 50 mM HEPES pH 8.0, 20 mM imidazole and 10 mM β-mercaptoethanol). The SeMet-labeled proteins were purified using Ni-NTA affinity chromatography by the AKTAxpress system (GE Healthcare Life Sciences, Marlborough, MA) and digested with recombinant His6-tagged Tobacco etch virus (TEV) protease to remove the His6-tag. The final pure proteins were concentrated using Amicon Ultra-15 concentrators (Millipore, Bedford, MA) in 20 mM HEPES pH 8.0 buffer, 250 mM NaCl and 2 mM dithiothreitol. Protein concentrations were determined based on the absorbance at 280 nm using a molar absorption coefficient (ɛ280=19,480 and 20,970 M−1 cm−1 for SgcJ and NCS-Orf16, respectively).43 The concentrations of SgcJ and NCS-Orf16 used for crystallization were both ~50 mg ml−1. Size-exclusion chromatography was performed using a Superdex 200 16/600 column (GE Healthcare Life Sciences) with an Äkta FPLC chromatographic system (GE Healthcare Life Sciences) at 4 °C. The column was calibrated with a size-exclusion calibration kit (GE Healthcare Life Sciences) and developed with the elution buffer (200 mM NaCl, 100 mM Tris, pH 8.0) at flow rate of 0.5 ml min−1 with UV detection at 280 nm.
Protein crystallization
Both SgcJ and NCS-Orf16 were screened for crystallization conditions using a Mosquito liquid dispenser (TTP Labtech, Melbourn, UK) and the sitting-drop vapor-diffusion technique in 96-well CrystalQuick plates (Greiner Bio-one, Monroe, NC). For each condition, 0.4 μl of protein (52.8 mg ml−1) and 0.4 μl of crystallization formulation were mixed. The mixture was equilibrated against 140 μl of the reservoir in the well. Commercially available crystallization screens were used, including MCSG-1–4 (Microlytic Inc., Burlington, MA) at 24 °C, 16 °C and 4 °C. For SgcJ, crystals were obtained under several conditions, with the most promising condition being from 0.1 M Na2HPO4 (adjust to pH 4.2 with citric acid) and 40% (v/v) PEG 300 at 16 °C. The crystals grew within 1 week and reached sizes of approximately 0.100 mm × 0.020 mm × 0.010 mm. For NCS-Orf16, suitable crystals for X-ray diffraction were grown from the condition containing 0.2 M sodium formate and 20% (w/v) PEG 3350 at 16 °C.
Data collection, structure determination and refinement
Diffraction data were collected at 100 K at the 19-ID beamline of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory.44 A single data set was taken near the Se K-edge peak anomalous position (0.9792 Å) from a single protein crystal of SgcJ to a resolution of 1.70 Å. The crystal was exposed for 3 s per 1.0° rotation with a distance of 240 mm from crystal to detector. The data were recorded on an ADSC Quantum 315r CCD detector. For NCS-Orf16, data collection was the same except the crystal to detector distance was 327 mm, and three data sets were collected and merged. Data collection strategy, integration and scaling were performed with the HKL3000 program package.45 A summary of the crystallographic data can be found in Table 1.
The crystal structures of SgcJ and NCS-Orf16 were determined by SAD phasing, utilizing the anomalous signal from Se atoms with shelxc/d/e,46 mlphare,47 and dm48 in HKL300045 for SgcJ and SOLVE/RESOLVE49 for NCS-Orf16, and refined to 1.7 and 2.72 Å, respectively. For SgcJ, the initial model contains two protein chains consisting of at least 90% of the residues in each chain. For NCS-Orf16, the initial model contains 10 protein chains consisting of 67% of whole model with 20% assigned side-chain. Extensive manual model building with COOT50 and the subsequent refinement using phenix.refine51 were performed until R-factors converged to final values of R(Rfree)=0.168(0.195) and 0.217(0.256) for the structures of SgcJ and NCS-Orf16, respectively. The geometrical properties of the models were assessed using PROCHECK52 and Molprobity.53 The atomic coordinates and structure factors have been deposited in the Protein Data Bank with the accession code 4I4K for SgcJ and 4OVM for NCS-Orf16, respectively.
References
Van Lanen, S. G. & Shen, B. Biosynthesis of enediyne antitumor antibiotics. Curr. Top. Med. Chem. 8, 448–459 (2008).
Liang, Z. X. Complexity and simplicity in the biosynthesis of enediyne natural products. Nat. Prod. Rep. 27, 499–528 (2010).
Shen, B. et al. Enediynes: exploration of microbial genomics to discover new anticancer drug leads. Bioorg. Med. Chem. Lett. 25, 9–15 (2015).
Rudolf, J. D., Yan, X. & Shen, B. Genome neighborhood network reveals insights into enediyne biosynthesis and facilitates prediction and prioritization for discovery. J. Ind. Microbiol. Biotechnol. 43, 261–276 (2016).
Edo, K. et al. The structure of neocarzinostatin chromophore possessing a novel bicyclo [7,3,0]dodecadiyne system. Tetrahedron Lett. 26, 331–340 (1985).
Kennedy, D. R. et al. Single chemical modifications of the C-1027 enediyne core, a radiomimetic antitumor drug, affect both drug potency and the role of ataxia-telangiectasia mutated in cellular responses to DNA double-strand breaks. Cancer Res. 67, 773–781 (2007).
Kennedy, D. R., Ju, J., Shen, B. & Beerman, T. A. Designer enediynes generate DNA breaks, interstrand cross-links, or both, with concomitant changes in the regulation of DNA damage responses. Proc. Natl Acad. Sci. USA 104, 17632–17637 (2007).
Beerman, T. A., Gawron, L. S., Shin, S., Shen, B. & McHugh, M. M. C-1027, a radiomimetic enediyne anticancer drug, preferentially targets hypoxic cells. Cancer Res. 69, 593–598 (2009).
Chen, Y., Yin, M., Horsman, G. P., Huang, S. & Shen, B. Manipulation of pathway regulation in Streptomyces globisporus for overproduction of the enediyne antitumor antibiotic C-1027. J. Antibiot. 63, 482–485 (2010).
Chen, Y., Yin, M., Horsman, G. P. & Shen, B. Improvement of the enediyne antitumor antibiotic C-1027 production by manipulating its biosynthetic pathway regulation in Streptomyces globisporus. J. Nat. Prod. 74, 420–424 (2011).
Zazopoulos, E. et al. A genomics-guided approach for discovering and expressing cryptic metabolic pathways. Nat. Biotechnol. 21, 187–190 (2003).
Liu, W. et al. Rapid PCR amplification of minimal enediyne polyketide synthase cassettes leads to a predictive familial classification model. Proc. Natl Acad. Sci. USA 100, 11959–11963 (2003).
Liu, W., Christenson, S. D., Standage, S. & Shen, B. Biosynthesis of the enediyne antitumor antibiotic C-1027. Science 297, 1170–1173 (2002).
Ahlert, J. et al. The calicheamicin gene cluster and its iterative type I enediyne PKS. Science 297, 1173–1176 (2002).
Liu, W. et al. The neocarzinostatin biosynthetic gene cluster from Streptomyces carzinostaticus ATCC 15944 involving two iterative type I polyketide synthases. Chem. Biol. 12, 293–302 (2005).
Van Lanen, S. G., Oh, T. J., Liu, W., Wendt-Pienkowski, E. & Shen, B. Characterization of the maduropeptin biosynthetic gene cluster from Actinomadura madurae ATCC 39144 supporting a unifying paradigm for enediyne biosynthesis. J. Am. Chem. Soc. 129, 13082–13094 (2007).
Lohman, J. R. et al. Cloning and sequencing of the kedarcidin biosynthetic gene cluster from Streptoalloteichus sp. ATCC 53650 revealing new insights into biosynthesis of the enediyne family of antitumor antibiotics. Mol. BioSyst. 9, 478–491 (2013).
McGlinchey, R. P., Nett, M. & Moore, B. S. Unraveling the biosynthesis of the sporolide cyclohexenone building block. J. Am. Chem. Soc. 130, 2406–2407 (2008).
Lane, A. L. et al. Structures and comparative characterization of biosynthetic gene clusters for cyanosporasides, enediyne-derived natural products from marine actinomycetes. J. Am. Chem. Soc. 135, 4171–4174 (2013).
Gao, Q. & Thorson, J. S. The biosynthetic genes encoding for the production of the dynemicin enediyne core in Micromonospora chersina ATCC53710. FEMS Microbiol. Lett. 282, 105–114 (2008).
Zhang, J. et al. A phosphopantetheinylating polyketide synthase producing a linear polyene to initiate enediyne antitumor antibiotic biosynthesis. Proc. Natl Acad. Sci. USA 105, 1460–1465 (2008).
Horsman, G. P., Chen, Y., Thorson, J. S. & Shen, B. Polyketide synthase chemistry does not direct biosynthetic divergence between 9- and 10-membered enediynes. Proc. Natl Acad. Sci. USA 107, 11331–11335 (2010).
Belecki, K., Crawford, J. M. & Townsend, C. A. Production of octaketide polyenes by the calicheamicin polyketide synthase CalE8: implications for the biosynthesis of enediyne core structures. J. Am. Chem. Soc. 131, 12564–12566 (2009).
Belecki, K. & Townsend, C. A. Environmental control of the calicheamicin polyketide synthase leads to detection of a programmed octaketide and a proposal for enediyne biosynthesis. Angew Chem. Int. Ed. 51, 11316–11319 (2013).
Belecki, K . & Townsend, C. A. Biochemical determination of enzyme-nound metabolites: preferential accumulation of a programmed octaketide on the enediyne polyketide synthase CalE8. J. Am. Chem. Soc. 135, 14339–14348 (2013).
Eberhardt, R. Y. et al. Filling out the structural map of the NTF2-like superfamily. BMC Bioinformatics 14, 327 (2013).
Cha, H. J. et al. Rescue of deleterious mutations by the compensatory Y30F mutation in ketosteroid isomerase. Mol. Cells 36, 39–46 (2013).
Lundqvist, T. et al. Crystal structure of scytalone dehydratase—a disease determinant of the rice pathogen, Magnaporthe grisea. Structure 2, 937–944 (1994).
Sultana, A. et al. Structure of the polyketide cyclase SnoaL reveals a novel mechanism for enzymatic aldol condensation. EMBO J. 23, 1911–1921 (2004).
Ames, B. D. et al. Crystal structure and functional analysis of tetracenomycin ARO/CYC: implications for cyclization specificity of aromatic polyketides. Proc. Natl Acad. Sci. USA 105, 5349–5354 (2008).
Paschal, B. M. & Gerace, L. Identification of NTF2, a cytosolic factor for nuclear import that interacts with nuclear pore complex protein p62. J. Cell. Biol. 129, 925–937 (1995).
Hoelz, A., Nairn, A. C. & Kuriyan, J. Crystal structure of a tetradecameric assembly of the association domain of Ca2+/calmodulin-dependent kinase II. Mol. Cell 11, 1241–1251 (2003).
Ott, M. et al. Mba1, a membrane-associated ribosome receptor in mitochondria. EMBO J. 25, 1603–1610 (2006).
Gust, B., Challis, G. L., Fowler, K., Kieser, T. & Chater, K. F. PCR-targeted Streptomyces gene replacement identifies a protein domain needed for biosynthesis of the sesquiterpene soil odor geosmin. Proc. Natl Acad. Sci. USA 100, 1541–1546 (2003).
Liu, W. & Shen, B. Genes for production of the enediyne antitumor antibiotic C-1027 in Streptomyces globisporus are clustered with the cagA gene that encodes the C-1027 apoprotein. Antimicrob. Agents Chemother. 44, 382–392 (2000).
Holm, L. & Rosenstrom, P. Dali server: conservation mapping in 3D. Nucleic Acids Res. 38, W545–W549 (2010).
Marchler-Bauer, A. et al. CDD: NCBI's conserved domain database. Nucleic Acids Res. 43, D222–D226 (2015).
Bullock, T. L., Clarkson, W. D., Kent, H. M. & Stewart, M. The 1.6 angstroms resolution crystal structure of nuclear transport factor 2 (NTF2). J. Mol. Biol. 260, 422–431 (1996).
Jacobson, M. P., Kalyanaraman, C., Zhao, S. & Tian, B. Leveraging structure for enzyme function prediction: methods, opportunities, and challenges. Trends Biochem. Sci. 39, 363–371 (2014).
Kieser, T., Bibb, M. J., Buttner, M. J., Chater, K. F. & Hopwood, D. A. Practical Streptomyces Genetics, The John Innes Foundation, Norwich, UK, (2000).
Aslanidis, C. & de Jong, P. J. Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 6069–6074 (1990).
Kim, Y. et al. Automation of protein purification for structural genomics. J. Struct. Funct. Genomics 5, 111–118 (2004).
Gill, S. C. & von Hippel, P. H. Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182, 319–326 (1989).
Rosenbaum, G. et al. The Structural Biology Center 19ID undulator beamline: facility specifications and protein crystallographic results. J. Synchrotron Radiat. 13, 30–45 (2006).
Minor, W., Cymborowski, M., Otwinowski, Z. & Chruszcz, M. HKL-3000: the integration of data reduction and structure solution—from diffraction images to an initial model in minutes. Acta Cryst. D62, 859–866 (2006).
Sheldrick, G. M. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Cryst. D66, 479–485 (2010).
Otwinowski, Z. In Isomorphous Replacement and Anomalous Scattering, Proceedings of the CCP4 Study (eds Wolf, W., Evans, P. R. & Leslie, A. G. W.) 80–86 (1991).
Cowtan, K. DM: an automated procedure for phase improvement by density modification. Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography 31, 34–38 (1994).
Terwilliger, T. SOLVE and RESOLVE: automated structure solution, density modification and model building. J. Synchrotron Radiat. 11, 49–52 (2004).
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Cryst. D60, 2126–2132 (2004).
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Cryst. D66, 213–221 (2010).
Laskowski, R. A., Macarthur, M. W., Moss, D. S. & Thornton, J. M. Procheck—a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26, 283–291 (1993).
Davis, I. W. et al. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res. 35, W375–W383 (2007).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Robert, X. & Gouet, P. Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res. 42, W320–W324 (2014).
Acknowledgements
This work is supported in part by a fellowship of Academia Sinica-The Scripps Research Institute Postdoctoral Talent Development Program (to C-YC), a German Research Foundation postdoctoral fellowship (to IC), US National Institute of General Medical Science Protein Structure Initiative Grants GM094585 (to AJ) and GM098248 (to GNP) and US National Institutes of Health Grants GM109456 (to GNP), CA078747 (to BS) and GM115575 (to BS). The use of Structural Biology Center beamlines at the Advanced Photon Source was supported in part by the US Department of Energy, Office of Biological and Environmental Research, under contract DE-AC02-06CH11357.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Dedicated to Professor David E Cane, Brown University, for his distinguished contributions to natural product biosynthesis and engineering.
Supplementary Information accompanies the paper on The Journal of Antibiotics website
Supplementary information
Rights and permissions
About this article
Cite this article
Huang, T., Chang, CY., Lohman, J. et al. Crystal structure of SgcJ, an NTF2-like superfamily protein involved in biosynthesis of the nine-membered enediyne antitumor antibiotic C-1027. J Antibiot 69, 731–740 (2016). https://doi.org/10.1038/ja.2016.88
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ja.2016.88
This article is cited by
-
Conformational spread drives the evolution of the calcium–calmodulin protein kinase II
Scientific Reports (2022)
-
Comparative transcriptomic analysis reveals the significant pleiotropic regulatory effects of LmbU on lincomycin biosynthesis
Microbial Cell Factories (2020)