Main

Lincosamides, including lincomycins A and B (1 and 2) (refs. 1,2), Bu-2545 (3) (ref. 3) and celesticetin (4) (ref. 4), are antibacterial compounds effective against Gram-positive bacteria (Fig. 1a)5,6,7,8. Lincomycin A and its semisynthetic derivative clindamycin are clinically used to treat severe bacterial infections in people allergic to penicillin9. The lincosamide structures feature an unusual thiooctose core connected via an amide bond with N-methyl-trans-4-propyl-l-proline (N-methyl-PPL; in 1) or N-methyl-l-proline (N-methyl-Pro; in 3 and 4) (ref. 10). Lincosamides block bacterial protein synthesis at the early stage of the elongation cycle by targeting the peptidyltransferase domain of the 50S ribosomal subunit, because of their structural similarity to the 3′-end of l-Pro-Met-transfer RNA (tRNA) and deacylated tRNA11,12. The structure–activity relationship of lincosamide compounds indicated that the connection between the N-methyl-PPL/Pro and the thiooctose is essential for the antibacterial activity and the amino acid selectivity (PPL versus Pro) is also important for the level of antibacterial activity13.

Fig. 1: Biosynthetic pathway of lincosamide antibiotics.
figure 1

a, Structures of lincosamide antibiotics. b, Amide bond formation in the biosynthesis of 1 and 4. c, The enzyme reaction of CcbD with 8.

In the biosynthesis of 1 and 4, the connection between PPL/Pro and thiooctose is catalysed by LmbC, LmbN and LmbD, and CcbC, CcbZ and CcbD, respectively5,6,7,14,15,16. First, the stand-alone adenylation enzymes LmbC/CcbC catalyse the loading of PPL or Pro onto the phosphopantetheine arm of the carrier protein (CP) domain of LmbN/CcbZ14,17. The condensation enzymes LmbD/CcbD then catalyse the condensation of CP-tethered PPL, or Pro and ergothioneine (EGT) S-conjugated thiooctose 5, to form an amide bond in 6 and 7, respectively (Fig. 1b). The amino acid selectivity is determined by the LmbC and CcbC specificities, which strictly recognize PPL and Pro to form LmbN-CP-tethered PPL and CcbZ-CP-tethered Pro, respectively17. In contrast, CcbD shows relaxed substrate specificity towards both of its substrates, CP-tethered Pro and thiooctose14,17. Specifically, as the aminoacyl substrate, CcbD can utilize both CcbZ-CP-tethered Pro (its natural substrate) and LmbN-CP-tethered PPL (intermediate from 1 biosynthesis), and it also accepts methylthiolincosamide (MTL: 8), in which EGT is substituted with a C1-S-methyl group bearing the opposite stereochemistry compared with the natural substrate 5, to generate 9 and 10, respectively (Fig. 1c)15.

Although this CP-dependent condensation system is similar to those of non-ribosomal peptide synthetases (NRPSs), LmbD/CcbD show low sequence similarity to any functionally characterized amide bond-forming enzymes (below 15%)13,14. Furthermore, the model structure of CcbD predicted by AlphaFold2 (ref. 18) does not resemble those of other amide bond-forming enzymes (Supplementary Fig. 1)19,20,21,22,23,24,25. These observations suggest that LmbD/CcbD possess an unusual fold and employ a different mechanism for amide bond formation from other enzymes. Furthermore, although the function of CcbD has been investigated by in vivo and in vitro analyses, its substrate specificities towards CP-tethered acyl-substrates, overall structures, catalytic residues and catalytic mechanisms remain unclear.

In this Article, we performed structure–function analyses of CcbD by in vitro characterization, crystallization and structure-based mutagenesis. The enzyme reaction analyses with l-prolyl-CoA and unnatural acyl substrates revealed that the enzyme recognizes the CP moiety of the substrate, but shows broad substrate specificity towards the acyl moiety to generate various unnatural lincosamide compounds. The crystallographic analyses of CcbD indicated that its overall structure is not related to known structurally characterized amide bond-forming enzymes, while its N-terminal region is weakly similar to those of cysteine proteases26. Furthermore, site-directed mutagenesis experiments indicated that CcbD utilizes Cys-His-Glu residues as a catalytic triad. Lastly, the complex structures of CcbD with 8 and CcbD with CP revealed the molecular basis for the recognition of substrates and CP, and the mechanism of the CcbD-catalysed CP-dependent amide bond formation reaction.

Results

Substrate specificity of CcbD

First, we compared the CcbD reaction between 5 and 8 to understand the substrate preference of CcbD for the C1-S-alkyl moiety. The yield of the product from 8 was comparable to that from 5, indicating that CcbD does not recognize the structure of C1-S-alkyl moiety in the enzyme reaction (Extended Data Fig. 1). Second, while CcbD shows substrate tolerance towards the alkyl-chain length of PPL17, the requirement of CP and the substrate scope for non-proline acyl-substrates have not been clarified. Therefore, to understand the importance of CP in the enzyme reaction, l-Pro-CoA was incubated with CcbD and 8. We determined that CcbD also uses l-Pro-CoA as a substrate, instead of CP-tethered l-Pro, although the reaction efficiency is significantly lower than that with CP-tethered l-Pro (Extended Data Fig. 2). We next tested various acyl-CPs, including acetyl-CP (9), butanoyl-CP (10), hexanoyl-CP (11), octanoyl-CP (12), alanoyl-CP (13), pipecolyl-CP (14) and phenylalanoyl-CP (15), as substrates of CcbD. Interestingly, CcbD accepted all of them as substrates to generate 1624 with yields of ~8–34% (Extended Data Fig. 3 and Supplementary Table 1). CcbD produced two compounds 18/19 and 20/21 from 11 and 12, respectively. The structure of 19 was determined to be N-hexanoyl-8 by comparison with authentic compound (Supplementary Fig. 2a–c and Supplementary Table 2). Further MS/MS analysis suggested that 21 is N-octanoyl-8, and 18 and 20 are products resulting from ester bond formation at the C7-hydroxyl group (Supplementary Fig. 2c,d). Moreover, we also tested various amino-containing compounds 2536 as substrates of CcbD (Extended Data Fig. 4). Surprisingly, CcbD accepted 3236 as substrates to generate dipeptide compounds 3741, although the yields are quite low, suggesting that CcbD recognizes the ethanolamine moiety of substrates.

These results indicated that the CP moiety is important for the precise substrate recognition by CcbD, while CcbD shows more relaxed specificity for the proline moiety and amino substrates to generate unnatural-type lincosamides and dipeptides.

Overall structure and active site of CcbD

To understand the structural basis for the CP-dependent amide bond-forming reaction by CcbD, we solved the structure of selenomethionine-labelled CcbD at 2.5 Å resolution (Fig. 2 and Supplementary Table 3). CcbD exists as a homodimer and the overall structure of CcbD consists of three domains, including the α/β-fold domain, which contains five α-helices and six β-strands, four anti-parallel β-sheets and the C-terminal five α-helix domain (Fig. 2a). The CcbD monomer possesses a large cleft between the α/β-fold domain and the C-terminal α-helix domain. The four anti-parallel β-sheet domain of the adjoining monomer inserts into this cleft to form a dimer. The dimer is stabilized through electrostatic, hydrophilic and hydrophobic interactions between the amino acid residues on each inserted β-sheet domain, and the dimer interface buries ~2,850 Å2. The monomers are nearly identical to each other, with root-mean-square deviations (RMSDs) of 0.3 Å. The dimerization creates a large cavity between the dimer surface, consisting of α1, α5, α10 and a loop between α6 and β5 from monomer A, and α10 and a loop between β1 and β2 from monomer B. The estimated total volumes and areas of the active-site cavity are 790 Å3 and 920 Å2, respectively.

Fig. 2: The structure of CcbD.
figure 2

a, The homodimer structure of CcbD. The α/β-fold domain, four anti-parallel β-sheets and C-terminal five α-helix domain are coloured green, magenta and cyan, respectively. b, The active site pocket of CcbD and the catalytic triad. c, The active site architecture and binding mode of substrate 8. Dashed yellow lines represent hydrogen bonds. Dashed cyan lines show distances between the amino group of 8 and catalytic residues. Water molecules are shown with red non-bonded spheres. d, Fo–Fc polder map of 8. The electron density maps of 8 (contoured at +3.0σ) in c and d are represented by a grey mesh. The catalytic residues are highlighted with red.

Interestingly, despite the lack of sequence similarity, the α/β-fold domain of CcbD shares weak structural similarity to those of cysteine proteases, including a putative C39-like peptidase protein with unknown function from Bacillus anthracis (PDB code 3ERV) and the phytochelatin synthase (PCS)-like enzyme NsPCS from Nostoc sp. pcc 7120 (PDB code 2BTW), with RMSDs of 3.3 Å and 3.3 Å for the Cα atoms, respectively (13% and 10% identity)26. However, the four anti-parallel β-sheets and the C-terminal five α-helix domain, which are important for the homodimerization of CcbD, were not observed in cysteine proteases and NsPCS (Extended Data Fig. 5a–c). Comparison of the active site between CcbD and NsPCS indicated that the active site shape and residues are not conserved except for the catalytic triad as described below. The ligand binding site of NsPCS is located on the surface of monomer, while CcbD forms large pocket to specifically accepts acyl-tethered CP (Extended Data Fig. 5d,e).

Identification of catalytic residues

Cysteine proteases and PCSs utilize a Cys-His-Asn/Asp catalytic triad for the enzyme reaction27,28,29,30. CcbD also possesses a similar triad composed of residues Cys17, His131 and Glu148 in the putative active site cavity (Fig. 2b).The C17A, H131A and E148A variants completely lost the amide bond-forming activity, indicating that these residues play a catalytic role in the enzyme reaction of CcbD (Fig. 3a,b). The structural similarities and conservation of the catalytic triad between cysteine proteases and CcbD suggested that the first step of the CcbD reaction is the formation of the CcbD–PPL/Pro complex, by the nucleophilic attack of Cys17 to the thioester group of CP-tethered PPL/Pro.

Fig. 3: Mutagenesis experiment of residues in the active site of CcbD.
figure 3

a, LC–MS charts of the reaction products of CcbD WT and variants. b, Relative activities of CcbD variants. The bars are means of n = 3 independent experiments, and the error bars indicate s.d. All experiments were repeated independently three times with similar results.

Source data

Binding mode of the thiooctose substrate

We solved the complex structure of CcbD with 8 at 2.25 Å resolution to investigate the binding mode of thiooctose. Compound 8 binds near Cys17 in the catalytic centre through hydrogen bond interactions (Fig. 2c,d). A C2-hydroxy group interacts with Arg263′ from the adjoining monomer, and the C3-hydroxy group also forms a hydrogen bond network with Arg263′ and Asp264′ via a water molecule. Furthermore, the C4-hydroxy group interacts with Tyr270 via a water molecule and the main chain amides of Met128 and Leu129. The position of the C7-hydroxy group is fixed by the hydrogen bond with the Asp114 side chain and the main chain carbonyl of Leu129. The C6-amino group is positioned close to the catalytic residues Cys17 and His131, at distances of 5.2 Å and 3.1 Å, respectively, suggesting that His131, once activated by Glu148, abstracts the hydrogen atom to activate the amino group of 8, and the activated amino attacks the thioester of the Cys-tethered PPL/Pro intermediate. The D114A and R263′A variants, whose side chains interact with the sugar substrate, abolished the amide bond-forming activity, suggesting that the hydrogen bond networks with the sugar moiety of the substrate are important for substrate recognition (Fig. 3). The S-methyl group of 8 is oriented towards the entrance of the active site cavity, and there is sufficient space to accept the EGT moiety of the natural substrate 5. The ethanolamine moiety of 3236 would bind in similar positions to those of the C6-amino and C7-hydroxy groups of 8 and is likely to be recognized by Asp114, Leu129 and the catalytic His131 as in the 8-binding. Furthermore, this sugar binding site is significantly large enough to bind other amino-containing compounds, including bulky phenylalanol and tryptophanol.

Cross-linking reaction of CcbD and CcbZ-CP

To understand how CcbD selectively recognizes CP during the reaction, we attempted to crystallize CcbD and the CcbZ-CP complex. First, we performed a site-specific cross-linking reaction of CcbD and CcbZ-CP by using a bifunctional maleimide reagent, 1,2-bis(maleimido)ethane (BMOE), to obtain covalently cross-linked CcbD/CcbZ-CP31. While CcbZ-CP does not have any cysteine residues, CcbD possesses five, with Cys216 and Cys249 located on the surface of the enzyme (Supplementary Fig. 3). Therefore, we substituted these cysteines with serine to prevent non-specific cross-linking reactions, and the resulting C216S/C249S variant was used for cross-linking to the sulfhydryl group of the 4′-phosphopantetheine in CcbZ-CP. The incubation of the C216S/C249S variant and holo-CcbZ-CP in the presence of BMOE efficiently afforded the covalent complex (Extended Data Fig. 6). In contrast, no apparent cross-linking was observed between the C17S/C216S/C249S variant and CcbZ-CP (Extended Data Fig. 6). These results clearly suggested that the specific cross-linking reaction occurred between the sulfhydryl groups of the 4′-phosphopantetheine of CcbZ-CP and Cys17 via BMOE in the active site of CcbD.

We purified and crystallized the cross-linked CcbD/CcbZ-CP complex. The structure of the CcbD/CcbZ-CP complex was solved at 2.3 Å resolution. In the complex structure, CcbZ-CP is located between the α/β-fold domain and the C-terminal α-helical domain, and covers the active site entrance of CcbD (Fig. 4a). The structure of CcbD in the complex is almost identical to that in the wild type (WT), with an RMSD of 0.6 Å for the Cα atoms. The interface area between CcbD and CcbZ-CP comprises 770 Å2, which is 4.8% of the surface area of CcbD and 17% of the surface area of CcbZ-CP. This contact area is similar to those of other complex structures of CPs and enzymes, such as the polyketide synthase and NRPS machineries32,33,34.

Fig. 4: The structure of CcbD/CcbZ-CP complex.
figure 4

a, The overall structure of the CcbD/CcbZ-CP complex. The CcbD and CcbZ-CP domains are represented by grey and yellow cartoons. b, The active site of CcbD in the complex structure. c, The interaction between CcbD and CcbZ-CP. The residues on CcbD and CcbZ-CP are coloured grey and yellow, respectively, in b and c. Dashed yellow lines represent hydrogen bonds. The Fo–Fc polder map of covalently bounded Cys17 in CcbD, Ser36 in CcbZ-CP and 4′-phosphopantetheine arm (contoured at +3.0σ) is represented by a grey mesh. The mutated residues are highlighted with red.

The 4′-phosphopantetheine moiety, which is covalently tethered to Ser36 of CcbZ-CP, is located in the same cavity as 8. The maleimide groups of BMOE form covalent bonds with the sulfhydryl group of the catalytic Cys17 of CcbD and the sulfhydryl group of the 4′-phosphopantetheine of CcbZ-CP (Fig. 4b). The pantetheine moiety does not form tight interactions with the active site residues, probably because BMOE is larger than the natural substrate, proline. The phosphate group of the 4′-phosphopantetheine forms a hydrogen bond network with Tyr322, Arg323 and Glu260′ of CcbD. The C1-carbonyl group of the maleimide, which forms a covalent bond with Cys17, also interacts with the main chain amide of Cys17 and the side chain of His150. Considering that the Cys17-tethered PPL/Pro should be in a similar position to that of maleimide, His150 and the main chain amide group of Cys17 would form an oxyanion hole to stabilize the tetrahedral intermediate during the thioester bond formation and subsequent amide bond formation reactions. The activity of the H150A variant is significantly reduced to 23%, indicating the importance of the oxyanion hole (Fig. 3 and Supplementary Table 4). The hydrophilic residues, including His16, Asp277 and Glu278, are located close to the catalytic Cys17 and form a hydrogen bond network via water molecules. Their H16A, D277A and E278A variants also had dramatically decreased activity, suggesting that the amino group of PPL/Pro would interact with this hydrophilic environment (Figs. 3 and 4b and Supplementary Table 4). Notably, a large space is observed deeper inside the cavity, beyond the maleimide binding site (Fig. 4b and Extended Data Fig. 7). This space is constructed by hydrophobic residues and large enough to accept long alkyl-chains and large side chains of amino acids. These observations suggest that the alkyl-chain of PPL and the other acyl substrates bind in this space, and explain why CcbD shows broad substrate specificity. The model structure of LmbD suggests that the catalytic triad, active site residues and hydrophobic space are well conserved in LmbD, suggesting that LmbD would show a similar property to CcbD (Fig. 5).

Fig. 5: Comparison of the active sites of CcbD and LmbD.
figure 5

a, The active site of CcbD crystal structure. b, The active site of LmbD model structure. The catalytic residues and the important residues for interaction with CP are highlighted with red.

The superimposition of the CcbD/8 and CcbD/CcbZ-CP structures and the docking model of 5 in CcbD/CcbZ-CP suggest that the binding sites for 5/8 and 4′-phosphopantetheine overlap and there is no space for the simultaneous binding of 5/8 and Pro-tethered 4′-phosphopantetheine (Extended Data Fig. 8). To further investigate the formation of CcbD-proline intermediate, we also incubated 8 with the presumed Pro-tethered CcbD, which was prepared by incubating CcbD, CcbC, holo-CcbZ-CP and proline (without 8) and removing the substrate by desalting. As a result, even after removal of the proline substrates, 10 was still produced (Supplementary Fig. 4a). Moreover, when Pro-CoA was incubated with or without CcbD, the non-enzymatic release of proline was not significantly increased in the presence of CcbD, suggesting that CcbD does not catalyse the hydrolysis of either Pro-CoA or Pro-conjugated Cys to release proline in the active site (Supplementary Fig. 4b). This could be due to the accessibility of water molecules in the active site. In the active site of NsPCS, which catalyses deglycination of glutathione, a water molecule interacting with catalytic His183 is located close to the acylated Cys70 (Extended Data Fig. 5f). In contrast, no activated water molecules interacting with basic or acidic amino acids were observed within 4 Å of Cys17 in the CcbD structures. On the basis of these observations, we believe that the formation of the CcbD-PPL/Pro complex and the ping-pong mechanism are more likely.

Interaction between CcbD and CcbZ-CP

CcbD recognizes CcbZ-CP through salt bridges, hydrogen bond interactions and hydrophobic interactions (Fig. 4c). The Arg312, Arg316 and Arg324 residues on α12 in CcbD form salt bridges with Glu47, Glu40 and Glu41 on α2 in CcbZ-CP, respectively, and Arg323 and Gln319 in CcbD interact with Gln43 in CcbZ-CP via a hydrogen bonding network. Moreover, Trp152 in CcbD is inserted into the hydrophobic region of CcbZ-CP (Fig. 4c and Supplementary Fig. 5).

To confirm the importance of these interactions, we constructed the CcbD R312A, R316A and R324A variants and their triple variant. The variants of the counterpart residues in CcbZ-CP, including E40A, E41A and E47A and their triple variant, were also constructed. Cross-linking assays to evaluate the interactions of each CcbD variant with CcbZ-CP and CcbD with CcbZ-CP variants revealed that CcbD R316A and the triple variant of CcbD, and E40A, E47A and the triple variant of CcbZ-CP significantly reduced the cross-linking efficiency (Extended Data Fig. 9). Furthermore, the amide bond formation activities of the CcbD R316A and CcbD R312A/R316A/R324A variants and the CcbZ-CP E40A/E41A/E47A variant were significantly decreased by 50%, 5% and 38%, respectively, while the activities of CcbD R312A and R324A were comparable to that of the WT (Extended Data Fig. 10 and Supplementary Table 4). Arg316 forms salt bridges with both Glu40 and Glu47 and a hydrogen bond interaction with Gln43, which is the reason why Arg316 is the most important residue for the activity. These results clearly indicated that the salt bridge and hydrophilic interactions between CcbD and CcbZ-CP are important for CP recognition. The sequence alignment between LmbN-CP and CcbZ-CP (68% identity) revealed that the hydrophilic and hydrophobic residues, observed in the interactions between CcbD and CcbZ-CP, are conserved except for Leu60. This would be the reason why CcbD accepts LmbN-CP-tethered PPL in addition to its natural substrate (Supplementary Fig. 6).

Discussion

Amide bond formation is one of the most ubiquitous reactions in nature, and a variety of amide bond-forming enzyme families have been identified, including the condensation (C)-domain of NRPS, ATP-dependent amide bond synthetase, N-acyltransferase, aminoacyl-tRNA synthetase, PCS and ATP-grasp enzymes35. Among them, the CP-dependent amide bond formation is known only in the C-domains of NRPSs and some Gcn5-related N-acetyltransferase (GNAT) enzymes35,36,37,38. The structure of the NRPS C-domain consists of two domains that form V-shaped pseudo-dimers (Supplementary Fig. 1c)19,39,40,41. The active site, formed in the interface tunnel of the two domains, contains the conserved HHxxxDG catalytic motif for positioning the α-amino group of the acceptor substrate to facilitate the nucleophilic attack to the thioester in the donor substrate. Donor and acceptor peptide CPs bind on the front and back faces of the active site tunnel. Although the structures of CP-dependent GNAT enzymes have not been reported, GNAT superfamily enzymes share the conserved GNAT-fold comprising the central β-strands flanked by α-helices (Supplementary Fig. 1h)42.

In contrast, the structure of the N-terminal domain of CcbD is weakly related to those of cysteine proteases and PCSs. The active site of CcbD is constructed on the dimer interface, while those of other cysteine proteases and PCSs are formed in the monomer cavity26. The CcbD-specific C-terminal α-helix domain is involved in dimerization, active site formation and interactions with CP, mainly through hydrophilic interactions and salt bridges. Furthermore, the C-terminal α-helix domain in the adjoining monomer also forms the active site and interacts with the 4′-phosphopantetheine.

Regardless of the lack of sequence similarity and active site architectures, CcbD has a Cys17-His131-Glu148 catalytic triad similar to the catalytic triads of cysteine proteases and PCSs26,27,28,29,30,43,44,45. In the reaction of PCSs, which catalyses deglycination of glutathione and condensation between γ-Glu-Cys and glutathione to generate phytochelatins, the catalytic Cys residue is initially acylated by the first glutathione and form enzyme-associated γ-Glu-Cys. Then, the amino group of another glutathione attacks the thioester bond of the enzyme-γ-Glu-Cys intermediate to generate phytochelatins. Similarly, the chemoenzymatic peptide synthesis by cysteine proteases has been demonstrated as the reverse reaction of hydrolysis, through thermodynamically or kinetically controlled synthesis46,47. In these peptide syntheses, the acyl donor substrate, activated with an ester, amide or nitrile bond, is attacked by a nucleophilic cysteine residue to form a thioester bond with the protease. Subsequently, the thioester bond is cleaved by the amino acid substrate to generate the amide bond.

Considering the similarity of the catalytic residues, CcbD is expected to share a catalytic mechanism comparable to those of cysteine proteases and PCSs. On the basis of these observations, we propose the following mechanisms for the condensation between thiooctose and CP-tethered Pro/PPL (Fig. 6). The enzyme reaction is initiated by the binding of the carbonyl group of Pro in the CP-tethered substrate into the oxyanion hole of the active site, through interactions with the backbone of Cys17 and the imidazole ring of His150. The nucleophilic attack from Cys17 to the thioester bond in the Pro-connected CP then generates the CcbD-Pro complex, with the release of CP. Subsequently, the thiooctose binds to the active site of the CcbD-Pro complex, and the thioester at Cys17 is attacked by the amino group of thiooctose, with nucleophilicity enhanced by the proximity of His131 and Glu148, to generate an amide bond. The high sequence similarity (56% identity) and the conservation of the active site architectures between CcbD and LmbD suggest that the reaction mechanism of LmbD is the same as that of CcbD (Fig. 5). Thus, the CcbD/LmbD-catalysed condensation reaction between the amino sugar substrate and Pro/PPL is structurally and mechanistically different from those of the NRPS system but resembles those of cysteine proteases and PCSs.

Fig. 6: Proposed mechanism of the CcbD-catalysed amide bond-formation reaction.
figure 6

Dashed lines represent hydrogen-bond interactions. The reaction is initiated by loading of Pro onto catalytic Cys17 to generate the CcbD-Pro complex. Further nucleophilic attack from the amino group of thiooctose, which is activated by His131 and Glu148, generates an amide bond.

In conclusion, our structural analysis of CcbD revealed that CcbD/LmbD catalyses the formation of the pharmacophore of lincosamide antibiotics via a mechanism that is structurally and mechanistically significantly distinct from the previously identified CP-dependent amide bond-forming enzymes, and thus provided insights into the diversity of amide bond-forming reactions in nature. Future structure-based engineering of CcbD/LmbD will develop the enzymes as biocatalysts to generate unnatural lincosamides with various acyl and sugar moieties for future drug discovery.

Methods

General

Oligonucleotide primers (Supplementary Table 5 and Supplementary Data 1) and DNA sequencing services were provided by Eurofins Genomics. The restriction enzymes and PrimeSTAR GXL DNA polymerase were purchased from Takara Bio Solvents and chemicals were purchased from Wako Chemicals, Merck KGaA and Hampton Research, unless noted otherwise. PCR was performed using a TaKaRa PCR Thermal Cycler Dice Gradient (Takara Bio). The nuclear magnetic resonance spectra of compounds were recorded on ECX-500 MHz (JEOL) spectrometers.

Construction of plasmids for CcbD expression

The ccbD gene was PCR amplified from the chromosomal DNA of the celesticetin-producing type strain Streptomyces caelestis ATCC 15084, using the primers Fw: CCGCATATGGCCCAATCCAAGGGTTCGGTTGAT and Rv: CCGCTCGAGGAGTTCCTTGAGCAATCGCCG. The ccbD gene was inserted into the pET42b vector (Novagen) via the NdeI and XhoI restriction sites. The resulting plasmid was used to produce C-terminally His8-tagged CcbD. The plasmid for CcbC expression was used as previously reported17.

Construction of plasmids for expression of CcbD and CcbZ-CP variants

The primers used for the construction of plasmids for site-directed mutagenesis studies are listed in Supplementary Table 5. The plasmid for the expression of WT CcbD or WT CcbZ-CP was used as the template for PCR-based site-directed mutagenesis, which was performed with a QuikChange Site-Directed Mutagenesis Kit (Stratagene) according to the manufacturer’s protocol.

Expression and purification of CcbD, CcbZ-CP, holo-CcbZ-CP and their variants

The pET42a plasmids for the expression of CcbD and its variants were transformed into Escherichia coli BL21(DE3). The resulting strains were cultured in LB medium supplemented with 34 mg l−1 chloramphenicol and 50 mg l−1 kanamycin sodium at 37 °C, with shaking at 160 rpm. Plasmids for the expression of CcbZ-CP and its mutants and sfp were transformed into E. coli BL21(DE3) or E. coli BLR. The resulting strains were cultured in LB medium supplemented with 100 mg l−1 ampicillin sodium and 50 mg l−1 kanamycin sodium, respectively, at 37 °C, with shaking at 160 rpm. When the OD600 reached 0.6, the cell cultures were cooled on ice for 30 min, and then isopropyl β-d-1-thiogalactopyranoside (0.3 mM) was added to induce the target protein expression and the cultures were continued at 16 °C, 160 rpm. After 18 h of post-induction incubation, cells were collected by centrifugation at 5,500g for 10 min and suspended in lysis buffer, containing 20 mM Tris–HCl (pH 8.0), 100 mM NaCl, 5 mM imidazole and 5% glycerol. The cell suspension was sonicated for 5 min on ice. After the cell debris was removed by centrifugation at 20,000g for 30 min, the supernatant was mixed with 1 ml Ni-NTA agarose resin and loaded onto a gravity flow column. Unbound proteins were removed with 50 ml lysis buffer containing 30 mM imidazole, and then the His-tagged protein was eluted with lysis buffer containing 300 mM imidazole. Buffers for CcbD H16A, C17A, D114A, H131A, E148A, H150A, R263A, D277A and E278A purification were supplemented with 1 mM dithiothreitol. For the in vitro assay, the eluates of CcbD and its variant proteins were concentrated to 10 mg ml−1 after the removal of imidazole, using a 30 kDa Amicon Ultra-15 filtration unit (Merck Millipore). The eluates containing holo-CcbZ-CP and its variant proteins were concentrated to 20 mg ml−1 after the imidazole was removed, using a 3 kDa or 10 kDa Amicon Ultra-15 filtration unit (Millipore). For crystallization, the eluate from the Ni-NTA agarose was applied to a 6 ml RESOURCE Q anion exchange chromatography column (4 °C, Cytiva) and a HiLoad 16/60 Superdex 200 pre-packed gel filtration column (4 °C, Cytica), and eluted with a solution containing 20 mM Tris–HCl (pH 8.0), 100 mM NaCl, 5% glycerol and 1 mM dithiothreitol. The resulting eluate was concentrated to 15 mg ml−1, using an Amicon Ultra-4 (molecular weight cut-off 30 kDa) filter at 4 °C. The purity of the proteins was monitored by SDS–PAGE, and the protein concentrations were determined with a SimpliNano microvolume spectrophotometer.

In vitro assays of CcbD and its variants

The standard enzymatic reaction of CcbD with substrate was pre-incubated for 30 min in a 50 μl reaction mixture, containing 50 mM Tris–HCl (pH 7.5), 2 mM l-proline, 2 mM dithiothreitol, 5 mM ATP (pH 7.5), 10 mM MgCl2, 2 μM CcbC, 50 μM holo-CcbZ-CP or its variants, and 2 mM substrate 5 or 8. After this pre-incubation, 10 μM CcbD (WT or variants) was added to the reaction and incubated for 5 min at 30 °C. To measure the relative activity between 5 and 8, the consumption of substrates was calculated by liquid chromatography–mass spectrometry (LC–MS) analysis. For the analysis of substrate specificity towards acyl substrates, 10 μM CcbD was incubated in 50 mM HEPES buffer (pH 7.5) buffer, containing 10 mM MgCl2, 100 μM acyl-CoA, 100 μM 8, 30 μM Sfp and 50 μM apo-CcbZ-CP on a 25 μl scale, overnight at 30 °C. For the substrate specificity analysis for amino compounds, the reaction buffer containing 50 mM HEPES (pH 7.5), 100 μM l-proline, 5 mM ATP (pH 7.5), 5 mM MgCl2, 2 μM CcbC, 100 μM holo-CcbZ-CP and 10 μM CcbD was pre-incubated for 30 min in a 25 μl reaction mixture, then 100 μM of amino substrates were added and reacted for overnight at 30 °C.

To measure total turnover number and conversion rate towards l-proline reaction, the reaction buffer containing 50 mM HEPES (pH 7.5), 1 mM l-proline, 1 mM 8, 5 mM ATP (pH 7.5), 5 mM MgCl2, 20 μM CcbC and 100 μM CcbZ-CP was pre-incubated for 30 min in a 25 μl reaction mixture. After this pre-incubation, 2 μM CcbD was added to the reaction and incubated for 13 h at 30 °C. For the total turnover number and conversion rate analysis towards acyl-substrates reactions, the reaction buffer containing 5 mM MgCl2, 100 μM acyl-CoA, 100 μM MTL, 30 μM Sfp and 100 μM apo-CcbZ-CP was pre-incubated for 30 min in a 25 μl reaction mixture. After this pre-incubation, 10 μM CcbD was added to the reaction and incubated for 13 h at 30 °C. Each sample was then subjected to LC–MS analyses and the consumption of 8 was calculated.

LC–MS methods used for analysing in vitro reactions

For the analysis of the CcbD WT with various substrates LC–MS samples were injected into a Shimadzu Labsolution LCMS 8045 system. A HILICpak VG-50 2D column (Shodex) or C18-MS-II column (Nacalai tesque) was used for separation. The gradient elution was performed with solvent A (50 mM NH4Ac) and solvent B (CH3CN), with a flow rate of 0.2 ml min−1 (T = 0 min, 95% B; T = 3 min, 95% B; T = 20 min, 50% B; T = 25 min, 95% B; and T = 40 min, 95% B: for Hilic column) or with solvent A (1 mM NH4 formate) and solvent B (CH3CN), with a flow rate of 0.2 ml min−1 (T = 0 min, 2.5% B; T = 4 min, 2.5% B; T = 20 min, 100% B: for C18 column). For the analysis of the CcbD R312A, R316A, R324A and R312A/R316A/R324A variants and the E40A/E41A/E47A variant of CcbZ-CP, LC–MS samples were injected into a Shimadzu Labsolution LCMS 8045 system. A HILICpak VG-50 2D column was used for separation. The gradient elution was performed with solvent A (50 mM NH4Ac) and solvent B (CH3CN), with a flow rate of 0.2 ml min−1 (T = 0 min, 95% B; T = 3 min, 95% B; T = 20 min, 50% B; T = 25 min, 95% B; and T = 40 min, 95% B). For the analysis of CcbD C17A, D114A, H131A, E148A, H150A, R263A, Asp277 and Glu278, LC–MS analyses were performed on an Agilent 1260 Infinity II LC coupled to an Agilent 6546 LC/Q-TOF MS equipped with a Jet Stream electrospray ion source (Agilent Technologies). A 3 µl portion of the sample was loaded on the ACQUITY PREMIER BEH Amide (2.1 mm × 100 mm, 1.7 µm) LC column, which was maintained at 40 °C. A two-component mobile phase, B and A, containing acetonitrile (ACN) with 20 mM ammonium formate, pH 4.75, at 9:1 (v/v), and ACN with 20 mM ammonium formate, pH 4.75, at 1:1 (v/v), was used for separation. The elution was performed at a flow rate of 0.4 ml min−1 with the following conditions (min/%B) 0/99.0; 1/99.0; 7/1.0; 9/1.0; and 10/99.0, with 1.0 min column clean-up (99.0% B) and 2 min equilibration (99% B). The mass spectrometer was used in the positive mode and the data storage was centroid, with a dual AJS electrospray ion source ion source under the following conditions: the drying gas temperature, 250 °C; drying gas, 8 l min−1; nebulizer, 35 psi; sheath gas temperature, 400 °C; sheath gas flow, 12 l min−1; capillary voltage set at +3,500 V, nozzle voltage +200 V; acquisition rate 5 spectra s−1 and time 200 ms per spectrum. MS chromatograms for ions corresponding to [M + H]+ were extracted with a 0.05 Da tolerance window. The data acquisition and post-run processing were performed with the MassHunter Workstation software, version 10.1 (Agilent). The Q-TOF MS was tuned in the positive mode for the mass range of 50–1,700 m/z, using Swarm Autotune. Purine (121.050873 m/z) and Agilent HP-0921 (922.009798 m/z) were used as the references to maintain the mass accuracy. The data were further processed with the Qualitative Analysis 10.0 software (Agilent).

Analysis of CcbD-Pro-conjugated intermediate

To investigate the generation of CcbD-Pro intermediate, the substrates were removed after incubation of enzyme reaction components without 8. The reactions were performed at 30°C for 1 h in a 100 μl reaction mixture containing 50 mM HEPES (pH 7.0), 5 mM ATP, 5 mM MgCl2, 1 mM proline, 10 μM CcbC, 50 μM holo-CcbZCP and 50 μM CcbD. Then, the small molecules were removed by Zeba Micro Spin Desalting Columns, 7K molecular weight cut-off, three times. After removing substrates, final 200 μM of MTL was added and further incubated at 30 °C for 1 h. As a negative control, a reaction without CcbD was also performed, and 10 μM CcbD and 200 μM MTL were added after the desalting process to confirm that no proline nor Pro-tethered CcbZ-CP remained in the solution. Each sample was then subjected to LC–MS analyses.

Hydrolysation of Pro-conjugated cysteine in the reaction

To analyse whether the CcbD generates proline via hydrolysation of Pro-conjugated catalytic cysteine, the Pro-CoA was incubated with or without CcdD. The reactions were performed by the incubation of in the buffers containing 50 mM HEPES (pH 7.0), 200 μM l-proline and (20 μM CcbD) for 2 h at 30 °C in a 25 μl reaction mixture. Each sample was then subjected to LC–MS analyses.

Production and purification of natural thiooctose substrate of CcbD (5)

The seed culture of S. lincolnensis lmbN_ΔCP strain13 was prepared by inoculating spores into 50 ml of YEME medium in 500 ml flat-bottom boiling flasks, which were incubated at 28 °C for 30 h, 180 rpm. A 2 ml portion of the seed culture was then inoculated into 40 ml of AVM medium13 in 500 ml flat-bottom boiling flasks and incubated at 28 °C for 120 h. The supernatants from 30 flat-bottom boiling flasks were used in the next steps. The cells were centrifuged at 4,000g at 4 °C for 10 min, and the supernatant was stored at −20 °C. The supernatant was adjusted to pH 2–3 with formic acid and extracted in two steps. First, Amberlite XAD-4 in a glass column was used, and the amount of the sorbent was approximately 5 cm in diameter and 10 cm in height. Methanol (MeOH) followed by 0.1% formic acid was used to equilibrate the column before loading the supernatant. The eluate-containing compounds not interacting with Amberlite and including the compound of interest were collected. Second, MCX 35 cc (6 g) (Waters) cartridges were used. The cartridge was conditioned and equilibrated with MeOH followed by 2% formic acid, and then the collected eluate from the previous extraction was applied on the cartridge, which was washed with 2% formic acid and MeOH. Thereafter, 200 ml MeOH with 5% of an aqueous solution of ammonium hydroxide (29%) was loaded to elute the compound of interest. For further purification of the natural substrate, two-step preparative high-performance liquid chromatography was performed. The first step was carried out using a Triart Diol-HILIC column (250 × 20 mm, 5 μm particles; YMC) with a two-component mobile phase: A, 50 mM ammonium acetate, pH 4.7, and B, ACN. The flow rate was 8 ml min−1 with a linear gradient (min/% of B) 0/95; 5/95; 28.6/30; and 32/95 with equilibration before the next analysis. The second step was performed on a Luna column (C18, 250 × 15 mm, 5 μm; Phenomenex) using a two-component mobile phase, with A as 0.1% formic acid and B as MeOH. The flow rate was 3 ml min−1 with a gradient of 5% to 90% B, for 70 min. The fractions with the natural thiooctose substrate were monitored using an Acquity UPLC system with a 2996 PDA detection system (194–600 nm), connected to an LCT Premier XE time-of-flight mass spectrometer (Waters). The sample (5 µl) was loaded onto the ACQUITY PREMIER BEH Amide (2.1 mm × 50 mm, 1.7 µm) LC column, maintained at 40 °C. A two-component mobile phase, B and A, containing ACN with 20 mM ammonium formate, pH 4.75, 9:1 (v/v) and ACN with 20 mM ammonium formate, pH 4.75, 1:1 (v/v), respectively, was used for separation. The elution was performed at a flow rate of 0.4 ml min−1 with the following gradient (min/%B) 1/99.0; 7/1.0; 9/1.0; and 10/99.0, with 1.0 min column clean-up (99.0% B) and 2 min equilibration (99% B). The mass spectrometer was used in the positive mode (W) with the cone voltage, +40 V; the capillary voltage, +2,800 V; ion source block temperature, 120 °C; desolvation gas temperature, 350 °C; desolvation gas flow, 800 l h−1; and cone gas flow, 50 l h−1; with an inter-scan delay of 0.01 s and a scan time of 0.15 s. The mass accuracy was maintained using lock spray technology with leucine-enkephalin as the reference compound (2 ng μl−1, 5 μl min−1). The diode array detector detection technique and the same conditions were used to quantify the natural substrate. The chromatograms were extracted at the absorption maxima of the compounds, and the area of the peaks was used to construct a five-point calibration curve (R = R2 = 0.9999) of the EGT standard (Cayman Chemical), which was used to quantify the natural thiooctose substrate (5).

Preparation and purification of the cross-linked CcbD-CcbZ-CP complexes

For the cross-linking assays of CcbD-CcbZ-CP and their variants, 20 µM CcbD or CcbD variant protein was mixed with 60 µM holo-CcbZ-CP or holo-CcbZ-CP variant proteins and 0.2 mM BMOE, in buffer containing 40 mM NaH2PO4/K2HPO4 (pH 7.0), 5 mM EDTA and 5% glycerol. At 10, 30 and 60 min after the reaction was initiated, a 50 µl aliquot of the reaction was removed and the reaction was quenched with SDS–PAGE loading buffer and monitored by SDS–PAGE. To obtain the CcbD-CcbZPCP complex proteins, 60 µM His-tag free CcbD C249S/C216S variant was mixed with 600 µM His-tagged holo-CcbZPCP protein, in buffer containing 40 mM NaH2PO4/K2HPO4 (pH 7.0), 5 mM EDTA and 5% glycerol. The cross-linking reaction was initiated by adding 0.2 mM BMOE, and incubated on ice for 1 h. The BMOE was removed by passage through a PD-10 column, and the complexes were monitored by SDS–PAGE. The remaining CcbD variant proteins and holo-CcbZPCP proteins were removed by chromatography on nickel affinity resin and a HiLoad 16/60 Superdex 200 pre-packed gel filtration column (4 °C, Cytiva).

Crystallization and structure determination

Crystals of selenomethionine-labelled CcbD were obtained after 3 days at 20 °C, while crystals of CcbD-apo and CcbD-CcbZPCP complex were obtained after 3 days at 10 °C. All crystals were obtained by using the sitting-drop vapour-diffusion method with the following reservoir solutions: selenomethionine-labelled CcbD: 0.02 M Tris–HCl (pH 6.7), 0.16 M MgCl2 and 10% w/v PEG 8000; and CcbD WT: 0.02 M Tris–HCl (pH 6.7), 0.16 M MgCl2 and 10% w/v PEG 8000. The complex structures were prepared by incubating CcbD crystals at 10 °C for 1 h with 50 mM 8 in the crystallization drop. The crystallization conditions to prepare the CcbD-CcbZ-CP complex were 0.1 M MES (pH 6.7), 0.16 M MgCl2, 0.2 M sodium thiocyanate and 8% w/v PEG 8000. The crystals were transferred into the cryoprotectant solution (reservoir solution with 25% (v/v) glycerol), and then flash-cooled at −173 °C in a nitrogen gas stream. The X-ray diffraction datasets were collected at BL-1A (Photon Factory), using a beam wavelength of 1.1 Å. The diffraction datasets for selenomethionine-labelled CcbD, CcbD with 8 and CcbD-CcbZ-CP were processed and scaled using the XDS program package48 and Aimless in CCP4 (ref. 49). The determination of Se sites and the generation of the initial model were performed with Crank2 in CCP4 (ref. 50). The dataset for selenomethionine-labelled CcbD was slightly twinned. The analysis by Xtriage in PHENIX51 suggested that the correlation between the intensities related by the twin law h, -k, -h-l, with an estimated twin fraction of 0.15 is most likely due to a non-crystallographic symmetry axis parallel to the twin axis. Further processes were carried out without detwin. The initial phases of CcbD with 8 and the CcbD-CcbZ-CP structures were determined by molecular replacement, using the selenomethionine-labelled CcbD structure as the search model. Molecular replacement was performed with Phaser52 in PHENIX51. The initial phases were further calculated with AutoBuild in PHENIX51. The structures were modified manually with Coot53 and refined with PHENIX.refine54. The cif parameters of 8 for the energy minimization calculations were obtained by using the PRODRG server55. The final crystal data and intensity statistics are summarized in Supplementary Table 3. The Ramachandran statistics are as follows: 97.7% favoured, 2.3% allowed for selenomethionine-labelled CcbD; 98.0% favoured, 1.9% allowed, 0.1% outliers allowed for CcbD with 8; and 98.4% favoured, 1.6% allowed for CcbD-CcbZ-CP. A structural similarity search was performed, using the Dali program server56. All crystallographic figures were prepared with PyMOL (DeLano Scientific, http://www.pymol.org). For docking model structure with 5, initial docking models were constructed using the AutoDock Vina plugin in UCSF Chimera57. Then, the conformation of ligands was manually modified on the basis of the binding mode of 8 in the complex structure of CcbD/8 to avoid the close contacts between the ligands and the active site residues, with Coot.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.