During the biosynthesis of all known complex carbapenem natural products, the assembly of the C6 alkyl side chain (Fig. 1a) is accomplished by a cobalamin (Cbl or B12)-dependent radical SAM enzyme3,4,5. These catalysts can perform serial methyl transfers with control of stereochemical outcome for each reaction. The Cbl-dependent RSMT ThnK performs two sequential methyl transfers to its (2R)-pantetheinylated carbapenam substrate 1 during the biosynthesis of the paradigm carbapenem antibiotic thienamycin (2) (ref. 6). An orthologue of ThnK, TokK from Streptomyces tokunonensis (ATCC 31569), constructs the C6 isopropyl chain of the carbapenem, asparenomycin A (3), by deploying three sequential methylations of 1 (Fig. 1b). This biosynthetic approach allows the producing organism to make a small ‘library’ of alkylated analogues, which may deter the development of resistance in susceptible bacteria. This strategy may also be used in the biosynthesis of cystobactamids, in which a similar Cbl-dependent radical SAM (RS) enzyme, CysS, performs successive methyl transfers7 (Extended Data Fig. 1). Notably, despite low sequence identity (around 29%) between CysS and TokK or ThnK, all three proteins are located within the same cluster of a sequence similarity network (SSN) composed of approximately 11,000 Cbl-binding RS enzymes obtained from networks provided by ( and the USCF structure–function linkage database (SFLD) (Fig. 1c, Supplementary Fig. 1). It is tempting to speculate that this colocalization might be driven by mechanistic similarities.

Fig 1: Cbl-dependent radical-mediated methylations in carbapenem biosynthesis.
figure 1

a, Structure of (2R)-pantetheinylated carbapenam precursor substrate 1 modified by ThnK and TokK. Related carbapenem natural products containing C6-alkyl substituents include thienamycin and asparenomycin A. b, Proposed mechanism describing the three sequential methylations catalysed by TokK. c, An abbreviated SSN of Cbl-binding RS enzymes, highlighting selected sequence clusters and nodes. The full network (Supplementary Fig. 1) was generated from around 11,000 annotated Cbl-dependent RS enzymes with an alignment score of 65. Each node represents a single sequence or a set of sequences with more than 40% sequence identity. Sequence clusters and nodes are coloured by predicted reaction mechanism (see key in the figure). Nodes containing functionally annotated sequences are indicated in colour. Structurally characterized enzymes are represented by enlarged nodes and labeled in boldface.

Carbapenem C6 alkyl chain construction requires stereoselective formation of carbon–carbon bonds between unactivated sp3-hybridized carbons. Cbl-dependent RSMTs are the only known biological catalysts capable of such transformations. The Cbl-containing subfamily, depicted as an SSN in Fig. 1c, is also one of the largest in the RS superfamily, a diverse group that functions in the biosynthesis of chlorophyll, lipids and natural products with antiproliferative biological activity8,9,10,11. Although most Cbl-dependent RS enzymes have unknown functions, those that have been characterized are generally—but not exclusively—methylases that act on carbon or phosphorus centres by using methylcobalamin (MeCbl) as an intermediate methyl donor. All RS enzymes, with a single known exception12,13,14, reductively cleave SAM to generate methionine (Met) and a 5′-deoxyadenosyl 5′-radical (5′-dA•) (Fig. 1b). The latter reactive intermediate typically initiates catalysis with a target substrate by abstracting a hydrogen atom. In B12-dependent RSMTs, the substrate radical attacks the methyl group of MeCbl, inducing homolytic cleavage of the cobalt–carbon bond to yield cob(II)alamin and the methylated product (Fig. 1b). After dissociation of the methylated product, Met and 5′-deoxyadenosine (5′-dAH), and rebinding of another molecule of SAM, cob(II)alamin is reduced to cob(I)alamin. Co(I) is a supernucleophile, which acquires a methyl group from SAM to regenerate MeCbl (refs. 10,11) (Fig. 1b).

Two Cbl-dependent RS enzymes have been structurally characterized, TsrM and OxsB (refs. 12,15) (Fig. 1c, Supplementary Fig. 1), which are involved in the biosynthesis of the antibiotics thiostrepton A and oxetanocin A, respectively. Both enzymes are mechanistic outliers among Cbl-dependent RS enzymes and are found in SSN clusters distinct from each other and from TokK (Fig. 1c). OxsB uses Cbl in an unknown manner to catalyse a complex ring contraction of 2′-deoxyadenosine monophosphate (dAMP)15 (Extended Data Fig. 2a). TsrM methylates an sp2-hybridized carbon, C2, of l-tryptophan (Trp) by a polar mechanism12 (Extended Data Fig. 2b). TsrM is distinctive among all RS enzymes because it does not catalyse the formation of 5′-dA• during catalysis12. Instead, TsrM uses SAM’s carboxylate moiety as an acceptor of the N1 proton of Trp during C2 electrophilic substitution by MeCbl12,16. In addition, the structures of TsrM and OxsB have limitations that prevent full understanding of Cbl-dependent RS catalysis. The structure of OxsB lacks the dAMP substrate15. TsrM has been co-crystallized with aza-SAM (a SAM analogue) and Trp, but the Trp substrate is bound in an unproductive conformation, requiring computational docking to understand the structural basis for the reaction outcome.

TokK was crystallized under anoxic conditions in the presence of 5′-dAH and Met, the products of reductive SAM cleavage (Supplementary Fig. 2). Structures of this complex were solved in the absence and presence of substrate 1 to resolutions of 1.79 Å and 1.94 Å, respectively (Extended Data Table 1). TokK shares in common with TsrM and OxsB an N-terminal Cbl-binding domain and a central RS domain containing a [4Fe–4S] cluster (Extended Data Figs. 3, 4). In both TokK structures, Met binds to the unique iron of the [4Fe–4S] cluster, and the position of the 5′-carbon of 5′-dAH suggests that the binding of SAM in TokK is almost identical to that in OxsB, but quite different from that observed in the TsrM from Kitasatospora setae (KsTsrM) (Extended Data Fig. 4d). A third, C-terminal domain is distinct to TokK12,15 (Fig 2a, Extended Data Fig. 5). Although as-isolated TokK contains MeCbl, hydroxycobalamin (OHCbl) and adenosylcobalamin (AdoCbl), only OHCbl is observed bound to the N-terminal domain in the X-ray crystal structures. This assignment was confirmed by high-resolution mass spectrometry of dissolved crystals (Supplementary Fig. 3). In the structure of the TokK–OHCbl–5′-dAH–Met–substrate complex, which mimics the complex immediately before reaction with substrate, we observed clear Fo − Fc electron density consistent with the shape and size of 1 in one of two monomers in the asymmetric unit (Fig. 2b). In the second monomer, this electron density was also present, but it was of insufficient intensity for substrate modelling (Supplementary Fig. 4). In the chain with substrate bound, the pantetheine tail of 1 occupies a channel that leads from the surface of the protein into the active site (Fig. 2a, Supplementary Fig. 5). This cavity is formed at the interface of all three domains of TokK. The N-terminal Cbl-binding domain and unique C-terminal domain contribute most interactions with the pantetheine unit (Fig. 2a, Extended Data Fig. 6). These include hydrophobic contacts, water-mediated H-bonding interactions and direct polar contacts. For example, Asn515 in the C-terminal domain H-bonds to the terminal -OH of the pantetheine moiety, suggesting that this domain has a key role in substrate recognition. In the N-terminal domain, the Cbl cofactor itself participates in a water-mediated contact to an amide carbonyl of 1. This network involves one of the Cbl propionamide substituents and Tyr410 of the RS domain. The β-lactam ring of 1 is buried deep within the RS domain and anchored by direct and water-mediated contacts to the C7 carbonyl and C3 carboxylate substituents. This mode of β-lactam interaction resembles that of carbapenem synthase, the enzyme responsible for inversion of stereochemistry at C5 in simple carbapenems17 (Extended Data Fig. 7). Both enzymes share the use of H-atom abstraction chemistry to selectively target an unactivated C–H bond within the bicyclic β-lactam core, consistent with their conserved substrate-anchoring strategies. The C3 carboxylate of 1 H-bonds to Arg280 in the RS domain and Tyr652 in the C-terminal domain. These side chains move considerably from their positions in the structure without substrate bound (Fig. 2a), and substitution of Arg280 with Gln results in near complete loss of activity (Fig. 2c), confirming the importance of these side chains in substrate binding.

Fig. 2: TokK binds its carbapenam substrate at the interface of three domains.
figure 2

a, The overall structure of TokK (chain A) is illustrated as a ribbon diagram and coloured by domain. The Cbl-binding Rossmann fold is shown in teal with the Cbl cofactor in stick format and coloured by atom type. The RS domain is shown in light blue. A [4Fe–4S] cluster is shown in orange and yellow spheres. 5′-dAH and Met coproducts are shown in stick format. The C-terminal domain is shown in pink. Carbapenam substrate (1) is shown in light blue sticks, coloured by atom type. b, An Fo − Fc omit electron density map is shown for 1 (blue mesh, contoured at 3.0σ). Substrates, cofactors and coproducts are shown in stick format. Distances between reactive groups are given in units of Å. c, Product formation for the R280Q TokK variant performed in triplicate (each replicate shown as a symbol). Substrate is shown in black spheres and the first methylated product is shown in pink squares. All activity assays were conducted with substrate 1, shown in Fig. 1a. d, Projections of the additional methyl groups added to C6 of 1 and their respective distances (in Å) from the 5′ carbon of 5′-dAH and the hydroxyl moiety of OHCbl, the latter of which serves as a surrogate of the active MeCbl cofactor. Spheres labelled Me and Et represent the suggested positions of the newly installed carbon atoms in the mono-methylated and dimethylated products.

The interactions between TokK and substrate 1 position the β-lactam appropriately both for activation of C6 by 5′-dA• and for subsequent methyl addition by the Cbl cofactor18 (Fig. 2d). C6 of 1 is located directly in front of the 5′-carbon of 5′-dAH, 3.7 Å away, like other RS enzyme–substrate complexes that initiate reactions by H-atom abstraction. The orientation of these two groups in the structure does not reveal whether the pro-R or pro-S H-atom is removed from C6 of 1 by 5′-dA•, as these substituents project equally above and below the 5′-carbon of 5′-dAH (Fig. 2d). However, the structure does provide insight into the trajectory of methyl addition. C6 of 1 is located 4.2 Å above the axial ligand of Cbl (Fig. 2d) at an angle of around 85° relative to the 5′-carbon of 5′-dAH. This arrangement suggests that the methyl group adds to the bottom face of the β-lactam ring, consistent with the absolute configurations observed in the TokK products and thienamycin18,19. The distance and orientation of reactant functional groups in TokK also compares favourably to other enzymes that catalyse radical-mediated activation and functionalization of a substrate C–H bond, such as iron-dependent hydroxylases in the cytochrome P450 and iron(II)-oxo-glutarate-(Fe-2OG)-dependent superfamilies (Extended Data Fig. 7). These systems orient their reactive groups similarly, but over a slightly shorter distance range20,21. This comparison highlights an important distinction between RSMTs and other radical functionalization enzymes. In P450s and Fe-2OG enzymes, a single reactive entity—a high-valent iron(IV)-oxo or iron(III)-hydroxo group—must both activate substrate and functionalize it. This strategy is inherently limiting because the enzyme can only activate and functionalize substrate from the same side. The Cbl-dependent RS radical functionalization platform is more versatile because the radical activation step is separated from methylation, which allows for more diverse stereochemical outcomes.

Notably, the structure of TokK in complex with the substrate has marked structural similarities to another well-characterized RS methylase that does not rely on MeCbl, RlmN (ref. 22) (Extended Data Fig. 8). RlmN uses an S-methyl cysteinyl (methylCys) residue as an intermediate methyl carrier during the methylation of the sp2-hybridized C2 atoms of adenosine 2503 in ribosomal RNA and adenosine 37 in several transfer RNAs (tRNAs). When the structure of RlmN crosslinked to an Escherichia coli tRNAGlu substrate is compared to that of substrate-bound TokK, the methylCys residue in the RlmN structure is in a position similar to that of the hydroxyl group of OHCbl in the TokK structure. Moreover, their respective substrates occupy similar positions in the active site (Extended Data Fig. 8). Although the catalytic mechanisms of these two enzymes are distinct, both obey a ping-pong kinetic model, in which one SAM molecule is used to methylate the intermediate methyl carrier, while a second SAM molecule is used to generate a 5′-dA•.

Cbl is multifunctional in TokK, mediating both the polar methylation of Co(I) by SAM and the transfer of a methyl radical to C6 of the substrate. It is bound at the interface of the Cbl and RS domains with its dimethylbenzimidazole base tucked into the Rossmann fold of the N-terminal domain, a conformation termed ‘base-off’ (Fig. 3a, Extended Data Fig. 5). OxsB and TsrM use a similar base-off approach to interact with their Cbl cofactors (Fig. 3a, Extended Data Fig. 4), a binding mode that allows for extensive modulation of the reactivity of the Co(III) ion of MeCbl by the local protein environment23. In TsrM, this structural feature is essential for the atypical polar methylation of its substrate, Trp, which requires heterolytic cleavage of the Co(III)-carbon bond of MeCbl. The bottom face of the Co ion in TsrM is adjacent to Arg69 but not directly coordinated, which is likely to promote nucleophilic attack of MeCbl by Trp by blocking coordination of a sixth ligand and destabilizing the Co(III)–C bond owing to charge–charge repulsion24,25,26 (Fig. 3a). In TokK, a different side chain, Trp76, occupies the lower axial face of the Cbl, residing 3.8 Å from the metal ion (Fig. 3a). Neither Arg69 nor Trp76 lie in a canonical DXHXXG motif exemplified by methionine synthase, wherein the His residue in the motif ligates to the Co, although both are found in the loop following β3 in the Rossmann fold. To investigate the role of this residue, Trp76 was substituted with Phe and Ala. Rates of methylation slightly increased for both site-specific substitutions, and analysis by electron paramagnetic resonance (EPR) spectroscopy suggested that both variants exhibited the same four-coordinate geometry as wild-type TokK (Supplementary Fig. 6). These data suggest that even when the steric bulk of Trp76 is reduced, water does not coordinate the Cbl (Extended Data Fig. 9). To perturb the local environment of the Cbl cofactor further, Trp76 was also substituted with His and Lys. The activity of Trp76His TokK resembles the activities of the Phe and Ala replacements, but the activity of Trp76Lys TokK was reduced by a factor of around 50 for all three methylation steps (Fig. 3b). The substitution tolerance of Trp76 in TokK contrasts with that of Arg69 of TsrM, which when substituted with Lys was unable to transfer a methyl group from MeCbl to substrate12. Although Trp76 is not widely conserved among other well-characterized Cbl-dependent RS methylases, it is found in the same sequence context in the CysS primary structure (Extended Data Fig. 1).

Fig. 3: The Cbl- and substrate-binding sites influence overall activity and the relative rates of each TokK methylation step.
figure 3

a, Comparative analysis of the side chains proximal to the Co ion in two Cbl-binding RS enzymes, TokK and KsTsrM (Protein Data Bank (PDB) ID: 6WTF). Selected amino acid side chains are shown in stick format and the Co ion is shown as a pink sphere. b, Top, schematic of the three sequential methylations performed by TokK (SPant, pantetheine (Fig. 1a). Forty-eight-hour time-course experiments performed in triplicate (each replicate represented as a symbol), tracking the product formation of substrate (black spheres), methyl (pink squares), ethyl (purple triangles) and isopropyl (blue diamonds). Bottom, each product was estimated using COPASI (irreversible mass action model using the reaction scheme shown above) and simulated using Virtual Cell (shown as lines) for wild type (WT), W76F, W76A, W76K, W76H, L383F, W215F, E19A/Y20V, W215Y and W215A.

The structure of 1 bound to TokK also rationalizes established differences in rate constants for each of the three methyl transfers catalysed by this enzyme (Fig. 3b). The second methylation to form the ethyl-containing carbapenam product 5 proceeds at least threefold faster than the formation of 4—a pattern that runs counter to known differences in the reactivity of secondary and primary C–H bonds. Although we do not report a structure containing the singly methylated intermediate 4, if we presume that 4 remains anchored to Arg280, the methyl group at C6 would be positioned closer to the Cbl axial ligand and potentially at a more optimal angle than the original C6 C–H target. A third methylation to form the isopropyl carbapenem product 6 requires hydrogen atom abstraction from the same carbon, but the newly added ethyl carbon restricts the population of conformers accessible to 5′-dA•. This steric demand could help to explain why the estimated first-order rate constant for the third methylation, k3, is slower than the first two methyl transfers, k1 and k2. The buried location of 1, 5′-dAH, Met and the Cbl cofactor suggests that dissociation of the methylated carbapenam products must occur before dissociation of the SAM cleavage products. This arrangement is consistent with the non-processive kinetic model used to fit the time-course kinetics of each TokK methylation18. A similar mechanism was proposed for CysS (ref. 7) (Extended Data Fig. 1). Although CysS is only 29% identical both to TokK and to ThnK, all three proteins potentially contain a Trp side chain adjacent both to the Cbl and to the substrate (Extended Data Fig. 1). Substitution of Trp215 with Phe, Ala, or Tyr markedly slows substrate methylation by TokK, which suggests that it could have a role in catalysis (Fig. 3b).

ThnK and TokK share 79.3% sequence identity and act on the same substrate, (2R)-pantetheinylated carbapenam (Supplementary Fig. 7), but ThnK performs two sequential methylations whereas TokK catalyses three18,27. Nearly all residues in proximity to the active site are identical in the two orthologues. However, three non-conserved amino acids near the active site were examined to determine their role in controlling the extent of methylation. Leu383 is positioned deep in the active site and near 5′-dAH (Supplementary Fig. 7). When this residue is substituted with Phe, which is found at the same position in the primary structure of ThnK (Supplementary Fig. 7), the rate constants for all three methyl transfers are reduced (2.3-, 2.5- and 4.5-fold for k1, k2 and k3, respectively) compared to those of wild-type TokK (Fig. 3b). Two adjacent residues at the entrance to the pantetheine-binding tunnel, Glu19 and Tyr20, (Fig. 2a, Supplementary Fig. 7) were replaced with the cognate residues in ThnK to generate an E19A/Y20V double substitution. In this variant, the rate constant for the first methylation is increased 1.4-fold compared to that of the wild type, and the rate constants for the second and third methylations are decreased 1.4- and 3.4-fold, respectively, therefore shifting the kinetic profile closer to the pattern observed with ThnK (k1 > k2, k3 = 0) (ref. 18) (Fig. 3b).

The structure of TokK solved in the absence of substrate reveals very few differences in overall fold or domain organization relative to the TokK–1 complex (root mean square deviation (RMSD) of 0.53 Å over 603 residues by Cα atoms). The substrate-binding channel, located at the interface of the three domains of TokK, remains intact without substrate with only modest alterations in size caused by the aforementioned conformational changes in the side chains of Arg280 and Tyr652 (Fig. 2a, Supplementary Fig. 5). The preformed nature of the substrate-binding tunnel in TokK contrasts with observations from structures of TsrM in the absence and presence of its substrate, in which a loop from the C terminus moves to cap the active site in the presence of Trp12. Although the C-terminal domain of TokK is considerably larger than that of TsrM, these domains appear to share a common role in substrate interaction (Extended Data Fig. 4).

The overwhelming majority of known Cbl-dependent RSMTs operate by the radical mechanism used by TokK, producing an equivalent of 5′-dAH and SAH for each methylated product molecule. The structure of the TokK active site reveals a scaffold for positioning the cofactors responsible for substrate activation and methyl transfer and is consistent with the non-processive mechanism of sequential methylations observed for the enzyme, which requires the release of each partially alkylated intermediate and both SAM coproducts before reloading the active site for subsequent methylation. Moreover, the structure reveals that there is little active participation in catalysis from other amino acids in the active site, apart from substrate or cofactor binding. This scaffolding approach, which is underscored by the notable absence of conformational changes after substrate binding, may be shared by the only other Cbl-dependent RS enzyme that has, to our knowledge, been structurally characterized in complex with its substrate, TsrM. Although the electrophilic substitution mechanism used by TsrM requires a general base to accept the N1 proton of the indole ring of Trp during catalysis, biochemical studies and models of the functional enzyme–substrate complex suggest that the carboxylate group of a cosubstrate, SAM, functions in this capacity instead of an active-site amino acid. Elucidation of additional structures and mechanisms for Cbl-dependent RS enzymes will reveal how a seemingly common approach for catalysis may be further elaborated within this large and diverse group of enzymes.


Overexpression and purification of TokK from Streptomyces tokunonesis

TokK (Uniprot ID: A0A6B9HEI0) was produced heterologously in Escherichia coli BL21 (DE3) by overexpression from a pET29b:tokKTev construct18. To facilitate production of soluble TokK protein with maximal occupancy of [4Fe–4S] cluster and Cbl cofactors, the strain was transformed with two additional plasmids, pDB1282 and pBAD42-BtuCEDFB18,28,29,30. A 100-ml LB starter culture with 50 µg ml−1 kanamycin (pET29b) containing tokK, 50 µg ml−1 spectinomycin (pBAD42-BtuCEDFB) and 100 µg ml−1 ampicillin (pDB1282) was inoculated from a single colony and incubated for 18 h at 37 °C while shaking at 250 rpm. A 12-ml aliquot of the starter culture was used to inoculate a 4-l culture of LB medium supplemented with OHCbl (1.3 µM) and allowed to shake at 180 rpm at 37 °C. Four of these cultures were grown to an optical density at 600 nm (OD600 nm) of 0.3, at which point induction of genes on pDB1282 and pBAD42-BtuCEDFB was initiated with the addition of arabinose to a final concentration of 0.2%. To facilitate iron–sulfur cluster incorporation, 25 µM FeCl3 and 150 µM cysteine were added at the initiation of induction with arabinose. The cultures were then grown to an OD600 nm of 0.6 followed by incubation in an ice-water bath for 1 h. Isopropyl β-d-1-thiogalactopyranoside (IPTG) was added to a final concentration of 1 mM, and the cultures were incubated at 18 °C for an additional 18 h before the cells were collected by centrifugation at 6,000g. The resulting cell paste (around 50 g) was flash-frozen in liquid N2 and stored in liquid N2 before protein purification.

All purification steps and subsequent manipulations of TokK were performed in a Coy Laboratory Products vinyl anaerobic chamber. Cell paste was resuspended in 100 ml lysis buffer (50 mM HEPES, pH 7.5, 300 mM KCl, 10% glycerol, 5 mM imidazole and 10 mM β-mercaptoethanol (BME)). The cell suspension was incubated with lysozyme (1 mg ml−1), DNase (0.1 mg ml−1) and phenylmethylsulfonyl fluoride (PMSF) (0.18  mg ml−1) for 30 min at room temperature and then cooled to 4 °C (refs. 18,30). Cells were then lysed by sonic disruption (70% amplitude, 45 s on, 59 s off, around 15 min) and then centrifuged at 50,000g for 1 h to separate insoluble material. The resulting supernatant was loaded onto a pre-equilibrated column of Ni-NTA resin and purified by immobilized-metal affinity chromatography. The protein-loaded resin was washed with 50 ml lysis buffer before the addition of elution buffer (lysis buffer supplemented with 500 mM imidazole). Dark-coloured elution fractions were pooled and concentrated to 1 ml in an Amicon 10 kDa MWCO ultrafiltration device (EMD Millipore). The protein fractions were then exchanged into TEV protease cleavage buffer (50 mM HEPES, pH 7.5, 300 mM KCl, 15% glycerol and 10 mM BME) using a PD-10 pre-poured gel-filtration column from GE Biosciences. The exchanged protein sample was allowed to react with 10 units of TEV protease (around 2 mg ml−1) in 25 mM Tris-HCl pH 8.0, 50 mM NaCl, 1 mM TCEP and 50% glycerol (Millipore Sigma) for two days on ice to generate TokK containing only six additional amino acids (ENLYFQ) on its C terminus. On the second day, the [4Fe–4S] cluster and Cbl cofactors were reconstituted as previously described30. The reaction mixture was then reapplied to the Ni-NTA resin to capture any remaining His-tagged protein, and purified TokK-ENLYFQ was collected in the flow-through fraction. TokK was concentrated and exchanged into gel-filtration buffer (50 mM HEPES, pH 7.5, 300 mM KCl, 1 mM DTT and 15% glycerol) for size-exclusion chromatography. In this step, the protein was applied to a HiPrep 16/60 S200 column using an ÄKTA fast protein liquid chromatography (FPLC) system (GE Biosciences) housed in the anaerobic chamber. TokK elutes as an apparent monomer. Fractions were pooled on the basis of UV-vis absorption at 280 and 410 nm and concentrated to 38 mg ml−1.

Synthesis of substrate

Synthesis of the TokK substrate, (2R,3R,5R)-3-((2-(3-((R)-2,4-dihydroxy-3,3-dimethylbutanamido)propanamido)ethyl)thio)-7-oxo-1-azabicyclo[3.2.0]heptane-2-carboxylic acid (1), was carried out as previously described27. Characterization matched that previously reported.

Determination of the X-ray crystal structure of TokK

General crystallographic methods

X-ray diffraction datasets were collected at the General Medical Sciences and Cancer Institutes Collaborative Access Team (GM/CA-CAT) and at the Advanced Photon Source, Argonne National Laboratory and Berkeley Center for Structural Biology (BCSB) beamlines at the Advanced Light Source at Lawrence Berkeley National Laboratory. All datasets were processed using the HKL2000 or HKL3000 package, and structures were determined by single anomalous dispersion (SAD) phasing using Autosol/HySS or by molecular replacement using the program PHASER31,32,33,34. Model building and refinement were performed with Coot and phenix.refine31,35. Figures were prepared using PyMOL36,37. Substrate channel figures were prepared using Hollow38. Ligplot was used to visualize the binding of substrate 1 (ref. 39).

Crystallization and structure solution of 5′-dAH–Met–TokK

Purified TokK protein aliquots were diluted to 8 mg ml−1 in 34 mM HEPES, pH 7.5. Then 5′-dAH and Met (Millipore Sigma) were added to the resulting solution to final concentrations of 2 mM each. The mixture was incubated for 30 min at room temperature. In hanging-drop vapour-diffusion trials with 100 mM magnesium chloride, 100 mM calcium chloride, 20% PEG 8000 and 10% 1,6-hexanediol as the precipitating reagent, brown plate-shaped crystals appeared within three days. Trials were initiated by adding 1 µl of protein with 1 µl precipitating solution, followed by equilibration against 500 μl of a 0.5 M LiCl well solution at room temperature. Crystals were prepared for data collection by mounting on rayon loops followed by a brief soak in cryoprotectant solution (50% (v/v) precipitating reagent and 50% (v/v) ethylene glycol) and flash-freezing in liquid nitrogen.

Diffraction datasets for single-wavelength anomalous diffraction phasing were collected at the iron K-edge X-ray absorption peak (λ = 1.72194 Å) with 360° of data measured using a 0.5° oscillation range to 2.52 Å resolution. In addition, a 1.79 Å-resolution native dataset was collected at λ = 1.03313 Å (Extended Data Table 1). Heavy-atom sites were identified using HySS implemented within Phenix Autosol31. The initial overall figure-of-merit (FOM) was 0.265 and the Bayes CC was 18.2 (ref. 31). Phenix Autobuild was used to generate an initial model of 541 residues out of 687 in chain A and 569 residues out of 687 in chain B with Rwork/Rfree of 0.25/0.31. Iterative manual model building and refinement were performed in Coot and Phenix35. This model was used to obtain phase information for the 1.79 Å-resolution native dataset by molecular replacement using Phenix Phaser-MR31. Geometric restraints for 5′-dAH and Cbl were obtained from the Grade Web Server (Global Phasing). Rfree flags were maintained so that the same 5% of the reflections were used as the test set. The final model consists of residues 9–412, 416–672, one [4Fe–4S] cluster, one OHCbl cofactor, one Met and one 5′-dAH in chain A; residues 7–413, 416–672, one [4Fe–4S] cluster, one OHCbl cofactor, one Met and one 5′-dAH in chain B. The final model also contains 2 chloride ions, 2 potassium ions, 21 molecules of ethylene glycol and 887 waters. The Ramachandran plot shows that 97.4% residues are in favoured regions with the remaining 2.6% in allowed regions36. Data collection and refinement statistics are shown in Extended Data Table 1.

Crystallization and structure solution of substrate-bound 5′-dAH–Met–TokK

Purified TokK was diluted to 8 mg ml−1 TokK in 34 mM HEPES, pH 7.5. Then 5′-dAH, Met and substrate were added to final concentrations of 2 mM each. The solution was incubated for 15 min at room temperature. In hanging-drop vapour diffusion crystallization trials with 0.2 M lithium sulfate, 0.1 M Tris-HCl, pH 8.5 and 18% PEG 8000 as the precipitant, brown plate-shaped crystals appeared within two weeks. Trials were initiated by mixing 1 µl of protein and 1 µl of precipitant followed by equilibration at room temperature against a 500-μl reservoir of the precipitating solution. Before looping, the concentration of substrate 1 was increased to 2.7 mM. Crystals were prepared for data collection by mounting on rayon loops followed by a brief soak in perfluoropolyether oil (Hampton Research) for cryoprotection and flash-freezing in liquid nitrogen.

The structure containing substrate was solved by molecular replacement (Phaser-MR in Phenix) using the coordinates of the 5′-dAH–Met–TokK as the search model. Iterative manual model building and refinement were performed in Coot and Phenix31,35. Initial geometric restraints for (2R)-pantetheinylated carbapenem substrate were generated by eLBOW in Phenix31. Final geometric restraints for the carbapenem substrate were created by using the PRODRG 2 server40. The final model consists of residues 8–566, 571–672, one [4Fe–4S] cluster, one OHCbl cofactor, one Met, one 5′-deoxyadenosine (5′-dAH) and one (2R)-pantetheinylated carbapenem substrate in chain A; residues 9–672, one [4Fe–4S] cluster, one OHCbl factor, one Met and one 5′-dAH in chain B. The final structure also contains 2 glycerol molecules, 4 potassium ions and 1,193 waters. The Ramachandran plot showed that 97.7% of residues are in favoured regions with the remaining 2.3% in allowed regions36. Data collection and refinement statistics are shown in Extended Data Table 1.

Cbl ligand assignment in protein crystals

The protocol was adapted from previously published studies28. Approximately 15 TokK + 5′-dAH + Met crystals were looped from the crystallization drop into 40 μl of mother solution in a darkly coloured Eppendorf tube. In the dark, 50 mM H2SO4 was added to the tube. The tube was vortexed and then centrifuged for 20 min. Five microlitres of this solution was injected into a Thermo Fisher Scientific UHPLC/QExactive HF-X mass spectrometer equipped with a C18 column (2.1 × 100 mm) equilibrated in 5% solvent A (0.1% formic acid) and 95% solvent B (0.1% formic acid in acetonitrile). The solvent B composition was increased to 98% from 1 to 7 min. Cbl forms were detected by ESI+, scanning from m/z 150 to 1,700 with a resolution of 120,000. A calibration curve (0.1 µM–5 µM) of Cbl standards was run concurrently to quantify the Cbl forms in the sample.

Generation of SSNs

A pool of annotated Cbl-dependent RS enzymes was generated by merging representative annotated B12-dependent RS enzyme sequences from (megacluster 2-1, subgroup 5;, accessed March 2021) with those from the USCF structure-function linkage database (SFLD) (, accessed April 2021)41,42. Additional sequences were included for the following functionally characterized enzymes (with Uniprot accession codes): SlTsrM (C0JRZ9), KsTsrM (E4N8S5), GenK (Q70KE5), GenD1 (Q2MG55), PhpK (A0A0M3N271), Fom3 (Q56184), CysS (A0A0H4NV78), OxsB (O24770), PoyC (J9ZXD6), swb7 (D2KTX6), ArgMT (Q8THG6), ThnK (F8JND9), TokK (A0A6B9HEI0), BchE (Q7X2C7), CouN6 (A0A1H2F7M3) and CloN6 (Q9F8U1)7,12,14,15,18,43,44,45,46,47,48,49,50,51. The pool of B12-dependent RS enzyme sequences from contains 9,724 representative entries that reflect 53,470 unique sequences. The class B and B12-binding RS enzyme sequence groups from the SFLD contain 1,525 and 5,920 representatives, respectively. Each group reflects a total pool of 4,232 and 15,983 unique sequences, respectively41. The final representative sequence set used for SSN generation contained approximately 11,000 sequences after removal of duplicates. This dataset represents a larger sequence pool of more than 50,000 unique entries.

The Enzyme Function Initiative enzyme similarity tool (EFI-EST) ( was used to perform an all-by-all BLAST analysis of the representative sequence dataset described above to create an initial SSN42,52,53,54 with an alignment score threshold of 65. To eliminate protein fragments, sequence length was restricted to greater than 300 amino acids. The final SSN (Fig. 1c, Supplementary Fig. 1) is depicted as a representative node network in which each node reflects sequences with more than 40% identity. All networks were visualized with the Organic layout in Cytoscape55. The SSN in Fig. 1c represents 36 sequence clusters extracted from the full SSN shown in Supplementary Fig. 1. The clusters in Fig. 1c were selected on the basis of the number of nodes or the presence of functionally annotated or structurally characterized sequences.

Preparation of EPR samples

TokK was diluted to a final concentration of approximately 250 µM for each EPR sample. The samples were photolysed on ice for 45 min to generate the cob(II)alamin state. All samples were flash-frozen in cryogenic liquid isopentane in an anaerobic chamber. The resulting samples were stored in liquid nitrogen before analysis. EPR measurements were taken on a Magnettech 5000 x-band ESR spectrometer equipped with an ER 4102ST resonator. Temperature was controlled by an ER 4112-HV Oxford Instruments variable-temperature helium-flow cryostat. All measurements were taken at 70 K, with 1 mT modulation amplitude and 1 mW power.

Construction of TokK variants

TokK variants were generated by overlap extension PCR using pET29b:tokKTev as a template and the primers described in Supplementary Table 1. TokK_F_s was used as the forward primer for all constructs except the E19A/Y20V double variant, for which TokK_F_EYmut was used instead. After amplification, PCR products were digested with NdeI and XhoI and ligated into a similarly digested pET29b vector. Sequence-verified constructs were used to transform E. coli BL21 (DE3) along with helper plasmids pDB1282 and pBAD42-BtuCEDFB as described above for overexpression.

Methylation assays of TokK variants

Expression and purification of TokK variants was carried out as previously described for wild-type TokK18, except reconstitution30 was done concurrently with overnight TEV protease cleavage, and an additional buffer exchange was done using an Econo-Pac 10DG column (Bio-Rad) to remove excess reconstitution reagents. Methylation assays were carried out in triplicate and contained 100 mM HEPES, pH 7.5, 200 mM KCl, 1 mM SAM, 1 mM methyl viologen, 2 mM NADPH, 0.5 mM MeCbl, 100 μM substrate and 100 μM enzyme. At each time point, an aliquot of the reaction mixture was diluted 5×, filtered through a 10-kDa Amicon ultrafiltration device and analysed for product formation using ultraperformance liquid chromatography-high resolution mass spectrometry (UPLC-HRMS) as previously described18. First-order rate constants were determined using the COPASI parameter estimation tool56, and curves were simulated using Vcell57.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.