Main

Epoxides are highly versatile building blocks for the synthesis of many organic molecules. In the presence of strong Lewis or Brønsted acids, epoxides isomerize to carbonyl compounds by Meinwald rearrangement1, which generates multifunctional active aldehyde and ketone intermediates that are widely applied in fine chemical and pharmaceutical syntheses. However, this chemical reaction usually requires the use of corrosive acids as catalysts and protective anhydrous and inert atmospheric conditions, as the substrates are moisture- and air-sensitive2. Furthermore, the Meinwald rearrangement suffers from low yields due to side reactions and poor regio-selectivity and stereo-specificity, typically resulting in the formation of a mixture of compounds3.

In contrast, styrene oxide isomerase (SOI), an integral membrane protein found in the styrene-degradation pathway of microbes4,5,6,7, catalyses the isomerization of aryl epoxides such as styrene oxide derivatives to carbonyl compounds under physiological conditions by Meinwald rearrangement. SOI possesses several characteristics that make it an attractive enzyme for biocatalytic applications and an alternative to chemical synthesis. First, there are very few unique enzymes that can catalyse the isomerization of an epoxide to an aldehyde. Apart from SOI8,9 and quinolone epoxide rearrangement protein (PenF)10 from the penigequinolone synthesis pathway, no other enzymes have been reported to catalyse Meinwald rearrangements. Second, SOI has a broad substrate range and may therefore be used for the production of a variety of organic molecules, including new compounds (Extended Data Fig. 1a) for synthetic applications11. With this goal in mind, SOI has been integrated into cascade biotransformations to produce natural chemicals and enantiopure compounds such as alcohols, amines and acids, starting from styrene derivatives, glucose, glycerol and l-phenylalanine11,12,13,14,15. Chemo-enzymatic cascades have also been reported12,16. Third, SOI is a highly efficient enzyme17,18,19. Its high productivity can be even further enhanced by fusion to a small ubiquitin-like modifier (SUMO) tag, which results in a more than twofold higher protein yield. This engineering approach has resulted in the highest reported biocatalytic production of phenylacetaldehyde to a concentration of 3.4 M (ref. 18). Fourth, SOI shows high regio-selectivity and stereo-specificity, which is a prerequisite for the production of many enantiopure compounds with biological activity. The regio-selectivity and stereo-specificity of SOI were recently demonstrated by the discovery of a 1,2-methyl shift in the enzyme-catalysed Meinwald rearrangement reaction of internal epoxides11,17. A 1,2-hydrogen shift has been reported for the SOI-catalysed isomerization of the natural terminal epoxide substrate styrene oxide14,15.

Although these examples demonstrate the capability of SOI for biocatalytic applications, the molecular details of the mechanism of enzyme action are unknown and thus limit the full exploitation of its potential. To this end, we determined two high-resolution structures of SOI with and without the competitive inhibitor benzylamine by single-particle cryo-electron microscopy (cryo-EM) and elucidated its unique catalytic mechanism by electron paramagnetic resonance (EPR) measurements, biophysical methods, mutagenesis, functional assays and docking experiments.

Results

SOI is a trimer with a ferric haem b prosthetic group

SOI is the most active enzyme and the only membrane-bound enzyme in the styrene side chain degradation pathway (Extended Data Fig. 1b), and it has become a promising candidate protein for large-scale applications of Meinwald rearrangements in industrial biocatalysis. To identify an enzyme homologue suitable for structural studies, we screened SOI proteins from several bacterial species for their recombinant protein expression levels as well as their solubility and stability in different detergents (Extended Data Fig. 2). These experiments revealed Pseudomonas sp. VLB120 SOI (UniProt ID O50216) as the candidate enzyme with the highest protein yields and the best detergent-solubility and detergent-stability properties (Extended Data Fig. 3a–c).

In the literature, SOI is described as a cofactor-independent enzyme9,19; however, our studies revealed an unexpected red colour (Extended Data Fig. 3a) associated with SOI expression and purification, immediately suggesting the presence of iron. Chemical analysis of purified SOI by inductively coupled plasma optical emission spectroscopy (ICP-OES) confirmed that iron is bound to the recombinant enzyme in a 1:1 molar ratio. Consistent with these results, UV–vis spectra showed a peak maximum at ~421 nm. The spectrum shifted from 421 nm to 419 nm upon mixing the sample with the reducing agent sodium dithionite and an additional peak emerged at 558 nm, indicating the presence of a reducible haem b prosthetic group tightly bound to SOI (Extended Data Fig. 3d). We therefore provide experimental evidence that ferric haem b is a prosthetic group of SOI.

Circular dichroism (CD) spectroscopy was used to assess the folding and thermal stability of SOI. The far-ultraviolet CD spectrum recorded from the recombinant enzyme showed a substantial amount of α-helical structure at 25 °C, with characteristic minima near 208 and 220 nm. A mean molar residue ellipticity Θ [222] value of ~24,000 deg cm2 dmol−1 indicates a degree of α-helicity of ~60–70% for the protein. A ratio of >1 for the CD signals at 222 nm/208 nm might be indicative of supercoiling of the enzyme’s transmembrane (TM) α-helices, a feature that is characteristic of coiled coils20 (Extended Data Fig. 4a). The temperature-induced CD unfolding profile recorded from SOI at 222 nm exhibited a sigmoid shape typical of a two-state transition with a melting temperature of ~55 °C (Extended Data Fig. 4b).

The oligomeric state of the purified SOI was assessed by size-exclusion chromatography coupled with multi-angle laser light scattering (SEC-MALS). For the recombinant enzyme, a molecular weight of 140 kDa was obtained, a value that suggests the presence of an SOI trimer (calculated molecular weight of 59 kDa) plus an n-dodecyl-β-d-maltopyranoside (DDM) micelle (calculated molecular weight of 76 kDa) (Extended Data Fig. 4c). Oligomerization of SOI is also consistent with the presence of bands migrating at 17 and 32 kDa on SDS–PAGE gels (Extended Data Fig. 3b). Additional bands revealed by SDS–PAGE analysis that were approximately two and four times the size of the monomeric SOI have also been reported previously18.

Next, we indirectly measured the isomerization activity of wild-type SOI (SOI WT) and its mutant Y103A in the presence and absence of the inhibitor benzylamine by means of a coupled enzyme assay with excess phenylacetaldehyde dehydrogenase (EcALDH; Extended Data Fig. 5a,c). EcALDH was used because the Pseudomonas phenylacetaldehyde dehydrogenase is only poorly expressed and therefore results in poor yields. The kinetic parameters KM, Kcat and Kcat/KM of SOI WT and SOI in complex with the nanobody used for structure determination (see next section) are listed (Supplementary Table 1a). These experiments also revealed that benzylamine acts as a competitive inhibitor, increasing the apparent KM for styrene oxide in its presence while not having any effect on the maximum rate of reaction Vmax (Extended Data Fig. 5b).

Structures of SOI reveal a unique substrate binding pocket

Next, we aimed at structure elucidation of SOI by single-particle cryo-EM analysis. Because the small molecular weight of SOI of only 59 kDa is expected to represent a considerable challenge for a cryo-EM approach, we raised conformational nanobodies to increase its molecular mass. Conformational nanobodies were generated as described in ref. 21. We characterized all nanobody/SOI complexes using analytical SEC and activity assays. Of particular interest was a nanobody that, as judged by analytical SEC, mediated the formation of a bigger complex, possibly a SOI–nanobody (SOI–NB) hexamer, and led to a threefold higher catalytic efficiency (kcat/KM) compared to the wild-type enzyme (Supplementary Table 1a,b). The higher catalytic efficiency seems to be the result of a lower KM of the enzyme for styrene oxide caused by the nanobody. We reconstituted the SOI–NB complex with and without the competitive inhibitor benzylamine into nanodiscs (MSP1D1-filled Escherichia coli polar 169 lipids) and subjected the samples to extensive cryo-EM data collection and image processing. We obtained three-dimensional (3D) reconstructions of SOI–NB and SOI–NB–benzylamine (SOI–NB–BA) complexes at resolutions of 2.05 Å and 2.12 Å. A local resolution analysis of the cryo-EM density maps showed that the core transmembrane domain of SOI can reach a resolution of 1.5–1.6 Å (Extended Data Figs. 6 and 7, and Table 1) with resolved fine structural details including ordered water molecules (Extended Data Fig. 8). The two structures are almost identical. Accordingly, superimposition of SOI–NB and SOI–NB–BA over 1,572 Cα atoms yielded a root-mean-square deviation (r.m.s.d.) value of 0.044 Å.

The structures revealed an extended complex consisting of a dimer of two SOI–NB trimers in which the electron densities of SOI, the nanobody and the ferric haem b prosthetic group are well defined (Extended Data Fig. 8). Both structures have a length of 156 Å and a width of 15 Å (Fig. 1a). Formation of the dodecameric complex is mediated by three main interfaces. First, SOI trimer formation is directed by the ferric haem b prosthetic group that is located at the subunit interface. The location of haem b at the interface of subunits possibly explains the high thermal stability of SOI (Extended Data Fig. 4c). Second, all three variable regions CDR1, CDR2 and CDR3 of the nanobody interact with the periplasmic loops of two different SOI protomers (Fig. 2a,d). Amino-acid residues R28, F30, V31 and P33 of nanobody CDR1 (NB-CDR1, G25-A34) and R100, G101, S103, G104, E107 and Y108 of CDR3 (NB-CDR3, S99-Y108) interact with V32, G33, I42, E44, S50, P51 and E52 of periplasmic loop 1 (PL1, V32-E52) (Fig. 2b,c) of one SOI protomer (Fig. 2d). Amino acids T51, N53, W54, H55, H58 and S60 of nanobody CDR2 (NB-CDR2, T51-S60) form interactions with F108, S109, P110, R112, P118, N119, F121 and P123 of periplasmic loop 2 (PL2, F108-I126) (Fig. 2b,c) of a second SOI protomer (Fig. 2d). Third, two nanobodies interact with each other via their conserved C-terminal β-strand in an anti-parallel complementary manner engaging L16, V94, P114 and T116 (Fig. 2e).

Fig. 1: Cryo-EM structure of membrane-bound SOI containing a ferric haem b prosthetic group.
figure 1

a, Cryo-EM map and model of the SOI–NB complex structure at 2.05 Å resolution. Each subunit of the SOI trimer (blue, light blue, cyan) is bound to a nanobody (NB, orange or yellow) via the periplasmic loops. Densities for the MSP1D1 nanodisc and for the disordered regions of SOI (C terminus and N-terminal 6xHis-tag, are coloured in light grey. b, Views of the SOI trimer from the periplasm (top), in plane with the lipid bilayer (middle) and from the cytosol (bottom). The three ferric haem b molecules bound at the subunit interfaces of the SOI trimer are coloured in pink. c, Electron density features of the ferric haem b prosthetic group visualized as a surface map (upper panel, map contoured to σ = 15) and schematic representation of the ferric haem b prosthetic group.

Fig. 2: Structural organization of the SOI–NB complex.
figure 2

a, Side view of the SOI–NB complex. b, One nanobody binds to the periplasmic loops of two adjacent protomers. Periplasmic loop 1 (PL1) of one subunit is shown in grey and periplasmic loop 2 (PL2) of a neighbouring SOI protomer is shown in blue. c,d, Details of the interaction between the SOI and nanobody. c, Interacting amino-acid residues of PL1 and PL2 are shown in white and cyan, respectively. The nanobody is shown as a space-filling model in orange. d, Interacting amino acids of NB-CDR1, NB-CDR2 and NB-CDR3 are labelled in yellow, brown and orange, respectively. e, The interaction between nanobodies is mediated by four residues that are conserved among nanobodies.

The structure of the SOI trimer is reminiscent of a classical transmembrane channel with an ‘open’ conformation towards the periplasmic space and a ‘closed’ state towards the cytosol (Fig. 1a,b). The functional importance of this conformation is currently not known.

To assess whether the nanobody has an influence on the structure of SOI, we aimed to elucidate the enzyme structure alone. Despite numerous attempts, we were not able to obtain its structure in the absence of the nanobody. We therefore used an SOI model predicted by AlphaFold2 to identify potential conformational changes in the enzyme upon nanobody binding22. There is convincing evidence in the literature that AlphaFold2 can accurately predict haem proteins in the absence of the haem cofactor. For example, lack of the haem cofactor or its tetramerization partners, which are essential for folding, does not stop AlphaFold2 from perfectly predicting the fold of the haemoglobin α-chain23. AlphaFold2 learns structure prediction at the amino-acid residue contact level, without the need for folding information, and can therefore accurately predict a single-chain haemoglobin fold that would never exist on its own or in the absence of the haem cofactor in nature. Furthermore, haem proteins in general are accurately predicted by AlphaFold in the absence of the haem cofactor24,25. Based on these findings, we believe that the use of an AlphaFold2-generated model as a reference for the state without nanobody is justified. Overall, the per-residue confidence score (pLDDT) was >90, indicating the high quality of the predicted structure. Low pLDDT values were only obtained for the C terminus of SOI, indicating that it is mostly unstructured. The model indicated that there is sufficient space for a haem b molecule between the subunits of the enzyme. Cα superimposition of the AlphaFold2 model onto the cryo-EM structure of the SOI–NB complex resulted in an r.m.s.d. of less than 1 Å over 169 Cα atoms, indicating that there exists only one predominant conformation of this enzyme. This conclusion is strongly supported by the identical structures observed for SOI–NB and SOI–NB–BA complexes.

A DALI search with SOI did not result in any substantial structural similarities to other proteins, a feature that also reflects the uniqueness of the enzyme26.

SOI substrate binding mode supports its broad substrate scope

The cryo-EM structures revealed that 17 amino acids from five TM helices originating from two adjacent monomers form a 5.4-nm3-sized catalytic centre cavity containing the ferric haem b prosthetic group (Fig. 3a–f). Two separate hydrogen-bond networks around the cavity are seen in the structures. Network 1 is composed of 11 amino acids from one monomer (coloured in marine). Hydrogen bonds formed by amino acids N64 (TM2), D95, N99, Y103 and L104 (TM3) are shown in Fig. 3c. The hydroxyl group of Y103 also forms a 2.9-Å-long hydrogen bond (Fig. 3e) to the nitrogen atom of benzylamine, suggesting that network 1 plays an important role in substrate positioning. Consistent with this conclusion, mutation of amino-acid residues N64 and D95 to A substantially impaired the function of the enzyme, whereas substitution of N99 and Y103 by A resulted in complete inactivation of the SOI variants, demonstrating their importance in catalysis (Supplementary Table 1b).

Fig. 3: Substrate binding pocket of SOI.
figure 3

a,b, Side (a) and top (b) view of the ferric haem b binding pocket formed at the interface of two SOI subunits. c, The active centre Fe(III) is coordinated by an axial H58 at the fifth coordination site and the competitive inhibitor benzylamine at the sixth coordination site. d,e, Network of residues from two adjacent SOI subunits in apo (d) and benzylamine inhibitor (e) bound states. The electron densities for haem b and benzylamine are shown as surfaces. The electron densities of haem b and benzylamine are shown in a surface representation (map contoured at σ = 15). In the apo state, an additional unknown density bound to haem b (shown as a white surface) is observed, which might correspond to a water molecule that mediates an interaction of Fe(III) and Y103 in the apo state. f, A superimposition of the SOI–NB (apo) and SOI–NB–BA (competitive inhibitor) bound states of SOI shows no structural changes associated with binding a competitive substrate inhibitor. g, Ordered structural water molecules in the catalytic centre facilitate the interaction of SOI with haem b. h,i, Network 1 (cyan) is important for positioning the substrate (h) and network 2 (blue) is required for the orientation of H58 (i). Hydrogen bonds and distances of Fe(III) to coordination sites are indicated by dashed lines. Shown values are in Å.

Network 2 consists of eight amino acids from the adjacent monomer (coloured in yellow). The most prominent features of network 2 are two hydrogen bonds formed between the first nitrogen atom of the H58 side chain (ND1 of H58) and the main-chain carbonyl group of T20 and between the H58 main-chain carbonyl group and the OG (oxygen atom) of the T20 side chain. Both interactions appear to lock the conformation of H58 (Fig. 3f). The H58A variant could not be assessed because its mutation resulted in a colourless, inactive and presumably monomeric protein that was very prone to aggregation. This result suggests H58 as the key residue for coordinating the Fe (III) and that the presence of the ferric haem b prosthetic group is crucial for proper enzyme folding and stability. Thus, the ferric haem b prosthetic group represents a key structural and functional element of SOI.

Our structural findings suggest that the ferric haem b prosthetic group plays an essential role in substrate binding and catalysis. Fe(III) has six coordination sites that are occupied by four equatorial ligands in the porphyrin ring and the side chain of H58 of a neighbouring subunit oriented perpendicular to the porphyrin ring and anchoring the ferric haem b prosthetic group at the lateral trimer interface (Fig. 1b). The distance of Fe(III) to H58 is 2.2 Å (Fig. 3e). The sixth ligand site of Fe(III) serves as interaction site with suitable groups of substrate molecules. As seen in the SOI–NB–BA complex structure (Fig. 1a,b), the substrate binding site positions the aryl group plane of benzylamine above and parallel to the porphyrin ring plane. The nitrogen atom of benzylamine occupies the sixth coordination site of Fe(III) at a distance of 2.3 Å (Fig. 3e). The ferric haem b acts as a Lewis acid, and its interaction with the epoxide oxygen atom should be sufficient to promote epoxide ring-opening. The position of the iron ligand atom together with the orientation of the aryl group, determined by its contact with the porphyrin ring, constrains the torsion angle of the benzylic C–C bond of benzylamine and likewise for bound styrene oxide. The size and property of the binding site controls the size and chemical nature of possible epoxide substrates.

Next, we investigated the catalytic centre using EPR studies. The EPR spectrum of purified SOI WT is dominated by a low-spin (LS) signal with g values of gzyx = 2.97, 2.28, 1.45 that we designate as LS1 (Extended Data Fig. 9a). The LS1 signal is characteristic of a ferric haem protein with a bis-His or His-imidazole coordination27. It is therefore possible that LS1 represents an imidazole adduct of SOI that is still present after purification. SOI WT also exhibits high spin signals (small features between 1,000 and 2,000 G), which are attributed to 5-coordinate haem (Extended Data Fig. 9b). These results suggest that the iron ion of haem b of SOI WT is a mixture of 5-coordination and 6-coordination. This is consistence with our X-ray absorption near-edge structure (XANES) measurement of SOI WT at room temperature (Extended Data Fig. 10a,b). EPR, XANES and cryo-EM data therefore suggest that, in the absence of the substrate, the product or an inhibitor (see below), the 6-coordination site of iron can be occupied by small ligand.

The SOI Y103A mutant exhibits a different EPR spectrum compared to SOI WT (Extended Data Fig. 9b). The Y103A spectrum is dominated by a highly anisotropic LS (HALS) signal (gz = 3.32), which is absent in SOI WT, and only a minor LS1 signal remained.

The dominant HALS signal (gz = 3.32) for the benzylamine adduct that is seen with wild-type and mutant enzyme variants (Extended Data Fig. 9a) cannot be explained by the perpendicular imidazole ligand planes, but is consistent with previous observations of a HALS signal with a primary amine ligand28,29. We therefore attribute the HALS signal as the N-coordinated benzylamine adduct, which is consistent with the structure showing direct coordination of the haem iron by the ligand (Fig. 3c).

Addition of the substrate styrene oxide and product phenylacetaldehyde to SOI resulted in very similar spectra, which show the appearance of a new LS species (LS2, gzyx = 2.60, 2.17, 1.84) characteristic of an oxygen sixth ligand (Extended Data Fig. 9a), with a presumed His/O-substrate or product coordination. For the SOI Y103A variant, LS2 was not observed and only a very dominant LS1-like signal was observed at the expense of the HALS signal (not shown). The absence of LS2 in the Y103A variant is consistent with the role of Y103 to form a hydrogen bond with the coordinating oxygen atom of the substrate/product. This suggests that the substrate and product still bind the active site, but in a less productive fashion.

The oxidation state of the iron of the haem b prosthetic group is Fe(III), as determined by EPR (Extended Data Fig. 9). The Fe(III) state acting as a strong Lewis acid is crucial for SOI function. HS and LS signals originated from Fe(III) disappeared after reduction with excess sodium dithionite (Extended Data Fig. 9a). As a result, we do not know whether the substrate/product will still bind under reducing conditions.

Regio-selectivity and stereo-specificity of SOI

Docking experiments, guided by the benzylamine binding mode (Fig. 4a), with (S)- and (R)-styrene oxide (Fig. 4b,c), (S)-α-methylstyrene oxide (Fig. 4d) and (1S,2S)-trans-2-methyl-3-phenyloxirane (Fig. 4e) confirmed that the bound substrates are conformationally highly restricted, which is the basis of the high regio-selectivity and stereo-specificity of the subsequent reaction. The shown epoxides (and many more known to be good substrates) can be fit well in the substrate binding site with a distance of the epoxide oxygen to the iron atom around 2.3 Å and in hydrogen bond distance to the hydroxyl group of Y103. The binding mode is always similar to that of benzylamine (Fig. 4a) and is maintained with moderate positional adjustments when modelling a ring-opened sp2-hybridized carbocation intermediate.

Fig. 4: Docking experiments explain the substrate range and regio-selectivity and stereo-specificity of SOI.
figure 4

a, The substrate binding pocket occupied by benzylamine, as seen in the cryo-EM structure (left) and key interactions (right). be, Different substrates docked to the binding pocket. Left: (S)-styrene oxide (b), (R)-styrene oxide (c), (S)-α-methylstyrene oxide (SAMO) (d) and (1S,2S)-trans-2-methyl-3-phenyloxirane (MPO) (e). Right: key interactions.

Both (S)- and (R)-styrene oxide can be well fitted in positions essentially related by a mirror plane through their oxygen atoms and perpendicular to their overlapping phenyl rings (Fig. 5). The R enantiomer has been shown to react more quickly, but the difference is too small to be rationalized by static structural considerations. Similarly, the ring-opened sp2-carbocations of both enantiomers, assumed to be stabilized by co-planarity with the aryl ring and with their oxygens still ligated to the haem iron, can be modelled without causing substantial repulsive interactions and maintaining the approximate mirror relationship. Fixed in this relatively rigid conformation, only one of the two hydrogens of Cβ of the epoxide is in a favourable position to shift to Cα, explaining the high specificity of the reaction and the conserved chirality at Cα. At the same time, it explains why internal epoxides methylated at Cβ, such as (1R,2R)-2-methyl-3-phenyloxirane, gave only one diastereomer by shifting the methyl group in the trans-position, but not the other product by transferring the hydrogen in the cis-position11 (Supplementary Fig. 2a–f).

Fig. 5: Proposed SOI reaction mechanism for the isomerization of styrene oxide.
figure 5

a,b, Mechanisms for the isomerization of (S)-styrene oxide (a) and (R)-styrene oxide (b). Y103 positions the oxygen atom of the epoxide ring optimally for the Fe(III) of haem b to act as a Lewis acid, resulting in epoxide ring-opening, carbocation formation and a stereo-specific 1,2-hydride shift.

The mechanism of the Lewis acid-catalysed Meinwald rearrangement is commonly presented as being initiated by epoxide ring-opening (involving C–O bond-breaking and relaxation of the strained geometry), leading to a carbocation intermediate followed by a 1,2-hydride (alkyl) shift and product formation. Alignment and partial overlap of the occupied bonding orbital of the shifting nucleophile with the empty p-orbital of the carbocation is thought to be needed to permit the shift. However, as the carbocation is achiral, the stereo-specificity observed for the SOI-catalysed reaction would have to be due to an enzyme environment-mediated restriction of transfer to only one face of the carbocation plane. With α-methylstyrene oxide as substrate, one would thus expect that only one enantiomer is formed, independent of the chirality of the educt. Instead, both enantiomers react with retention of chirality, as reported by two groups15,16. Meza and colleagues16 therefore precluded the carbocation hypothesis and instead proposed a concerted Meinwald rearrangement where stereo-specificity is under substrate control. To us, a concerted C–O bond-breaking/hydride (or alkyl) shift appears stereochemically highly unfavourable (H–C–C–O torsion angle of ~110° rather than 180°) and is not compatible with the established antiperiplanar geometry of concerted bond-breaking/bond-forming rearrangements. As a 1–2 (equivalent to Cα–Cβ) bond rotation is not possible before the oxirane ring opens, an antiperiplanar orientation is not accessible. Furthermore, the very low stereo-specificity observed for the chemical Meinwald rearrangement of these substrates15 would imply that substrate control is only taking place in the enzyme environment. Our structural and modelling results reveal a different but very elegant solution to this problem that is fully consistent with the carbocation hypothesis. The key finding is that (R)- and (S)-styrene oxide bind in two different ways (related by an approximate local mirror plane) to the active site, as already described. As a result, the shifting group attacks from the same side with respect to the protein environment but from opposite sides if we take the prochiral carbocation as the reference frame. Further support for a carbocation intermediate is provided by the observation that aryl electron-donating and electron-withdrawing substituents in the para position lead to an accelerated and strongly reduced reaction rate, respectively30, consistent with their expected inductive effect on benzyl carbocation stability. With respect to the final electronic rearrangement, the conformation of our modelled carbocation intermediate shows a favourable, nearly antiperiplanar arrangement of the moving electron pairs (involved in hydride shift and C=O double-bond formation, respectively). We thus clearly favour the carbocation intermediate hypothesis as it fulfils established (stereo) chemical principles and is fully consistent with the observed enzyme stereo-specificity.

Discussion

Biocatalysts are attractive tools for fine chemical and pharmaceutical syntheses and offer several advantages over traditional chemical syntheses. Typically, they catalyse reactions under milder conditions, thus consuming less energy and producing lower greenhouse-gas emissions. Furthermore, they generate less waste and show better compatibility with sustainable resources. Another advantage is their high stereo-specificity, which is a prerequisite for the production of many compounds with biological activity. SOI has several properties that are essential for biocatalysis applications. It is a very stable protein with a melting temperature Tm value of more than 55 °C. This high thermal stability is most likely the result of ferric haem b binding at the subunit interface. Furthermore, it has been demonstrated that SOI is highly active under rather harsh conditions, including organic solvents18. Finally, the enzyme is expressed at high levels in E. coli, and we have demonstrated that its expression can be substantially increased by fusion to the SUMO protein18. These features together make SOI an ideal biocatalyst.

The value of SOI for the production of carbonyl compounds from aryl epoxides by means of the Meinwald rearrangement reaction has been demonstrated8. In an example, SOI was used for the biocatalytic synthesis of aldehydes via single-step reactions using cell-free extracts or whole cells. The use of a fusion protein greatly enhanced SOI biotransformation to reach phenylacetaldehyde concentrations as high as 3.4 M without enzyme inhibition18. Furthermore, SOI-catalysed epoxide isomerization has been introduced as a key step in several cascade biotransformation reactions to produce high-value natural chemicals such as alcohols, acids and esters from renewable substrates11,12,13,14,15. The size of the catalytic cavity of SOI explains the broad range of substrates. These structural findings provide a solid basis for further extension of the range of epoxides in SOI-based biocatalytic syntheses. Structure-guided mutagenesis of the hydrophobic pocket of SOI should allow for the isomerization of different or bigger epoxide substrates and is therefore expected to set the stage for the development of novel SOI-based applications.

Haem enzymes are among the most versatile of catalysts found in nature. Accordingly, SOI could be used to catalyse new reactions. Consistent with this suggestion, it has been shown that certain haem enzymes can catalyse reactions for which there exist no biological counterparts. Well-documented examples are cyclopropanation reactions, as demonstrated in ref. 31 with a first biocatalysis. Cyclopropanation is an important reaction in modern chemistry because many invaluable compounds harbour this motif, including insecticides and certain antibiotics31. The authors of ref. 31 screened existing haem enzymes for cyclopropanation activity and repurposed the most promising candidate, P450BM3, by protein engineering to optimize the reaction. Based on these findings, SOI should be an attractive enzyme for the catalysis of Fe-based chemical reactions for which no biological pathways exist.

Methods

Construct design and cloning

A codon-optimized synthetic DNA fragment encoding wild-type SOI was cloned into the pRSFDuet-1 vector, as described in ref. 15. Mutants H58A, N64A, D95A, N99A and Y103A were generated from the plasmid encoding the full-length, wild-type SOI using a modified Quikchange method based on the protocol of ref. 32. All DNA constructs were sequence-verified (Eurofins). All SOI variants contain an N-terminal 6xHis tag.

Expression and purification of SOI, nanobody and MSP1D1

Wild-type and mutant SOI proteins were expressed in E. coli strain C41(DE3) (NEB). Bacteria were cultured at 37 °C in 2xYT medium containing 50 µg ml−1 kanamycin until an optical density at 600 nm (OD600) of 0.6 was reached. The temperature was then lowered to 20 °C, expression was induced with 0.1 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) and incubation continued at 20 °C for ~16 h. The cells were collected by centrifugation (4,000g, 4 °C, 15 min) and stored at −80 °C until further use.

The N-terminally 6xHis-tagged SOI proteins were purified by cobalt affinity chromatography (GE Healthcare) using a lysis buffer containing 50 mM NaH2PO4, pH 8.0, 300 mM NaCl, 10 mM imidazole and 0.05% DDM. After a washing step with ten column volumes (CVs) containing 20 mM imidazole, the proteins were eluted with a high-imidazole elution buffer (50 mM NaH2PO4, pH 8.0, 300 mM NaCl, 250 mM imidazole and 0.05% DDM). Pooled fractions of eluted protein were subjected to SEC on a Superdex 200 column (GE Healthcare) in buffer containing 20 mM HEPES pH 7.5 and 150 mM NaCl.

Nanobody proteins were expressed in E. coli strain BL21(DE3) (NEB). Bacteria were cultured at 37 °C in 2xYT medium containing 50 µg ml−1 ampicillin until an OD600 of 0.8 was reached. The temperature was then lowered to 28 °C, expression was induced with 1 mM IPTG, and incubation continued at 20 °C for ~16 h. The cells were collected by centrifugation (4,000g, 4 °C, 15 min) and stored at −80 °C until further use.

The C-terminally 6xHis-tagged nanobody proteins were purified by Ni-NTA affinity chromatography using a periplasmic extraction method33 by osmotic shock through the addition of ice-cold TES buffer (50 mM Tris-HCl pH 7.2 at 4 °C, 0.53 mM ethylenediaminetetraacetic acid (EDTA) and 20% sucrose) to the cell pellets in a 2:1 ratio. The cell suspension was shaken overnight at 4 °C and 200 r.p.m., then 5 mM ice-cold MgSO4 was added in a ratio of 4:1 to the cell suspension the next day. The cell suspension was shaken for another 2 h at 4 °C and at 200 r.p.m. The periplasmic extract (supernatant) was collected by centrifugation (11,305g; 30 min, 4 °C) and kept aside for purification. The periplasmic extract was applied to a Ni-NTA column. After the washing step with 10 CVs of buffer A containing 50 mM NaH2PO4 pH 8.0 and 300 mM NaCl, a linear gradient from 0 to 300 mM imidazole was applied during elution with 20 CVs of a combination of buffer A and buffer B containing 50 mM NaH2PO4 pH 8.0, 300 mM NaCl and 1 M imidazole. The eluted nanobody proteins were pooled and subjected to SEC on a Superdex-75 column (GE Healthcare) in buffer containing 20 mM HEPES pH 7.5 and 150 mM NaCl.

The protocol for expression and purification of membrane scaffold protein MSP1D1 was adapted from a previous study34. MSP1D1 was expressed in E. coli BL21 (DE3) (NEB). Bacteria were cultured at 37 °C in Terrific broth (TB) until an OD600 of ~2–3 was reached, then expression was induced with 1 mM IPTG and incubation continued at 37 °C for 3 h. Cells were collected by centrifugation at 4,000g, then resuspended in lysis buffer containing 50 mM Tris-HCl pH 8.0, 200 mM NaCl, 25 mM imidazole, 1% Triton-X100, 1 mM phenylmethylsulfonyl fluoride (PMSF) and 10 μg ml−1 DNase I. MSP1D1 was purified by incubating the cell lysate with Ni-NTA resin for 30 min. The column was washed, first with 10 CVs of buffer A containing 50 mM Tris-HCl pH 8.0, 150 mM NaCl, 25 mM imidazole and 1% Triton-X100, second with 5 CVs of buffer B containing 50 mM Tris-HCl pH 8.0, 150 mM NaCl, 25 mM imidazole and 2% sodium cholate, and third with 5 CVs of buffer C containing 50 mM Tris-HCl pH 8.0, 150 mM NaCl and 50 mM imidazole. The protein was eluted in buffer D containing 50 mM Tris-HCl pH 8.0, 200 mM NaCl and 350 mM imidazole. Pooled fractions of MSP1D1 were desalted in buffer E containing 20 mM Tris-HCl pH 8.0 and 200 mM NaCl and flash-frozen in liquid N2 and stored at −80 °C until further use.

Nanobody library generation and selections

A male alpaca was immunized with purified SOI in 20 mM HEPES pH 7.5 and 150 mM NaCl that was mixed before injection with GERBU Fama adjuvant (GERBU Biotechnik) in a 1:1 (vol/vol) ratio and injected subcutaneously in 100-μl aliquots into the shoulder and neck region. The procedure was done four times in two-week intervals, each time using 200 µg of target antigen, until the development of a high titre of the heavy-chain-only immunoglobulin-G (IgG subclasses IgG2 and IgG3) needed for nanobody production. Ten days after the last injection, 60 ml of anticoagulated blood was collected from the jugular vein for the isolation of lymphocytes (Ficoll-Paque PLUS, GE Healthcare Life Sciences; Leucosep tubes, Greiner). Approximately 25 million lymphocytes were used to isolate messenger RNA (RNeasy Mini Kit, Qiagen), which was reverse-transcribed into complementary DNA (Applied Biosystems High-Capacity cDNA Reverse Transcription Kit) using the VH gene-specific primer. The VHH (nanobody) repertoire was amplified by two-step polymerase chain reaction (PCR) and a phage library was generated by ligation into a SapI-digested pDX phagemid vector35 using 336 ng of the VHH repertoire and 1 μg of the plasmid DNA. The resulting nanobody library (2.14e9) was screened by biopanning against directly immobilized SOI target at 1 µl per well in a 96-well Maxisorp plate (Nunc); two selection rounds were performed until high positive enrichment of phages. One hundred and ninety single clones from the enriched nanobody library were induced to express 6xHis-tagged soluble nanobodies in the bacterial periplasm and analysed by enzyme-linked immunosorbent assay (ELISA) for binding to the target. Sanger sequencing of 96 ELISA-positive clones identified 20 different nanobody families according to their CDR3 length and sequence diversity36.

The immunizations of alpaca were conducted strictly according to the guidelines of the Swiss Animals Protection Law and were approved by the Cantonal Veterinary Office of Zurich, Switzerland (licence no. ZH028/2021).

Biophysical characterizations of SOI

SEC-MALS measurements were carried out using fractions eluted from cobalt beads containing the highest amount of SOI. The samples were centrifuged for 10 min at 15,000 r.p.m. to remove large aggregates. For SEC-MALS measurements, the samples were gel-filtered using a Superdex 200 Increase 10/300 column on a Thermo Scientific UltiMate 3000 HPLC system. Gel filtration was performed with 50 mM HEPES pH 7.5 and 150 mM NaCl containing 0.03% DDM (Anagrade) through a 0.1-μm filter. Following elution from the column, the samples were analysed in line by the UV absorbance detector of the Thermo Scientific UltiMate 3000 HPLC system, followed by TriStar miniDawn light scattering (LS) and OptiLab Rex refractive index detectors in series. Protein conjugate analysis (PCA) was performed using ASTRA 6.1 software (Wyatt Technology) to determine the protein mass of the protein–detergent complexes.

CD spectroscopy was used to assess the folding and thermal stability of the purified SOI37. All CD measurements were performed on a Chirascan Plus CD spectrometer (Applied Photophysics) equipped with a computer-controlled Peltier element. All experiments were performed in phosphate-buffered saline at pH 7.6 without NaCl in a 1-mm cuvette. CD spectra were obtained at 25 °C by scanning from 190 to 260 nm in 0.2-nm steps using a protein concentration of 8 μM. A ramping rate of 1 °C min−1 was used to record the thermal unfolding profiles from 20 to 90 °C.

Identification of the ferric haem b prosthetic group

XANES data for SOI was collected at the Swiss Light Source at beamline SuperXAS, using a Si(111) monochromator, Si mirrors for collimation and harmonic rejection, a toroidal Rh-coated mirror for focusing, ionization chambers for incident intensity detection and a five-element silicon drift detector (SDD) for X-ray fluorescence measurements. A 200–300-μl sample of 3 mg ml−1 SOI was filled into a 4–5-cm kapton capillary (diameter of 2 mm) using a Hamilton syringe. The following measurement strategy was used. First, a series of XANES spectra were measured to check the influence of X-ray-induced damage at room temperature After that, the sample was translated to fresh sections multiple times, and scans were performed during the time when the X-ray-induced damage was small. Measured XANES spectra were compared with theoretical XANES calculations performed using finite-difference method near-edge structure (FDMNES) code using the full multiple scattering method38. Our data show a mixture of 5- and 6-coordinated SOI at room temperature in solution (Extended Data Fig. 2). The fifth axial ligand of iron is a nitrogen atom of histidine with ~2.3 Å and the sixth distal ligand of iron can be an oxygen or nitrogen atom.

EPR spectroscopy

The EPR samples contained 305 µM SOI, 545 µM SOI–NB complex or 309 µM SOI Y103A in 200 µl of buffer consisting of 20 mM HEPES, 150 mM NaCl and 0.03% DDM pH 7.5, unless stated otherwise. Reduced samples were prepared by addition of 10 mM sodium dithionite in an anaerobic glovebox. Stock solutions of 200 mM styrene oxide and phenylacetaldehyde were prepared in methanol. A 200 mM benzylamine stock solution was prepared in water. Styrene oxide, phenylacetaldehyde and benzylamine were added to a final concentration of 10 mM to the three different SOI samples, and the EPR samples were frozen in liquid nitrogen. EPR spectra were recorded on a Bruker EMXplus spectrometer with a helium-flow cryostat at 17 K (refs. 39,40) using the following EPR parameters: microwave frequency of 9.410 GHz, microwave power of 20 mW, modulation frequency of 100 kHz and a modulation amplitude of 20 G. The magnetic field was calibrated using the Bruker BDPA (1,3-bis(diphenylene)-2-phenylallyl radical) standard with a g-value of 2.00254 ± 0.00003.

Functional characterization

For isomerization activity measurements, the following coupled enzyme assay was carried out (Extended Data Fig. 5c). In a 1.5-ml cuvette, a 1-ml reaction for (S)-styrene oxide was performed at 25 °C in reaction buffer (0.05 M potassium phosphate buffer pH 8, 0.01% DDM) containing 2 mM NAD+ and 6 U of EcALDH. The change in absorbance at 340 nm was recorded at regular time intervals using a Hitachi U2900 spectrophotometer, and absorbance units were converted to concentration using the extinction coefficient at 340 nm (ɛ340 nm) value of NADH (6.22 Abs mM−1 cm−1). The slope of the concentration–time plot for the first 15 s was calculated to determine activity, with 1 U defined as the activity of SOI that gives 1 µmol of product in 1 min under assay conditions. To determine the kinetic parameters KM and kcat, the isomerization activity was determined with (S)-styrene oxide concentrations of 0.1–5.0 mM. SOI (0.14 μg, 7.26 pmol) or 0.07 μg (3.63 pmol) of SOI–NB complex (containing an equimolar mixture of SOI and nanobody) was added to the assay mixture at 25 °C in reaction buffer (0.05 M potassium phosphate buffer, pH 8, 0.01% DDM) containing 2 mM NAD+, 6 U EcALDH, to start the reaction. The slope of the concentration–time plot for the first 15 s was used to determine the initial rate of each reaction in mM min−1, to obtain a relationship between initial activity and concentration. Each reaction was performed in triplicate.

AlphaFold structure prediction

AlphaFold2 (ref. 22) was implemented by locally running an adapted code written by ColabFold41. All runs used only the sequence of SOI, with no templates or Amber relaxation. We assessed monomeric to tetrameric models.

Reconstitution of SOI in a lipid nanodisc

The lipid nanodisc reconstitution of SOI was performed using E. coli polar lipids and an MSP1D1 nanodisc. The concentrations of SOI for nanodisc reconstitution were in the 300–400 μM range, with a molar ratio of SOI to MSP1D1 to lipid of 3:2:75. Initially, 5 mg of E. coli polar lipid (EPL, Avanti) was dissolved in 100 μl of chloroform and dried under a stream of N2. The resulting lipid film was then mixed with 600 μl of buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl and 1% DDM), and sonicated in a bath sonicator (Bandelin SONOREX SUPER) until the mixture turned translucent. The detergent-solubilized EPL extract was combined with freshly purified SOI (in a 1:25 molar ratio), then gently mixed for 30 min at room temperature. Following this incubation, MSP1D1 was introduced to the protein–lipid mixture and incubated at room temperature for another 30 min. Incorporation of SOI in nanodisc was triggered by adding 300 mg of wet Bio-beads (precleaned with 100% methanol and Milli-Q water). The reconstitution solution was then incubated overnight at 4 °C with gentle mixing. The supernatant was cleared of beads, and the sample was spun before loading onto a Superdex 200 Increase 10/300 GL column (GE Healthcare) equilibrated in 20 mM Tris pH 8.0 and 150 mM NaCl. The peak fractions corresponding to SOI in MSP1D1 (elution volume, 10.3–11.8 ml) were collected, concentrated with a 100-kDa-cutoff Amicon concentrator (Millipore) and used for cryo-EM grid preparation.

Cryo-EM sample preparation and data collection

The cryo-EM samples of SOI–NB were prepared using freshly reconstituted SOI–EPL–MSP1D1 nanodisc complex. SOI–benzylamine was prepared by adding 1 mM benzylamine (100 mM stock concentration in 100% DMSO) to SOI–NB, then incubating the samples on ice for 10–15 min. The final concentration of SOI–NB and SOI–NB–BA complexes used for freezing grids was ~5–7 mg ml−1. The concentration of protein was estimated based on absorbance at 280 nm and a normalized extinction coefficient of 154,685 M−1 cm−1 (considering one molecule of MSP1D1 per three molecules of SOI and nanobody molecule, respectively). All the grids used for cryo-EM grid preparation were freshly glow-discharged in a PELCO easiGlow (Ted Pella) glow-discharge cleaning system for 25 s at 30 mA in air, then 3.5 μl of protein solution was applied to Quantifoil 1.2/1.3 grids (300 mesh), blotted for 3 s with blot force 20, and plunged into liquid ethane using a Vitrobot Mark IV system (Thermo Fisher Scientific) with 100% humidity and an ambient temperature of 4 °C. The frozen grids were stored in liquid nitrogen for subsequent cryo-EM data collection.

Cryo-EM data collection and processing

All the cryo-EM datasets of SOI were collected using EPU software on a 300-kV Titan Krios system (Thermo Fisher Scientific) equipped with a Gatan K3 direct electron detector and a Gatan Quantum-LS GIF, at ScopeM, ETH Zurich. All movies were acquired in super-resolution mode with a defocus range of −0.5 to −3 μm and were binned twofold after acquisition in EPU. The dataset of SOI apo was composed of 8,942 movies with an average dose of 65 e2 and final pixel size of 0.66 Å. The cryo-EM processing was performed in Relion (version 3.1.3 and 4.0.0)42,43. A flow chart of cryo-EM processing of the SOI–NB complex is provided in Extended Data Fig. 6. In brief, all movie stacks were motion-corrected using MotionCorr2 (version 1.4.0)44, then CTF-corrected using Gctf (version 1.0.6)45. A total of 2,715,264 particles were autopicked, then subjected to several rounds of 2D classifications, yielding 955,410 particles. The best set of 2D classes was then used to generate an initial model to be used in 3D classification. After multiple rounds of 2D and 3D clarifications, a set of 396,760 particles were selected for masked 3D refinement (using masks excluding the nanodisc density), resulting in a 3D reconstruction at a resolution of 2.54 Å. These refined particles were then subjected to CTF refinement and particle polishing, and another round of 3D classification without alignment and with masking of the nanodisc. The particles from the best 3D class were subjected to several iterative cycles of 3D refinement, CTF refinement and particle polishing, yielding a final post-processed density map at a resolution of 2.05 Å.

For the SOI–NB–BA complex, 11,905 movies with 56 e2 were collected. The data-collection strategy and image processing of the SOI–NB–BA complex were similar to those used for the SOI–NB complex. The 3D projections from the best 3D class from SOI apo were used as templates for autopicking and 3D classification jobs. The detailed steps for processing of the SOI–NB–BA complex are shown in Extended Data Fig. 7. The cryo-EM density features of the transmembrane helices of SOI, nanobodies and ferric haem b are shown in Extended Data Fig. 8. The resolutions of the cryo-EM maps of the SOI–NB complex and SOI–NB–BA complex are shown by Fourier shell correlation (FSC) curves (Supplementary Fig. 1).

Details of cryo-EM data collection and the statistical analysis are shown in Table 1.

Table 1 Cryo-EM data collection and analysis statistics

Model building and refinement

As starting models, the AlphaFold22 predicated model of SOI and a homology model of the nanobody generated using SwissModel46 (with PDB 5VAK as template) were docked on the final post-processed density map and manually refined using Coot47. The ligands, the ferric haem b prosthetic group and benzylamine were generated from SMILE codes using the eLBOW program in Phenix48. The structures were finally refined using phenix.real_space_refine48. The quality of the final models was assessed using MolProbity49. Local resolution maps were calculated by ResMap50 implemented in Relion 4.0.0 (ref. 42). All figures were generated using PyMOL 2.5.2 (ref. 51) and ChimeraX52.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this Article.