Structural basis for enzymatic terminal C–H bond functionalization of alkanes

Alkane monooxygenase (AlkB) is a widely occurring integral membrane metalloenzyme that catalyzes the initial step in the functionalization of recalcitrant alkanes with high terminal selectivity. AlkB enables diverse microorganisms to use alkanes as their sole carbon and energy source. Here we present the 48.6-kDa cryo‐electron microscopy structure of a natural fusion from Fontimonas thermophila between AlkB and its electron donor AlkG at 2.76 Å resolution. The AlkB portion contains six transmembrane helices with an alkane entry tunnel within its transmembrane domain. A dodecane substrate is oriented by hydrophobic tunnel-lining residues to present a terminal C–H bond toward a diiron active site. AlkG, an [Fe–4S] rubredoxin, docks via electrostatic interactions and sequentially transfers electrons to the diiron center. The archetypal structural complex presented reveals the basis for terminal C–H selectivity and functionalization within this broadly distributed evolutionary class of enzymes. Alkane monooxygenase (AlkB) enables diverse microorganisms to hydrolyze alkanes as their sole carbon and energy source. Liu and colleagues reveal how AlkB orients an alkane to its active site and performs a selective terminal C–H bond hydroxylation.


Structure determination
We screened 21 AlkB homologs for production in Escherichia coli and identified a natural AlkB-AlkG fusion from Fontimonas thermophila (FtAlkBG) that was expressed well and could be purified using the detergent n-dodecyl-β-d-maltoside (DDM) (Extended Data Fig. 2a). We reconstituted the protein in amphipol PMAL-C8 nanodiscs and further purified it using size-exclusion chromatography (Extended Data Fig. 2b). The molecular mass of FtAlkBG on SDS-PAGE is less than 50 kDa (Extended Data Fig. 2a), which is relatively small for structure determination by cryo-electron microscopy (cryo-EM). To enhance the signals of cryo-EM images, we used a camera pixel size of 0.333-Å resolution under super-resolution mode, with an energy filter slit width at 15 eV, and collected data from holes with the thinnest-possible ice capable of encapsulating particles. Under these conditions, we were able to obtain well-identifiable particles on micrographs (Extended Data Fig. 2c). Two-dimensional (2D) class averaging showed recognizable secondary structural features and Article https://doi.org/10.1038/s41594-023-00958-0 region of FtAlkBG during sample vitrification. In addition, the relatively small size of the PMAL-C8 amphipol may be well suited for cryo-EM structure determination of small membrane proteins. The final reconstruction used only 1.6% (46,953) of the total picked particles to reach a resolution of 2.76 Å, estimated using the gold-standard Fourier shell correlation at 0.143 (Extended Data Fig. 4d). These particles have well-distributed orientation angles, with only slightly more particles viewed from the periplasmic side (Extended Data Fig. 4e). The cryo-EM map is of high quality and allowed the building and refinement of an atomic model (Extended Data Fig. 5).

Overall structure
The solved FtAlkBG structure is a monomer (Fig. 1a), consisting of both AlkB (residues 13-387) and AlkG (residues 415-467) domains (Fig. 1b,c), with a total molecular mass of 48.6 kDa. The FtAlkB structure is mostly α-helical, with six transmembrane α-helices (TM1-6), two partially membrane-embedded short α-helices α2 and α3 (Fig. 1d), a long membrane-associated amphiphilic helix α4 (Fig. 1b,d) and six other α-helices outside the membrane (Fig. 1c,d). TM2 is kinked within the membrane, making an angle of about 100°. The protein is narrower on the periplasmic side than on the cytoplasmic side, forming a wedge-shaped structure that may be responsible for bending the membranes to facilitate the formation of membrane vesicles 22 . In terms of sequence, TM1-4 are clustered and separated from TM5 and TM6 by α2-α4. The soluble domain follows TM6 and connects to the FtAlkG domain through a 27-residue, disordered linker (residues 388-414).

The diiron active site
FtAlkB binds two iron ions, which form a diiron center that is essential for C-H bond activation (Fig. 2) 23 . The diiron center is coordinated by nine conserved histidines and one carboxylate residue (Extended Data Fig. 6) with His137, His141, His167, His172 and His314 coordinating Fe1 and four histidines (His171, His272, His311 and His315) and glutamate Glu271 coordinating Fe2. The coordination geometry of Fe1 is octahedral, such that four histidine nitrogen atoms are coplanar with Fe1, analogous to the heme in cytochrome P450 enzymes that can also catalyze C-H bond hydroxylation (Extended Data Fig. 7) 24 . The octahedral geometry of Fe1 allows a fifth ligand on one side of the plane. His141 is on the other side of the plane to further stabilize Fe1. Fe2 is 6.1 Å away from Fe1, in line with Fe1 and His141, but too far to coordinate with Fe1. The four histidines (His171, His272, His311 and His315) are on the same side of the Fe2 site, thus allowing additional coordination to Fe2.
Additional density was present in the active site of the cryo-EM structure, allowing a dodecane substrate (C-12) to be modeled with its terminal methyl group in close proximity of the diiron center ( Fig. 2b and Extended Data Fig. 8). Experimentally, FtAlkBG hydroxylates dodecane, as demonstrated by the decrease in NADH fluorescence compared to a nonsubstrate control (Extended Data Fig. 9). No alkanes were provided during protein production or sample preparation. Therefore, the dodecane was likely derived from the E. coli which was used as the expression host for FtAlkBG. The linear dodecane is oriented by hydrophobic residues, such that its terminal methyl group (C(sp 3 )-H bonds) is 5.3 Å from Fe1 and 4.8 Å from Fe2. The methyl group is also located 3.2 Å away from Glu271, suggesting a possible role for Glu271 in hydroxylation of the C-H bond (Fig. 2a). The methyl group is not in line with Fe1 and His141. Instead, it is closer to Fe2 and lies on the noncoordinated side of Fe2. The proximity of the substrate terminal methyl group to Fe2 suggests the Fe2 center may be more directly involved in C-H bond activation.

Substrate selectivity
FtAlkBG has a long, hydrophobic substrate binding channel starting from near the middle of transmembrane helices TM1 and TM2 and ending at the diiron center (Fig. 2b). The hydrophobic residues lining the channel show increasing degrees of conservation the nearer they are amphipol densities around the protein transmembrane region (Extended Data Fig. 2d).
We determined the structure of FtAlkBG, using the workflow illustrated in Extended Data Fig. 3, and achieved a cryo-EM map at a resolution of 2.76 Å (Table 1). FtAlkBG is wrapped tightly by amphipols that cover approximately half of the protein (Extended Data Fig. 4a-c). Amphipols appear to form a shield that stabilizes the transmembrane to the diiron site. Dodecane occupies roughly half the tunnel length, suggesting the enzyme could accommodate longer alkanes. Because the tunnel entrance is narrower than the dodecane-occupied region, binding of longer substrates would presumably be accommodated by minor conformational changes. The side chains of residues Ile27, Ile54 and Ala127 form a restriction site. Sequence alignment shows that Ile54 in FtAlkBG is equivalent to Trp55 in AlkB from P. oleovorans (PoAlkB) (Extended Data Fig. 6). Trp55 in PoAlkB determines alkane chain-length selectivity because mutating it to a serine or cysteine extended its alkane chain-length selectivity from C-10 to C-13 (ref. 25). Conversely, AlkB in Mycobacterium tuberculosis has a leucine at position 69 (equivalent to Trp55 in PoAlkB and Ile54 in FtAlkBG); mutating it to a bulky side chain residue phenylalanine or tryptophan reduces the chain-length selectivity from C-16 to C-11 (ref. 25). Similarly, in AlkBG from Dietzia cinnamea, a V91W mutant (equivalent to Ile54 in FtAlkBG) rendered the enzyme less active for alkanes longer than C-9 (ref. 26). Collectively, these hydrophobic residues appear to mediate alkane chain-length selectivity.

Electron transfer mechanism
C-H bond activation requires the transfer of electrons to the Fe-Fe diiron center in AlkB by AlkG 11 . FtAlkG has an [Fe-4S] cluster coordinated by Cys418, Cys421, Cys451 and Cys454 (Extended Data Fig. 10a). FtAlkG sits on a positively charged surface of FtAlkB, adjacent to the diiron center (Fig. 3a). On the surface of AlkG are four negatively charged residues (that is, Glu434, Asp453, Asp458 and Asp461; Extended Data Fig. 10b) that form electrostatic interactions with six positively charged residues (Arg143, Arg148, Arg169, Arg215, Arg284 and Arg287) on the surface of FtAlkB. However, we did not observe specific salt-bridge interactions between carboxylate-guanidinium pairs in the cryo-EM density. We interpret this to mean that nonspecific electrostatic interactions may facilitate fast on and off rates of FtAlkG association with FtAlkB that would act as a shuttle to relay pairs of electrons one at a time from FtAlkT to FtAlkB. This mechanism is also consistent with the disordering of the linker between FtAlkG and FtAlkB (Fig. 1b). AlkG alone cannot supply electrons; it functions as a shuttle of electrons originating from its electron donor NADH through NADH oxidation by AlkT, a rubredoxin reductase 12 . Electron transfer between AlkT and AlkG has been proposed on the basis of the structure of a complex between rubredoxin reductase and rubredoxin from P. aeruginosa (PaAlkT and PaAlkG, respectively) (Extended Data Fig. 10c) 27 . We superimposed the FtAlkBG and PaAlkG-PaAlkT structures on the basis of AlkG orientation (Extended Data Fig. 10d). We found that AlkB and AlkT bind to the same face on AlkG (Fig. 3a,b), indicating AlkG must dissociate from AlkB to associate with AlkT to enable its reduction and subsequently transfer the electron to AlkB, thus forming a sequential electron transfer model shuttling one electron at a time.
Although there are two irons in FtAlkB, the [Fe-4S] cluster in FtAlkG is closer to Fe1, that is, at 13.6 Å, than to Fe2, at 17.1 Å (Fig. 3c). The unequal distance raises a question as to which iron ion the electrons are transferred. In natural redox proteins, physiological tunneling electron transfer can happen 6-14 Å between redox centers 28 . Therefore, we suggest that, in FtAlkBG, an electron is transferred from the [Fe-4S] cluster to Fe1. Considering the close distance of 6.1 Å between Fe1 and Fe2, Fe1 may transfer an electron to Fe2. Fe1 may function as a redox center to relay electrons from [Fe-4S] in FtAlkG to Fe2 in FtAlkB.

Discussion
Alkanes are highly hydrophobic and are almost immiscible with water, but they can diffuse into hydrophobic membranes. A structural feature of the transmembrane protein AlkB is its alkane-accessible tunnel with an entrance in the transmembrane region, allowing substrate access. We expressed the protein in the absence of reductant and in the presence of supplemented ferric chloride and purified it under ambient atmospheric oxygen conditions. Our previous Mossbauer spectroscopy findings identified that the 'as-isolated AlkB' from P. oleovorans was in the Fe(III)-Fe(III) state 22 ; therefore, we postulate that, in an isolated state, the iron ions in the FtAlkBG structure are also in the oxidized Fe(III)-Fe(III) state (Fig. 4a), consistent with the observed long Fe-Fe at 6.1 Å (Fig. 2a). Alkanes diffuse into the membranes where they interact with the tunnel-forming hydrophobic residues that lead toward the diiron center. The shape of the tunnel means that only its terminal C-H bonds can reach the cavity adjacent to the diiron center (Figs. 2b and  4b). Binding of a substrate to the active site triggers conformational changes that induce the docking of AlkG to AlkB (Fig. 4b). When AlkT and NADH are available, AlkG will be reduced and will transfer two electrons derived from oxidation of NADH in AlkT to the diiron center in AlkB. The process requires two single electron transfer cycles from AlkT to AlkG and from AlkG to AlkB (Fig. 4c,d) (Fig. 4e). The diamond core is reactive and abstracts a hydrogen from the terminal C-H bond followed by oxygen rebound 33 to form the terminal hydroxyl group. One of the two oxygen atoms is utilized for the production of the hydroxyl group during substrate oxidation, whereas the other is likely protonated by Glu271 to form a water molecule (Figs. 2a and 4f). Biocatalysts provide environmentally benign alternatives to high-temperature and high-pressure precious metal-based catalysis for alkane C-H bond activation and functionalization 6,34 . The FtAlkBG structure may provide a structural platform, in a similar way to the P450 system 35 , for creating novel biocatalysts for super-selective production of chemical feedstocks and value-added products from abundant alkanes.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41594-023-00958-0.

Protein expression and purification
The gene encoding the full length of FtAlkBG was synthesized and cloned into the pET16b expression vector by GenScript. The vector contains an N-terminal 10×His tag followed by a tobacco etch virus (TEV) cleavage site. Protein expression was performed in Escherichia coli BL21-Gold (DE3) pLysS cells (Agilent) growing in LB medium with an induction of 0.4 mM IPTG at 28 °C for 4.5 h. One milliliter of 0.1 M FeCl 3 in 0.12 M HCl was added to every 1 L of culture 15 min before induction. Cells were harvested by centrifugation at 5,000g for 10 min at 4 °C, and were resuspended in lysis buffer containing 30 mM HEPES, pH 7.5, 150 mM NaCl and 2 mM MgCl 2 . Cells were lysed using an EmulsiFlex-C3 homogenizer (Avestin) at 15,000 psi. Cell lysates were cleaned by centrifuging at 25,000g for 25 min at 4 °C, and the supernatants were collected for pelleting membranes by ultracentrifugation at 360,000g for 2 h at 4 °C. The membranes were frozen in liquid nitrogen and stored at −80 °C until further use. Membranes were resuspended in solubilization buffer (30 mM HEPES, pH 7.5, 150 mM NaCl, 20% glycerol (wt vol −1 ), 0.5 mM Tris(2-carboxyethyl)phosphine (TCEP), 8% DDM (wt vol −1 ) and Roche proteases inhibitor cocktail). After solubilization for 4 h at 4 °C, insoluble components were removed by ultracentrifugation at 360,000g for 30 min at 4 °C. The supernatant was applied to pre-equilibrated Ni 2+ -nitrilotriacetic acid (NTA) column (Anatrace) and washed by 8 column volumes of wash buffer (30 mM HEPES, pH 7.5, 150 mM NaCl, 20% glycerol, 0.25 mM TCEP, 50 mM imidazole, 0.05% DDM). The protein was eluted by 500 mM imidazole and digested by TEV protease overnight at 8 °C. The His tag-removed protein was clarified by ultracentrifugation at 190,000g for 30 min at 4 °C, desalted by Econo-Pac 10DG desalting column (Bio-Rad) and passed through the pre-equilibrated Ni 2+ -NTA column. The flow-through was concentrated to 10 mg ml −1 and further purified by size-exclusion chromatography using Superdex 200 Increase (GE Healthcare) in buffer containing 20 mM HEPES, pH 7.5, 100 mM NaCl, 5% glycerol, 0.25 mM TCEP and 0.01% DDM. Fractions containing the protein were pooled and concentrated to 10 mg ml −1 using a 50-kDa molecular cutoff concentrator (Millipore Sigma).

Amphipol reconstitution preparation
Reconstitution of detergent-solubilized FtAlkBG (10 mg ml −1 ) into amphipol nanodiscs was performed on the basis of the previously published method 37 . Purified protein in a buffer (20 mM HEPES, pH 7.5, 100 mM NaCl, 5% glycerol, 0.25 mM TCEP, 0.01% DDM (wt vol −1 )) was mixed with amphipol PMAL-C8 (catalog no. P5008, Anatrace) in a protein:amphipol mass ratio of 1:5 (wt wt −1 ). The mixture was incubated at 4 °C for 20 h and concentrated using a 50-kDa molecular cutoff concentrator. Then, SM-2 Bio-Beads (40 mg wet; catalog no. 1523920, Bio-Rad) were added to the concentrated mixture, and the mixture was nutated at 4 °C for 3 h. Before use, Bio-Beads were soaked with buffer containing 20 mM HEPE, pH 7.6, 100 mM NaCl overnight at 4 °C. The Bio-Beads were removed by passing through a Micro Bio-Spin chromatography column. The flow-through containing the reconstituted particles was centrifuged at 20,000g for 1 h at 4 °C; the supernatant was used for size-exclusion chromatography through a Superdex 200 Increase column (GE Healthcare) in a buffer containing 20 mM HEPES, pH 7.6, 100 mM NaCl. The fractions containing the FtAlkBG-PMAL-C8 nanodiscs were collected and concentrated to 5 mg ml −1 for cryo-EM analyses.

Cryo-EM sample preparation and data collection
Three microliters of the FtAlkBG-PMAL-C8 nanodiscs were applied to a glow-discharged (15 mA current for 15 s), 300-mesh R 0.6/1 UltrAuFoil Holey Gold grid (catalog no. Q350AR1A, Electron Microscopy Sciences). After waiting for 60 s, vitrification was performed using a Mark IV Vitrobot (Thermo Fisher Scientific) with a blotting condition of 5 s blot time, 0 blot force and 100% humidity at 6 °C.
Single-particle cryo-EM data were collected at the Laboratory for Biomolecular Structure (LBMS) facility at Brookhaven National Laboratory using a Thermo Fisher Titan Krios electron microscope (G3i) equipped with a Gatan K3 camera and a BioQuantum energy filter. With a physical pixel size of 0.666 Å (0.333 Å in super-resolution mode), a total dose of 60 e − Å −2 was fractioned to 50 frames under the super-resolution mode using the Thermo Fisher data acquisition program EPU. A total of 11,757 movies were collected with an energy filter width of 15 eV throughout the data acquisition. Data collection statistics are listed in Table 1.

Cryo-EM data processing
Dose-fractioned movies were corrected and averaged using Motion-Corr2 (ref. 38) with a bin factor of two. Averaged movie frames were further corrected by contrast transfer function estimation using Gctf 39 . Micrographs with an estimated resolution better than 4.5 Å were selected for particle picking. Particles were initially picked and cleaned up by 2D class averaging in Relion3 (ref. 40). The cleaned particles were then used for training and particle picking using Kpicker 41 . A total of 2,950,051 particles were picked, extracted at 256 pixels and binned to 64 pixels with a pixel size of 2.664 Å.
We used CryoSPARC 42 for additional 2D class averaging, which resulted in 1,738,105 selected particles. With four classes each, selected particles were used for three cycles of ab initio reconstruction followed by three-dimensional (3D) heterogenous refinements with a particle size of 64 pixels (2.664 Å). The process selected 369,194 particles, which were centered and re-extracted at 256 pixels and binned to 128 pixels with a pixel size of 1.332 Å (Extended Data Fig. 3).
Re-extracted particles were auto-refined to convergence in Relion3, followed by nonalignment 3D classification into four classes (Extended Data Fig. 3). Particles from the 3D class with the best structural feature (ɑ-helices and side chains) as visualized in Chimera 43 were selected. These particles (13%) from two such classes were selected for contrast transfer function refinement and Bayesian polishing in Relion3, followed by 2D classification and nonuniform refinement in CryoSPARC to reach a refined reconstruction at 2.76 Å. Local resolutions were estimated using BlocRes 44 . Reconstruction statistics are listed in Table 1.

Model building and refinement
To boost side chain features of the map, the masked and filtered cryo-EM map was sharpened using PHENIX 45 with a B factor of −66 Å 2 . COOT 46 was used to build the atomic model, including irons and a substrate dodecane. All refinements were performed in real space in PHENIX. The refined model was validated using MolProbity 47 and the refinement statistics are listed in Table 1.

Model visualization
The cryo-EM density map and atomic models were visualized using either PyMOL (http://www.pymol.org/) or Chimera 43 .

Expression of AlkT
AlkT is a flavin adenine dinucleotide-dependent rubredoxin reductase that transfers electrons from NADH to FtAlkBG for C-H activation and functionalization 10,12 . The gene encoding AlkT from P. oleovorans (PoAlkT) was amplified using PCR and cloned into the pETDuet expression vector. Protein expression was performed in E. coli BL21-Gold (DE3) pLysS cells (Agilent) growing in TB medium with an induction of 0.4 mM IPTG at 16 °C for 17 h. Cells were harvested by centrifugation at 5,000g for 10 min at 4 °C, and were then resuspended in lysis buffer containing 50 mM Tris-HCl, pH 8.0, 100 mM KCl, 16% glycerol, 1 mM TCEP, 2 mM K 3 PO 4 and 2 mM MgCl 2 . Cells were lysed by passing through a French press twice at 13,000 psi. Cell lysates were cleaned up by centrifugation at 25,000g for 30 min at 4 °C, and the supernatants were applied to pre-equilibrated Ni 2+ -NTA resin in a buffer containing

Enzyme activity
For the activity assay, FtAlkBG was overexpressed in E. coli BL21 Star cells (Thermo Fisher Scientific). Cells grew in LB medium supplemented with 100 μM FeCl 3 . After growing for 14-19 h at 16 °C, cells were harvested by centrifugation at 5,000g for 10 min at 4 °C. Cells were resuspended in lysis buffer (30 mM Tris-HCl, pH 7.75, 150 mM NaCl, 0.2 mM PMSF, 100 μM ferrous ammonium sulfate hexahydrate and 100 μM sodium hydrosulfite) and lysed using a French press at 13,000 psi. Lysates were centrifuged at 10,000g for 25 min at 4 °C. Pellets were resuspended in lysis buffer without PMSF, frozen in liquid nitrogen and stored at −80 °C until further use.
The dodecane assay was performed on the basis of a modified protocol from McKenna and Coon 31,48 using a Spark 20 M multimode microplate reader (Tecan). In a reaction volume of 100 μl, we added 20 μg FtAlkBG, 10 μg PoAlkT and 500 μM NADH in a buffer of 50 mM Tris-HCl, pH 7.5, 100 mM NaCl. The reaction was initiated by addition of 400 μM dodecane dissolved in acetone at 24 °C. The reaction was monitored by measuring the decrease of NADH fluorescence (excitation/emission at 360/460 nm with a bandwidth of 10 nm). Consumption of NADH, that is, decreased NADH fluorescence, is proportional to dodecane hydroxylation to dodecanol. For the control experiment, we added all components except we used acetone instead of dodecane.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
The 3D cryo-EM density map has been deposited in the Electron Microscopy Data Bank under the accession number EMD-28890. Atomic coordinates have been deposited in the Protein Data Bank under the accession number PDB 8F6T. Source data are provided with this paper. Article https://doi.org/10.1038/s41594-023-00958-0 Extended Data Fig. 1 | Organization of the components of the AlkBGT biocatalysts. The three components (AlkB, AlkG, and AlkT) in the alkane hydroxylation biocatalyst are encoded as separate genes in P. oleovorans. AlkG is an Fe-sulfur cluster-containing redox protein that transfers electrons to the diiron center in AlkB. AlkT is a flavin adenine dinucleotide (FAD)-dependent rubredoxin reductase that transfers two electrons from NADH singly to the Fe-sulfur cluster of AlkG, which then relays two of them singly to AlkB to enable biocatalysis 10,12 . The membrane-bound AlkB is a nonheme diiron-containing monooxygenase that specifically oxidizes the terminal methyl group of alkanes to produce the corresponding primary alcohols 8,10,13,14,17,22,30 . There are microbes that have evolved to have the AlkB and AlkG fused together, presumably for improved electron transfer 15,26 . Additionally, in some species (Leptospira, Limnobacter, and Polaromonas), AlkB is fused with multiples proteins of an N-terminal ferredoxin and a ferredoxin reductase 15 .