Biogenic manganese oxide nanoparticle formation by a multimeric multicopper oxidase Mnx

Bacteria that produce Mn oxides are extraordinarily skilled engineers of nanomaterials that contribute significantly to global biogeochemical cycles. Their enzyme-based reaction mechanisms may be genetically tailored for environmental remediation applications or bioenergy production. However, significant challenges exist for structural characterization of the enzymes responsible for biomineralization. The active Mn oxidase in Bacillus sp. PL-12, Mnx, is a complex composed of a multicopper oxidase (MCO), MnxG, and two accessory proteins, MnxE and MnxF. MnxG shares sequence similarity with other, structurally characterized MCOs. MnxE and MnxF have no similarity to any characterized proteins. The ~200 kDa complex has been recalcitrant to crystallization, so its structure is unknown. Here, we show that native mass spectrometry defines the subunit topology and copper binding of Mnx, while high-resolution electron microscopy visualizes the protein and nascent Mn oxide minerals. These data provide critical structural information for understanding Mn biomineralization by such unexplored enzymes.

The ab initio models of MnxE and MnxF were docked into a MnxEF dimer and then a hexamer. The MnxEF hexamer model was docked onto the homology model of MnxG to generate a model for the complex. Specific residues (residue numbers listed above structures) were blocked from being at proteinprotein interfaces in docking based on covalent labeling data. CCSs for most models calculated with two different algorithms (CCS PA and CCS PSA , details in supplementary method) are reasonably consistent with experimental values (CCS exp ), except for the MnxEF dimer and hexamer which may have collapsed significantly after being released from the complex. I-TASSER 1, 2 was used to generate the model of MnxG based on its sequence similarity to the characterized MCO, human ceruloplasmin. 3 The structures of MnxE and MnxF were modeled de novo and docked using online servers. 4 These monomers were released from the intact Mnx complex in the gas phase by (a) SID and (b) CID. The CCSs (in nm 2 ) from ion mobility measurement are annotated next to the peaks (based on apex of drift time peak). While most of the released MnxE bound to one copper in both SID and CID, the copper load on MnxF differs significantly between SID and CID. The overall metal load on MnxE is 70~80% for one copper and 10~20% for two copper. For MnxF, about half of the protein is bound to one copper and ~25% is bound to two copper, regardless of charge state (4+ to 6+ in view). However, on average the MnxF released in CID showed less copper loading, with higher charge states (5+, 6+) binding less copper than the lower charge state (4+). The higher charged monomers also showed larger CCSs than the lower charged monomers. In Figure 3b, these same monomers created the spots that are enclosed by the parallelograms labeled with "extended" and "compact" near m/z 2000. Most of the monomers in CID have CCSs and drift times that place them in the "extended" parallelogram, implying extended and unfolded conformations. In contrast, most of the ions with m/z higher than 2000 in SID showed compact conformations (also in Figure 3d, parallelograms labeled with "MnxE/MnxF 1mer"). The MnxF 6+ ion in SID showed a lower copper load than the 5+ and 4+ ions in the spectrum above probably because it is partially unfolded. Even though the MnxF ions at 5+ and 4+ from CID showed compact CCS similar to those in SID at the same charge states, the lower copper load suggests they might have been unfolded during the dissociation process and then refolded after release from the complex. This observation is consistent with previous studies showing that significant subunit unfolding occurs during CID which may trigger metal loss. However, the energy-sudden SID activation minimizes unfolding and preserves the native ligands/metals bound to the subunits.  Collisional cross section (CCS) was determined experimentally following a previously described protocol based on the drift time measured in the ion mobility mass spectrometry spectra 7 . Relative error was calculated from multiple measurements using different ion mobility parameters. TEM diameter was calculated based on the average diameter of particles in TEM images. The reported error is standard deviation. More than 1000 Mnx particles and 100 MnxEF hexamer particles were measured. Unlike the MS experiments, the MnxEF hexamer particles examined here were purified from a separate plasmid that did not contain any MnxG. A C-terminal strep tag was attached to MnxF for purification. The CCS from the IM-MS measurement for the Mnx complex was performed on the intact protein complex shown in Figure 1 without intentional activation. The CCSs for all other subunits and subcomplexes are measured on the released species in SID (Figure 3d). The details of the EDTA treatment are further described in Supplementary Note 3. The released subcomplexes from the EDTA treated Mnx implied that most of the copper from the MnxE and MnxF is removed (small amount of residual copper on MnxE, as shown in Supplementary Fig. 4). The experimental mass is calculated based on the peak apex of the smoothed mass spectra; the standard deviation is from the residual error in fitting the charge state distribution of the mass values at the peak apex and did not consider peak widths which significantly increase uncertainties in mass measurement for large ions. Most of the species from the Mnx with EDTA treatment showed good agreement with theoretical mass within experimental error, except for MnxG which has extra mass from the expected sequence and may have been modified. It is noted that many of the high mass ions have broad peak widths due to variable attachment of other species (copper, sodium, potassium, etc.) and variable stoichiometry of modified forms of MnxE/MnxF (e.g. a hexamer of E 3 F 3 can have 1 of the E modified, or 2 of the E modified, etc.), which generates uncertainties in mass determination that are not represented by the error listed in the table. Some of the MnxEF multimeric species listed showed variable copper binding ratio; only the most abundant species are listed here. The copper binding stoichiometry of larger species cannot be confidently determined due to limited resolution and complication from nonspecific salt adducts (such as sodium ions) and protein modifications. Theoretical mass is calculated based on the confirmed sequence considering truncation for MnxE/MnxF based on top-down LC-MS data ( Supplementary Fig. 1); the theoretical mass for MnxG is calculated based on the predicted sequence. Potential bound metal is not added to the theoretical mass. The Delta Mass column lists the mass difference between the experimental and theoretical values. The collision voltage in SID is defined as the potential difference between the trap cell and the SID surface. The voltages on the SID device were controlled by an external power supply (Ardara Technologies, Ardara, PA, USA). The other voltages and gas flows were directly controlled by the instrument software. The experimental mass of MnxF is 11.2 kDa which is truncated from the predicted sequence (truncated region highlighted in red). Top-down mass spectrometry showed that the MnxF started from the 2nd methionine in the sequence (Supplementary Fig. 1). The average MWs were obtained from a freeware Molecular Weight Calculator (https://omics.pnl.gov/software/molecular-weight-calculator) based on the sequences. The MW for MnxE (noted with asterisks) were subtracted by hydrogen losses (-2 Da) because of the detected mass shift assigned to disulfide bond formation by top-down MS.
Supplementary Note 2. Impurity in the Mnx sample (Supplemental details for Figure 1) The impurity species in Figure 1b was isolated and activated by SID in a manner similar to that described for Mnx in the main text. The dissociation pattern suggested that it was a symmetric 166 kDa hexamer consisting of monomers at 27 kDa 8 . Bottom-up LC-MS analysis of the tryptic peptides identified the presence of a putative GTP cyclohydrolase 1 type 2 from E. coli (UniProt ID: P0AFP7, data not shown).
The known structure of this protein is a hexamer with the monomer mass of 26.9 kDa. Therefore, we believe the impurity species is the putative GTP cyclohydrolase from the host cell that co-purified with wild-type Mnx following the heat denaturation protocol. 9 When using affinity purification protocol for the construct where the C-terminus of MnxG is strep-tagged, we did not observe this impurity protein in the final sample with mass spectrometry.
Supplementary Note 3. Additional details of EDTA treated Mnx for Supplementary Figure 4.
The EDTA treatment was performed by incubating ~3 mg/mL Mnx complex in 100 mM NH 4 OAc buffer for 2-3 days at 4°C and then incubating it in 2mM EDTA ammonium salt (balanced to pH 7) for ~10 minutes at ambient temperature. The buffer was then exchanged to 100 mM NH 4 OAc to remove EDTA. The spectra were acquired with a protein concentration of ~1 mg/mL. After EDTA treatment, the majority of the bound Cu on MnxE and MnxF was removed, as manifested by the downward mass shift of the EDTA spectra relative to the untreated spectra. In untreated samples ( Supplementary Fig. 4ae), the combination of bound Cu and protein modifications give rise to complex spectra with significant broadened peaks which can hardly be resolved for MnxEF multimers larger than the 3mer. The numbers of bound Cu shown are estimates based on mass shifts. Removing the bound Cu with EDTA greatly simplified the spectra, allowing the subunit stoichiometry of MnxE/F in the subcomplexes to be confidently determined. The most abundant species in the EDTA treated samples exclusively match the theoretical mass of apo MnxEF multimers. Additional peaks at higher mass correspond to MnxEF multimers containing modified MnxE and/or MnxF monomers, and to species carrying nonspecifically bound salt/solvent. For MnxG ( Supplementary Fig. 4f), the relatively small mass shift after EDTA treatment did not allow confident determination of the change in the number of bound Cu (peak apex shifted by ~130 Da).

Supplementary Note 4. Structural characterization of MnxEF (Supplemental details for main manuscript)
Native mass spectrometry was attempted on solutions of MnxEF hexamer that were overexpressed and purified in the absence of MnxG. However, the experiment was not reproducible because MnxEF is highly unstable, especially in the ammonium acetate buffer that is optimal for mass spectrometry. Instead, MnxEF was separated from the Mnx complex in situ after SID and ion mobility so that information on the copper binding affinities of MnxE and MnxF could be obtained. MnxF exhibits more variable Cu loading than MnxE. MnxF binds 1-3 Cu atoms per subunit, on average more than one, while MnxE appears to bind only one Cu per subunit. The binding of Cu on MnxE was even maintained in CID where the MnxE partially unfolded (Supplementary Fig. 6). We hypothesize that this variability results from the location of these metals' binding sites within the secondary and/or tertiary structure of their subunits. The major metal binding site in MnxE is likely localized at several residues at the N-terminus which may not require a specific tertiary fold, therefore the metal binding is not directly affected by unfolding. 10 In contrast, metal binding on MnxF is affected by protein unfolding, with more highly charged and more extended conformations retaining less copper (Supplementary Fig. 6). If this hypothesis is true, then the compact MnxE/MnxF monomers released in SID can better resemble the native structure and metal binding properties in the Mnx complex than the unfolded monomers.
Previous reports found that the copper content of Mnx varies 11,12,13 . The ability to extract metals from the protein with chelators (such as EDTA, Supplementary Fig. 4, Supplementary  Supplementary Fig. 1) The purified protein sample was denatured and analyzed with top-down LC-MS. The Waters NanoAcquity liquid chromatography was equipped with C5 reversed phase column (5 µm, 300 Å, inner diameter 100 µm, length ~ 50 cm, solvent A: 2.5% isopropanol, 5% acetonitrile, 0.58% acetic acid, 0.01% trifluoroacetic acid: solvent B: 45% isopropanol, 45% acetonitrile, 0.58% acetic acid, 0.01% trifluoroacetic acid), and was coupled to a Thermo Orbitrap Elite mass spectrometer for standard datadependent acquisition of the intact proteins with higher-energy collisional dissociation (HCD) fragmentation at a normalized collision energy of 25 V. The major species detected corresponded to the MnxE and MnxF subunits ( Supplementary Fig. 1a, b, c). The MnxG subunit was not detected, but a peak in the chromatogram around 41 min showed high baseline with unresolved species in MS ( Supplementary Fig. 1d). It is speculated that this early eluting species, not resolved in MS, originated from MnxG which was not fully denatured. Therefore, it was not retained strongly on the column and did not ionize well within the detection range of the mass spectrometer. It is noted that even when the protein was denatured in 50% organic solvent for direct infusion into a mass spectrometer, the MnxG subunit could not be detected (data not shown). The MnxG subunit can be detected in standard bottom-up experiment from its tryptic peptides and can also be detected when it is released from the complex by SID in native MS. The cause of this unusual behavior remains to be explored.

Top-down MS (Experimental details for
The data were analyzed with MSAlign 15 for protein identification. The protein database included reviewed E. coli K-12 proteins in UniProt, appended with the MnxD, MnxE, MnxF, and MnxG sequences. MnxD was not detected in top down or native MS, consistent with a previous report 9 that MnxD is not part of the active complex even though the gene was encoded into the construct for protein expression. The coverage map of MnxE is listed in Supplementary Fig. 1b and showed limited fragmentation on the latter half of the sequence with a modification of -2 Da. This implies that there is a disulfide linkage between the two cysteines in the protein which protected the protein from fragmentation, leading to limited coverage in the region with the cysteine resides and a mass shift of -2 Da from the expected sequence. For MnxF ( Supplementary Fig. 1c), an N-terminal truncation of the first 7 residues was detected consistently across all the samples prepared from different batches. The experimentally confirmed MnxE and MnxF sequences including the modifications were used for interpreting all the other experimental data.

Covalent labeling and Computational modeling (Experimental details for Supplementary Fig. 5)
Mnx complex was reacted with acetic anhydride or N-hydroxysuccinimidobiotin (NHS-biotin) to label surface exposed lysine residues. For labeling with acetic anhydride, a final concentration of 10 mM labeling reagent was added to 10 μL protein ( Some of the structures had very small contact and large exposed surfaces and were therefore discarded. The theoretical CCS of the top scored model was in good agreement with the experimental CCS and is shown in Supplementary Fig. 5 along with the models of the subcomplexes and the monomeric subunits in Mnx. The theoretical CCS values for the models were calculated using IMPACT 17 with the Projection Approximation (PA), scaled by an empirical factor of 1.14 to fit experimental data 18 . The theoretical CCS values were also calculated by Projected Superposition Approximation (PSA), which included a more sophisticated algorithm than PA to better capture the molecular shape while maintaining the calculation at a reasonable speed for large molecules 19 . For the MnxEF dimer and hexamer in particular, the CCS values of the models are significantly larger than the experimental values. The discrepancy could be explained by the collapse of the released subcomplexes which are still folded but may have rearranged after losing their binding partners. Similar behavior has been observed for other large subcomplexes released in SID 20 . Therefore, the CCS values of these subcomplexes could not be used as experimental constraints for selecting the models.
It is important to note that this is only one of the plausible models that fit the limited number of experimental constraints. Some of the models other than the top hit from the output can also match the experimental CCS with reasonable subunit connectivity and similar overall topology to the structure shown here, but with different structural details and contacts. In particular, the lack of known structures for MnxE and MnxF monomers poses significant challenges for building high-confidence models of the Mnx complex, because differences in the structures of the fundamental units can lead to pronounced differences in the structure of the final protein complex. However, even with the limited information, it can be concluded that the structural model proposed, where MnxE and MnxF form a symmetric hexamer attached to MnxG, is the most reasonable structure given the subunit connectivity from SID results and the experimental size of the Mnx complex. A recent computational study for protein structure proposed two major topologies for protein complexes with two unique subunits and C3 symmetry: a ring structure consisting of three dimers of the two unique subunits, or a triangle structure with a homo-trimer of the unique subunit in the center 21 . The lack of strong MnxE 3 or MnxF 3 trimers in the released subcomplexes favors the topology of a ring structure with alternating repeats of MnxE and MnxF, which is in agreement with the highly scored models from the docking simulations. To generate models at higher confidence in the absence of high resolution data from conventional techniques, it is essential to obtain more constraints from experiments. We will pursue this information through additional labeling experiments, crosslinking, and computational efforts exploring larger conformational space. Our initial crosslinking experiments with bis(sulfosuccinimidyl) 2,2,4,4-glutarate (BS2G) and bis(sulfosuccinimidyl)suberate (BS3) did not generate confident identifications for inter-subunit crosslinks due to low efficiency of crosslinking. Crosslinking experiments with paraformaldehyde and glutaraldehyde were also not able to effectively capture multimers of MnxE and MnxF, and using high concentrations of these reagents resulted in excessive crosslinking and caused precipitation.