Identiﬁcation and characterization of multiple rubisco activases in chemoautotrophic bacteria

Ribulose-1,5-bisphosphate carboxylase/oxygenase (rubisco) is responsible for almost all biological CO 2 assimilation, but forms inhibited complexes with its substrate ribulose-1,5-bisphosphate (RuBP) and other sugar phosphates. The distantly related AAA þ proteins rubisco activase and CbbX remodel inhibited rubisco complexes to effect inhibitor release in plants and a -proteobacteria, respectively. Here we characterize a third class of rubisco activase in the chemolithoautotroph Acidithiobacillus ferrooxidans . Two sets of isoforms of CbbQ and CbbO form hetero-oligomers that function as speciﬁc activases for two structurally diverse rubisco forms. Mutational analysis supports a model wherein the AAA þ protein CbbQ functions as motor and CbbO is a substrate adaptor that binds rubisco via a von Willebrand factor A domain. Understanding the mechanisms employed by nature to overcome rubisco’s shortcomings will increase our toolbox for engineering photosynthetic carbon dioxide ﬁxation.

T he key photosynthetic CO 2 -fixing enzyme ribulose-1,5bisphosphate carboxylase/oxygenase (rubisco) is renowned for its slow kinetics and poor substrate specificity, making it a long-standing target for crop improvement efforts 1,2 . It is accepted that by taking advantage of the natural variability existing in rubisco kinetic properties the efficiency of photosynthesis could be improved 3 . However, so far attempts to produce 'superior' enzymes in plant chloroplasts have been foiled by an incomplete appreciation of the complex suite of accessory proteins that govern rubisco's biogenesis and maintenance 4,5 . Hence a thorough understanding of such factors will be essential [6][7][8] .
Plants encode one highly conserved so-called green-type form I enzyme consisting of eight B52 kDa large subunits and eight B15 kDa small subunits (L 8 S 8 ) 9 . In contrast, proteobacteria often possess a multitude of diverse rubiscos, including both red-and green-type form I enzymes (L 8 S 8 ) and form II rubisco, which lacks small subunits and generally forms an (L 2 ) X oligomer 10,11 .
All rubiscos have a propensity to catalyse a side reaction with oxygen, which necessitates the energetically wasteful photorespiratory pathway 12 . One strategy that evolved to minimize the oxygenation reaction involved increasing the CO 2 /O 2 specificity, albeit at the cost of a reduced catalytic turnover rate 13 . The increased structural rigidity associated with higher specificity rubiscos correlates with a tendency of the enzyme to form dead-end inhibited complexes with sugar phosphates 14,15 . Physiologically relevant inhibitors include the substrate ribulose-1,5-bisphosphate (RuBP), misfire products such as xylulose-1,5-bisphosphate, or regulatory compounds such as carboxyarabinitol 1-phosphate 16 .
To become functional, a conserved active-site lysine needs to become carbamylated by a non-substrate CO 2 , followed by co-ordination of a Mg 2 þ ion 17 . Inhibitors bind to the active sites of both the activated (ECM) and inactive form (E) of the enzyme. To maintain photosynthetic CO 2 fixation a group of molecular motors known as the rubisco activases evolved to conformationally remodel inhibited rubisco complexes leading to release of the inhibitors. So far two distantly related AAA þ (ATPases associated with various cellular activities) proteins, rubisco activase (Rca) and CbbX have been described to act on plant green-type and a-proteobacterial red-type form I rubisco respectively 7,[18][19][20][21] . Like most AAA þ proteins both Rca and CbbX can form ring-shaped hexamers, but function via distinct mechanisms. The current model for CbbX function involves destabilization of the inhibited rubisco active site via transient threading of the rubisco large subunit C terminus through the pore of the hexamer, analogous to the mechanism of the intensely studied AAA þ proteins ClpB or ClpX 22,23 . Whereas CbbX function is completely dependent on a conserved pore-loop tyrosine and the rubisco large subunit C terminus, Rca does not possess a typical pore-loop 1 and activase activity is tolerant of rubisco large subunit C-terminal modifications 24 . Instead an interaction between two surfacedisplayed large subunit residues (position 89 and 94) and a helix insertion in the a-helical subdomain of the activase has been established 20,[25][26][27] . Nevertheless, mutational analysis of predicted pore loops does support a role in Rca function 20 . Following activesite remodelling, the released inhibitors can then be degraded by specific sugar phosphatases 28,29 .
Chemoautotrophic proteobacteria that oxidize sulphur and other inorganic substrates use rubisco to perform 'dark' CO 2 fixation in diverse habitats including the dark ocean 30 , but their genomes generally do not encode Rca or CbbX. Instead two openreading frames, cbbQ and cbbO are regularly found directly downstream of the rubisco genes in multiple operons 11 (Fig. 1a). Co-expressing rubisco with CbbQ and/or CbbO in Escherichia coli was reported to lead higher rubisco activities in cell extracts and more active purified rubisco, suggesting the proteins' involvement in some form of posttranslational rubisco modification 31,32 . Subsequent similar co-expression experiments did not observe such effects 33  CbbQ sequences are highly conserved ( Supplementary Fig. 1) and belong to the large, but poorly studied MoxR group of AAA þ ATPases. CbbQ proteins are very short (B260 residues), essentially comprising a single AAA þ module. The MoxR group is widely distributed in Archaea and bacteria and believed to represent a new class of molecular chaperones [36][37][38] . Incidentally, the best-studied MoxR protein, RavA, has been shown to relieve inhibition of lysine decarboxylase by the small-molecule ppGpp 37 . In many cases this group of proteins is associated with a downstream gene encoding a protein containing a von Willebrand Factor A (VWA) domain. Accordingly the B770 residue CbbO proteins contain a C-terminal VWA domain with a perfect metal-ion-dependent adhesion site (MIDAS) 39 ( Supplementary Fig. 2). This well-studied domain generally mediates protein-protein interactions via a carboxylate sidechain 40,41 . The sequence preceding the VWA domain (B550 residues) is poorly conserved and has no discernable homology to any characterized proteins 33 .
Here we demonstrate using pure components that CbbQ and CbbO isoforms form hetero-oligomeric complexes that function as genuine and specific activases of two distinct rubisco enzymes. Extensive site-directed mutagenesis indicates a conserved mechanism of activation exhibiting both similarities and differences to the Rca and CbbX systems. Understanding the diversity of accessory proteins that has evolved to overcome intrinsic limitations of the key enzyme rubisco will empower synthetic biology approaches aimed at increasing the productivity of crops and other photosynthetic systems.

Results
CbbQ and CbbO isoforms form hetero-oligomeric complexes. We wanted to test the hypothesis that the cbbQ and cbbO genes encode a third convergently evolved rubisco activation system. Therefore, E. coli was used to recombinantly produce the proteins encoded by form I and form II rubisco operons in Acidithiobacillus ferrooxidans (Fig. 1a).
Pure form I (AfLS) and form II rubisco (AfM) was obtained in functional form. Gel filtration and native PAGE supported a typical L 8 S 8 hexadecamer for AfLS, whereas AfM appeared significantly larger than the dimeric form II rubisco from Rhodospirillum rubrum (RrM) (Fig. 1b,c). Satagopan et al. 10 have recently described a form II rubisco from Rhodopseudomonas palustris (RpM) that crystallized as a hexamer. We found that RpM and AfM behaved identically using gel-filtration and sedimentation analysis (Fig. 1c, Supplementary Figs 3 and 4). Using a combined Stokes radius/ sedimentation analysis 42,43 a mass of 313 kDa (AfM) and 308 kDa (RpM) was calculated, supporting a hexameric oligomeric state for both enzymes ( Table 1). The elution volume of AfM from the sizing column was independent of protein concentration or whether the tight-binding inhibitor carboxyarabinitol-1,5bisphosphate (CABP) 44 was bound (Fig. 1d). These results indicate that in solution AfM occupies a single well-defined oligomeric state, in contrast to the homologue from Methanococcoides burtonii, which has been shown to populate multiple oligomeric states 45 .
Whereas it was possible to purify both isoforms of CbbQ separately (AfQ1 and AfQ2), CbbO isoforms were insoluble unless co-expressed with their respective CbbQ isoform ( Supplementary  Fig. 5). Both CbbQ and CbbO isoforms were produced as His 6 -Ubiquitin fusion proteins 46 , which resulted in higher yields of soluble protein. This approach permitted an initial immobilized metal ion affinity chromatography capture step to be incorporated. Precise cleavage of the ubiquitin moiety was possible using a specific protease 46 . Following cleavage, a significant proportion of the recombinant protein precipitated in all cases, however, the soluble fraction could be purified further using ion-exchange chromatography and gel filtration (A representative purification of Q2O2 is shown in Supplementary  Fig. 6a). CbbQ and CbbO co-purified using ion-exchange and size-exclusion chromatography indicating the formation of a stable hetero-oligomeric complex (Fig. 2a). To test whether the presence of nucleotide altered the relative proportion of CbbQ and CbbO subunits in the isolated complexes, we performed a smallscale purification of Q2O2, which included 1 mM Mg-ATP in all buffers. The purification profile was identical in the presence and absence of nucleotide ( Supplementary Fig. 6b).
In EM, AfQ2 appeared as ring-shaped particles displaying a hexameric arrangement typical for AAA þ ATPases (Fig. 2b, Supplementary Fig. 7), and Stokes radius/sedimentation analysis yielded a molecular mass of 173 kDa, consistent with a monodisperse hexamer (Fig. 2a, Supplementary Figs 3 and 4, Table 1). AfQ1 eluted in the void volume of the sizing column and likely formed soluble aggregates (Fig. 2a). EM of Q1O1 and Q2O2 resulted in heterogeneous particles that could not be processed (data not shown). However, EM analysis of a complex formed with AfQ2 and C-terminally truncated O2 (Q2O2DC444, residues 1-444) allowed identification and selection of particles ( Supplementary Fig. 8a,b). Reference-free 2D classification revealed an AfQ2 ring-shaped particle with an additional density emerging radially from the ring in a fraction of unbiased class averages (Fig. 2c), consistent with the interaction of a AfQ2 hexamer with the O2DC444 fragment. Both combined Stokes radius/sedimentation analysis (  Table 1) and quantitative densitometry ( Fig. 2d and Supplementary Fig. 8c) supported a CbbQ 6 CbbO 1 stoichiometry for Q1O1 and Q2O2. Densitometry of Q2O2DC444 was not consistent with this stoichiometry, but yielded a ratio of 6 CbbQ subunits to 3.5 CbbO fragments assuming equal binding of the The CbbQO complexes function as rubisco activases. For each rubisco form we prepared and compared the activities of three forms: the fully activated enzyme (ECM); the apoenzyme inhibited by its substrate RuBP (ER); and the activated enzyme bound to the tight-binding transition state analogue CABP (ECMC) 44 (Fig. 3a).
As expected for an activase-requiring rubisco, the form I (AfLS) ER and ECMC complexes were inactive and ECM displayed a linear activity corresponding to 5.2 s À 1 (Fig. 3b,c). When either ER or ECMC was assayed in the presence of Q1O1, rubisco activity increased, demonstrating a rubisco activase function (Fig. 3b,c). Inclusion of 5% v/v of the crowding agent polyethylene glycol 3,350 (PEG) approximately doubled activase activity (Fig. 3b,c) as observed for the plant Rca system previously 47 . Due to loose inhibitor binding, form II rubiscos generally do not form stable complexes with RuBP or other physiologically relevant inhibitors, but readily activate in assay buffer 15,48,49 , which would make an activase unnecessary. Accordingly, no activase has been described for form II rubiscos. Consistent with these considerations, when assaying AfM ER at 20 mM NaHCO 3 the inhibited complex spontaneously activated to reach full activity after 100 s ( Supplementary Fig. 9a). However, when performing rubisco assays at low CO 2 concentrations (5 mM NaHCO 3 ) we observed that the activities of both AfM ECM and ER rapidly converged to a low rate (0.55 s À 1 ) suggesting the establishment of an equilibrium of ECM and inactive complexes (Fig. 3d). Assaying either ER or ECM in the presence of Q2O2 resulted in a approximately fivefold increase in linear rubisco activity to achieve a carboxylation rate of 2.7 s À 1 (Fig. 3d). This is consistent with an active removal of inhibited ER complexes from the equilibrium, which then favours the active ECM form. In contrast to RuBP, AfM bound to the transition state analogue CABP (ECMC) formed stable and inactive complexes (Fig. 3e). Inclusion of Q2O2 in the assay resulted in rapid activation and a rubisco activity equivalent to ECM in the presence of Q2O2 (Fig. 3e). Q2O2 was also functional at high CO 2 (20 mM NaHCO 3 ) as judged by its ability to activate the stably inhibited ECMC complex ( Supplementary Fig. 9b). In contrast to Q1O1, activation of form II rubisco by Q2O2 was too fast to resolve the gradual increase in rubisco activity (Fig. 3d,e). Reducing the concentration of Q2O2 resulted in rapid activation of a fraction of active sites, followed by linear rubisco activity ( Supplementary  Fig. 9c). This observation can be explained by the continual loss of newly formed ECM sites by decarbamylation and/or renewed inhibition (by CABP or RuBP) when activase concentrations are sub-saturating.
The ability of both Q1O1 and Q2O2 to remove the nonphysiological tight-binding CABP from the active site of their respective target rubiscos is remarkable, as Rca from spinach is unable to activate CABP-inhibited spinach rubisco 50 .
AfQ2 and Q2O2DC444 complexes were unable to activate rubisco, highlighting the importance of the hetero-oligomer in contrast to the homo-oligomeric CbbX and Rca activases ( Supplementary Fig. 9d). Q1O1 was specific for AfLS and likewise Q2O2 could only activate AfM ( Supplementary  Fig. 9e,f). Having established the QO complexes as rubisco activases we then examined their ATPase function.
Both AfQ2 and Q2O2DC444 displayed a low ATPase activity of o1 min À 1 protomer CbbQ À 1 (Supplementary Fig. 9g). In contrast both Q1O1 and Q2O2 hydrolysed ATP with a specific activity of B3 min À 1 protomer CbbQ À 1 . Therefore, the interaction of the full length CbbO subunit with the CbbQ hexamer induces a conformational change that increases the basal ATP hydrolysis rate of CbbQ. The ATPase activity of both QO systems (but not AfQ2 or Q2O2DC444) was specifically stimulated by their respective ER and ECMC complexes in a concentration-dependent manner, resembling the behaviour of CbbX 19 (Fig. 4a,b, Supplementary Fig. 9g). In contrast, the ATPase activity of Rca is not stimulated by inhibited rubisco complexes 51 .
Activated forms (ECM) of both rubiscos did not result in stimulation (Fig. 4c). The magnitude of the stimulation varied greatly, with saturating concentrations of stably inhibited rubisco increasing Q1O1 and Q2O2 ATPase activity 2-and 20-fold, respectively (Fig. 4a,b). These results demonstrate that the CbbQO complexes specifically recognize inhibited rubisco complexes and that, similar to other AAA þ ATPases 19,52 , a productive interaction with their protein substrate results in conformational changes that accelerate ATP hydrolysis and thus assist remodelling.
The mechanism of QO-mediated rubisco activation is conserved.
To probe the mechanism of QO-mediated rubisco activation we purified and characterized a series of amino-acid exchanges in both the rubisco and activase components of the system (Supplementary . In all cases the fully activated (ECM) rubisco activity of the rubisco mutants tested was at least 70% of the corresponding wild-type activity, indicating catalytic performance was largely unaffected (Supplementary Figs 11d,f and 12d). For the AAA þ ATPase AfQ2 we mutated the Walker A and Walker B motifs of the nucleotide-binding domain. Consistent with a motor function of CbbQ, these mutations abolished both ATPase and activase activities of the Q2O2 mutants (Fig. 5a,b).
For CbbX and other AAA þ ATPases the so-called pore-loop 1, which lines the pore of the ring-shaped hexamer, has been shown to be important for remodelling protein substrates 19,22,23 . This is generally achieved by threading a peptide through the pore. CbbQ has a conserved pore-loop 1 motif, but the aromatic residue attributed to threading in other systems is replaced by a leucine (Supplementary Fig. 10a). A number of predicted AfQ2 pore-loop 1 mutants maintained activase functionality  Supplementary Fig. 10c). The W83L and W83F mutants displayed three and twofold increased basal ATPase activities. Although the D78A substitution resulted in a loss of activase function, a concomitant abolition in ATPase activity makes this result inconclusive ( Supplementary Fig. 10b). Analysis of lessconservative amino-acid exchanges in this region was hampered by the insolubility of such mutants.
In both CbbX and Rca pore-loop mutations leading to ATPaseactive, but activase-compromised proteins have been identified 19,20 . In contrast, our data so far do not support a role for the central pore of the AAA þ hexamer in the mechanism of CbbQO-mediated rubisco activation.
To understand the contribution of the CbbO subunit we mutated all five predicted CbbO MIDAS motif residues in the VWA domain of Q2O2 to alanine (Supplementary Fig. 2). In other systems, such as the integrins, the MIDAS motif mediates protein-protein interactions via a metal ion (usually Mg 2 þ ), which is co-ordinated by the five conserved residues 40 . All MIDAS mutants tested were successfully purified as the Q2O2 hetero-oligomeric complex (Supplementary Fig. 11a). Consistent with an important role in rubisco-activase interaction all mutants but one (Q2O2(T656A)) were ATPase active, but unable to activate rubisco (Fig. 5a,b). In addition their ATPase activity was no longer stimulated by inhibited AfM (Fig. 5a), which indicates that the mutant activases could no longer sense their substrate.
The MIDAS motif generally interacts with the side chain of an acidic residue on the interacting protein, which provides a sixth co-ordination site for the MIDAS ion 39,40 . Following extensive mutagenesis of surface exposed AfLS Glu and Asp residues, we discovered that activation of AfL(D82P)S by Q1O1 was eliminated (Fig. 5d). Inhibited AfL(D82P)S did not stimulate the ATPase activity of Q1O1 (Fig. 5c). Remarkably, when the corresponding residue in AfM (E75) was mutated to alanine, activation of the resultant form II enzyme was reduced by 80% and stimulation of Q2O2 ATPase activity was abolished (Fig. 5e,f). It is striking that the corresponding residue in higher plant large subunits is 89, which is essential for rubisco activation by Rca 26 and interacts with a helical insertion in the small subdomain of the Rca AAA þ module 20,25,53 . Mutation to alanine of directly adjacent acidic residues (E81 and D74, respectively) did not affect the enzymes' ability to be activated by their activases (Fig. 5d,f) and underlined the specificity of the interaction.
These results support a model where CbbO functions as an essential substrate adaptor that binds a specific surface-exposed carboxylate group of inhibited rubisco via its VWA domain.
A sequence alignment of rubisco large subunit sequences associated with cbbQ and cbbO revealed the presence of an HK/R motif at their C terminus not present in other form I and form II sequences (Supplementary Fig. 11b). For both AfLS and AfM deleting one or two residues (DC1 and DC2), or adding one alanine residue to the C terminus of the large subunit (HKA and HRA) abolished activase function and ATPase stimulation ( Fig. 5c-f). In contrast inserting an alanine upstream of the HK/R motif reduced, but did not eliminate the ability of rubisco to be activated. Therefore, the presence and relative position of the C-terminal HK/R motif is critical. Additional analysis showed that the final residue could be substituted without a marked loss of function, whereas the penultimate histidine residue could only be partially replaced by aromatic residues (Fig. 5c-f, Supplementary Fig. 12).
A peptide corresponding to the final 11 residues of the AfM C terminus was found to strongly stimulate the ATPase activity of Q2O2, albeit at much higher concentrations (in the mM range) than required for stimulation by inhibited rubisco complexes ( Supplementary Fig. 12e). In contrast an 11-mer including the identified interacting acidic surface residue of AfM (E75) did not result in stimulation. Hence the C-terminal peptide in isolation can interact productively with the CbbQO complex, but additional contacts are required for a high-affinity interaction.
These results show that the CbbQO activases, like CbbX 19 , also function via manipulation of the large subunit C terminus. However, the interaction between the rubisco large subunit C terminus and CbbX is less specific than for CbbQO, since in the former system both the deletion of one C-tail residue or substitution of the only conserved residue was tolerated 19 . The relatively stringent requirements exhibited by CbbQO activation concerning the architecture of the C terminus may indicate that in this instance the C terminus binds to a specific pocket in the CbbQO complex.

Discussion
In this work we demonstrate the existence of a third class of rubisco activase in chemoautotrophic bacteria and for the first time extend this phenomenon to the distantly related form II rubiscos. Our observations suggest that both sugar-phosphatemediated inhibition and consequently the requirement for activase proteins is remarkably wide-spread among rubiscos and raises the possibility of universality. If so, then we can expect more convergently evolved rubisco activation systems to be discovered in organisms whose genomes do not encode Rca, CbbX or CbbQO.
A. ferrooxidans possesses multiple rubisco-encoding operons, the two studied here and an additional gene cluster encoding carboxysomal gene products, form I rubisco and a third set of cbbQ and cbbO genes 54 . In Hydrogenovibrio marinus it has been convincingly shown that three similar operons are regulated in response to CO 2 concentrations 55 , and this also appears to be the case in A. ferrooxidans 56 . At high CO 2 concentrations form II rubisco prevails, whereas when CO 2 is limiting, the form I enzymes are present. On the basis of our biochemical data, Q2O2 function is predicted to be less important at high CO 2 concentrations, since the ER complex is unstable ( Supplementary  Fig. 9a). However, it is likely and so far unexplored, that other inhibitors, such as rubisco misfire products 16 , exist in chemoautotrophic bacteria. Some of these may form complexes that resemble ECMC and thus necessitate Q2O2 function at high CO 2 concentrations.
In spite of the different quaternary structure and low sequence identity (33%) between form I and form II rubisco, the similar mutant phenotypes demonstrate that the mechanism of QO-mediated rubisco activation is conserved. Our results support an initial model where a CbbQ 6 O 1 complex interacts with the inhibited rubisco via the MIDAS motif of the VWA domain of CbbO, by engaging the identified acidic surface residue Glu 75 (form II) or Asp 82 (form I) (Fig. 6). The residue in plant rubisco that corresponds to the identified acidic surface residue, Pro 89, has been demonstrated to be necessary for activation by Rca 26 , which indicates mechanistic similarities in rubisco activation. As observed for CbbX, a productive interaction between the activase and rubisco results in a stimulation of ATPase activity (Fig. 4), which powers conformational remodelling of the inhibited enzyme. As a result the active site opens and the inhibitory sugar is released. As in CbbX, manipulation of the C terminus of the large subunit is necessary (Fig. 5), but so far no evidence for an involvement of the CbbQ pore has been obtained ( Supplementary Fig. 10). It remains to be established whether direct CbbQ-rubisco contacts are involved in the mechanism, which would allow direct force transmission from the ATPase to the substrate and permit a passive, binding role for CbbO. Alternatively the mechanical force leading to active site disruption could be transmitted through the CbbO linker to the VWA domain. We favour a mechanism similar to that proposed for the ribosome maturation protein Rea1 or midasin 57 . Rea1, the largest yeast protein, possesses six AAA þ modules connected to one VWA domain at its C terminus via a long linker and thus resembles the organization of the CbbQO hetero-oligomer. In comparative genomics, it has been noted that fusion proteins of ARTICLE different domains can predict the subunit organization of related protein complexes 58 . Since the AAA þ module of Rea1 has been placed in the same group as CbbQ in two independent classification studies 59,60 , it may predict the subunit organization of CbbQO. Rea1 removes the preribosomal factor Rsa4 and the Rix1-subcomplex from the pre-60S subunit. The Rea1 VWA domain binds Rsa4 via its MIDAS motif, whereas the AAA moiety is bound to the Rix1-subcomplex. A force generated by ATP hydrolysis then disrupts the complex, leading to release of Rsa4, the Rix1-subcomplex and Rea1 from the pre-60S particle 57 . It will be interesting to see whether disruption of the rubisco active site by CbbQO proceeds via such a mechanism. Since CbbQO activation involves conserved contact points that have been shown to be important for activation by Rca and CbbX respectively (residue 89 and the C terminus), it is tempting to speculate that the active site of the rubisco large subunit is subjected to a very similar force application during activation by the three different systems. Since rubisco large subunit secondary structure is highly conserved 9 , the conformational changes that lead to an opening of the inhibited active site are likely to be similar despite large primary sequence divergence. In contrast, the detailed mechanism of force generation and propagation by the three activases appears to be dissimilar, as could be expected for convergently evolved systems.
The MoxR AAA þ family is poorly studied, but in most cases is associated with a VWA-domain-containing protein 38 . On the basis of our results we hypothesize that the association of a AAA þ hexamer with one VWA protein subunit will be a general theme. This hetero-oligomeric arrangement requires an asymmetric binding mode during complex assembly. It can be envisaged that interaction of a CbbO monomer with a AAA þ hexamer leads to a conformational change that prevents the binding of additional CbbO subunits.
We predict that in similar systems the VWA protein will bind a target protein via its MIDAS motif, and that the energy of ATP hydrolysis will then be employed to conformationally remodel the substrate. Consistent with these ideas the VWA protein ViaA stimulates the ATPase activity of its associated MoxR AAA þ protein RavA, suggesting an interaction 61 . Likewise the viral MoxR AAA þ protein p618 has been shown to interact with the associated VWA protein p892 (ref. 62).
Early work by Hayashi et al. 31,32 suggested increased rubisco activity in E. coli lysates co-expressing rubisco and associated CbbQ and CbbO proteins. This increased activity may possibly be explained by removal of so far unidentified inhibitors present in the E. coli cytoplasm. However, the increased activity was reported for the CbbQ and CbbO proteins expressed separately 32 and specificity was not observed (for example, both CbbQ1 and CbbQ2 co-expression led to increased form I and form II rubisco activity 31 ). In contrast our in vitro data show the entire CbbQO complex is required for functionality. So far, we have not been able to identify CbbO proteins that express solubly in E. coli in the absence of CbbQ.
In summary our work illuminates another facet of the evolutionary innovations developed by organisms to overcome the well-documented short-comings of rubisco. A thorough mechanistic understanding of these processes will allow engineering strategies towards optimized CO 2 fixation to be implemented with consequences in fields such as agriculture, bioenergy and CO 2 sequestration.

Plasmids.
A. ferrooxidans ATCC 23270 (ref. 63) genes were amplified from genomic DNA obtained from the ATCC. The cbbL (AFE_3051) and cbbS (AFE_3052) genes are positioned in tandem and were amplified together and cloned between the NdeI-HindIII sites of the vector pET30b to yield pET30bAfcbbLS. The form II rubisco cbbM (AFE_2155), activase genes cbbQ1 (AFE_3053), and cbbQ2 (AFE_2156), were cloned between the SacII/HindIII sites of the vector pHue allowing cleavage of His 6 -ubiquitin fusions at their native N termini 46 . This resulted in plasmids pHueAfcbbM, pHueAfcbbQ1 and pHueAfcbbQ2 respectively. pHueRrcbbM and pHueRpcbbM encoding form II rubisco of Rhodospirillum rubrum (RrM) and Rhodopseudomonas palustris (RpM) were constructed similarly, but the former used pTrcrbcM 64 as template and was cloned between the BamHI/HindIII sites of pHue, and RpcbbM was synthesized by Genscript. CbbO1 (AFE_3054) and cbbO2 (AFE_2157) were first amplified and cloned into the pHue vector using SacII and HindIII restriction sites followed by positioning the XbaI-HindIII fragments into the pBAD33 vector 65 to give pBAD33UbAfcbbO1 and pBAD33UbAfcbbO2. pBAD33Ub-HA-AfcbbO2 and pBAD33Ub-FLAG-AfcbbO1 were constructed in the same way, but using pHueHA and pHueFLAG instead of pHue (these vectors have sequences encoding the HA and the FLAG-epitopes inserted 5 0 of the SacII site, and 3 0 of the protease cleavage site resulting in N-terminal epitope tags). The QuikChange protocol (Stratagene) was used to introduce point mutations, deletions or insertions into the expression plasmids as desired. The plasmids and primers used are listed in Supplementary  Table 1 and 2, respectively. All protein-encoding sequences were verified by DNA sequencing.
Protein expression and purification. All protein concentrations were quantified using the Bradford assay with bovine serum albumin as the standard. Immunoblotting used monoclonal mouse anti-FLAG M2 (Sigma Aldrich, F1804) and anti-HA (protein tech, 66006) antibodies at 1:1,000 dilution.
Plasmids encoding His 6 -ubiquitin fusion proteins (AfM and mutants, RpM, RrM, AfQ1 and AfQ2) were transformed into E. coli BL21 (DE3) cells and grown in Luria-Bertani (LB) medium at 37°C followed by induction with 0.5 mM isopropyl -D-1-thiogalactopyranoside (IPTG) at 23°C. The collected cells were lysed in buffer A (20 mM Tris-HCl, pH 8.0, 50 mM NaCl) containing 10 mM imidazole and 0.3 mg ml À 1 lysozyme for 30 min on ice. Cells were disrupted by ultrasonication after addition of 1 mM phenylmethanesulfonyl fluoride. The supernatant obtained from centrifugation (40,000 Â g, 45 min, 4°C) was applied to Ni 2 þ -nitrilotriacetic acid resin (Pierce) followed by cleavage of the ubiquitin moiety to yield the native N terminus, as described previously 46 . Any precipitated protein was removed at this stage by centrifugation. The protein solution was applied to a pre-equilibrated Mono Q 10/100 GL column, and eluted with a linear salt gradient to 0.5 M NaCl. Fractions containing the protein of interest were combined, concentrated and applied to a Superdex 200 gel-filtration column equilibrated with buffer A. The purest protein fractions (495% pure as judged by SDS-PAGE) were concentrated, supplemented with 5% glycerol, flash-frozen in liquid nitrogen and stored at À 80°C.
E. coli BL21 cells transformed with pET30bAfcbbLS encoding wild-type and mutant proteins were grown and induced as described above. The cells were resuspended in buffer A, incubated with lysozyme and lysed by ultrasonication. The soluble supernatant was applied to a Source30Q column (GE Healthcare) preequilibrated with buffer A and proteins were eluted with a linear salt gradient to 0.5 M NaCl. Fractions containing AfLS were combined, concentrated and applied to Superdex 200 16/60 size-exclusion column. The purest fractions were pooled, concentrated, supplemented with 5% v/v glycerol, flash-frozen in liquid nitrogen and stored at À 80°C.
AfCbbQ1 and AfCbbO1 were co-expressed in E. coli and co-purified as Q1O1 His 6 -ubiquitin fusion proteins. The cells harbouring both pHueAfcbbQ1 and pBAD33UbAfcbbO1 were grown in LB medium and induced with both 0.5 mM IPTG and 0.2% w/v L-arabinose at 23°C. Soluble lysate was produced and Ni-NTA affinity chromatography was performed as described above. After overnight cleavage of the ubiquitin moiety at 23°C, the soluble proteins were applied onto a MonoQ column followed by gel-filtration using a Superdex 200 column. AfCbbQ2 and AfCbbO2 (and all Q2O2 mutants) were also co-expressed and co-purified like Q1O1 protein with the addition of 5 mM MgCl 2 and 5% glycerol in the purification buffers used in Ni-NTA agarose and anion-exchange columns. Purified proteins were concentrated toB5 mg ml À 1 , 5% glycerol was added, flash-frozen in liquid nitrogen and stored at À 80°C.
Determination of native molecular weight. Analytical gel filtration was performed using a Superdex 200 PC3.2/30 column using buffer A as the eluant. For glycerol gradient sedimentation analysis, 10 ml 5-30% glycerol gradients in buffer A were overlaid with 100 ml of protein solution in buffer A (the concentration of each complex was 1 mg ml À 1 ). The gradients were centrifuged (35,000 r.p.m., 16 h, 4°C) using a SW 41 Ti rotor (Beckmann Coulter). Five hundred microlitres fractions were collected and analysed by SDS-PAGE and silver staining. Standards for both gel filtration and sedimentation analysis were obtained from the highmolecular-weight gel filtration calibration kit (GE healthcare). Native molecular weight was calculated using the equation M ¼ 4,205 (SR s ), where S is in Svedberg units and R s in nanometers 43 .
Enzymatic assays. All ATPase activity and rubisco activation assays were performed spectrophotometrically at 25°C. ATPase activity was assayed using a coupled enzymatic assay that monitored the oxidation of NADH as described previously 19,66 . Rubisco activity was assayed using the spectrophotometric assay essentially as described 67 with minor modifications. Briefly, the measurements were carried out in 100-ml reactions containing 100 mM Tricine-NaOH pH8.0, 5 mM MgCl 2 , coupling enzymes 67 , 0.5 mM NADH, 10 mM phosphocreatine, 5 or 20 mM NaHCO 3 , 1 mM DTT/ATP/RuBP, 0.27 mM CbbQO oligomer, and ECM, ER and ECMC as stated in the figure legends. Ribulose-1,5-bisphosphate was synthesized enzymatically from ribose-5-phosphate 68 and purified using anion-exchange chromatography 69 . ECM was obtained by incubating rubisco (20-100 mM active sites depending on desired final concentration) in buffer A supplemented with 40 mM NaHCO 3 and 10 mM MgCl 2 for 10 min at 25°C, while ECMC was prepared by further addition of a fourfold molar excess of carboxypentitol bisphosphate (a 1:1 mixture of the tight-binding inhibitor CABP and the loosebinding stereoisomer carboxyribitol-1,5-bisphosphate) to the ECM complex for 10 min. The ER complex was obtained by incubating rubisco in buffer A with 4 mM EDTA for 10 min and then adding 0.8 mM RuBP. Peptides (495% purity) were synthesized by the Peptide Synthesis Core Facility of the School of Biological Sciences, NTU. Relative activase activities were determined by measuring the rubisco activity of ECMC complexes exposed to activase 1 min after assay initiation.
Electron microscopy. Purified proteins at concentrations indicated in the figure legends in 20 mM Tris-HCl pH8.0, 50 mM NaCl were applied to a carbon-coated TEM grid and stained with 2% (w/v) uranyl acetate. Micrographs were recorded on a FEI T12 transmission electron microscope equipped with a 4 K CCD camera (FEI) at a magnification of Â 66,350 resulting in a pixel size of 2.11 Å per pixel under low-dose conditions. Single particles were selected and processed with the EMAN2 image-processing package 70 . Particles were CTF corrected with EMAN2 before processing.
Particles were manually picked for the AfQ2 data set (defocus range 0.8-1.8 mm) because of the preferred orientation of particles on the grid (with top views dominating). A total of 958 particles were used for reference-free unbiased 2D classifications and 3D reconstruction. Unbiased 2D classification and initial models clearly suggested a six-fold symmetry similar to other known hexameric AAA proteins. The highest quality initial model (based on EMAN2 criterion) was used for further refinement to obtain a 3D EM map of AfQ2 at 23 Å resolution using gold standard criteria (0.143 criterion) with an imposed six-fold symmetry.
A data set of Q2O2DC444 (defocus range 0.6-1.6 mm) containing 12,486 particles was used for reference-free unbiased 2D classifications. Selected 2D classes showed an additional density of O2 radiating from one of the subunits of the AfQ2 hexamer when compared with 2D classes of AfQ2 alone. Figure 6 was prepared using Pymol (www.pymol.org).