Modelling the pyrenoid-based CO2-concentrating mechanism provides insights into its operating principles and a roadmap for its engineering into crops

Many eukaryotic photosynthetic organisms enhance their carbon uptake by supplying concentrated CO2 to the CO2-fixing enzyme Rubisco in an organelle called the pyrenoid. Ongoing efforts seek to engineer this pyrenoid-based CO2-concentrating mechanism (PCCM) into crops to increase yields. Here we develop a computational model for a PCCM on the basis of the postulated mechanism in the green alga Chlamydomonas reinhardtii. Our model recapitulates all Chlamydomonas PCCM-deficient mutant phenotypes and yields general biophysical principles underlying the PCCM. We show that an effective and energetically efficient PCCM requires a physical barrier to reduce pyrenoid CO2 leakage, as well as proper enzyme localization to reduce futile cycling between CO2 and HCO3−. Importantly, our model demonstrates the feasibility of a purely passive CO2 uptake strategy at air-level CO2, while active HCO3− uptake proves advantageous at lower CO2 levels. We propose a four-step engineering path to increase the rate of CO2 fixation in the plant chloroplast up to threefold at a theoretical cost of only 1.3 ATP per CO2 fixed, thereby offering a framework to guide the engineering of a PCCM into land plants.

While previous works have identified essential molecular components for the PCCM 16,[20][21][22][23][24][25][26][27][28][29] , key operating principles of this mechanism remain poorly understood due to a lack of quantitative and systematic analysis. At the same time, there is growing interest in engineering a PCCM into C 3 crops to improve yields and nitrogenand water-use efficiency 30,31 . Key questions are: (1) What is the minimal set of components necessary to achieve a functional PCCM? (2) What is the energetic cost of operating a minimal PCCM?
We model the above enzymatic activities and Ci transport in a spherical chloroplast. We assume that carbonic anhydrases catalyse the bidirectional interconversion of CO 2 and HCO 3 − , producing a net flux in one direction where the two species are out of equilibrium. We consider three chloroplast compartments at constant pH values: a spherical pyrenoid matrix (pH 8, ref. 41 ) in the centre, a surrounding stroma (pH 8, ref. 41,42 ), and thylakoids (luminal pH 6, ref. 43 ) traversing both the matrix and stroma ( Fig. 1b and Supplementary Fig. 2). The flux balance of intracompartment reaction and diffusion and intercompartment exchange sets the steady-state concentration profiles of Ci species in all compartments (Methods). To account for the effect of Ci transport across the cell membrane, we simulate a broad range of surrounding cytosolic Ci pools from which the chloroplast can uptake Ci. We characterize the performance of the modelled PCCM with two metrics: (1) its efficacy, quantified by the computed CO 2 fixation flux normalized by the maximum possible flux through Rubisco; and (2) its efficiency, quantified by the ATP cost per CO 2 fixed (Methods).

A baseline PCCM driven by intercompartmental pH differences.
To identify the minimal components of a functional PCCM, we build a baseline model (Fig. 1c,d), with the carbonic anhydrase LCIB diffuse throughout the stroma, BST channels for HCO 3 − uniformly distributed across the thylakoid membranes, the carbonic anhydrase CAH3 localized to the thylakoid lumen within the pyrenoid, and Rubisco condensed within the pyrenoid matrix. This model lacks the HCO 3 − transporter LCIA and potential diffusion barriers to Ci. We first analyse modelled PCCM performance under air-level CO 2 (10 μM cytosolic); lower CO 2 conditions are discussed in later sections. CO 2 diffusing into the chloroplast is converted to HCO 3 − in the high-pH stroma where the equilibrium CO 2 :HCO 3 − ratio is 1:80 (Methods). Since passive diffusion of HCO 3 − across the chloroplast envelope is very slow, this concentrated HCO 3 − becomes trapped in the stroma. The BST channels equilibrate HCO 3 − across the thylakoid membrane, so HCO 3 − also reaches a high concentration in the thylakoid lumen (Fig. 1c). The low pH in the thylakoid lumen favours a roughly equal equilibrium partition between CO 2 and HCO 3 − ; however, HCO 3 − is not brought into equilibrium with CO 2 immediately upon entering the thylakoid outside the pyrenoid, since no carbonic anhydrase (CA) is present there. Instead, HCO 3 − diffuses within the thylakoid lumen towards the pyrenoid, where CAH3 localized within the pyrenoid radius rapidly converts HCO 3 − back to CO 2 (Fig. 1d). This CO 2 can diffuse across the thylakoid membrane into the pyrenoid matrix. This baseline model, driven solely by intercompartmental pH differences, achieves a pyrenoidal   3 − is transported across the chloroplast membrane by LCIA and across the thylakoid membranes by BST1-3 (referred to as BST henceforth for simplicity). In the acidic thylakoid lumen, a carbonic anhydrase CAH3 converts HCO 3 − into CO 2 , which diffuses into the pyrenoid matrix where the CO 2 -fixing enzyme Rubisco (Rbc) is localized. CO 2 leakage out of the matrix and the chloroplast can be impeded by potential diffusion barriers-a starch sheath and stacks of thylakoids-and by conversion to HCO 3 − by a CO 2 -recapturing complex LCIB/LCIC (referred to as LCIB henceforth for simplicity) in the basic chloroplast stroma. b, A schematic of the modelled PCCM, which considers intracompartment diffusion and intercompartment exchange of CO 2 and HCO 3 − , as well as their interconversion, as indicated in the inset. Colour code as in a. The model is spherically symmetric and consists of a central pyrenoid matrix surrounded by a stroma. Thylakoids run through the matrix and stroma; their volume and surface area correspond to a reticulated network at the centre of the matrix extended by cylinders running radially outward. c, Concentration profiles of CO 2 and HCO 3 − in the thylakoid (dashed curves) and in the matrix/stroma (solid curves) for the baseline PCCM model that lacks LCIA activity and diffusion barriers. Dotted grey line indicates the effective Rubisco K m for CO 2 (Methods). Colour code as in a. d, Net fluxes of inorganic carbon between the indicated compartments. The width of arrows is proportional to flux; the area of circles is proportional to the average molecular concentration in the corresponding regions. The black dashed loop denotes the major futile cycle of inorganic carbon in the chloroplast. Colour code as in a. For c and d, LCIA C -mediated chloroplast membrane permeability to HCO 3 − κ H − chlor = 10 −8 m s −1 , BST-mediated thylakoid membrane permeability to HCO 3 − κ H − thy = 10 −2 m s −1 , LCIB rate V LCIB = 10 3 s −1 and CAH3 rate V CAH3 = 10 4 s −1 (Methods). Other model parameters are estimated from experiments (Supplementary Table 2). CO 2 concentration approximately 2.5 times that found in a model with no PCCM.
The baseline PCCM suffers from pyrenoid CO 2 leakage. The substantial CO 2 leakage out of the matrix in the baseline model ( Fig. 1d) is in part due to the relatively slow kinetics of Rubisco. During the average time required for a CO 2 molecule to be fixed by Rubisco in the pyrenoid, that CO 2 molecule can typically diffuse a distance larger than the pyrenoid radius (Supplementary Note I). Therefore, most of the CO 2 molecules entering the pyrenoid matrix will leave without being fixed by Rubisco ( Supplementary Fig. 3). One might think that adding LCIA C as a passive channel to enhance HCO 3 − diffusion into the chloroplast could overcome this deficit (Fig. 2a). However, even with the addition of LCIA C to our baseline PCCM model, no combination of enzymatic activities and channel transport rates achieves an effective PCCM, that is, more than half-saturation of Rubisco with CO 2 (Fig. 2b and Supplementary  Fig. 4). Thus, the pH-driven PCCM cannot operate effectively without a diffusion barrier.

Barriers to pyrenoidal CO 2 leakage enable a pH-driven PCCM.
To operate a more effective PCCM, the cell must reduce CO 2 leakage from the pyrenoid matrix. A barrier to CO 2 diffusion has been regarded as essential for various CO 2 -concentrating mechanisms [44][45][46][47] . Although the matrix is densely packed with Rubisco, our analysis suggests that the slowed diffusion of CO 2 in the pyrenoid matrix due to volume occupied by Rubisco can only account for a 10% decrease in CO 2 leakage (Supplementary Note VI.C). Thus, we consider alternative barriers in our model.
We speculate that thylakoid membrane sheets and the pyrenoid starch sheath could serve as effective barriers to decrease leakage of CO 2 from the matrix. Thylakoid membrane sheets could serve as effective barriers to CO 2 diffusion because molecules in the stroma must diffuse between and through the interdigitated membranes 45 . Indeed, our first-principle simulations suggest that the thylakoid stacks, modelled with realistic geometry 48 , effectively slow the diffusion of Ci in the stroma ( Supplementary  Fig. 5). Evidence on the role of the starch sheath in the PCCM is limited and mixed. While early work suggested that a starchless Chlamydomonas mutant had normal PCCM performance in air 49 , the phenotype was not compared to the appropriate parental strain. A more recent study found that a mutant (sta2-1) with a thinner starch sheath than wild-type strains displays decreased PCCM efficacy at very low CO 2 50 . On the basis of the latter work, we hypothesize that the starch sheath that surrounds the matrix may act as a barrier to CO 2 diffusion. Since the starch sheath consists of many lamellae of crystalline amylopectin 51-53 , we model it as an essentially impermeable barrier equivalent to 10 lipid bilayers; in its presence, most CO 2 leakage out of the matrix occurs through the thylakoid tubules ( Supplementary Fig. 6).
We next test whether the above two realistic diffusion barriers allow for an effective pH-driven PCCM. Adding either thylakoid stacks or a starch sheath to the baseline PCCM model above drastically reduces CO 2 leakage from the matrix to the stroma ( Supplementary Fig. 7). The resulting PCCM is highly effective under air-level CO 2 (10 μM cytosolic) conditions: pyrenoidal CO 2 concentrations are raised above the effective half-saturation constant K m of Rubisco (Methods) using only the intercompartmental pH differential and passive Ci uptake (Fig. 2e,h). PCCM performance with both barriers present closely resembles the impermeable starch sheath case ( Supplementary  Fig. 8); for simplicity, we omit such a combined model from further discussion.
Optimal passive Ci uptake uses cytosolic CO 2 , not HCO 3 − . In addition to the requirement for a diffusion barrier, the efficacy of the pH-driven PCCM depends on the LCIB rate and the LCIA C -mediated chloroplast membrane permeability to HCO 3 − (Fig. 2b,e,h). Depending on LCIB activity, our model suggests two distinct strategies to passively uptake Ci. If LCIB activity is low, CO 2 fixation flux increases with higher LCIA C -mediated permeability to HCO 3 − , which facilitates the diffusion of cytosolic HCO 3 − into the stroma (Fig. 2c,f,i). In contrast, if LCIB activity is high, CO 2 fixation flux is maximized when LCIA C -mediated permeability is low; in this case, a diffusive influx of CO 2 into the chloroplast is rapidly converted by LCIB into HCO 3 − , which becomes trapped and concentrated in the chloroplast. Under this scenario, permeability of the chloroplast membrane to HCO 3 − due to LCIA C is detrimental, since it allows HCO 3 − converted by LCIB in the stroma to diffuse back out to the cytosol (Fig. 2c,f,i).
Interestingly, the highest CO 2 fixation flux is achieved by passive CO 2 uptake mediated by the carbonic anhydrase activity of LCIB, not by passive HCO 3 − uptake via LCIA C channels (Fig. 2), even though HCO 3 − is more abundant than CO 2 in the cytosol. The key consideration is that the stroma (at pH 8) is more basic than the cytosol (at pH 7.1, ref. 54 ), which allows LCIB to equilibrate passively acquired CO 2 with HCO 3 − to create an even higher HCO 3 − concentration in the stroma than in the cytosol.
The PCCM requires active Ci uptake under very low CO 2 . While the passive CO 2 uptake strategy can power the pH-driven PCCM under air-level CO 2 (10 μM cytosolic), its Ci uptake rate is ultimately limited by the diffusion of CO 2 across the chloroplast envelope. Indeed, our simulations show that under very low CO 2 conditions (1 μM cytosolic) 55 , a chloroplast using the passive CO 2 uptake strategy can only achieve at most 20% of its maximum CO 2 fixation flux, even in the presence of barriers to Ci diffusion (Fig. 3). Since passive HCO 3 − uptake cannot concentrate more Ci than passive CO 2 uptake (Fig. 2), we hypothesize that active Ci transport is required for an effective PCCM at very low CO 2 . To test this idea, we consider a model employing active LCIA HCO 3 − pumps (LCIA P ) without LCIB activity (Fig. 3a,d,g). We find that, indeed, HCO 3 − pumping enables . a,d,g, Schematics of the modelled chloroplast employing LCIB for passive CO 2 uptake (red), or employing active LCIA P -mediated HCO 3 − pumping across the chloroplast envelope and no LCIB activity (blue). PCCM performance under air-level CO 2 (10 µM cytosolic) (b,e,h) and under very low CO 2 (1 µM cytosolic) (c,f,i) are shown, as measured by normalized CO 2 fixation flux versus ATP spent per CO 2 fixed, for the two inorganic carbon uptake strategies in a, d and g. Solid curves indicate the minimum energy cost necessary to achieve a certain normalized CO 2 fixation flux. Shaded regions represent the range of possible performances found by varying HCO 3 − transport rates and LCIB rates. Colour code as in a. In h and i, dashed black curves indicate the optimal PCCM performance of a simplified model that assumes fast intracompartmental diffusion, fast HCO 3 − diffusion across the thylakoid membranes, and fast equilibrium between CO 2 and HCO 3 − catalysed by CAH3 in the thylakoid tubules inside the pyrenoid (Methods).

Both passive and active Ci uptake can have low energy cost.
According to our model, both passive CO 2 uptake and active HCO 3 − pumping can support an effective PCCM under air-level CO 2 . However, the latter directly consumes energy to achieve non-reversible transport. What is the total energy cost of a PCCM that employs active HCO 3 − uptake, and how does this cost compare to that of the passive CO 2 uptake strategy? To answer these questions, we used a nonequilibrium thermodynamics framework to compute the energy cost of different Ci uptake strategies (Supplementary Note II and Fig. 13) 56 . First, a PCCM without diffusion barriers is energetically expensive regardless of the Ci uptake strategies employed (Fig. 3a-c). Second, in the presence of diffusion barriers, we find that the passive CO 2 uptake strategy can achieve similar energy efficiency (~1 ATP cost per CO 2 fixed) to the active HCO 3 − uptake strategy ( Fig. 3d-i). Thus, both strategies can achieve high PCCM performance at air-level CO 2 ; however, active HCO 3 − uptake is necessary to achieve high efficacy under lower CO 2 .

The PCCM depends on cytosolic Ci and its chloroplast uptake.
How does Ci transport across the cell's plasma membrane impact the feasible Ci uptake strategies at the chloroplast level? To explore this question in our chloroplast-scale model, we assess PCCM performance under a broad range of cytosolic CO 2 and HCO 3 − concentrations ( Supplementary Fig. 15). Unsurprisingly, we find that the performance of a particular chloroplast Ci uptake strategy increases with the cytosolic level of its target Ci species. Thus, it is important to replenish cytosolic Ci species taken up by the chloroplast. Moreover, regardless of the makeup of the cytosolic Ci pool, a chloroplast lacking both passive CO 2 uptake and active HCO 3 − uptake fails to achieve high PCCM efficacy, unless the cytosolic CO 2 concentration is 100 μM or higher. Creating such a pool would presumably result in substantial CO 2 leakage across the plasma membrane and thus high energy cost. Therefore, effective mechanisms for Ci uptake from the external environment to the cytosol and from cytosol to the chloroplast are both essential for high PCCM performance.
Carbonic anhydrase localization alters modelled Ci fluxes. So far, we have only considered the carbonic anhydrase localization patterns that are thought to exist in Chlamydomonas under air-level CO 2 40,57 . To assess the benefits of such localization, we vary the localization of CAH3 and LCIB while maintaining the total number of molecules of each carbonic anhydrase (Fig. 4a). We find that ectopic carbonic anhydrase localization compromises PCCM performance. First, LCIB mislocalized to the basic pyrenoid matrix (pH 8) converts Rubisco's substrate CO 2 into HCO 3 − , and hence decreases CO 2 fixation ( Fig. 4b-f, region i). Second, when CAH3 is distributed in the thylakoids outside the pyrenoid, CO 2 molecules produced by this CAH3 can diffuse directly into the stroma, making them less likely to be concentrated in the pyrenoid and thus decreasing the efficacy of the PCCM (Fig. 4b-f, region ii, and Supplementary Fig. 16). Moreover, CAH3 mislocalization outside the pyrenoid decreases PCCM efficiency as it leads to increased futile cycling of Ci between the stroma and thylakoid, increasing the energetic cost required to maintain the intercompartmental pH differences. Finally, concentrating CAH3 to a small region of thylakoid lumen in the centre of the pyrenoid increases the distance over which HCO 3 − needs to diffuse before it is converted to CO 2 , thus lowering the CO 2 production flux by CAH3 (Fig. 4b-f, region iii). All these results hold true both at air-level CO 2 employing passive CO 2 uptake (Fig. 4) and at very low CO 2 employing active HCO 3 − uptake ( Supplementary Fig. 17). Thus, our model shows that proper carbonic anhydrase localization is crucial to overall PCCM performance.

Effects of LCIB activity and localization at very low CO 2 .
When shifted from air levels to very low levels of CO 2 (~1 μM dissolved), Chlamydomonas relocalizes LCIB from diffuse throughout the stroma to localized around the pyrenoid periphery 57 . To better understand the value of LCIB localization to the pyrenoid periphery under very low CO 2 , we vary both the end radius of stromal LCIB,   which defines how far LCIB extends towards the chloroplast envelope, and the total number of LCIB molecules in a model employing a starch sheath barrier and active HCO 3 − uptake (Fig. 5a). Our analysis shows that it is energetically wasteful to allow concentrated CO 2 to leak out of the chloroplast ( Supplementary Fig. 13). Consequently, LCIB relocalized near the starch sheath increases energy efficiency by recapturing CO 2 molecules that diffuse out of the matrix and trapping them as HCO 3 − in the chloroplast ( Fig. 5b-c, region i). The energy cost is higher without any LCIB for CO 2 recapture (Fig. 5b-c, region iii), or with diffuse stromal LCIB, which allows incoming HCO 3 − to be converted into CO 2 near the chloroplast membrane at which point it can leak back to the cytosol (Fig. 5b-c, region ii, and Supplementary Fig. 19). Our model thus suggests that under very low CO 2 and in the presence of a strong CO 2 diffusion barrier around the pyrenoid, localizing LCIB at the pyrenoid periphery allows for efficient Ci recycling, therefore enhancing PCCM performance.
Intercompartmental pH differences are key to PCCM function. To determine the impact of thylakoid lumen and stromal pH on PCCM function, we vary the pH values of the two compartments ( Fig. 6 and Supplementary Fig. 20). We find that regardless of Ci uptake strategy, the modelled PCCM achieves high efficacy only when the thylakoid lumen is much more acidic than the stroma (Fig. 6a,d). Indeed, carbonic anhydrase activity in a low-pH stroma (Fig. 6, region i) or in a high-pH intrapyrenoid tubule lumen (Fig. 6, region ii) would lead to low concentrations of HCO 3 − or CO 2 , respectively, in those compartments; both would be detrimental to the PCCM. Interestingly, variation in pH differentially influences the energy efficiency of the PCCM employing passive CO 2 uptake (Fig. 6a-c) and the PCCM employing active HCO 3 − pumping (Fig. 6d-f). Specifically, only the latter shows a dramatically increased energy cost when the stroma has a relatively low pH; in this case, most HCO 3 − pumped into the stroma is converted to CO 2 and is subsequently lost to the cytosol (Fig. 6e,f, regions i and ii). Thus, our results suggest that high PCCM performance requires maintenance of a high-pH stroma and a low-pH thylakoid lumen.
The model recapitulates Chlamydomonas PCCM mutant phenotypes. We next explore whether our model can account for the phenotypes of known Chlamydomonas PCCM-deficient mutants. We select model parameters to best represent the effect of each mutation, assuming that the Chlamydomonas PCCM switches from passive CO 2 uptake under air-level CO 2 to active HCO 3 − uptake under very low CO 2 ( Supplementary Figs. 23 and 24). Our simulation results show semi-quantitative agreement with experimental results for all published mutants (Supplementary Table 5) and provide mechanistic explanations for all recorded phenotypes. For example, our model captures that the lcib mutant fails to grow in air, presumably due to a defect in passive CO 2 uptake. This phenotype implies that Chlamydomonas does not pump HCO 3 − into the chloroplast under air-level CO 2 because a modelled lcib mutant employing HCO 3 − pumping has a PCCM effective enough to drive growth in air. Notably, the lcib mutant recovers growth under very low CO 2 , which we attribute to the activation of an HCO 3 − uptake system under this condition 22,57,58 . Indeed, knockdown of the gene encoding the LCIA HCO 3 − transporters in the lcib mutant background results in a dramatic decrease in CO 2 fixation and growth under very low CO 2 57 . More broadly, our model recapitulates phenotypes of Chlamydomonas mutants lacking the HCO 3 − transporter HLA3 or the CO 2 transporter LCI1 at the plasma membrane. Indeed, knockdown of the gene encoding HLA3 (simulated as a lower level of cytosolic HCO 3 − ) leads to a dramatic decrease in PCCM efficacy under very low CO 2 , presumably due to reduced HCO 3 − import into the cell and thus into the chloroplast 23,24 . In contrast, the lci1 single mutant shows a moderate decrease in PCCM efficacy under air-level CO 2 , presumably due to a reduced CO 2 influx into the cytosol and thus into the chloroplast, but no effect on the PCCM under very low CO 2 , presumably due to the activation of an active HCO 3 − uptake system under this condition 34 . Finally, our model captures the phenotypes of Chlamydomonas starch mutants, which survive under both air-level and very low CO 2 conditions presumably because thylakoid stacks can effectively block CO 2 leakage from the pyrenoid in the absence of a starch sheath. The existence of non-starch diffusion barriers, such as the thylakoid stacks, may also help explain why some other pyrenoid-containing algae do not have a starch sheath 59 .

Various thylakoid architectures can support PCCM function.
The analysis of Ci fluxes in our model supports the long-held view that the thylakoid tubules traversing the pyrenoid in Chlamydomonas can deliver stromal HCO 3 − to the pyrenoid, where it can be converted to CO 2 by CAH3 32,60 . However, is a Chlamydomonas-like thylakoid architecture necessary to a functional PCCM? Certainly, eukaryotic algae display a variety of thylakoid morphologies, such as multiple non-connecting parallel thylakoid stacks passing through the pyrenoid, a single disc of thylakoids bisecting the pyrenoid matrix, or thylakoid sheets surrounding but not traversing the pyrenoid 61-64 . Our calculations show that different thylakoid morphologies could in principle support the functioning of an effective PCCM, as long as HCO 3 − can diffuse into the low-pH thylakoid lumen and the thylakoid carbonic anhydrase is localized to the pyrenoid-proximal lumen ( Supplementary Fig. 25).

An effective PCCM needs Ci uptake, transport and trapping.
Our model identifies a minimal PCCM configuration sufficient to effectively concentrate CO 2 . Next, we ask: can alternative configurations of the same minimal elements achieve an effective PCCM? We restrict our focus to PCCMs employing passive Ci uptake strategies. We measured the efficacy and energy cost of 216 partial PCCM configurations in air, varying the presence and localization of Rubisco, thylakoid and stromal carbonic anhydrases, HCO 3 − channels on the thylakoid membranes and the chloroplast envelope, and diffusion barriers ( Supplementary Fig. 26).
Our results summarize three central modules of an effective pH-driven PCCM (Fig. 7a): (i) a stromal carbonic anhydrase (LCIB) to convert passively acquired CO 2 into HCO 3 − , (ii) a thylakoid membrane HCO 3 − channel (BST) and a luminal carbonic anhydrase (CAH3) that together allow conversion of HCO 3 − to CO 2 near Rubisco, and (iii) a Rubisco condensate surrounded by diffusion barriers. We find that PCCM configurations lacking any one of these modules show a compromised ability to concentrate CO 2 (Fig. 7b). The Chlamydomonas-like PCCM configuration is the only configuration possessing all three modules; thus, this configuration is not only sufficient but also necessary to achieve an effective PCCM using the considered minimal elements.

Possible strategies for engineering a PCCM into land plants.
Many land plants, including most crop plants, are thought to lack any form of CCM. Our analysis shows that a typical plant chloroplast configuration can only support ~30% of the maximum CO 2 fixation flux through Rubisco (Supplementary Table 6). Engineering a PCCM into crops has emerged as a promising strategy to increase yields through enhanced CO 2 fixation 30,31 . Despite early engineering advances including expressing individual PCCM components 65 and reconstituting a pyrenoid matrix in plants 66 , the optimal order of engineering steps needed to establish an effective PCCM in a plant chloroplast remains unknown. Here we leverage our partial PCCM configurations to propose an engineering path that results in monotonic improvement of efficacy and avoids excessive energy costs. To the best of our knowledge, the plant chloroplast contains diffuse carbonic anhydrase and diffuse plant Rubisco in the stroma, and lacks HCO 3 − channels and diffusion barriers 67 . We note that plant Rubisco has a lower K m for CO 2 than Chlamydomonas Rubisco; our engineering calculations account for this and employ values from plant Rubisco. Studies have also suggested that native plant carbonic anhydrases are diffuse in the thylakoid lumen 68 , which we therefore assume in our modelled plant chloroplast configuration (Fig. 8, starting configuration). This configuration contains only one of the three essential modules for an effective PCCM (Fig. 7a), that is, the passive CO 2 uptake system.
After exploring all possible stepwise paths to install the remaining two modules to achieve the Chlamydomonas-like PCCM configuration (Fig. 8, desired configuration), we suggest the following path consisting of four minimal engineering steps (Fig. 8b,  arrows). The first step is the localization of plant Rubisco to a  Fig. 1a). In Chlamydomonas, LCIB can be used for passive uptake of CO 2 , which is then trapped in the stroma as HCO 3 − (module i); BST allows stromal HCO 3 − to diffuse into the thylakoid lumen where CAH3 converts HCO 3 − into CO 2 (module ii); and a starch sheath and thylakoid stacks could act as diffusion barriers to slow CO 2 escape out of the pyrenoid matrix (module iii). b, Histograms of normalized CO 2 fixation flux for CCM configurations without (left, grey) or with (right, coloured) the respective module. We tested 216 CCM configurations by varying the presence and/or localization of enzymes, HCO 3 − channels and diffusion barriers in the model (see Supplementary Fig. 26). transporters and diffusion barriers. Bottom: the desired configuration representing a Chlamydomonas chloroplast that employs the passive CO 2 uptake strategy and a starch sheath (as in Fig. 2g). b, Venn diagram showing the normalized CO 2 fixation flux (circle, area in proportion to magnitude) and ATP spent per CO 2 fixed (square, area in proportion to magnitude) of various configurations after implementing the designated changes. Arrows denote the proposed sequential steps to transform the starting configuration into the desired configuration (see text). The starting configuration has a normalized CO 2 fixation flux of 0.31 and negligible ATP cost. All costs below 0.25 ATP per CO 2 fixed are represented by a square of the minimal size.
pyrenoid matrix, which we assume would inherently exclude the plant stromal carbonic anhydrase, as the tight packing of Rubisco in the matrix appears to exclude protein complexes greater than ~80 kDa 26,69 . The second step is the localization of the thylakoid carbonic anhydrase to thylakoids that border or traverse the matrix. These first two steps do not yield notable changes to either the efficacy or the efficiency of the PCCM. The next step is to introduce HCO 3 − channels to the thylakoid membranes, which increases the CO 2 fixation flux to ~175% of that of the starting configuration. This step also increases the cost of the PCCM to around 4 ATPs per CO 2 fixed. Such a high-cost step cannot be avoided, and all other possible paths with increasing efficacy at each step have more costly intermediate configurations (Fig. 8b and Supplementary Table 6). Importantly for engineering, the increased CO 2 fixation flux resulting from this step would provide evidence that the installed channels are functional. The final step of the suggested path is to add a starch sheath to block CO 2 leakage from the pyrenoid matrix, which triples the CO 2 fixation flux compared with the starting configuration and reduces the cost to only 1.3 ATPs per CO 2 fixed.
Selecting an alternative implementation order for the four minimal engineering steps leads to decreased performance of the PCCM in intermediate stages. For example, adding HCO 3 − channels on the thylakoid membranes before the stromal and thylakoid carbonic anhydrases are localized (Fig. 8b, blue oval) leads to futile cycling generated by overlapping carbonic anhydrases (Fig. 4, region ii). Additionally, adding a starch sheath before HCO 3 − channels are added to the thylakoids could decrease CO 2 fixation (Fig. 8b, grey oval); without channels, HCO 3 − cannot readily diffuse to the thylakoid carbonic anhydrase to produce CO 2 , and the starch sheath impedes diffusion of CO 2 from the stroma to Rubisco. Thus, our suggested path avoids intermediate configurations with decreased efficacy or excessive energy cost.

Discussion
To better understand the composition and function of a minimal PCCM, we developed a multicompartment reaction-diffusion model on the basis of the Chlamydomonas PCCM. The model not only accounts for all published Chlamydomonas PCCM mutants, but also lays the quantitative and biophysical groundwork for understanding the operating principles of a minimal PCCM. Systematic analysis of the model suggests that keys to an effective and energetically efficient PCCM are barriers preventing CO 2 efflux from the pyrenoid matrix and carbonic anhydrase localizations preventing futile Ci fluxes. The model demonstrates the feasibility of passive CO 2 uptake at air-level CO 2 , and shows that at lower external CO 2 levels, an effective PCCM requires active import of HCO 3 − . Both uptake strategies can function at a low energy cost.
While not explicitly considered in our model, protons are produced in Rubisco-catalysed CO 2 -fixing reactions 5 and are consumed in CAH3-catalysed HCO 3 − -to-CO 2 conversions. Protons must then be depleted in the pyrenoid matrix and replenished in the intrapyrenoid thylakoid lumen to maintain physiological pH values 41,43 . However, our flux-balance analysis shows that the concentrations of free protons are too low to account for the expected proton depletion/replenishment fluxes by free proton diffusion (Supplementary Note VI.D and Fig. 27). Thus, efficient transport of protons must employ alternative mechanisms. One possibility, suggested by recent modelling work 70 , is that proton carriers such as RuBP and 3-PGA could be present at millimolar concentrations 71 and hence could enable sufficient flux to transport protons between compartments. Understanding the molecular mechanisms underlying proton transport will be an important topic for future studies.
Another class of CCM is the carboxysome-based CCM (CCCM) employed by cyanobacteria 13 . In the CCCM, HCO 3 − becomes concentrated in the cytosol via active transport 72 and diffuses into carboxysomes-compartments that are typically 100 to 400 nm in diameter, each composed of an icosahedral protein shell enclosing Rubisco 73 . The protein shell is thought to serve as a diffusion barrier, which is necessary for an effective CCCM 46,47 . Whereas the pyrenoid matrix does not appear to have a carbonic anhydrase, the carboxysome matrix contains a carbonic anhydrase that converts HCO 3 − to CO 2 to locally feed Rubisco. Recent studies suggest that protons produced during Rubisco's carboxylation could acidify the carboxysome, which in turn favours the carbonic anhydrase-catalysed production of CO 2 70 . One may ask: what are the benefits of operating a PCCM versus a CCCM? One possibility is that the PCCM uses more complex spatial organization to segregate Rubisco from the thylakoid lumen carbonic anhydrase, which allows the two enzymes to operate at pH values optimal for their respective catalytic functions. Thus, the PCCM may require a smaller Ci pool than the CCCM to produce sufficient CO 2 in the vicinity of Rubisco. Indeed, cyanobacteria appear to accumulate roughly 30 mM intracellular HCO 3

−74,75
, while Chlamydomonas creates an internal HCO 3 − pool of only 1 mM 76 . Future experimentation comparing the performance of the PCCM and the CCCM will advance our understanding of the two distinct mechanisms.
The PCCM has the potential to be transferred into crop plants to improve yields. Our model provides a framework to evaluate overall performance, considering both the efficacy and the energetic efficiency of the PCCM (Supplementary Fig. 28), and allows us to propose a favoured order of engineering steps. Moreover, we expect that our model will help engineers narrow down potential challenges by providing a minimal design for a functional PCCM. If the native plant carbonic anhydrases are inactive or absent, it might be favourable to express and localize other carbonic anhydrases with known activities. Additionally, a key step will be to test whether heterologously expressed Chlamydomonas BST channels function as HCO 3 − channels and to verify that they do not interfere with native ion channels in plants. We hope that our model provides practical information for engineers aiming to install a minimal PCCM into plants, and that it will serve as a useful quantitative tool to guide basic PCCM studies in the future.

Methods
Reaction-diffusion model. To better understand the operation of the PCCM, we developed a multicompartment reaction-diffusion model on the basis of the postulated mechanism in Chlamydomonas. The model takes into account the key PCCM enzymes and transporters and the relevant architecture of the Chlamydomonas chloroplast 48 . For simplicity, our model assumes spherical symmetry and considers a spherical chloroplast of radius R chlor in an infinite cytosol. Thus, all model quantities can be expressed as functions of the radial distance r from the centre of the chloroplast (Fig. 1b). The modelled chloroplast consists of three compartments: a spherical pyrenoid matrix of radius R pyr (pH 8) in the centre, surrounded by a stroma (pH 8), with thylakoids (luminal pH 6) traversing both the matrix and stroma (Fig. 1) [41][42][43] . At steady state, flux-balance equations set the spatially dependent concentrations of CO 2 , HCO 3 − , and H 2 CO 3 in their respective compartments (indicated by subscripts; see Supplementary Table 2 and Note I): Here, C denotes the concentration of CO 2 , and H denotes the combined concentration of HCO 3 − and H 2 CO 3 , which are assumed to be in fast equilibrium 77 . Thus, their respective concentrations are given by H − = η 1+η H for HCO 3 − and H 0 = 1 1+η H for H 2 CO 3 , where η = 10 pH−pKa 1 is a pH-dependent partition factor and pKa 1 = 3.4 is the negative log of the first acid dissociation constant of H 2 CO 3 78 .
The first terms in equations (1a-1f) describe the diffusive fluxes of inorganic carbon (Ci) within compartments. D C and D H respectively denote the diffusion coefficients of CO 2 , and HCO 3 − and H 2 CO 3 combined, in aqueous solution. In a model with thylakoid stacks slowing Ci diffusion in the stroma, the effective diffusion coefficients D str C/H are obtained using a standard homogenization approach (see Supplementary Fig. 5 and Note I.G); D C/H str = D C/H otherwise. The other flux terms (j X ) in equations (1a-1f) describe enzymatic reactions and intercompartment Ci transport, and the factors f s and f v describe the geometry of the thylakoids. Their expressions are provided in subsequent sections.
The boundary conditions at r = R pyr are determined by the diffusive flux of Ci across the starch sheath at the matrix-stroma interface, that is, where ∂ r denotes derivative with respect to r, and the starch sheath is assumed to have the same permeability κ starch for all Ci species. κ starch →∞ when there is no starch sheath and Ci can diffuse freely out of the matrix. κ starch = 0 describes an impermeable starch sheath (see Supplementary Note I.F). Similarly, Ci transport flux across the chloroplast envelope yields the boundary conditions at r = R chlor , that is, where κ H − chlor and γ denote the rate and reversibility of inward HCO 3 − transport from the cytosol, representing the action of the uncharacterized chloroplast envelope HCO 3 − transporter LCIA 24,37 ; γ = 1 corresponds to a passive bidirectional channel and γ < 1 corresponds to an active pump. The external CO 2 conditions are specified by cytosolic CO 2 concentration C cyt . We set C cyt = 10 μM for air-level CO 2 conditions, and C cyt = 1 μM for very low CO 2 conditions. Unless otherwise specified, all cytosolic Ci species are assumed to be in equilibrium at pH 7.1 54 .
Thylakoid geometry. The thylakoid geometry has been characterized by cryo-electron tomography in Chlamydomonas 48 . In our model, we account for this geometry by varying the local volume fraction f v and surface-to-volume ratio f s of the thylakoids. These fractions describe a tubule meshwork at the centre of the pyrenoid (r ≤ R mesh ), extended radially by N tub cylindrical tubules, each of radius a tub (see Supplementary Note I.C), that is, for r > R mesh , and fs = 2/a tub .
In the baseline model, the thylakoid tubules are assumed to extend to the chloroplast envelope, that is, the outer radius of tubules R tub = R chlor . In a model with shorter tubules, we choose R tub = 0.4 R chlor , and set f v = 0 and f s = 0 for r > R tub . Thus, the Laplace-Beltrami operators in equation (1) are given by ∇ 2 thy = r −2 f −1 v ∂rfvr 2 ∂r for the thylakoid tubules, and by Enzyme kinetics. The model considers three key Chlamydomonas PCCM enzymes, that is, the carbonic anhydrases (CAs) CAH3 and LCIB and the CO 2 -fixing enzyme Rubisco. The interconversion between CO 2 and HCO 3 − is catalysed by both CAs and follows reversible Michaelis-Menten kinetics 79 . The rate of CA-mediated CO 2 -to-HCO 3 − conversion is given by where V C max,CA denotes the maximum rate of CA, K C m and K H − m respectively denote the half-saturation concentrations for CO 2 and HCO 3 − , and V C max,CA /K C m denotes the first-order rate constant which we refer to as the 'rate' of the CA (Fig. 2). Finally, K eq = 10 pK eff −pH denotes the equilibrium ratio of CO 2 to HCO 3 − , where the effective pKa is given by pK eff = 6.1 80,81 . The localization function L CA is equal to one for r where CA is present and zero elsewhere. The uncatalysed spontaneous rate of CO 2 -to-HCO 3 − conversion, with a first-order rate constant k C sp , is given by 82 . Note that negative values of j CA and j sp denote fluxes of CO 2 -to-HCO 3 − conversion.
The rate of CO 2 fixation catalysed by Rubisco is calculated from Here, V C max,Rbc denotes the maximum rate, and the effective K m (Rubisco K m in Fig. 1 In our baseline model, we assume that CAH3 is localized in the thylakoid tubules traversing the pyrenoid 40 , LCIB is distributed diffusely in the stroma 57 and Rubisco is localized in the pyrenoid matrix 16 . To explore the effect of enzyme localization, we vary the start and end radii of the enzymes while maintaining a constant number of molecules (Figs. 4 and 5, and Supplementary Note III).
Transport of Ci across thylakoid membranes. The flux of CO 2 diffusing across the thylakoid membrane from the thylakoid lumen to the matrix or stroma is given by where κ C denotes the permeability of thylakoid membranes to CO 2 . Similarly, the cross-membrane diffusive flux of HCO 3 − and H 2 CO 3 , j H mem , is given by where κ H − and κ H 0 respectively denote the baseline membrane permeability to HCO 3 − and H 2 CO 3 , and κ H − thy denotes the additional permeability of thylakoid membranes to HCO 3 − due to bestrophin-like channels 25 . Note that the final terms of equations (1a) and (1a-1c) differ by a factor of fv 1−fv because the cross-membrane fluxes have a larger impact on the concentrations in the thylakoid compartment, which has a smaller volume fraction.
Choice of parameters and numerical simulations. The model parameters were estimated from experiment (see Supplementary Table 2 and references therein), except for the rates of LCIB and CAH3 and the kinetic parameters of the HCO 3 − transporters, which are not known. We performed a systematic scan for these unknown parameters within a range of reasonable values ( Fig. 2 and Supplementary  Fig. 4). The numerical solutions of equation (1) were obtained by performing simulations using a finite element method. Partial differential equations were converted to their equivalent weak forms, computationally discretized by first-order elements 85 and implemented in the open-source computing platform FEniCS 86 . A parameter sensitivity analysis was performed to verify the robustness of the model results ( Supplementary Fig. 30). A convergence study was performed to ensure sufficient spatial discretization ( Supplementary Fig. 31).
Energetic cost of the CCM. We computed the energetic cost using the framework of nonequilibrium thermodynamics 56 (see Supplementary Note II.B for details). In brief, the free-energy cost of any nonequilibrium process (reaction, diffusion, or transport) is given by (j + −j − )ln(j + /j − ) (in units of thermal energy RT), where j + and j − denote the forward and backward flux, respectively. Summing the energetic cost of nonequilibrium processes described in equation (1) − in the stroma, with 4πr 2 (1 − f v )dr being the geometric factor. J C chlor = 4πR 2 chlor κ C (Cstr|r=R chlor − Ccyt) denotes the flux of CO 2 diffusing from the stroma back out into the cytosol. J Rbc = ∫ Rchlor 0 4πr 2 (1 − fv)j Rbc dr integrates the flux of CO 2 fixation by Rubisco. The lnγ −1 and ln(K eq thy /K eq str ) terms denote the free-energy cost of pumping HCO 3 − across the chloroplast envelope and pumping protons across the thylakoid membranes, respectively. Using ATP hydrolysis energy |ΔG ATP | = 51.5 RT 87 , we compute the equivalent ATP spent per CO 2 fixed as Ẇ PCCM /J Rbc /|ΔG ATP |.
Well-mixed compartment model. To better understand the biophysical limit of the PCCM, we consider a well-mixed compartment simplification of the full model. Specifically, we assume that (i) the diffusion of Ci is fast in the matrix and stroma, and therefore the concentrations of CO 2  (iii) HCO 3 − and CO 2 are in equilibrium (catalysed by CAH3) in the thylakoid tubules inside the pyrenoid, and thus the CO 2 concentration therein is given by C thy = K eq thy H − pyr ; and (iv) the concentration of CO 2 in the thylakoid tubules approaches C str toward the chloroplast envelope. Thus, the flux-balance conditions are described by a set of algebraic equations of 4 variables, Cpyr, C thy , Cstr and H − str (see Supplementary Notes IV and V). The algebraic equations are solved using the Python-based computing library SciPy (version 1.5.0) 88 . The energetic cost of the well-mixed compartment model is computed similarly as above.
Engineering paths. We are interested in how adding and removing individual components affects the overall functioning of the PCCM. We thus measured the efficacy and energy efficiency of 216 PCCM configurations, modulating the presence and localization of enzymes, HCO 3 − channels and diffusion barriers. Each configuration was simulated using the reaction-diffusion model above, with the appropriate parameters for that strategy (Supplementary Fig. 26).
To find all possible engineering paths between these configurations, we considered a graph on which each possible configuration is a node. Nodes were considered to be connected by an undirected edge if they were separated by one engineering step. Thus, by taking steps on the graph, we searched all possible engineering paths, given a start node with poor PCCM performance and a target node with good performance. A single engineering step could be the addition or removal of an enzyme, a channel, or a diffusion barrier, as well as the localization of a single enzyme. The exception is the localization of Rubisco, which we assumed can exclude LCIB from the matrix as it forms a phase-separated condensate 26 . We did not consider strategies employing both a starch sheath and thylakoid stacks as diffusion barriers. We used a custom depth-first search algorithm in MATLAB (R2020a) to identify all shortest engineering paths between a start and a target node.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All data generated or analysed during this study are included in this Article and the supplementary tables. The raw datasets have been deposited in the Zenodo repository at https://doi.org/10.5281/zenodo.6406849.

March 2021
Corresponding author(s): Martin Jonikas Last updated by author(s): Apr 5, 2022 Reporting Summary Nature Portfolio wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Portfolio policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy All data generated or analysed during this study are included in this article, the extended data, and the supplementary tables. The raw datasets have been deposited in the Zenodo repository, https://doi.org/10.5281/zenodo.6406849.