Main

The production of renewable low-carbon liquid fuels is vital in a net-zero transition1. Second-generation cellulosic biofuel has been repeatedly shown to have environmental benefits over first-generation, grain-based biofuel2,3,4,5,6. Furthermore, a substantial amount of cellulosic biomass is needed to meet liquid fuel demand and the demand for chemicals manufacturing1,7.

Large-scale economic and environmental analyses of cellulosic biofuel production typically place biofuels within the broader context of total energy systems; however, these large models sacrifice detail related to biofuel supply chains (SCs)8,9. The biofuel SC connects biomass feedstock, pre-processing depots (depots) and biorefineries and includes complex logistics networks needed to produce biofuel at a large scale. Because cellulosic biomass has yet to be planted in large quantities, and cellulosic biorefineries have yet to be constructed, effective designs and accurate outcomes from system analysis depend on spatially explicit trade-offs between three levels of design: the landscape, the SC and the biorefinery. Previous studies have developed detailed models studying the biofuel SC; however, most focus on only one aspect of the system or consider a small geographic area, which may neglect to identify detailed spatially explicit interactions between the three levels of design at larger scales.

Landscape-level studies have shown that biomass siting, the type of crop that is planted and the land management (for example, fertilization, irrigation) influence the environmental and economic performance of crops planted at a specific location. The type of crop planted on marginal lands has also been shown to have a large impact on the soil carbon sequestration, a measure of the negative emissions from crop growth and key contributor to the negative greenhouse gas (GHG) balance of the system3. Previous studies used the DayCent crop model to optimize landscape design around a single, pre-existing biorefinery and showed that careful crop establishment and management can increase the amount of soil carbon sequestration, complementing industrial carbon capture and storage (CCS)10,11. Additionally, researchers presented multi-objective optimization models that considered several land-management practices to analyse their pareto trade-offs when designing the landscape around a biorefinery12.

Other important studies have focused on developing modelling methodologies for biofuel SC design. Relevant studies include a multi-objective framework for biofuel SC optimization accounting for economic, environmental and social objectives13, a model for switchgrass SCs showing that costs increase substantially if biorefineries are not optimally located14 and a compact neighbourhood-flow-based formulation for hybrid first-/second-generation bioethanol SCs that considerably reduced the problem size and was extended to consider both economic and environmental objectives over longer time horizons15,16. Many biofuel SC studies make three limiting assumptions. First, they assume a pre-existing uniform distribution of biomass across their study area. Second, they use a coarse resolution to model the yield, collection and transportation of biomass. Third, they typically assume no control over design and operation of the bioenergy landscape. Recent studies have made progress addressing these three limiting assumptions to better understand the spatial features of the system. A framework for designing high-resolution biofuel SCs was introduced considering pre-processing depots and the uneven distribution of biomass17. Researchers analysed the competition between food and fuel considering the design of the landscape around a biorefinery over a longer time horizon18 and a similar study focused on the economics of biofuels using a two-stage optimization model that included the high-resolution design of both the landscape and the SC19. Recently, a framework was constructed that incorporated high-resolution, realistic, biomass data to simultaneously optimize the bioenergy landscape (including fertilization) and a detailed logistics model minimizing the cost and environmental impact of bioethanol production under biomass yield uncertainty20.

At the biorefinery, detailed models are available for single conversion technologies (particularly from researchers at the National Renewable Energy Laboratory, who performed detailed process design for biorefineries using microbial conversion, pyrolysis and gasification21,22,23); however, most SC studies consider simplified models of the biorefinery, and only some have compared the economics of different conversion technologies simultaneously24,25. To achieve further GHG mitigation, researchers have examined using CCS to capture CO2 from the different high-concentration process streams at biorefineries26,27. Installing CCS technologies requires additional capital and operational investment and in the USA is incentivized by carbon sequestration credits (currently a US$85 Mg−1 tax incentive)28. In our previous work, a single biorefinery was studied producing either liquid fuels, hydrogen or electricity with the option for CCS from multiple process streams29. For each feedstock, the study assumed an even distribution of biomass around the biorefinery. These works provide valuable information about the detailed biorefinery design but assume a fixed delivered cost of biomass and a simplified SC model.

To better understand large-scale biofuel production, both the spatially explicit SC and detailed biorefinery design must be considered simultaneously. Importantly, at larger scales, additional parameters should be considered in a spatially explicit context. For example, biomass availability, local electricity price and CO2 impact (the indirect impact from electricity generation and transmission), distance to CO2 storage, CO2 sequestration credit, fuel yield and fuel demand may all impact the choice of liquid fuel conversion and CCS technology installed at biorefineries.

Here, we use modelling and optimization methods to design and analyse an integrated system producing liquid cellulosic biofuel from field to product in the US Midwest. Importantly, we consider systems of a meaningful scale and simultaneously account for design and operation of the SC, landscape and biorefinery (including CCS) at a high spatial resolution. We use realistic crop simulations to determine crop productivity and soil carbon sequestration for biomass grown on ‘bioenergy lands’6 to design a landscape that accounts for downstream trade-offs within the SC and at the biorefinery. We analyse multiple technologies and their associated CCS configurations under different sequestration credits, CO2 valuations and fuel demands. The results provide insights into how CO2 mitigation incentives influence spatial trends related to the bioenergy landscape and the bioenergy SC including what design and operational decisions can produce economically attractive fuel with net-negative emissions. Applied broadly, the results of this study improve the understanding of cellulosic biofuel production systems and can help to inform decision and policymakers with strategies to incentivize and implement economically sound net-negative-emissions biofuels at a large scale.

System analysis and spatial trends

We study incentives for installing CCS, expected outcomes from an economic and environmental perspective and the spatially explicit SC changes needed to accommodate an optimal system under different conditions. We use a mixed-integer linear programming (MILP) model with the objective to minimize the total annualized cost of the system. Details are available in the Methods section. The modelling approach is not intended as a detailed decision support tool. Instead, we aim to identify, quantify and understand the trade-offs that should be considered during large-scale deployment of cellulosic biofuel systems.

To accurately compare the energy content of liquid fuels, we consider the fuel yields from technologies in Table 1 in gallons of gasoline equivalents (GGE) instead of volume (l). GGE is the preferred unit of the US National Renewable Energy Laboratory, which is the primary source for process data for this study21. For clarity, we also include standard (SI) units when applicable. Table 1 provides abbreviations used for biorefinery technologies. The chosen technologies represent relatively mature conversion routes with published techno-economic analyses. Whereas there has been important progress in light-duty electrification, liquid fuels are expected to remain a critical element of the transportation sector for decades to come30. Figure 1 shows a comparison of the technology’s capital and operating costs and their relative fuel yield and CO2 flows, with further detail provided in Supplementary Table 1. Figure 2 shows the SC network structure and potential material flows and includes important model variables for the biofuel SC, with formulation details presented in Methods.

Table 1 The abbreviations used for biorefinery technology and CCS combinations
Fig. 1: An overview of the biorefinery technologies considered in this study.
figure 1

The width of GGE and CO2 flows are proportional to energy and mass flows, respectively. The capital cost is expressed in US$ Mg−1 for a 2,000 Mg day−1 biorefinery, and the operational cost is in units of US$ Mg−1 biomass processed. Green boxes show capital costs (CAPEX) and orange boxes show operational costs (OPEX), with details in Supplementary Table 1.

Fig. 2: Material flow through the biofuel supply chain network.
figure 2

Modelling notation showing the potential ways biomass iIF, intermediates iIID, byproducts iIB and biofuel iIP flow through the three types of node in the biofuel supply chain: harvesting sites j, pre-processing depots k and biorefineries l. Solid lines represent shipments Fi,*,*,q,t, dash dotted lines represent material consumption and dotted lines represent material production. Yi,f,t represents the biomass yield as a result of land management (details in Methods) for fields within a harvesting site. Hi,j,t is the actual amount harvested and \({C}_{l,m,t}\!^\mathrm{SEQ}\) is the amount of CO2 sequestered with CCS technology.

In the following sections, we first consider the current policy in the USA, with a carbon sequestration credit only for CO2 captured with CCS. The spatial features of the system, the technology portfolio and system-wide metrics (cost and GHG emissions) are presented as a function of biofuel demand and sequestration credit. Because a main motivation for producing cellulosic biofuel is its potential as a low-carbon or carbon-negative fuel, we next consider the implication of treating all mitigated CO2-equivalent emissions equally. That is, we treat avoided emissions the same as captured CO2 in our economic valuation, something that is not incentivized by the current policy in the USA. Finally, we study system limits by enforcing a constraint on total GHG emissions. We discuss how optimal SC and biorefinery decisions change as a function of the constraint on emissions and biofuel demand and how they relate to the spatial distribution of biomass and other important parameters.

Effect of carbon sequestration credits and demand

In the USA, facilities that capture and store CO2 receive a monetary tax credit of US$85 Mg−1 of CO2 captured and stored according to the 2022 updates to the US tax code IRS §45Q28. Figure 3a shows the system cost and GHG impact as a function of sequestration credit for a base-case demand of 8 × 108 GGE year−1 (9.7 × 107 GJ). Cost per GGE decreases with increasing sequestration credit once CCS is incentivized, and increases with increasing fuel demand due to the requirement for larger logistics networks. Technology selection is not a strong function of liquid fuel demand (Supplementary Fig. 4). Supplementary Figs. 1115 show representative SC configurations at a variety of sequestration credits and demands. Figure 3b shows representative metrics describing spatial trends as a function of liquid fuel demand. At low demand, biomass can be transported directly to the biorefinery, and fields with higher yields are preferred. However, as the demand increases, larger logistics networks are required and the spatial distribution of biomass requires establishment of less favourable fields and for biomass to be densified and transported further. When additional refineries are constructed as demand increases, a new optimal spatial layout for the biorefineries is found and a sharp drop in the average transportation distance is observed because the fields can be established closer to the larger number of biorefineries.

Fig. 3: System performance when the sequestration credit is only applied to CO2 captured with CCS.
figure 3

a, The cost (solid line) and emissions in CO2 equivalents (CO2e) per gallon of gasoline equivalent (dashed line) as a function of the sequestration credit for CO2 captured with CCS for a demand of 8 × 108 GGE year−1 (9.7 × 107 GJ). Background colour indicates fuel and CCS technology portfolio. Bottom area plot shows the percentage of each installed technology. b, Representative spatial metrics as a function of liquid fuel demand. Average biomass transportation distance (blue), fraction of biomass processed by pre-processing depots (red) and average biomass yield for established fields (green). The vertical lines and corresponding numbers to the right of the line indicate the total number of biorefineries installed.

Source data

Below a credit of US$60 Mg−1 CO2, CCS is not incentivized and the system has positive net emissions. The emissions per GGE and technology selection are not a strong function of liquid fuel demand. Costs increase rapidly at higher fuel demands because biomass needs to be established and transported from further away. All biorefineries are installed with pyrolysis with hydrogen purchase and flue gas CCS (PH+) above a sequestration credit of US$72 Mg−1 CO2 and without CCS (PH) below US$60 Mg−1 CO2. PH is the preferred technology because of its high fuel yield, meaning less biomass needs to be planted, leading to lower land level and logistics costs. Interestingly, there is a transition region from US$60–72 Mg−1 CO2 (Supplementary Fig. 16) where the preferred technology depends on the location of the biorefinery because of two main factors. First, in states with high electricity price, CCS is not incentivized because it is cost effective to receive a credit for selling excess electricity instead of making the capital investment to install CCS and receive sequestration credits; however, in states with cheap electricity, the opposite is true and additional CCS technology results in the biorefinery requiring electricity purchase. Second, the transportation and injection cost of CO2 depends on the proximity of the biorefinery to the injection site and its geological features. Supplementary Fig. 4 shows the technology portfolio as a function of sequestration credit.

The sequestration credit threshold of US$60 Mg−1 CO2 is similar to other studies on CCS incentives for biorefineries, although slightly higher because we account for upstream SC and landscape costs29. From a spatial perspective, at low demands and sequestration credits the largest driver of biorefinery location is biomass yield and concentration. A bulk of the fuel conversion capacity is installed in areas with the highest average biomass yield (that is, Iowa, northern Indiana) and biomass is established as close to the biorefineries as possible to limit transportation costs, with fertilization used extensively to increase crop yields. Targeted crop establishment decisions become important at the edge of the biomass establishment area where only the highest-yielding fields are selected. Higher fuel demands require more biomass and in turn, less selective crop establishment; however, the quality of marginal land plays an important role. The poor-yielding and low-concentration land in northern Minnesota and Michigan are avoided even at high demand. Depots are preferred only at high demand when biomass is transported over larger distances. It is more economical in most cases to ship-baled biomass directly to the biorefinery.

Valuing all CO2 emissions

As the results above show, a sufficiently high sequestration credit incentivizes CCS at cellulosic biorefineries. However, by only incentivizing the CO2 captured at the biorefineries, the current policy neglects to account for emissions from the upstream SC. This viewpoint and policy results in an economically optimal SC, but from a global perspective, may work against the primary goal of creating low-emission fuel by being unable to incentivize further emissions reductions at the system level. We consider a second case where we value CO2 captured with CCS and CO2 equivalents from SC activities equally. By considering the net emissions of the entire system, the spatial trade-offs between the landscape and SC have a larger impact on the optimal design, operation and technology portfolio.

Figure 4 shows the cost and GHG impact as a function of the sequestration credit applied to all CO2-equivalent emissions. Sources of emissions, such as transportation and the electricity use for CCS and process operations, are penalized, and emissions sinks, such as soil carbon sequestration and offset electricity emissions (due to excess electricity production), are credited.

Fig. 4: System performance when all CO2 is valued the same as the CCS sequestration credit.
figure 4

a,b, The system emissions (a) and cost (b) per GGE as a function of the sequestration credit and liquid fuel demand for the case when all CO2 is valued equally. Markers indicate the base case (red) and representative (white) solutions for which plots of the SC configuration are available in Supplementary Figs. 1721.

Source data

The credit needed to begin incentivizing the installation of CCS technology is notably higher, US$79 Mg−1 CO2, when all CO2 is valued equally than US$60 Mg−1 CO2 when the credit is only applied to captured CO2. It is more favourable at lower credits to offset grid emissions with excess electricity produced from combusting byproducts than to make capital investments for CCS. The optimal technology mix is more sensitive to both the biofuel demand and sequestration credit. Only above a credit of US$100 Mg−1 CO2 is CCS installed at every biorefinery. Costs increase with increasing sequestration credit before CCS is incentivized because the system has net-positive emissions and those emissions are being penalized. At demands below 5 × 108 GGE year−1, the costs are fairly constant because the transportation distances are small and the landscape design is selective enough to result in net-zero emissions. Figure 5 shows the biofuel SC configuration for the base-case credit and demand (US$85 Mg−1 CO2, 8 × 108 GGE year−1 (9.7 × 107 GJ), red diamond marker). Supplementary Figs. 1721 show configurations for other representative cases (white diamond markers), and Supplementary Fig. 10 provides a legend. Interestingly, the emissions at the base-case credit and demand (US$85 Mg−1 CO2, 8 × 108 GGE year−1 (9.7 × 107 GJ), red diamond marker) are slightly higher in the second case than the first because the sequestration credit is insufficient to offset the penalty from indirect electricity GHG emissions and indirect emissions from hydrogen purchase except in locations where electricity is cheap and clean, therefore CCS is not installed at all biorefineries.

Fig. 5: Base-case supply chain configuration.
figure 5

SC configuration for demand = 8 × 108 GGE year−1, sequestration credit applied to all CO2 = US$85 Mg−1. Objectives: US$3.19 GGE−1, −2.64 g CO2 equivalent per GGE. Blue triangles represent pre-processing depots, green biorefinery icons correspond to PH+, purple biorefinery icons correspond to PH, orange fields are established and fully fertilized, green fields are established but not fertilized and black fields are not chosen for establishment. Credit: biorefinery icon, Flaticon.com.

The optimal technology portfolio is also very sensitive to fuel demand and sequestration credit (Fig. 6). At low credits, PH and PH+ are the primary technologies, but as the credit increases, the penalty to indirect emissions from purchasing hydrogen and the energy requirement for CCS favours gasification instead. Technologies G+ and G++ have a lower fuel yield but a lower net electricity usage and do not have indirect emissions from purchasing hydrogen. At high demands and high credits (≥9 × 108 GGE year−1, ≥US$110 Mg−1 CO2), the low fuel yield of gasification results in biorefineries with PH+ installed to meet demand without planting distant biomass with poor yield.

Fig. 6: Technology portfolio when all CO2 is valued the same as the CCS sequestration credit.
figure 6

The optimal technology portfolio (using the abbreviations introduced in the model) as a function of sequestration credit and liquid fuel demand for the case when all CO2 is valued the same. Markers indicate the base case (red) and representative solutions (white) for which plots of the SC configuration are available in Supplementary Figs. 1721.

Source data

From a spatial perspective, at low credits and low demands (≤US$85 Mg−1 CO2, ≤8 × 108 GGE year−1), biofuel yield and biomass concentration play a large role influencing the optimal biorefinery locations. However, SC design shifts slightly to favour biorefineries constructed in Indiana instead of Illinois due to high grid electricity emissions in Indiana. Because PH produces excess electricity, there is a credit for offsetting the emissions from the grid. When comparing gasification and pyrolysis, spatially explicit GHG emissions from electricity determine the preferred technology at each biorefinery. Furthermore, when both technologies are installed at different biorefineries, gasification is installed in areas of high biomass yield, reducing the footprint for biomass establishment, reducing logistics costs. At all credits and demands, landscape design decisions depend on the soil carbon sequestration. At low credits, the biorefineries constructed in Minnesota when only CCS is credited (Supplementary Fig. 14) are not present when all CO2 is valued equally (Supplementary Fig. 20), largely because of poor soil carbon sequestration in the region. Additionally, fertilization decisions are more targeted at higher credits where there is a trade-off between indirect fertilization emissions and additional biomass yield.

To isolate the interactions between the inputs and decisions of the model, results above are limited to a single definition of marginal land, historically abandoned land, described in the Methods section. To evaluate the generality of our findings with respect to the land definition, we include results with an alternative marginal land definition, low-capability land (LCL) in Supplementary Note 1 and Supplementary Figs. 2531. We report similar findings to those for HAL.

Opportunity for further CO2 mitigation

The previous two sections have dealt with systems that result from financial incentives to reduce GHG emissions under different CO2 penalties and credits. In the following section, we study the mitigation potential of the system beyond what is incentivized by financial incentives alone.

In equation (1) we introduce a constraint on the total system emissions where β is the fuel demand and GTTOT are the net emissions of the system. We solve the integrated optimization model to minimize cost for varying levels of ϵ. This approach, the epsilon-constraint method, is commonly used to study multi-objective systems.

$$\frac{{G{T}}^\mathrm{TOT}}{\beta }\le \epsilon$$
(1)

Figure 7 shows the cost as a function of biofuel demand and emissions constraint. We use the current policy, where only CO2 captured at the biorefinery is credited (US$85 Mg−1), with the system GHG emissions taken into account by the emissions constraint. The low-cost region on the right corresponds to the economically incentivized system where PH+ is selected. Once the emissions constraint is binding, emissions decrease linearly (enforced by the epsilon constraint) while costs increase only slightly. The region in the upper left corner with high demands and restrictive emissions constraints is infeasible. High demand leads to unfavourable lands located far from biorefineries being planted and harvested. There is no combination of landscape and SC design that can meet the emissions constraint. The solid red region bordering the infeasible region is an area of high cost where the marginal price of reducing GHG emissions increases rapidly. The maximum cost in the region reaches US$6.34 GGE−1 (US$52.27 GJ−1).

Fig. 7: System performance under an emissions constraint.
figure 7

The cost per GGE as a function of liquid fuel demand and the emissions constraint. Lines and corresponding labels are guides that indicate the boundary between the preferred technology portfolios in each region of the figure. Markers indicate representative solutions available in Supplementary Figs. 2224.

Source data

Large reductions in GHG emissions are still possible for moderate cost increases. There is only a 10% increase in cost to reduce system GHG emissions by a factor of 2 from the economically favourable solution at the base-case demand. Large reductions in GHG emissions are possible for only a small increase in cost due to careful selection of bioenergy lands and design and operation of the SC.

Figure 7 also shows the technology portfolio as a function of the emissions constraint and biofuel demand. As the emissions constraint becomes more restrictive (moving right to left on the plot), technologies with less indirect emissions are installed with CCS to meet the emissions constraint (that is G+, G++, P+). However, there is a trade-off whereby gasification, having low indirect emissions but a lower fuel yield than pyrolysis, cannot meet high fuel demands at restrictive emissions constraints, leading to the infeasible region. Supplementary Figs. 2224 show SC configurations for representative solutions indicated by the white diamond markers in Fig. 7.

From a spatial perspective, landscape design becomes even more important at very restrictive emissions constraints. Seen in Supplementary Fig. 22, some fields close to the biorefineries are not chosen for crop establishment despite being economically preferred (due to low transportation and handling costs) because of poor soil carbon sequestration. In addition, fewer fields are fertilized due to the trade-off between indirect fertilizer emissions and increased productivity. As in the previous cases, the preferred biorefinery technologies are influenced by the local electricity price and emissions and the bulk of the capacity (the largest biorefineries) are located near the most productive and highest concentration bioenergy lands in northern Missouri, southern Michigan and northern Ohio/Indiana.

Discussion

In this paper we used an integrated field-to-product optimization model to analyse interactions between the bioenergy landscape, the SC and the biorefinery design, including CCS, in the US Midwest. Importantly, we used realistic biomass yield and soil carbon sequestration data, studied technologically mature liquid fuel conversion route, and the spatially explicit interactions at a regional scale. We find that the current CO2 sequestration credit of US$85 Mg−1 incentivizes PH+ leading to net-negative GHG emissions biofuels. However, we also find that the GHG emissions, cost and technology portfolios, when valuing avoided emissions from the SC equally to CO2 captured with CCS, are much more sensitive to the credit and biofuel demand. By increasing the sequestration credit to above US$100 Mg−1, the net emissions of the optimal solution decrease substantially. The spatially explicit price and GHG emissions from electricity influence the optimal technology installed at each biorefinery, and G++ is preferred at higher credits due to its lower electricity use and fewer indirect emissions. The location, productivity and soil carbon sequestration of bioenergy lands also have a large impact on the location of biorefineries. Landscape design becomes increasingly important at high credits and at restrictive GHG emissions constraints when additional biomass is planted to meet the fuel demand using technologies with lower fuel yields (that is, G++). The high-resolution modelling captures the system-wide trade-offs of CO2 mitigation strategies and incentive structures and highlights the sensitivity of cellulosic biofuel landscapes, SCs and biorefinery technology portfolios to those incentives.

Although the system studied is limited to the US Midwest and a single definition of available bioenergy lands, the detailed high-resolution data and spatial analysis reveals field-to-product insights for other systems producing biofuel from dedicated bioenergy crops. For example, using a different definition of bioenergy lands (Supplementary Note 1) or considering alternative bioenergy crops could change the amount and distribution of biomass, but biorefineries should still be constructed in areas of high biomass concentration while respecting the local price and CO2 impact of electricity to determine if CCS is economical. Additionally, motivating stakeholders to make landscape and SC design decisions that complement the installed technology portfolio can disproportionately reduce the GHG emissions system wide at reasonable cost.

At a larger scale, integrated assessment models provide estimates of the scale and carbon mitigation potential for bioenergy as a whole in the context of the total energy system but usually do not include detailed insights into where and how dedicated bioenergy systems would be designed and operated and do not identify the high-resolution trade-offs between the different parts of the system. By contrast, the model and results presented in this paper are specific to the bioenergy system and take a field-to-product view, revealing detailed design decisions and trade-offs. The granular interactions between the availability and quality of bioenergy lands and the crop productivity for a variety of bioenergy crops, not to mention the many variations of fuel conversion technologies, makes it highly difficult to perform optimization-based design at a regional scale for every combination of crop and available land simultaneously. Therefore, this analysis of cellulosic biofuel in the US Midwest is not intended to be a decision support tool but rather to identify important trade-offs that must be considered when designing distributed energy systems at large scale and quantify the impacts of these trade-offs for the specific system we study. Finally, the motivation to produce biofuels from dedicated bioenergy crops is largely driven by desire to mitigate GHG emissions by replacing traditional fossil fuels. Therefore, system design and analysis must consider GHG as part of the objective, and because biofuels are unlikely to be an economically attractive way to produce liquid fuels, the approach to balance cost and GHG emissions is extremely important. To this end, and shown in the results above, incentives which motivate stakeholders to reduce GHG emissions from SC activities in addition to installing CCS at biorefineries can result in biofuels with a lower carbon footprint than the systems incentivized by the status quo.

Methods

Field to product modelling and analysis approach

We use a mixed-integer linear programming (MILP) model that minimizes the total annualized cost of the system20,29. The constraints defining the SC model follow the ones presented in our previous work20 with key modifications described in detail below. For clarity of presentation we categorize decision variables into three levels, but the model is formulated and solved as one large integrated MILP so that decisions are considered simultaneously. The regional study area includes the eight states in the US Midwest: Ohio, Michigan, Indiana, Illinois, Wisconsin, Missouri, Iowa and Minnesota. This region is the topic of interest for several analyses of US biofuel production, which showed that switchgrass and native grasses grown on marginal land in the region can support large amounts of carbon-negative biofuel3,27.

Landscape-level decision variables are primarily related to crop establishment and operations at the biomass fields; that is, choosing in which fields to plant biomass, how much to fertilize each field,and which to leave uncultivated. Harvest planning decisions for a representative year are also considered. The large study area and high-resolution data introduce considerable computational complexity. We use a linear formulation to model landscape design and include a spatial aggregation procedure that reduces the number of transportation arcs in the optimization model. The formulation details are available below.

The SC network design decisions, related to the structure and operation of the SC, include the number, location and capacity of biorefineries and depots in addition to inventory, transportation and production planning at depots for a representative year. We consider economies of scale for biorefineries and depots using a piecewise linear approximation of the capital costs that follow a nonlinear scaling rule17.

The biorefinery and CCS model decision variables determine the optimal liquid fuel conversion technology and CCS to install at each biorefinery selected in the SC network design level. Furthermore, production planning decisions are considered for a representative year, taking into account the cost of CCS at the biorefineries.

Important model constraints are presented below. We use uppercase italic Latin letters for variables, bold uppercase Latin letters for sets, lowercase italic Latin letters for indices and lowercase Greek letters for model parameters. Important sets include materials iI, potential harvesting sites jJ, potential depots kK, potential biorefineries lL, technologies mM, transportation modes qQ and time periods tT during a representative year (we consider T = 4 corresponding to the four seasons). Figure 2 shows the network structure and important variables for the biofuel SC. The objective function is to minimize the total annualized cost. GHG emissions are included in the objective (CGHG) with a sequestration credit applied to the CO2 captured at the biorefinery. Model statistics are given in Supplementary Note 2.

$$\mathrm{OBJ}={C}^\mathrm{CAP}+{C}^\mathrm{LND}+{C}^\mathrm{INV}+{C}^\mathrm{TRA}+{C}^\mathrm{PRD}+{C}^\mathrm{GHG}+{C}^\mathrm{ELC}$$
(2)

The objective function in equation (2) includes annualized capital costs CCAP, the landscape-level CLND costs including the annualized cost of crop establishment on bioenergy lands and the cost of harvesting and handling biomass, inventory cost CINV, transportation cost CTRA, the operational cost to produce intermediates and liquid fuels CPRD, the costs and credits associated with GHG emissions CGHG and the net cost from electricity CELC that is purchased or sold to the grid.

System-wide GHG emissions (GTTOT) are calculated similarly (equation (3)), where a clear distinction is made between two types of mitigated CO2. (1) CO2 captured by CCS technology at the biorefinery (GTCCS) and (2) all direct and indirect emissions from logistic and processing activities such as transportation (GTTRA), soil carbon sequestration and harvesting and handling (GTLND), direct and indirect emissions at the biorefinery (GTPRD) and electricity usage (GTELC).

$$G{T}^\mathrm{TOT}=(G{T}^\mathrm{TRA}+G{T}^\mathrm{LND}+G{T}^\mathrm{PRD}+G{T}^\mathrm{ELC})-G{T}^\mathrm{CCS}$$
(3)

The objective value contribution from GHG emissions is given by equation (4).

$${C}^\mathrm{GHG}={\pi }^\mathrm{SCC}(G{T}^\mathrm{TRA}+G{T}^\mathrm{LND}+G{T}^\mathrm{PRD}+G{T}^\mathrm{ELC})-{\pi }^\mathrm{CCS}G{T}^\mathrm{CCS}$$
(4)

where πSCC is the GHG penalty to the net SC emissions and πCCS is the tax credit from CO2 captured with CCS. Because one of the main motivations for cellulosic biofuel is its potential to be a low-carbon or carbon-negative fuel, in our analysis we also consider the implication of treating all mitigated CO2-equivalent emissions equally. That is, we treat avoided emissions the same as captured CO2 in our economic valuation, something that is not incentivized by the current policy in the USA (with πSCC = 0). However, we show that the system design is greatly impacted by the CO2 valuation.

Bioenergy landscape

The land available to plant dedicated bioenergy crops depends on the definition used to identify it. Bioenergy land is land that is either economically, socially or environmentally more suitable to plant perennial bioenergy crops than annual food crops31. In this study, we assume that historically abandoned croplands are available to establish bioenergy crops. We use the definition of abandoned lands, defined as lands from which food crops were harvested for at least five years, subsequently taken out of food crop production for at least five years and not converted to urban or water use32,33. Refs. 34,35,36 provided underlying land availability data including data used to generate maps of biomass land distribution and availability. This definition is more conservative than other definitions such as the commonly used set of Land Capability Classifications V–VIII, which may not explicitly account for the current land use37.

The biogeochemical process model SALUS was used to simulate the growth of native bioenergy grasses on the lands identified as historically abandoned38,39,40. Biomass growth was simulated at a daily time step under two fertilization conditions (0 and 50 kg N ha−1) over a period of 40 years. The primary outputs of the simulations were average biomass yield (Mg ha−1 year−1) and soil carbon sequestration (Mg ha−1 year−1). Soil carbon sequestration represents the amount of CO2 equivalents mineralized and stored long term in the soil and is one of the primary reasons perennial cellulosic crops are preferred over annual cellulosic crops such as corn stover or sorghum3. Both biomass yield and soil carbon sequestration depend on the soil quality and local weather at each individual field, which leads to an uneven spatial distribution of biomass across the landscape.

The study area contains a total area of roughly 3 × 106 ha. The maximum amount of harvestable biomass (if every field were planted and fully fertilized) is 1.48 × 107 Mg of biomass per year. Most of the biomass is concentrated in the south west and the south east of the study area. Supplementary Methods and Supplementary Figs. 13 provide maps and detail related to the distribution of bioenergy land, biomass yield and soil carbon sequestration. Raw biomass data are also available for download as a part of the data that accompany this study41. Also in the Supplementary Information is an additional analysis involving an alternative definition of marginal land, LCL (Supplementary Note 1). LCLs are non-forested, non-wetland, non-cropland fields/soils that are classified by the US Department of Agriculture as non-arable (classes V–VIII); also, ref. 3. The LCL that is present in the study area is also roughly 3 × 106 ha with 1.55 × 107 Mg of available biomass if all land was planted and fully fertilized. Biomass grown on LCL is more localized than historically abandoned land, with the majority of biomass concentration in the western portion of the study area in southern Iowa and northern Missouri with a lower amount in southern Michigan.

Spatial aggregation procedure

The study area includes over 410,000 individual land parcels of historically abandoned land for the potential establishment of dedicated bioenergy crops. This large number of SC nodes can make the MILP model computationally intractable if we were to consider landscape design decisions and/or transportation arcs for each field. In the biogeochemical crop model SALUS, fields with the same soil type that are within the same local weather grid result in the same model outputs of yield and soil carbon sequestration40. Therefore, fields with the same soil type that are within the same ‘harvesting site’ grid jJ are aggregated so that an index fF refers to all the fields of the same soil type that are within a harvesting site grid cell and variable Ei,f determines the fraction of the total area to establish with bioenergy crops. The aggregation procedure leads to a set of 56,698 fields fF.

Landscape design formulation

The added computational complexity of a large study area and the additional variables and constraints from CCS technology options requires a modified formulation for the landscape design model. We use a formulation that assumes a linear yield and soil carbon sequestration response to nitrogen fertilization. The formulation maintains a high spatial resolution for landscape design but no longer requires the binary variables and special ordered set II (SOS2) variables presented previously20,42.

We introduce a set of fields fF and a variable Ei,f [0, 1] that represents the fraction of field f to establish with crop i. We also introduce continuous variable Ri,f [0, 1], which represents the fraction of nitrogen fertilization to apply to field f planted with crop i. We consider a maximum fertilization amount of 50 kg N ha−1. The yield and soil carbon sequestration of a crop i planted on a specific field f is calculated as:

$${Y}_{i,f,t}={E}_{i,f}{\alpha }_{i,f,t}\!^{1}+{E}_{i,f}{R}_{i,f}{\varsigma }_{i,f,t}\!^{1}$$
(5)
$$G{T}_{i,f,t}\!^\mathrm{SOC}={E}_{i,f}{\alpha }_{i,f,t}\!^{2}+{E}_{i,f}{R}_{i,f}{\varsigma }_{i,f,t}\!^{2}$$
(6)

where \({\alpha }_{i,f,t}\!^{1}\) and \({\alpha }_{i,f,t}\!^{2}\) are the yield and soil carbon sequestration (Mg) at zero fertilization and \({\varsigma }_{i,f,t}\!^{1}\) and \({\varsigma }_{i,f,t}\!^{2}\) are the excess yield and soil carbon sequestration (Mg) from full fertilization, respectively. This formulation introduces a bilinear term Ei,fRi,f, which can be replaced with an auxiliary variable Di,f = Ei,fRi,f and bound Di,f.

$$0\le {D}_{i,f}\le {E}_{i,f}$$
(7)
$${Y}_{i,f,t}={E}_{i,f}{\alpha }_{i,f,t}\!^{1}+{D}_{i,f}{\varsigma }_{i,f,t}\!^{1}$$
(8)
$$G{T}_{i,f}\!^\mathrm{SOC}={E}_{i,f}{\alpha }_{i,f,t}\!^{2}+{D}_{i,f}{\varsigma }_{i,f}\!^{2}$$
(9)

The amount of field area planted, Ai,f, and amount of fertilization applied, Fi,f, are given in equations (10) and (11), respectively, where ω is the maximum amount of fertilization (kg N ha−1) and σf is the available land area for field f.

$${A}_{i,f}={E}_{i,f}{\sigma }_{f}$$
(10)
$${F}_{i,f}={D}_{i,f}\omega {\sigma }_{f}$$
(11)

The landscape cost (CLND) and GHG impact (GTLND) are given by equations (12) and (13).

$${C}^\mathrm{LND}=\mathop{\sum}\limits_{i,j,t}{\lambda }_{i}{H}_{i,j,t}+\mathop{\sum}\limits_{i,f}\rho {F}_{i,f}+\mathop{\sum}\limits_{i,f}{A}_{i,f}{\phi }_{i}$$
(12)
$$G{T}^\mathrm{LND}=\mathop{\sum}\limits_{i,j,t}{\gamma }_{i}\!^\mathrm{MG}{H}_{i,j,t}+\mathop{\sum}\limits_{i,f}{\gamma }^\mathrm{N}{F}_{i,f}+\mathop{\sum}\limits_{i,f}{\gamma }_{i}\!^\mathrm{HA}{A}_{i,f}+\mathop{\sum}\limits_{i,f}G{T}_{i,f}\!^\mathrm{SOC}$$
(13)

where \({\gamma }_{i}\!^\mathrm{MG}\) are the per Mg emissions from harvesting biomass (Mg CO2e Mg−1 biomass), γN are the indirect emissions from fertilization (Mg CO2e kg−1 N) and \({\gamma }_{i}\!^\mathrm{HA}\) are the annualized fixed emissions (Mg CO2e ha−1) from establishing and harvesting biomass. Finally, the amount of harvested biomass Hi,j,t at harvesting site jJ is bounded by equation (14).

$${H}_{i,j,t}\le \mathop{\sum}\limits_{f\in {{{{\bf{G}}}}}_{{{{\bf{j}}}}}}{Y}_{i,f,t}$$
(14)

where the set G {F × J} is a two-dimensional set defining the membership of fields f to harvesting sites j.

Biofuel supply chain network

We consider 30 potential biorefinery locations, which are screened from the US Environmental Protection Agency (EPA) Re-Powering dataset43. We also consider 800 potential pre-processing depots that can be installed near existing rail infrastructure to enable the rail transportation of densified intermediates44. Once harvested, biomass can be transported via truck to a pre-processing depot or directly to a biorefinery. From a pre-processing depot, densified biomass pellets can be shipped via truck or rail to a biorefinery (Fig. 2).

We introduce binary variable Ul,m = 1 if technology m is installed at biorefinery l and 0 otherwise. We also constrain the technology selection so that only one technology can be installed at a biorefinery (equation (15)). Similarly, drying and densification is installed at pre-processing depots, enforced by equation (16) using the binary variable Uk,m45.

$$\mathop{\sum}\limits_{m}{U}_{l,m}\le 1$$
(15)
$$\mathop{\sum}\limits_{m}{U}_{k,m}\le 1$$
(16)

We assume that the refineries must have access to rail and road transportation within 5 km and must have enough potential biomass within 300 km to support a minimum capacity of 2,000 Mg day−1 of dry biomass17. We also assume the potential locations must have a land area of 500 acres (ref. 21). Potential sites within 5 km were treated as a single biorefinery with the location taken as the mean longitude and latitude of the potential sites within the 5 km range. We consider 30 potential biorefineries whose location can be seen in Supplementary Fig. 8.

Depots were screened similarly by taking existing rail stations and filtering them to 800 potential sites ensuring there is at least one depot per county and that depots have enough biomass within 50 km to support a minimum capacity depot17. Supplementary Fig. 8 also shows the 800 potential depot locations that were used in our analysis.

The liquid fuel conversion technologies and associated CCS options each require heat and electricity. For certain technology options with a high level of CCS, the utility requirement cannot be met completely by combusting byproducts, and the biorefinery must purchase grid electricity. The biorefinery location determines the local price and GHG impact of that electricity. We include the spatially explicit price and CO2 impact of the grid electricity46 (details in Supplementary Fig. 9).

If CCS technologies are installed at a biorefinery, the captured CO2 must be fed into a pipeline and transported to an injection site for long-term storage. The location and injection costs for each potential sequestration site are obtained from the US National Energy Technology Laboratory (NETL)47. The cost of CO2 transportation and injection depends on the proximity of the capture site to the injection site and well geology accounted for in the NETL model. We pre-calculate the minimum cost of injection for each potential biorefinery location. First, injection sites are screened and removed if the capacity is too low to support a maximum capacity biorefinery with CCS. Next, we calculate a distance matrix Dl,e between the biorefineries lL and the injection sites eE and use the distance matrix to calculate the cost of injection for every biorefinery (equation (17)). The CO2 injection cost \({\kappa }_{l}\!^\mathrm{P}\) (US$ Mg−1) for each biorefinery is taken as the minimum pre-calculated cost.

$${\kappa }_{l}\!^\mathrm{P}=\mathop{\min }\limits_{e\in {{{\bf{E}}}}}{\kappa }_{e}\!^\mathrm{V}{D}_{l,e}+{\kappa }_{e}\!^\mathrm{F}$$
(17)

where \({\kappa }_{l}\!^\mathrm{P}\) is the pipeline transportation cost, \({\kappa }_{e}\!^\mathrm{V}\) is the variable portion (US$ (Mg km)−1) and \({\kappa }_{e}\!^\mathrm{F}\) is the fixed portion (US$ Mg−1), which depends on the geology of the injection well47. Injection sites and the corresponding biorefineries can be seen in Supplementary Fig. 8.

Biorefinery with carbon capture and storage

Microbial conversion, pyrolysis and gasification are considered as potential conversion technologies in this study. For liquid fuel production via microbial conversion, the biomass is first pretreated to access the sugars, which are hydrolysed before fermentation into ethanol. The remaining unconverted oligomeric sugars and lignin are combusted to meet the energy demands of the biorefinery with the excess sold to the grid21. Pyrolysis consists of converting the biomass into a combination of bio-oil and non-condensable gases. The non-condensable gases are typically combusted to meet the energy demands of the biorefinery, whereas the bio-oil is upgraded into gasoline and diesel fuel. This upgrading requires hydrogen, which could be purchased, or a portion of the bio-oil can be converted into hydrogen to upgrade the remaining bio-oil22. In gasification, the biomass is vapourized into syngas, which can be converted into liquid fuels using Fischer–Tropsch synthesis23. The yields, costs and operating conditions for all technologies are taken from techno-economic analysis performed by researchers at the National Renewable Energy Laboratory21,22,23.

Each of these conversion technologies produces a different number of distinct CO2 streams. Microbial conversion produces nearly pure (>95 wt%) CO2 from fermentation, 73 wt% CO2 in biogas from anaerobic digestion of wastewater and 20 wt% CO2 in flue gas from combustion of waste lignin and residues. Gasification also produces a >95 wt% CO2 stream from the cleaning of syngas before Fischer–Tropsch synthesis, and both gasification and pyrolysis produce CO2 in flue gas (32 wt% for pyrolysis, 15 wt% for gasification)29. Depending on the technology portfolio, the utility requirements can be met by combusting byproducts produced at the biorefinery (that is, lignin from fermentation residue, non-condensable gases from pyrolysis, fuel-gas from gasification) with the excess energy sold to the grid, offsetting grid emissions. The capital and operating costs, relative fuel yield and relative CO2 flows for each combination of conversion technology and carbon capture are shown in Fig. 1. Additional details on the technologies considered can be found in Supplementary Table 1 and Supplementary Figs. 57.

Selected model constraints related to the biorefinery are presented in equations (18)–(23). We explicitly model the amount of CO2 available for capture at the refinery and the amount of electricity produced or consumed.

$$\begin{array}{l}{P}_{{i}^{{\prime} },l,m,t}\!^\mathrm{C-CB}=\mathop{\sum}\limits_{i\in {{{{\bf{I}}}}}^{{{{\rm{F}}}}}}{\eta }_{i,{i}^{{\prime} },m}{G}_{i,l,m,t}\!^\mathrm{F-CB}\\+\mathop{\sum}\limits_{i\in {{{{\bf{I}}}}}^{{{{\rm{ID}}}}}}{\eta }_{i,{i}^{{\prime} },m}{G}_{i,l,m,t}\!^\mathrm{ID-CB}\quad \forall t\in {{{\bf{T}}}},m\in {{{{\bf{M}}}}}^{{{{\rm{CB}}}}},{i}^{{\prime} }\in \{\mathrm{C{O}}_{2}\}\end{array}$$
(18)

where \({P}_{i,l,m,t}\!^\mathrm{C-CB}\) is the amount of CO2 available for sequestration produced by technology m at biorefinery l at time period t, \({\eta }_{i,{i}^{{\prime} },m}\) is the conversion coefficient for technology m producing CO2 available for sequestration per tonne of biomass consumed,and \({G}_{i,l,m,t}\!^\mathrm{F-CB}\) and \({G}_{i,l,m,t}\!^\mathrm{ID-CB}\) are the consumption of feedstocks and intermediates by technology m, respectively. The amount of CO2 that is actually sequestered, CSEQ, is bounded by the amount that is available, and the balance is assumed to be vented (equation (19)). We assume the vented CO2 (if any) is biogenic CO2 and does not contribute to the GHG balance of the system.

$${C}_{l,m,t}\!^\mathrm{SEQ}\le {P}_{\mathrm{C{O}}_{2},l,m,t}\!^\mathrm{C-CB}\quad \forall l\in {{{\bf{L}}}},m\in {{{{\bf{M}}}}}^{{{{\rm{CB}}}}},t\in {{{\bf{T}}}}$$
(19)

The transportation cost constraint includes the spatially explicit cost of CO2 pipeline transportation and injection.

$$\begin{array}{rcl}{C}^\mathrm{TRA}&=&\mathop{\sum}\limits_{i,(j,k,q)\in {{{\bf{A}}}},t}\left[\left({\kappa }_{i,q}\!^\mathrm{F}+{\tau }_{j,k,q}{\kappa }_{i,q}\!^\mathrm{V}\right){F}_{i,j,k,q,t}\right]+\mathop{\sum}\limits_{i,(j,l,q)\in {{{\bf{B}}}},t}\left[\left({\kappa }_{i,q}\!^\mathrm{F}+{\tau }_{j,k,q}{\kappa }_{i,q}\!^\mathrm{V}\right){F}_{i,j,l,q,t}\right]\\ &&+\mathop{\sum}\limits_{i,(k,l,q)\in {{{\bf{C}}}},t}\left[\left({\kappa }_{i,q}\!^\mathrm{F}+{\tau }_{j,k,q}{\kappa }_{i,q}\!^\mathrm{V}\right){F}_{i,k,l,q,t}\right]+\mathop{\sum}\limits_{l,m}{\kappa }_{l}\!^\mathrm{P}{C}_{l,m,t}\!^\mathrm{SEQ}\end{array}$$
(20)

where \({\kappa }_{i,q}\!^\mathrm{F},{\kappa }_{i,q}\!^\mathrm{V},{\kappa }_{l}\!^\mathrm{P}\) are the fixed (US$ Mg−1), variable (US$ (Mg km)−1) and CO2 pipeline costs (US$ Mg−1), respectively. Note that the pipeline transportation cost depends on the location of the installed biorefineries. Fi,j,k,q,t are the flow variables and A, B, C are the sets of valid transportation arcs. Finally, equations (21)–(23) account for the production or consumption of electricity and the associated indirect emissions at the biorefineries and pre-processing depots, which depends on the location of the facilities. Note that the ‘production’ of electricity at the refinery \({P}_{i,l,m,t}\!^\mathrm{E-CB}\) can be negative, indicating a surplus of electricity that is sold back to the grid to offset the GHG emissions of local grid electricity.

$$\begin{array}{l}{P}_{i,l,m,t}\!^\mathrm{E-CB}\\=\mathop{\sum}\limits_{i\in {{{{\bf{I}}}}}^{{{{\rm{F}}}}}}{\eta }_{i,{i}^{{\prime} },m}{G}_{i,l,m,t}\!^\mathrm{F-CB}+\mathop{\sum}\limits_{i\in {{{{\bf{I}}}}}^{{{{\rm{ID}}}}}}{\eta }_{i,{i}^{{\prime} },m}{G}_{i,l,m,t}\!^\mathrm{ID-CB}\quad \forall t\in {{{\bf{T}}}},m\in {{{{\bf{M}}}}}^{{{{\rm{CB}}}}},{i}^{{\prime} }\in\{{\mathrm{ELC}}\}\end{array}$$
(21)
$${P}_{i,l,m,t}\!^\mathrm{E-PD}=\mathop{\sum}\limits_{i\in {{{{\bf{I}}}}}^{{{{\rm{F}}}}}}{\eta }_{i,{i}^{{\prime} },m}{G}_{i,k,m,t}\!^\mathrm{F-PD}\quad \forall t\in {{{\bf{T}}}},m\in {{{{\bf{M}}}}}^{{{{\rm{PD}}}}},{i}^{{\prime} }\in \{{\mathrm{ELC}}\}$$
(22)
$$G{T}^\mathrm{ELC}=\mathop{\sum}\limits_{i,l,m,t}{\gamma }_{l}{P}_{i,l,m,t}\!^\mathrm{E-CB}+\mathop{\sum}\limits_{i,k,m,t}{\gamma }_{k}{P}_{i,k,m,t}\!^\mathrm{E-PD}$$
(23)

In equations (21)–(23), \({\eta }_{i,{i}^{{\prime} },m}\) are the conversion coefficients for the electricity requirement or surplus for technology m. Variables GF−CB, GID−CB and GF−PD are the consumption variables and γl, γk are the local GHG impact of electricity at the biorefineries and depots.