Life cycle assessment needs predictive spatial modelling for biodiversity and ecosystem services

International corporations in an increasingly globalized economy exert a major influence on the planet's land use and resources through their product design and material sourcing decisions. Many companies use life cycle assessment (LCA) to evaluate their sustainability, yet commonly-used LCA methodologies lack the spatial resolution and predictive ecological information to reveal key impacts on climate, water and biodiversity. We present advances for LCA that integrate spatially explicit modelling of land change and ecosystem services in a Land-Use Change Improved (LUCI)-LCA. Comparing increased demand for bioplastics derived from two alternative feedstock-location scenarios for maize and sugarcane, we find that the LUCI-LCA approach yields results opposite to those of standard LCA for greenhouse gas emissions and water consumption, and of different magnitudes for soil erosion and biodiversity. This approach highlights the importance of including information about where and how land-use change and related impacts will occur in supply chain and innovation decisions.


Agricultural crop demand scenarios
In this use case, we apply conversion pathways from the literature (as given in Supplementary Figure 1) to determine how much raw material (sugarcane or maize) is required to meet the HDPE demand scenarios. The volumes of feedstock given in Supplementary Table 1 correspond to differing bio-HDPE demand scenarios. However, recognizing the importance of geographical influences on the results of such assessments, two different feedstocks in two locations are considered to demonstrate the new LUCI-LCA approach. Different scenarios are also informed by an understanding that impacts could differ with greater volume requirements due to the different spatial patterns of land-use change that result (Chaplin-Kramer et al. 1 ). For this reason, the first two volume scenarios are set at the largest scales that could be induced directly by Unilever, with a subsequent scenario set to represent broader sectorial uptake of the bio-HDPE.

Allocation
Economic allocation is applied to allocate inputs and emissions to the main product and any co-products: distiller's dried grains with solubles during maize ethanol production and electricity during sugarcane ethanol production, following the default approach in the ecoinvent database 2.2.

Impact Assessment Methodology
The following impact categories and methods are used:  Global warming potential (GWP) -IPCC AR5 5 100 years excluding biogenic carbon method. The GWP associated with land use change (LUC) is considered as given in section 2.4.2  Marine eutrophication potential -ReCiPe (Goedkoop et al. 6 )  Biodiversity damage potential -Mean Species Abundance (MSA; GLOBIO 7 ), according to de Baan et al. 8 The MSA observations of selected pressure factors for 'Conventional farming' and 'Perennial tree crop' are chosen for the land use and land use change areas for annual cropland and perennial cropland respectively in the assessment of biodiversity damage potential. For land use change from forest, an MSA factor of 1 is used.  Erosion regulation potential (ERP) -Saad et al. 9  Consumed water -No impact assessment performed for main results, the indicator used is water consumption (Mekonnen et al. 10

Agricultural production
The inputs and emissions for the production of sugarcane in Brazil and maize in the USA are obtained from the ecoinvent 2.2 datasets 'BR: sugar cane, at farm' and 'US: corn, at farm' (Ref.
2) respectively and the inventories are revised to take into account updated crop irrigation figures (Supplementary Table 2). The volume of consumed (or blue) water for irrigation is derived from the Water Footprint Network (WFN) database (Ref. 10). Irrigation efficiency factors are taken from Rohwer et al. 11 The ecoinvent 2.2 dataset 'US: irrigating' 12 is used to model the irrigation of sugarcane, but the US electricity mix is replaced with a Brazilian one. In the maize system, we also replace the inventory 'CH: irrigating' with the 'US: irrigating'. These values for irrigation correspond to other sources (see 3

Land-Use Change (LUC)
The amount of land use change (transformation) and carbon dioxide emissions resulting from land transformation are estimated using the direct land use change assessment tool (Blonk Consultants 2014) 13 , following the standards PAS2050-1 14 , Greenhouse Gas (GHG) Protocol 15 and EnviFood Protocol 16 -"country known, previous land use unknown" situation The area of land transformation used in the assessment of Erosion Potential and Biodiversity Damage Potential are given below (Supplementary Table 3) and the GHG emissions from land transformation for sugarcane and maize are 10.81 T CO2eq/ha*year and 0.02 T CO2eq/ha*year respectively. The figures for both the area of land transformation and GHG emissions from land transformation are derived in the tool as annual figures).

Ethanol Production
The resource use and emissions for the production of ethanol are obtained from the ecoinvent 2.2 datasets 'BR: ethanol, 95% in H2O, from sugar cane, at fermentation plant' and 'US: ethanol, 95% in H2O, from corn, at distillery' respectively. These are representative of ethanol production in Brazil and the United States (Ref. 2).

Ethanol Dehydration
The dehydration of hydrated ethanol 95% to anhydrous ethanol is considered using the datasets 'BR: ethanol, 99.7% in H2O, from biomass, at distillation' and 'US: ethanol, 99.7% in H2O, from biomass, at distillation'. The Brazilian ethanol dataset considers the production from both sugarcane and molasses (by-product from sugar production). For the purposes of this study, we assume that 100% of new ethanol is produced from new sugarcane. We adjust the ecoinvent dataset to consider 100% production from sugarcane.

Ethylene Production
The LCI for the production of ethylene from ethanol is based on the processing requirements given in Kochar et al. (Ref. 3) shown in Supplementary Table 4. The steam is modelled using the dataset 'RER: steam, for chemical processes, at plant' 17 , power is modelled using the ecoinvent country specific electricity mix 'BR: electricity, high voltage, production BR, at grid' 18 and 'US: electricity, medium voltage, at grid' for Brazil and USA respectively, and fuel is assumed to be gas and modelled using the ecoinvent dataset 'RER: natural gas, burned in industrial furnace >100kW' 19 . Supplementary

Polymerization
The polymerization of ethylene to HDPE is modeled following the approach of Tsiropoulos et al. (Ref. 4).
Monomer consumption in kilograms per metric tonne of product is 1027 kg/T (according to the European Commission 20 ).

Transport
Transportation is modeled for relevant life cycle phases where it is not already included in the existing datasets. Transport includes the trucking of product from the ethanol production plant to the ethylene production plant (500 km by road for both regions) and the shipping of HDPE produced from sugarcane in Brazil to the US market (10,100 km according to Sea-Distances.org 21 ). Transport is modelled using ecoinvent datasets 'RER: transport, lorry >16t, fleet average' and 'OCE: transport, transoceanic freight ship'. 22

End-of-Life
All the carbon stored in the bio-HDPE is assumed to be released back to the atmosphere at the end of the product life in the form of carbon dioxide (CO2) with no contribution towards global warming potential, since the CO2 released was sequestered in crop growth.

Sensitivity analysis
A sensitivity analysis is provided for scenario 3 (321,000 T HDPE). The effects of pumping water for irrigation are varied using lower and upper irrigation water volumes based on the lower and upper consumed water volumes (Supplementary Table 5) considering irrigation efficiencies (as given in Supplementary Table 2). Similarly, the N-fertilizer application rates and yields are varied using the lower and upper N-fertilizer application rates (as given in Supplementary  Table 7). For Water Consumption, the yield changes associated with the upper and lower N-fertilizer application rates are applied and combined with the lower and upper consumed water values as estimated using the LUCI-LCA values; the relative differences in consumed water from the base case in the LUCI-LCA were applied to the LCA base case.

Land Use Change Improved (LUCI) Life Cycle Assessment Methodological Details
To produce the LUCI-LCA, we first develop predictive land change models (LCM) to translate the demand scenarios into maps of agricultural expansion and intensification (section 3.1). We feed the resulting land-use change maps into models for biodiversity and ecosystem services (InVEST) in order to assess the environmental impacts of the additional product demand in a spatially explicit way (section 3.2). Finally, we integrate the results to substitute for key elements of the land-use change impacts in standard LCA, as illustrated in Fig. 1 in the main text (and described in detail in section 3.3).

Land Use Change Modeling to Spatialize Demand Scenarios
This section summarizes the process for generating spatial scenarios of agricultural expansion and intensification resulting from changes in commodity demand in a region. This approach can be applied with public, globally available data and limited land change modeling expertise. The approach we developed has three steps, described in detail below: 1. Derive the potential land area for expansion required to achieve the increase in production based on expansion only 2. Adjust the total land expansion to account for intensification by creating spatially explicit yield map to partition production into amounts met through expansion and intensification 3. Allocate the expansion area spatially within the region of interest

Derive the area of expansion
The potential expansion area is derived through an extrapolation of past trends in production and the harvested crop area. In order to allow for a globally replicable approach that can be used across supply chains, we prioritized globally-available data to maintain consistency between study regions. We use national FAO data 23 to relate production and area of a particular crop over a time series, regressing production in each year against harvest area in each year. The slope of this line provides a more appropriate estimate of agricultural expansion to meet a production target than yields alone because it incorporates past gains through both intensification and expansion. If past production from one year to the next has increased more than can be accounted for by multiplying past increases in harvested area by yield in the first year, then the remaining production must be attributable to intensification. If yields are used instead of this time series relationship (i.e., dividing production target by current yields), the total area predicted for expansion could be higher than is likely to occur in reality because no intensification would be assumed. Since the slope of the regression we are using is the expected increase in production for an expected increase in area, the production target is simply divided by the regression slope in order to solve for the area of expansion needed. We are assuming the rate of intensification will continue on its previous trajectory.
To illustrate this in the context of our use case, the solid line in Supplementary Figure 3 represents the regression; the production increase per area increase based on past trends, which reflects a mixture of intensification and expansion. The dotted line represents the expansion required if there is no intensification, if current yields are applied to reach production target (i.e., the slope of the dotted line is the most recent yield or average yield from the past several years if highly fluctuating). In this case study, Mato Grosso shows evidence of intensification based on past trends, whereas Iowa does not. That is, changes in production in recent years can be attributed solely to changes in harvested area in Iowa. In Brazil we find the slope of the regression between production and area to be 81.9 (y = 81.993x -5E07); thus for every 82 metric tonnes of sugarcane production, 1 hectare of land was converted to sugarcane production in Brazil. In order to calculate the expansion area predicted by past trends for each demand scenario, then, the production target is divided by the slope of this regression (Supplementary Table 8).

Partition production into expansion and intensification
To derive the amount of crop produced through expansion, average yields for the most recent year (Ref. 23) are multiplied by the expanded area (determined in 3.1.1). This value is then subtracted from the demand target to derive the amount of commodity produced through intensification (Supplementary Table 9). If, in the last 10 years of production, increases can be attributed solely to an increase in harvested area (as was the case for Iowa; meaning the slope of the line determined in 3.1.1, was the same as the current yield, 11.1 T/ha), the amount of production achieved through expansion should be equal to the demand target.
For simplicity, the yield increase for intensification is specified as exactly that required for all of the target production to occur on the land predicted for agricultural expansion. The area of intensification is therefore equal to the area of expansion, and the intensified yield is calculated as the slope of the line presented in 3.1.1.
In the case of Brazil, this requires a yield of 82 T/ha for all scenarios (up from a current average yield of 75.3 T/ha). Thus, according to this estimate, the amount of increased production met through intensification is less than 10% of the total increase. The effects of intensification are assumed to apply to all of the sugarcane expansion area. This is not necessarily a realistic prediction (i.e., yields could be expected to rise both in current and future sugarcane production areas, not exclusively in, and evenly across, all future sugarcane production). However, it is not possible with currently globally available data to spatially attribute past intensification, because maps of fertilizer use and crop-specific land-uses are not available at the resolution necessary for our ecosystem services modeling (500 m or less).

Spatially allocate the expansion area
Our approach for spatially allocating the agricultural expansion required to meet the production target involves creating a suitability layer that assigns eligible pixels an index value which reflects biophysical suitability for conversion to agriculture. Specifically, the approach relies on estimating suitability using a logistic regression where the 0/1 dependent variable indicates whether a pixel is classified as agriculture in a particular year. After regressing the dependent variable against multiple driver variables on a subset of the data, the suitability layer is generated by interpreting predicted probabilities from the logistic regression as a suitability value for each pixel capable of conversion to agriculture. These values are then sorted, with the pixels that have the highest ranking values chosen for conversion to agriculture until the area requirements of the demand scenario have been met. 25 This approach makes the fundamental assumption that factors that have historically determined the location of agriculture will also determine the location of agricultural expansion driven by increases in demand for the commodities specified in the demand scenario (i.e. in our case, maize and sugarcane). Part of this assumption is related to the static nature of the regression: that is, we predict where agriculture exists at a point in time.
The other part of this assumption is due to coarseness in the representation of agriculture: we are unable to identify datasets with global coverage that resolve agriculture at more refined classifications (i.e. crop species . Therefore, we are unable to account for the fact that expansion of certain crops may be driven by different variables than those that drive "agriculture" generally. The benefit of taking this general agriculture perspective is that we are representing total land-use change, which includes not only the direct land-use change that results from a particular crop replacing natural habitat, but also a proxy for indirect landuse change that results from one crop displacing another crop that ultimately causes conversion of natural habitat (indirect land-use change). That is, it will likely be the case that meeting a demand scenario will not cause conversion of natural habitat to farming of the desired feedstock at the areas specified in the demand scenarios, since crop-to-crop conversion on farmland may occur. But this crop-to-crop conversion may cause other commodities to drive expansion into natural lands. We believe this makes the LUCI-LCA approach a more conservative (in the sense of assessing worst-case impacts) way to consider the full change that will likely be provoked within the regions of study, whether directly or indirectly.
We do not account for leakage effects outside the geographic boundaries of the study system (in this case, the states of Mato Grosso or Iowa); instead we assume that all commodity demand and associated agricultural expansion will occur within the study regions. This is because the effects of crop displacement are more profound on the local level. It is more likely that the shift from cattle ranching to sugar cane cultivation will lead to deforestation for cattle ranching within the same country and region. The direct link between the shift in types of cultivation in one part of the globe and land use changes in the other parts of the globe is difficult to demonstrate.

The Logistic Regression Framework
Land cover (MODIS 26 ) data from a reference year (in this case, 2007) are reclassified according to binary variables indicating whether each pixel is classified as agriculture. Areas that are assumed to be unable to convert to agriculture (urban, barren, water) are omitted from the regression. Given a set of K driver variables (discussed below), a regression (with slope coefficients for each driver variable) is then run with the following form: Where is the logistic transform that translates the probability ( ) of the outcome variable taking on a value of unity to a variable suitable for linear regression: 27 The driver variables xk can be transformed based on theory or empirical knowledge -for example, topography may be an important predictor of suitability for agriculture, but variations in a relatively shallow slope may be unimportant. In this case, we found this to be true, and created a new variable that creates a binary indicator for "shallow" and "steep," based on whether slope is below or above a specific value (see 3.1.3.2).
Logistic regressions are estimated using maximum likelihood estimators, which identify the combination of parameter values that are most likely to produce the observed data. When working with relatively large rasters, the potential for spatial autocorrelation can typically be addressed by fitting the regressions to relatively sparse sample of the rasters (between 1% and 10% in different model runs). This was the approach used for generating the scenarios used for our analysis. Logistic regression can be implemented in most common statistical packages; we used the R package "lulcc" which provides a workflow to connect raster data to the glm function for generalized linear models.

Data Sources and Selection of Driver Variables
A key component of this approach is to develop a process that can function with only global data.
Supplementary Table 10 provides an overview of globally available data relevant for modeling agricultural expansion. Additional global data layers of potential relevance could be included (e.g., the Global Roads Open Access Data Set 28 ), but our initial small set is sufficient to represent the types of ecosystem impacts that occurred from historic agricultural expansion (see Fig. 3, main manuscript).

Supplementary Table 10. Data sources considered for land change model
For land cover data, MODIS years 2001, 2007 and 2012 are used as the primary years for exploratory analysis and model testing, 30 with the model ultimately generated using 2007 data to allow validation against datasets from later years (see Section 3.1.3.3). Land cover data specifying agriculture are always required to generate the dependent variable layer, but may also be used to derive alternative driver variables such as distance from the current agricultural frontier or from urban centers. All data are resampled to the 500 meter resolution of the MODIS land cover, using bilinear resampling for continuous data, and nearest neighbor for categorical.
The slope variable was transformed (after resampling) to a binary variable indicating slope as greater than five percent (though testing reveals results are insignificant to thresholding between four and twelve percent). All other variables described in Supplementary Table 11 are untransformed when tested for inclusion in the linear logistic regression. Applying screening tests for correlation using Kramer's V statistic, as well as subsequent inspection of logistic regression results, the final functional forms of the regression (with intercept 0 and slope coefficients ) used for the two states are: The specific values of the model fit are listed in Supplementary Table 11.

Model Performance
In the context of our application and data limitations, there is not a single test and metric that provides a good summary of model predictive performance. A first consideration is how well the model predicts the transitions from different types of habitat into agriculture. We considered this by comparing the actual land use change that occurred between 2007 and 2012 in both regions (Iowa for maize and sugarcane from Mato Grosso) with the output from the logistic land change model. This was accomplished by isolating the agricultural expansion land use change only (pixels that switched from non-agricultural non-urban to agricultural) for MODIS 2007 and MODIS 2012 land cover data. Supplementary Figure 4 summarizes the transitions resulting from this landuse change model, classified by habitat type. Very different patterns emerge in the two regions, with maize expansion in Iowa predicted to occur predominantly on forested land, and secondarily on grassland, while sugarcane expansion is predicted to occur almost exclusively on savanna in Mato Grosso. Interestingly, the logistic LCM is more accurate at predicting the proportion of change in different habitats, the larger the change considered. That is, there is a closer match to actual historical changes, in terms of the relative proportions of each type of habitat converted, for scenario 3 with a volume of 321,000 T HDPE than scenario 1 with a volume of 23,000 t. In both regions and for all scenarios, the proportion of each habitat predicted to be converted to agriculture by the logistic LCM matches actual change (red bars in Supplementary Figure 4) much better than the changes assigned in the standard LCA (grey bars). In standard LCA, all habitats for the additional maize and sugarcane production required to reach the demand targets is based on national-level estimates from the previous 20 years (according to the Direct Land Change Assessment tool), 13 which only tracks annual and perennial cropland, grassland, and forest. As previously noted, it assumes much of the land for both crops will come from existing cropland, and in fact nearly all of the "conversion" in Iowa is counted as coming from other crops (which is why no yellow bars appear in the Iowa plot for Supplementary Figure 4).
For comparison, we also ran the InVEST Scenario Generator: Proximity Based model for the total hectares of expansion predicted for Scenario 3 (320,000 T HDPE), to generate maps of agriculture expanding out from current cropland. This essentially counts the pixels closest to current agriculture as most "suitable". The proximity-based agricultural expansion often matches the actual compositional changes more closely than the logistic model for Mato Grosso, but not for Iowa (light blue bars in Supplementary Figure 4). However, in both regions the proximity to agriculture land change model yields results that more closely align to actual change than those achieved using the Direct Land Use Change Assessment Tool for standard LCA (Ref. agriculture (i.e. changes in land cover pixels from 2007 to 2012) on a majority of the landscape in Iowa, though only about one quarter of new agriculture in Mato Grosso is correctly predicted. The proximity-based model performs as well in Iowa and slightly better in Mato Grosso (35% correct in pixel-specific changes). However, despite this similar performance, the agreement between new agricultural pixels for the two models is low (55% in Iowa and 31% in Mato Grosso). Thus, the pixels that are identified correctly by each model are different pixels, for the most part.
Supplementary Despite this error in individual pixel-level conversion, the greater question to environmental impact assessment is whether the land use change model captures the types of land-use change trends that are important to ecosystem services. We therefore assess the difference between impacts on ecosystem services modeled from past agricultural expansion and LCM-generated agricultural expansion, and in this case find the magnitude of impacts to be relatively robust to pixel-level errors in land-change model prediction (see Figure  3, main manuscript).
To create a validation layer for the ecosystem service impacts predicted by the LCM model, we generate a binary map of where conversion to agriculture occurred between 2007 and 2012, and then overlay these pixels as new agriculture onto the 2007 landscape. The only impact for which we are unable to assess LCM uncertainty in this way is Water Consumption. Impacts resulting from past land-use change cannot be modelled because change in irrigation resulting from land-use change during this period is unknown. For the remaining impacts, predictions that show complete alignment with this map represent total land change model accuracy (assuming no error in the MODIS classifications). In this case our validation is only concerned with conversion from vegetated land to agriculture, and not any other categorical transitions. We then use the absolute amount of conversion between 2007 and 2012 as a new "demand scenario" to feed into the LUCI-LCA and calculate the impact per T of HDPE. This normalized impact can then be compared to the logistic regression model, as well as the proximity-based scenario generator (abbreviated as "SG:PB" for "Scenario Generator: Proximity Based"). SG:PB essentially creates a suitability layer as well, except this layer is derived purely based on distance to or from the frontier of certain land cover classes (in this case, agriculture). When running SG:PB, we also apply the same restrictions for the types of land that can be converted -specifically, omitting barren, urban, and water.

Modeling Biodiversity & Ecosystem Services Impacts from Agricultural Expansion and Intensification
Here we describe the methods, data, assumptions, and results from the ecosystem services modeling to assess impacts from the increased production to meet the different scenario demand targets. The spatially-explicit effects of agricultural expansion are modeled in Iowa and Mato Grosso, for carbon loss (InVEST Carbon Storage and Sequestration and Forest Carbon Edge Effects models), nitrogen export (InVEST Nutrient Delivery Ratio model), water consumption from irrigation (InVEST beta model for blue water consumption), sediment export (InVEST Sediment Delivery Ratio model), and biodiversity (MSA) reduction (InVEST GLOBIO model).
However, when it comes to modeling the impacts of intensification, our approach is only partial. We believe the science and data are not adequate to model the impacts of intensification to biodiversity, sediment export or carbon loss / sequestration, for two reasons: first, there are currently no globally available maps of cropspecific composition at the resolution needed for the InVEST models. It is therefore not possible to make predictions of the spatial effects of intensification of existing crops. Second, even if such data were available, we do not yet have a mechanistic model or even consistent scientific evidence linking yields to specific changes in tilling and soil management that may also affect erosion and carbon sequestration, or between yields and hospitability to biodiversity. We thus limit our analysis of impacts of intensification to nitrogen export and water consumption, via models predicting the relationship between yields, nutrient application, and irrigation.
In the following sections, we describe the individual InVEST models used to estimate ecosystem impacts to substitute key elements (inventory data and characterization factors) in the agricultural stage of standard LCA. The ecosystem impacts from the land use change scenarios, modeled using InVEST or simple GIS approaches, align to the LCA impact categories as follows: 1. Carbon Loss (input to Global Warming Potential) 2. Nutrient Export (input to Eutrophication Potential) 3. Water Consumption from Irrigation (input to Water Consumption and Global Warming Potentialenergy used for pumping in irrigation) 4. Sediment Export (input to Erosion Potential) 5. MSA Reduction (input to Biodiversity Damage Potential)

Carbon Loss 3.2.1.1. Overview
We use the InVEST (v 3.2) carbon edge effects model to estimate loss of carbon storage for each scenario. The carbon edge effects model is an extension of the InVEST carbon model, which incorporates our recent work documenting the effects of fragmentation on carbon storage in tropical forest edges. 31 The model follows the typical inventory approach 32 for all habitat types other than tropical forest.

Inputs and Assumptions
The full description of the InVEST carbon edge model can be found in the InVEST User's Guide online. 33 For forest carbon edge, we included only estimates for below-ground, because above-ground carbon was predicted by the model based on a pixel's distance from forest edge. For consistency with the standard LCA approach (Ref. 13), wherever possible we use global estimates for carbon in different vegetation classes from IPCC (Ref. 32) and FAO Global Forest Resource Assessment 34 defined in the land use-land cover map we are using (MODIS, IGBP classification). Where coverage of certain classes is missing in those sources (e.g. savanna, shrubland), we use the dataset developed by Ruesch and Gibbs 35 . We summarize the input data sources and assumptions in Supplementary Table 13.
The effect of intensification practices, specifically normal and low tillage, on soil carbon storage are not well understood and this remains an open area of research. The majority of studies on this topic have only measured soil carbon within the first 20-30 cm, and meta-analyses 36,37 have shown that when lower depths of the soil profile are included, the effect of tilling is negligible. Furthermore, we have no quantitative relationship between tilling and yields; including the effects of management would only have been useful in terms of providing an upper and lower bound for the impacts on carbon storage. Therefore, we do not include the impacts of mechanization or intensification in our assessment of the effects of land use change on carbon storage. We do not report results for soil carbon here, because this parameter is already considered within the LCA and does not require substitution since we do not expect effects to be spatially explicit. Biophysical tables used in the model are reported in Supplementary Note 3.

Model Sensitivity
We test parameter sensitivity for scenario 3 in the predominant land covers into which agriculture expanded in our land-change model for each region. In Mato Grosso, the predominant land cover transformed is woody savanna (<80% of converted habitat for scenario 3; Supplementary Figure 4); in Iowa, it is forest (60%) and grassland (20%).
For Mato Grosso, the globally-available estimate for carbon stored in woody savanna is 53 T/ha (  41 Carbon estimates for grassland in the region range from 11.9 T/ha 42 to 4.9 T/ha. 43

Nutrient Export 3.2.2.1. Overview
We use the InVEST (v 3.2) nutrient model to estimate Nitrogen (N) export for each scenario. The full description of the InVEST nutrient model can be found in the InVEST User's Guide online. 44 For each pixel, the model computes the nitrogen load, i.e. the amount of nitrogen running off the pixel (either by surface or subsurface flow), and the transport coefficient, termed nutrient delivery ratio (NDR). NDR is a factor between 0 and 1 that represents the amount of nitrogen that actually reaches the stream, based on the landscape properties (slope, land cover, etc.) between the pixel and the stream.

Inputs and Assumptions
Generally, values for N loads and efficiencies are sourced from the InVEST parameter database (Supplementary Table 14). For the land-use change scenarios to meet the required production increase, N loads for new agriculture (sugarcane or maize) are computed as the product of fertilizer application rates and N use efficiency. Fertilizer application rates are based on ecoinvent 2.2 or local literature (Supplementary Table  14). N use efficiency is set to a global average value of 0.6 45 . Biophysical tables used in the model are reported in Supplementary Note 3.

Impacts of Intensification
In reviewing the effect of intensification practices on nutrient dynamics (cf. Supplementary Note 4), we find impacts to be very heterogeneous. To represent intensification in Mato Grosso, we calculate the increase in fertilizer application necessary to reach the intensified yield (0), relative to baseline (average current) yields. While the availability of other nutrients (notably Phosphorus and Potassium) along with pesticides, seeds, machinery and knowledge can also limit the crop yields, we focus on N because N availability is considered the main limiting factor for crops in these regions. 46 For each production scenario, we then increase N loads for intensification areas by that percentage. This equation is applied to baseline yields as well as intensified yields under the different volume production scenarios in order to derive the N required to achieve the production level in each scenario (Supplementary  Table 15). We take an area-weighted average of the N required to reach the yield target across the entire agricultural expansion area (which for simplicity is also where all intensification is assigned), based on the number of pixels in each climate bin (Supplementary Note 5). The application rates for the different scenarios are divided by the application rates for current yields to arrive at the percent increase in N application required for each scenario (Supplementary Table 15).
For example, in Mato Grosso, the baseline application rate for sugarcane in Brazil is 55 kg/ha (Ref. 46) and for scenario 3 the relative increase in fertilizer application to achieve the intensified yield is 28.1% (or increased by a factor of 1.281). With the N use efficiency of 0.6, this gives: load_n = 55 x 1.281 x (1-0.6) = 28.2 kg/ha.

Sensitivity analyses
A major uncertainty in assessing nutrient impacts is related to N loads, i.e. sources of nutrients in the landscape. For agricultural land, these are driven by the fertilizer inputs and the amount of leaching. In the LUCI-LCA approach, we assess the effect of errors in this input by running the model for Scenario 3 with lower and upper bounds for these parameters.

Sugarcane in Mato Grosso
We use a weighted-average N application rate of 55 kg N/ha for sugarcane in Brazil based on an FAO report (Ref. 46). According to this study, the rates can vary from 14 kg N/ha in the North to 76 kg N/ha in the South. Other reports 49 suggests that N application rates are about 78 kg N/ha. In a following study 50 , the author used an average of 60 kg N/ha, with the sugarcane yield of 72.52 T/ha. In a sensitivity analysis, he used minimum and maximum values of 35 kg N/ha and 97 kg N/ha, respectively, with a standard deviation of 16 kg N/ha. This range represents an error of approximately 45%, which we use in our analysis. Based on these uncertainty bounds, the additional N export from Scenario 3 changes by -42% and 37% relative to the original estimate.

Maize in Iowa
According to IFA 2006 51 , the N application rate can range from 145 kg N/ha in the western part of the maize belt to 179 kg N/ha in the eastern part. This corresponds to a range of approximately 10%. In a US study by Grassini et al. 52 , average N application rates ranged from 158 kg N/ha in Iowa, to 183 kg N/ha in Nebraska. Similar ranges are found in recommendations by Iowa State University 53 (140 to 190 kg N/ha), which leads us to consider a relative error of 15%. Based on these uncertainty bounds, the additional N export from Scenario 3 changed by -7% and 23% relative to the original estimate.

Model verification
To assess the credibility of our results, we compared InVEST nutrient export predictions with estimates from a global model (NEWS2) and from local empirical data (Supplementary Table 16). Comparison with NEWS2 and local sources suggest that InVEST underestimates total N export. This could be due the omission of other sources of nutrients in InVEST (e.g. point loads, especially in IA) and simplified representation of transport: in particular, the model simplifies the complex processes that drive nutrient degradation in surface and subsurface flows. However, the model correctly predicts Iowa as the location with the higher standardized N exports (i.e. per ha), which suggest that the relative difference between the scenarios is credible.  56 , which was calibrated against observations and accounts for both point sources and non-point sources of nutrients. See calculation details in the study by The Nature Conservancy available at: http://nature.ly/TNC-Dow-Brazil ("verification of hydrologic model predictions"). Here, we use the range for "pristine catchments" given the small proportion of agriculture in the baseline scenario.

Water Consumption 3.2.3.1. Methods
To assess the impact of agricultural expansion on water availability, we compute the irrigation water requirements and the resulting water consumption per tonne of product.
Net irrigation requirements, i.e. the amount of water needed by specific crops to grow, are computed based on a water balance at the monthly time scale 57 , for each month of the growing period (April to August in Iowa, May to October in Mato Grosso). For each cropland pixel, and each month, the net irrigation requirements (Irrn) are: where kc is the crop factor for the crop of interest, ET0 is the reference evapotranspiration, and Pavail is the available precipitation, i.e. the amount of precipitation that did not leave as quick flow and is available to crops. Pavail is computed at the monthly time step with the InVEST index water model (cf. Supplementary Table  17, "Quickflow"), based on mean monthly precipitation and number of rain events. Given that the amount of quick flow was small (<5% relative to precipitation), with little variation between months and pixels, we set it to a constant value of 5%.
Net irrigation requirements are converted into predicted water consumption for irrigation. This is achieved by accounting for two factors: the irrigation efficiency, since some water extracted for irrigation is lost before reaching the crops; and the current irrigation rate, which is based on existing irrigation rates in the states of interest (some areas need irrigation in theory but are either restricted or lack equipment). = Predicted irrigation volumes, i.e. the amount of water likely to be used by farmers in the field, are based on current irrigation rates. Due to lack of infrastructure, water regulations, or a decision not to irrigate, farmers do not always irrigate at the theoretical rates calculated above (Irr_g). We use current irrigation rates for all the expansion areas from global statistics: specifically, local data and a gridded dataset from Aquastat 58 , which estimates the percentage of area actually irrigated as a function of areas equipped with irrigation (resolution of ~12km, see details in Supplementary Table 17). For each site, we compute the average percentage of area actually irrigated, AAI, and compute predicted irrigation volume (Irrp) as:

= ×
The water consumption (WC, in m 3 /T product) only accounts for water evapotranspired or incorporated in a product and is therefore calculated as: Where Prod is the amount of HDPE in tonnes.

Inputs and Assumptions
A summary of model inputs and assumptions is provided below. Given the coarseness of the data, we average these data at the watershed scale to estimate the regional irrigation level.

Withdrawals
Total withdrawals per basin Source: Aqueduct 61 In the Aqueduct model, water is routed from one subwatershed to another, such that the blue water should be obtained from only the most downstream subwatershed (not summing blue water values across the watershed area). In IA, however, since some watershed areas belong to a different basin (HUC4 #10 vs #7 for most of IA), they are not draining to the same outlet). The blue water contribution from these subwatersheds is thus added to the main outlet (in basin #7) to obtain total blue water.

Model Sensitivity and Verification
Regional assessments of irrigation water consumption comprise a number of uncertainties. We summarize and discuss the main sources of uncertainty in Supplementary Table 18 below. An important consideration is the time scale: in our calculations, hydrologic data are long-term averages to reduce the effect of climate variability, whereas some datasets represent recent years only (e.g., withdrawals). We do not assess the effect of these uncertainties individually.

Source of uncertainty Level of uncertainty Precipitation inputs
Medium. Precipitation is from the 1950-2000 period to produce long-term results. Impacts of climate change are not considered in this analysis; they are likely to increase actual irrigation water consumption in Iowa Reference evapotranspiration inputs Low. Reference evapotranspiration, similar to precipitation, is computed for 1950-2000. See precipitation inputs for comments on climate change Quick flow (runoff assumed to be unavailable to plants) Low. Quick flow is estimated from InVEST based on monthly data (disaggregated to daily). Rerunning the IA assuming 0 quickflow (i.e. the extreme opposite), yields a difference in baseline requirements of 10%. Error in crop coefficients* Medium. Trials for IA yield a difference of 38% for June Error in total withdrawals in total blue water (from Aqueduct) Medium. Aqueduct data are based on 1950-2008. These estimates are from disaggregated country-level data using regression based on proxies for industrial, agricultural, and domestic uses. Data for global analysis may be uncertain.
*We note that model outputs are a linear function of these factors so the effect of their uncertainty can be assessed by propagating the error linearly.

Irrigation and water yield volumes
To verify predicted irrigation volumes in Iowa, we compare our estimates with USGS data. 62 In 2010, daily irrigation use in Iowa was 42.8 Mgal or 0.00016 km 3 . Correcting for area since the watershed considered was larger than the state of Iowa, and multiplying over a year, this yields an annual irrigation water volume of 0.09 km 3 . We note that this correction is approximate since areas in the North-east, outside the state of Iowa, are responsible for a large proportion of the irrigation requirements in this region. Similarly, we compare the total irrigated areas (116,000 ha vs. 115,000 ha, for USGS and our estimates, respectively), and the total withdrawals (6.5 km 3 /yr vs. 9.7 km 3 /yr, for USGS and our estimates, respectively). This verification suggests that the values for Iowa are reasonable first-order estimates.
In Mato Grosso, similar data are not available so we simply verify the amount of available blue water (from Aqueduct) with the uncalibrated InVEST annual water yield model (which uses the precipitation, reference evapotranspiration inputs). The error is 35% for Mato Grosso (and 16% for Iowa), indicating potential errors in model inputs but suggesting that estimates are credible.

Comparison with other studies' results
Although water consumption values (in m 3 /T crop) are similar across scenarios, they are very sensitive to assumptions about actual water requirements. We compare our results to several other sources, as summarized in Supplementary Table 19. Our results differ from those of Mekonnen and Hoekstra (Ref. 10) in terms of magnitude and ranking (water consumption for sugarcane is higher than maize in our study, but lower according to Ref. 10). This could be due to the differences in data sources and processing (see 3.3.3), and points to large uncertainties in the calculations of water consumption metrics (Ref. 10).  64 This study suggests 5% irrigation in Iowa.

Sediment Export 3.2.4.1. Overview
We use the InVEST (v 3.2) sediment model to estimate sediment export for each scenario. The full description of the InVEST sediment model can be found in the InVEST User's Guide online. 65 The model computes the soil loss, i.e. the amount of sediment produced on each pixel, and the transport coefficient, termed sediment delivery ratio (SDR). SDR is a factor between 0 and 1 that represents the amount of soil loss that actually reaches the stream, based on the landscape properties (slope, LULC) between the pixel and the stream.

Inputs and Assumptions
To inform parameter selection, we review the effect of intensification practices on sediment dynamics (cf. Supplementary Note 4), finding the impacts to be very heterogeneous. Sediment export is more related to Best Management Practices than to the level of intensification. For this reason, we model all agricultural land with the same generic coefficients (the C and P factors of the sediment model, controlling soil loss in a pixel), thereby ignoring the distinction between standard and intensified production.
We summarize the input data sources and assumptions in Supplementary Table 20. Empirical values for the C (crop management) and P (practice) factors are derived from the InVEST database (see Supplementary Table  20 for details). When insufficient data is available for the state (either Mato Grosso or Iowa), we use regional data (South or North America). As noted above, C and P factors for both the baseline and expanded agricultural areas are set to the average of current agricultural land for the area (not specific to sugarcane or maize). Final biophysical tables used in the model are reported in Supplementary Note 3.

Sensitivity analyses
A major uncertainty in assessing sediment impacts is related to the C factors, i.e. which are empirical parameters representing the amount of soil loss relative to bare soil. As detailed in Supplementary Table 20, the LUCI-LCA approach uses regional parameters derived from the peer-reviewed literature. We assess the effect of errors in this input by running the model for Scenario 3 with an uncertainty bound of 50% around baseline values, which correspond to a typical error around these parameters (Hamel et al. 67 , Chaplin Kramer et al. 68 ). Based on these uncertainty bounds, the additional sediment export from Scenario 3 varies from -53% to 71%, and from -64% to 60%, respectively, in Iowa and Mato Grosso, relative to the original estimate.

Model verification
We verify the magnitude of InVEST predictions by comparing them with one global model (BQART 69 ) and local empirical studies (Supplementary Table 21). Given the large uncertainties in sediment and nutrient modeling, especially for ungauged basins, 70 it is outside the scope of this study to reduce modeling uncertainty in the sediment export estimates. The accuracy of these values impacts the results of the LUCI-LCA to the extent that absolute predictions are used. However, relative difference in sediment exports is more robust. Sediment yield in Mato Grosso is consistent with values from the BQART model (Supplementary Table 21). In Iowa, InVEST seems to overestimate the sediment yield; however, BQART is limited in the representation of agriculture (human modifications to a basin are represented by a factor that equals 1 for Mato Grosso, vs. 2 for Iowa), which certainly underestimates the impact of agriculture (97% of LULC). Overall, the order of magnitude predicted by InVEST is in line with BQART and regional studies.

MSA Reduction 3.2.5.1. Overview
We use the InVEST (v 3.2) GLOBIO model to estimate biodiversity impacts. GLOBIO uses a meta-analysis of studies around the world to derive changes in mean species abundance (MSA) resulting from different anthropogenic threats. We examine differences in MSA between baseline (2007) and scenario landscapes, and summarize the area-weighted average reduction in MSA and total affected area (through occupation or fragmentation) where biodiversity is impacted by land-use change

Inputs and Assumptions
The full description of the InVEST GLOBIO model can be found in the InVEST User's Guide online. 71 The three sources of threat considered in this model are land-use (affecting only the pixel of the new expanded agricultural land), fragmentation (affected by changes in nearby pixels), and infrastructure (affected by changes in nearby pixels; but not altered in the scenarios explored here). We summarize the data sources and assumptions in Supplementary Table 22.  InVEST GLOBIO model Based on meta-analysis in Alkemade et al. 75 ; std. errors of mean effects included in uncertainty analysis An MSA estimation ranges from 0 to 1, indicating the average proportional change in abundance of individual species in a location compared to the average abundance of the species within a pristine ecosystem. An MSA of 1.0 implies that, on average, species abundances are the same as in pristine land while an MSA of 0.0 implies that average species abundance is zero (i.e. locally extinct).

Supplementary
The typical use of MSA is to report an average value for a region, but when considering the impacts of localized agricultural expansion, relatively large local changes can be masked by the overall size of the landscape that is not changing. In our case study, the production scenarios convert <1% of the overall landscape. Biodiversity impacts should be considered in terms of their local rather than aggregate effects, in order to form a more conservative estimate of impacts especially for species whose ranges may be limited. We therefore subtract the scenario MSA maps from the baseline (2007) MSA maps for Iowa and Mato Grosso, and report results as averaged only over those pixels whose MSA values changed due to the agricultural expansion scenarios. This extends beyond the pixels that are actually converted from natural habitat to agriculture because the remaining habitat's configuration is altered by the conversion. Thus, MSA within unconverted habitat may still decline due to fragmentation resulting from agricultural expansion.
Because the method used in standard LCA multiplies the characterization factor in MSA by the area occupied or transformed (amortized for 20 years), we follow the same method to keep the two approaches as comparable as possible. However, we do not multiply only by the area of agricultural expansion, but the total impacted area, which includes both areas that have converted (from natural habitat to crop) as well as the areas that have been affected by fragmentation through their proximity to converted areas. (Supplementary  Table 23).

Supplementary Table 23. Area of impact for Mean Species Abundance (MSA)
Impacted area could also be smaller than converted area if the area studied had very little habitat remaining, with some of that remaining habitat registering MSA values as low as the agriculture replacing it. In this case study, Mato Grosso has a larger area impacted than converted, as a result of fragmentation, while Iowa has a slightly smaller area impacted than converted

Model Sensitivity
To test the model's sensitivity to the error in MSA values for each land use related impact (land transformation, fragmentation, and infrastructure (e.g. roads)), we run the model with upper and lower bounds set at the mean MSA value plus or minus the standard error given in the meta-analysis by Alkemade et al. (Ref. 75). Based on these uncertainty bounds, the MSA by area impact from Scenario 3 varies from -51% to 68%, and from -32% to 11%, respectively, in Iowa and Mato Grosso, relative to the original estimate.

Integrating the predictive, spatially explicit information into LCA
The outputs of the individual InVEST models described in Section 3.2 are used to estimate ecosystem impacts and to directly substitute key elements (inventory data) in the agricultural stage of standard LCA. Adaptations, which transform the standard LCA into LUCI-LCA, are described next, for each of the LCA impact categories considered in this study.

Global Warming Potential
In LUCI-LCA, results of spatially explicit modelling substitute elements of the LCA, changing the estimates of carbon dioxide emissions from land use change. We also consider the spatially explicit impacts of agricultural intensification and irrigation on Global Warming Potential.

Greenhouse gas emissions from land use change
In LUCI-LCA, the output from the InVEST carbon edge effects model replaces results from the Direct Land Use Change Assessment tool (Ref. 13) that are used in the standard LCA. Specifically we replace the "CO2 emissions from transformation" component of the life-cycle inventory (Figure 1, main manuscript).
The land use change (carbon loss) results provided from the InVEST model are based on the total amount of crop required to meet the demand for bio-HDPE for the different scenarios prior to allocation. In order to use these results in the LCA, they are allocated to HDPE, amortized over 20 years and converted into carbon dioxide equivalents (CO2-eq.). The carbon dioxide emissions from land use change in the LCA are substituted with these updated results from the InVEST model.
The key points of difference between the approaches are; 1) emissions induced by land use change are spatially explicit in LUCI-LCA; 2) trends in land use change are evaluated on a regional (state) level (rather than country level) and; 3) impacts are based on the difference between current and predicted future change (rather than historical change over the last 20 years). The amount of different habitat types considered as changed are different between the two approaches, with standard LCA suggesting much more forest loss in Mato Grosso than is predicted by the logistic LCM, and no forest loss in Iowa, in contrast to LUCI (Supplementary Figure 4).

Greenhouse gas emissions from irrigation
Updated irrigation water volumes modelled as described in 3.2.3 are used to estimate the greenhouse gas emissions from the electricity required for pumping water during irrigation.
The sugarcane and maize datasets are updated to include irrigation based on 2.66% and 0.53% of the crop areas being irrigated respectively. The average volumes of water for irrigation of the sugarcane and maize used in the LUCI-LCA are calculated from irrigation volume (km 3 ) as given in Supplementary Table 24

Greenhouse gas emissions resulting from intensification (sugarcane)
Life cycle inventories for sugarcane cultivation are updated to consider the impact from intensification, although as noted previously maize is considered to be close to its maximum yield (See section 3.1). The intensification includes increase in yield and additional nitrogen fertilizer application.

Increase in yield
The percentage increase from the current spatially explicit weighted average yield to reach the theoretical intensified yield for each scenario is considered as given in Supplementary Table 26. In the LUCI-LCA, the yield increase is applied to the total results from the agricultural stage of the life cycle.

Increase in nitrogen fertilizer application
A factor representing the relative increase in N is derived from the nutrient model (section 3.2.2.3) and applied to derive the additional amount of N-fertilizers used (ammonium nitrate phosphate, ammonium sulphate, diammonium phosphate, potassium nitrate and urea) and therefore additional greenhouse gas emissions from their production is linearly derived. The increase in transport requirements to deliver the additional quantities of fertilizers to the farms is also considered, as well as the additional emissions of nitrous oxide, ammonia, nitrate and nitrogen oxides at the farm. Supplementary Table 27 provides a summary of inputs from InVEST used to estimate the intensification from an increase in nitrogen fertilizer application.

Additional Transportation of N-Fertilizer
The additional transport steps required for the additional N-fertilizer are estimated using the following data from econinvent: 'RER: transport, freight, rail,' 'RER: transport, lorry >16t, fleet average' and 'RER: transport,  Table 15).

Additional N related emissions associated with fertilizer production and application
The factors applied to calculate emissions of ammonia, nitrous oxide and nitrate in sugar cane and maize inventories in ecoinvent assume a linear relationship between the levels of inputs per hectare and the level of emissions per hectare (Ref. 2; Nemecek et al. 76 ). For consistency with the ecoinvent approach and IPCC approach (Ref. 32) the increase of N-related emissions per hectare are assumed to be directly proportional to the increase in N-fertilizer use per hectare. This means that the additional emissions from the application of additional fertilizer are calculated as a fraction of total emissions per ha, based on the factor describing the increase of fertilizer rate (Supplementary Table 27). These emissions are later attributed to the production volume, based on the estimated volume of crop that is coming from intensification (Supplementary Table 27).

Eutrophication Potential
The spatially explicit modelling of nutrient loss influences the values of nitrate leaching from the fields, which affects the Eutrophication Potential, along with intensification and irrigation.
Nitrogen export from the InVEST NDR model is substituted into standard LCA in place of the nitrate emissions to water in the agricultural inventories. The life cycle inventory for Brazilian sugarcane in ecoinvent 2.2 (Ref. 12) contains only a rough estimation of nitrate leaching, calculated with an emission factor of 2.5% of the N contained in the fertilizer, following work conducted by Stewart et al. for sugarcane fields in Australia. 77 Nitrate emissions for maize in ecoinvent 2.2 are based on the emission factor of 32% of the N contained in the fertilizer. This is based on field measurements from 1987 to 1994 according to Randall et al. 78 Nitrogen loss results provided from the InVEST model are based on the total amount of crop required to meet the demand for bio-based HDPE for the different scenarios prior to allocation. In order to use these results in the LCA, they are allocated to the main product (HDPE).
There are several key points of difference between the standard and LUCI-LCA. In standard LCA, all the N that has the potential to leach to groundwater is assumed to reach the surface water. The LUCI-LCA approach considers the configuration of landscape and its effects on N leaching. Standard LCA inventories are based on single yield figures and N application rates, while LUCI-LCA uses spatially differentiated yield and N application relationships, based on climate. Additionally, there can be some inconsistency between data in some standard life cycle inventories depending on data availability (e.g., in this case, estimates for sugarcane are based on a modelling study, while those for maize are based on direct measurements of drained fields).
The updated impact for irrigation as given in section 3.3.1.2, and the additional impact from intensification, which includes increase in yield and increase in nitrogen fertilizer application as described in section 3.3.1.3 are added to the total result.

Water Consumption
The water consumption considered here is for irrigation only, and thus replaces the life-cycle inventory for the volume of consumed water during the irrigation process for crop production.
The main differences when compared to the LCA data, obtained from the WFN database 79 , concern the data sources and processing.  81 in LCA vs. the yield data described in Supplementary Note 5 for LUCI-LCA). The difference in sources is necessary for the spatiallyexplicit modeling conducted in the LUCI-LCA approach.
The water consumption values for both standard and LUCI-LCA were also evaluated according to the AWARE impact assessment methodology to determine if the results changed when water scarcity of the two basins was included. In this case, it did not change the direction of the difference between the two feedstocks (see Supplementary Note 6 for more details).

Erosion Potential
In LUCI-LCA, the InVEST method for calculating sediment export (T/yr) is a direct replacement for the Saad  Note 3) that are specific to sugarcane and maize. Finally, and perhaps most importantly, standard LCA does not consider landscape configuration surrounding the occupied land. The LUCI-LCA approach models the retention of soil by the vegetation between the occupied land and the river, thus attenuating much of the potential soil eroded.

Biodiversity Damage Potential
In LUCI-LCA, the InVEST/GLOBIO method for calculating MSA impact is a direct replacement for the De Baan et al. 8 approach used to calculate Biodiversity Damage Potential within standard LCA for the agricultural stage of the life cycle. De Baan et al. also use MSA to estimate this biodiversity damage, but the method uses potential natural vegetation as the baseline from which to measure an effect. In LUCI-LCA we employ the current (2007) land use for the baseline. This is the greatest difference, and may in large part explain the orders of magnitude difference between the impacts estimated by the two methods. The difference between potential natural vegetation and currently occupied land is 0.84, many times larger than the average reduction in MSA on the impacted pixels found in both regions using the LUCI-LCA approach (~0.11-0.16).
The standard LCA method also adds the effects of transformation to those of occupation, both of which are compared to the same potential natural vegetation state (though transformation is amortized by 20 years to derive an annual figure, and occupation by the number of months out of the year the land is used for the crop in question). Thus, transformation impacts may be small compared to the occupation impacts for biodiversity damage when using the standard LCA method. In contrast, the LUCI-LCA method only considers transformation impacts of the future land use change predicted by the LCM (also amortized over 20 years, to follow convention). The difficulty with the interpretation of the standard LCA method for calculating biodiversity damage is the implication that occupation of land converted many years ago is linked to current impacts, when in reality the loss of biodiversity happened at the time of the change and cannot easily be reinstated. The greater current threat to biodiversity is the agricultural expansion, not the occupation of already converted land, hence our focus on land transformation only in LUCI-LCA.
Finally, the spatially explicit nature of the LUCI-LCA method allows several advances beyond the standard LCA approach. As previously mentioned, area impacted considers land that is reduced in quality due to fragmentation, not only transformation. Also, greater precision in current habitat types is possible due to the remotely sensed land cover data used, as compared to national or regional averages for standard LCA.  Table S1i n/a 16 Barren or sparsely vegetated 0 0 0 n/a n/a 17 Maize/Sugarcane expansion 0 n/a 0 n/a n/a *Note: all values listed in Total C are carbon stocks for above-and below-ground combined, for all land covers except forest in Mato Grosso, for which below is listed separately, to be added to above-ground estimates generated by edge effects model.

Supplementary Note 4 Effect of intensification practices on ES and their modeling
Note: the assumptions summarized below synthesize the information in the references and indicates the value selected for the InVEST model.

Increase in load
Information on fertilizer management suggest that a decrease by 10% is a conservative assumption with precision agriculture 93 Assumption: load is reduced by 90% wrt the "conventional practice" value.
-Irrigation -Increase in leaching Empirical evidence 91,92 Assumption: leaching (proportion of subsurface flow) is reduced by ~90% based on Table S3 and effect of irrigation in the Metamodel

Decrease in water recharge
Additional irrigation may help increase yields. Amount is proportional to the plant water deficit (difference between precipitation and potential evapotranspiration)

Increase in water scarcity
Water balance approach to increase yields.

Supplementary Note 5
Parameter

Comparison of Water Consumption Results Using AWARE Methodology
The AWARE characterisation factors are given with two spatial resolutions (watershed and country) and two temporal resolutions (monthly and annual). We use the watershed values aligned with the spatially-explicit predicted prediction of agricultural expansion from our LCM. We use the annual agricultural average because we do not know the exact months of irrigation. The annual agricultural average is an average of the monthly values based on agricultural water consumption usually happening in this watershed, so it provides a general picture of the region (assuming the crop of interest is not being irrigated at completely different times compared to other crops).
The water scarcity footprint is calculated by multiplying the water consumption (inventory) in m 3 by the AWARE characterisation factor of the specified time and place (in m 3 -eq / m 3 )called and is expressed in m 3 -eq. The methodology is based on The WULCA consensus paper submitted by Boulay et al. 98 The AWARE methodology and data are available at: http://wulca-waterlca.org/project.html.

Iowa
There are two watersheds in Iowa in the AWARE dataset. The watershed that covers most of the state and intersects with the agricultural expansion in our model has an agricultural annual average of 1.2. Therefore the water consumption in all Iowa scenarios (in m 3 ) is multiplied by this factor 1.2 to obtain results in m 3 -eq.

Mato Grosso
There are three watersheds in Mato Grosso in the AWARE dataset. One watershed that accounts for 75% of the agricultural expansion has an agricultural annual average of 0.5, while another watershed that accounts for the remaining 25% of the agricultural expansion has a value of 1.1. Weighting the average by area produces a characterization factor of 0.65. Therefore the water consumption in all Mato Grosso scenarios (in m 3 ) is multiplied by this factor 0.65 to obtain results in m 3 -eq.

Water scarcity footprint results (Scenario 3 volumes):
Standard While the characterization factor suggests higher impact per unit of water used in Iowa than in Mato Grosso, this weighting is not enough to change the overall LUCI-LCA result (i.e., Mato Grosso has a higher water scarcity footprint than Iowa, for the case study scenarios). In LUCI-LCA, use of AWARE moderates the results such that sugarcane goes from being 10 times worse than maize to being ~ 6 times worse. In standard LCA, however, use of AWARE results in the water scarcity impact of maize changing from being roughly on par with sugarcane to being nearly twice as impactful.