Global process-based characterization factors of soil carbon depletion for life cycle impact assessment

Regionalization of land use (LU) impact in life cycle assessment (LCA) has gained relevance in recent years. Most regionalized models are statistical, using highly aggregated spatial units and LU classes (e.g. one unique LU class for cropland). Process-based modelling is a powerful characterization tool but so far has never been applied globally for all LU classes. Here, we propose a new set of spatially detailed characterization factors (CFs) for soil organic carbon (SOC) depletion. We used SOC dynamic curves and attainable SOC stocks from a process-based model for more than 17,000 world regions and 81 LU classes. Those classes include 63 agricultural (depending on 4 types of management/production), and 16 forest sub-classes, and 1 grassland and 1 urban class. We matched the CFs to LU elementary flows used by LCA databases at country-level. Results show that CFs are highly dependent on the LU sub-class and management practices. For example, transformation into cropland in general leads to the highest SOC depletion but SOC gains are possible with specific crops. Measurement(s) soil organic carbon depletion Technology Type(s) process-based modelling Factor Type(s) land use class • world region • soil carbon Sample Characteristic - Environment soil Sample Characteristic - Location global Measurement(s) soil organic carbon depletion Technology Type(s) process-based modelling Factor Type(s) land use class • world region • soil carbon Sample Characteristic - Environment soil Sample Characteristic - Location global Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.15163911


Background & Summary
Land use (LU) and LU change are important drivers of change in the state of ecosystems globally 1 . Life cycle assessment (LCA) is increasingly used for estimating, comparing and highlight potential areas to reduce environmental impact of products and commodities throughout their supply chain [2][3][4][5] . Life cycle inventories (LCI) compile elementary flows, which are resources required in a unit process and emissions into the environment after production. Areas occupied and transformed, measured in m 2 .year and m 2 , respectively, are two of those LCI flows. In the last decade, different models were proposed to classify and characterize LCI flows into impacts, through life cycle impact assessment (LCIA). LCIA uses characterization factors (CFs) to determine the contribution of each inventory flow to each environmental indicator of interest. Soil organic carbon (SOC) depletion has been one of the most used indicators related to LU and LU change (among others, as biodiversity loss 6 ) because it is a good proxy for LU damages to the biotic primary production potential of soils 7 and other ecosystem services 8,9 . SOC depletion is included in environmental effects connected with the area of protection of "Natural Environmental" 1 . The Joint Research Centre of the European Commission recommends SOC depletion as the indicator for midpoint LU impacts 10 .
All published methods that used SOC depletion indicator are proxy-based and are based on a combination of statistical analysis and geographical information systems. They have varying levels of regionalization (i.e. spatial differentiation) and LU class differentiation. Among global methods, the first widely accepted method that proposed CFs was developed by Milà i Canals et al. 11 , a method without regionalization, i.e., for the same LU, a single CF is used globally. Other methods introduced regionalization at different levels. For example, Brandão and Milà i Canals 12 developed CFs at the climate region scale and Teixeira et al. 13 used a combination of climate region and soil type. Nevertheless, the number of regions and LU classes of these models is limited. For example, Teixeira et al. 13 considered 96 regions and 4 LU classes. This is a consequence of requiring actual SOC measurements that need to be aggregated for statistical representativeness at wider geospatial scales and broader LU class.
Process-based modelling (PBM) is an approach based on formulating biogeochemical processes in to mathematical-ecological theory. These models consider site soil conditions, soil management practices and climatic data 14,15 . They consider temporal and spatial scales based on scenarios that characterize intra and inter-annual dynamics. They generally require more data than proxy-based models, but allow higher level of detail and have the possibly of reducing uncertainties because they are based on processes and not on statistics 16 . For example, the Rothamsted Carbon (RothC) Model 17 is a well-accepted soil process model that simulates SOC turnover [18][19][20][21] . PBM have been used before to obtain CFs with higher level of regionalization and number of LU classes, but those were local/regional models only or involved only one type of LU systems (e.g. cropland) [22][23][24][25] .
Here, we propose a set of LU-LCIA CFs using SOC depletion as an indicator, using recently published data by Morais et al. 26 involving global highly-regionalized and LU-specific results, from a global application of RothC. We considered 81 foreground LU classes (63 individual cropland classes, 16 forest classes and 1 grassland class, plus an urban LU class) and 17,203 regions. This is a new paradigm for how global CFs in LCIA can be calculated that combines PBM with LCA. Data resulting from this paper will enable LCA practitioners increased accuracy for their LCA studies in the "Natural Environmental" area of protection 1 , and will serve as demonstration that it is possible to use PBM globally and for all useful LU classes.
In this paper, we use specific terminology to separate two distinct methods for referring to the land transformation CFs calculated. We refer to "foreground" and "background" CFs for LU impacts. Background transformation CFs are equivalent to the "traditional" formulation used in the LCA community, i.e. CFs are defined with an unknown initial LU and a known final LU (e.g. "transformation to cropland"). The term "background" is due to the fact that these factors are mostly useful in combination with background LCI processes, as databases typically only include the final LU state and not the initial state prior to transformation. We define foreground CFs as those that have two known LU classes, i.e. when both initial and final LU classes are known (e.g. "transformation from irrigated tomato to irrigated cabbage"). Methods the RothC model. The Rothamsted Carbon Model (RothC) estimates carbon turnover in non-waterlogged soils 17 . It was developed for arable soils in the United Kingdom, but it has been expanded and successfully applied to model soil carbon dynamics also in grassland 18,27 and forestry 28,29 LUs in other regions of the World. It takes into account the effects of temperature, moisture content and soil type. SOC is divided in five compartments or pools, depending on decomposability: inert organic matter, easily decomposable plant material, resistant plant material, microbial biomass and humified organic matter. The inert organic matter pool is resistant to decomposition and does not receive C inputs 30 . Each compartment, except inert organic matter, decomposes according to a first-order decomposition process. The model uses a monthly step. Here, we used the RothC model to estimate the dynamics of SOC stock accumulation and loss after LU change.
Land use characterization model. Here, we use the characterization model proposed by Milà i Canals et al. 31 and updated by Koellner et al. 32 with some modifications for transformation CFs. Land "occupation" and land "transformation" as basic types of land use elementary flows that affect ecosystem quality. Land occupation refers to the use of a given area for human purposes during a certain period, while land transformation refers to the conversion of a certain area to a new occupation.
The occupation CF, as defined by Koellner et al. 32 , is the difference between the attainable SOC (ASOC) for potential natural vegetation (PNV), designated as ASOC PNV , and the ASOC for LU2 (ASOC LU2 ). ASOC is the potential maximum SOC stored under a given LU given constant climate and soil conditions. PNV is the vegetation type that the LU system would revert to if human occupation ceased. The CF was calculated according to The model expressed by Eq. (1) assumes that the impact of occupying land are the foregone ecosystem services provided by SOC due to the fact that LU2 is delaying regeneration of land to PNV, measured as the difference in ASOC between LU2 and PNV.
For transformation, we considered an exponential transition (given by RothC) between the SOC of initial and final states while Koellner et al. 32 considered a linear transition. The impact of transformation is the accumulated SOC deficit during revegetation with PNV between two cases -if regeneration started without transformation to LU2, and after occupation with LU 2 . The transformation CF is therefore the area comprised between the SOC curve during regeneration from LU 2 and ASOC at PNV for the period between t f and t reg,LU2 (Impact LU2 in Fig. 1), minus the area comprised between the SOC curve for regeneration from LU 1 and the ASOC at PNV for the period between t ini and t reg,LU1 (Impact LU1 in Fig. 1). ASOC is a characteristic of each LU type, and we assume that regeneration to PNV starts from LU systems in equilibrium (i.e. with SOC level at the start of the transition equal to ASOC). Occupation and transformation CFs express SOC depletion, which means that a positive CF implies higher SOC loss in the transition to LU 2 (and vice-versa for a negative CF).
Foreground characterization factors. To calculate Impact LU1 and Impact LU2 (Fig. 1), we calculated the integral between ASOC at PNV and each SOC dynamic curve starting at the beginning of the transformation (t ini and t f for LU1 and LU2, respectively) and ending when ASOC is achieved (t reg,LU1 and t reg,LU2 for LU1 and LU2, respectively). In this approach both LUs are known, and therefore the CF is calculated according to +∞ +∞ www.nature.com/scientificdata www.nature.com/scientificdata/ Background characterization factors. For background transformation CFs LU1 is undetermined. Impact LU1 was calculated as the average of impacts of transformations from LU classes within each region according a LU map 33 . For example, if a certain region is divided in 50% cropland and 50% forest, Impact LU1 is the average impact of the individual crops feasible in that region multiplied by 50% plus the average impact of the individual forest types feasible in the region multiplied by 50%.
Here, we only calculated background transformation CFs at country-level because these CFs are meant to be used in background inventory flows that are at country-level (or even a higher level of aggregation). Data used. ASOC and SOC dynamic curves data was obtained from Morais et al. 26 and is available in Zenodo 34 . Their work covers 80 LU classes, including 1 grassland class, 16 forest classes, and 63 agricultural classes. The 63 agricultural classes correspond to 28 individual crops the differ according to management practices. First, cropland SOC curves were determined for rainfed and irrigated production. Then, for cereal classes, they considered two management options for residue management: residues are left on the field and residues are removed from the field. All analyses were repeated for three organic fertilization scenarios.
We used the spatial aggregation proposed by Morais et al. 26 , where the world is divided in to 17,203 unique territorial units (UHTU). UHTUs are geographical regions where the local characteristics (i.e. soil type and texture, climate type and current LU) are uniform. Thus, UHTUs were obtained by overlaying thermal zones, land cover, soil type, soil texture and country. The UHTU map resolution is 0.083 decimal degrees (approximately 10 km × 10 km at equator) and is also available in Zenodo 34 .
LU maps used in the background transformation CFs were obtained from Erb et al. 35 and can be downloaded from Erb et al. 36 . These maps consider four classes (cropland area, forestry area, grazing land and urban) for the all World. Resolution of all LU maps was also 0.083 decimal degrees. Each pixel has the fraction of each class of LU present (e.g., x% of cropland and y% of urban). The 63 agricultural LU classes from Morais et al. 26 correspond to the cropland class, the 16 forest classes to the forestry, and the grassland class to the grazing land.
Calculation procedure. First, we defined the PNV LU class (among forest and grassland classes) as a simplification that ASOC at PNV should be the maximum achievable ASOC in each UHTU, which was also the approach used by Teixeira et al. 13 . When the PNV was a forest and the initial SOC stock was significantly different from the initial SOC used by Morais et al. 26 , the fourth-degree polynomial obtained by Morais et al. 26 led to implausible results. For example, using the parameters provided by Morais et al. 26 for the forest growth period, if the initial SOC stock was significantly lower than the one used by Morais et al. 26 , the SOC stock after forest growth sometimes reached zero or even negative values. Thus, in order to correct for this issue, we ran the RothC model for the forest growth period in each UHTU for all possible transitions between forest and other LU classes. For the period between the end of the forest growth and SOC stabilization, we used the exponential fit from Morais et al. 26 . The initial SOC stock used for each LU class was the ASOC stock obtained by Morais et al. 26 . All the other input data required (soil, vegetation and climatic data) to use RothC was also the same used by Morais et al. 26 . Soils were characterized with the soil cover period, initial SOC stock and clay content. The soil cover period is a binary monthly variable, where 1 means that the soil was covered with vegetation during that month and 0 means that the soil was bare. The initial SOC stock was obtained from the European Soil Data Centre 37 . Clay content was obtained from the Harmonized World Soil Database 38 (available from https://dare.iiasa.ac.at/44/). We used the IPCC methods [39][40][41] and crop yields obtained from the Food and Agriculture Organization of the United Nations (FAO) 42 to calculate C inputs from annual plant residues the residues. Precipitation was obtained from the database of the "Global Precipitation Climatology Project" 43 and monthly average air temperature was obtained from ASOC LU1 -Attainable soil organic carbon content before transformation; ASOC LU2 -Attainable soil organic carbon content in the actual land use; ASOC PNV -Attainable soil organic carbon content in natural vegetation; t ini -the instant when the LU1 occupation ends; t f -the instant when the LU2 occupation ends; t reg,LU1 -instant when SOC has reverted to the potential after LU1; t reg,LU2 -instant when SOC has reverted to potential after LU2; Impact LU1-PNV -impact of transformation from LU1 to potential natural vegetation; Impact LU2-PNV -impact of transformation from LU2 to potential natural vegetation.
www.nature.com/scientificdata www.nature.com/scientificdata/ MODIS 44 . Potential evapotranspiration was calculated using the Thornthwaite equation 45 , which uses monthly average air temperature, average day length, in hours, and number of days per month obtained from MODIS 44 .
We used a Monte Carlo method 46 considering 100 unique set of SOC dynamic curves. For each SOC dynamic curve a different set of input parameters was used, i.e. for each LU class in a certain UHTU, RothC is run 100 times, and in each of the runs the climate data and soil inputs vary according to a normal distribution depicting intra-UHTU variability (see in detail in Morais et al. 47 ). Thus, the final CFs, per LU and UHTU are equal to the average of the CFs obtained from the 100 runs. Sampling from a normal distribution ensured that the average results of all simulations were approximately equal to results obtained using the most representative data for each UHTU, while allowing for some outlier samples to be modelled, thus representing expected heterogeneity within each region.
Regarding transitions between the urban land use and PNV, we simulated this LU class in the RothC model by considering the soil covered all year and no carbon inputs in the soil, in 10 different UHTUs. After 100 years, there was almost no difference between the SOC stock and the inert organic matter pool (i.e. all other pools were close to zero) calculated by Morais et al. 26 using the method by Weihermüller et al. 48 . Therefore, in each UHTU we set the initial SOC stock equal to the inert organic matter pool from Morais et al. 26 and ran the RothC model to obtain the SOC dynamic curve between urban LU and the PNV. All the other input data required were also the same as used by Morais et al. 26 , using again a Monte Carlo approach 46 . Integration with LCI elementary flows. In most LCI databases, occupation and transformation flows are not at UHTU level or LU-specific. They are usually at country, continental or other representative scales and in aggregated LU classes. To ensure wide usability, we calculated occupation and transformation (for the background approach) CFs per country at aggregated LU classes for the elementary flows proposed by Koellner et al. 49 , which are used in the most common LCI databases (e.g. ecoinvent 50 and GaBi 51 ). CFs were aggregated at country level as the area-weighted average of all UHTUs in each country. LU aggregation was performed according the classification key shown in Table 1. Wetlands, bare areas and all water-related elementary flows do not have CFs for SOC depletion (as in other methods, e.g. Milà i Canals et al. 52 , which is the method used in the International Reference Life Cycle Data System -ILCD 10 ), thus they were omitted from Table 1. We assumed that an "Unspecified" elementary flow has the highest CFs (i.e. CFs for urban LU class in this paper), except for "Unspecified, natural" where we considered the forest LU class with lowest ASOC (the highest CFs). All elementary flows related with human activities and unrelated with agriculture were assigned to the urban class in this paper. All grassland and pasture elementary flows were attributed to the grassland class in this paper. Agriculture-related elementary flows were mainly divided in crop type (annual crop/permanent crop) and different management practices (rainfed/irrigated).

Validation of characterization factors.
We compared the results of this study with two proxy-based models that use SOC depletion or foregone carbon sequestration as the indicator for LU-LCIA and calculated CFs at global scale. The models are: Teixeira et al. 13 (CFs can be downloaded from https://pubs.acs.org/ doi/suppl/10.1021/acs.est.8b00721/suppl_file/es8b00721_si_002.xlsx) and Brandão and Milà i Canals 12 (CFs downloaded from https://static-content.springer.com/esm/art%3A10.1007%2Fs11367-012-0381-3/ MediaObjects/11367_2012_381_MOESM1_ESM.xlsx). In both models, croplands and forest LU classes are depicted as a single class. We compared the different methods in terms of regionalization (i.e. number of UHTUs), LU desegregation (i.e. number of LU classes) and the absolute value of CFs.

Data Records
We calculated 370,760 foreground occupation CFs and 5,198,763 foreground transformation CFs. Online-only Table 1 presents the number of occupation and transformation foreground CFs per LU class. For transformation we only calculated one CF for each pair of LU classes, e.g. we only considered "transformation from X to Y" and not "transformation from Y to X", because the latter CF can be obtained only multiplying the former CF by −1. The full list of occupation and transformation CFs is shown at Zenodo repository 53 .
Unlike most studies calculating CFs, we also quantified uncertainty for all CFs. This is the first paper that considers SOC depletion at global level that provides CFs uncertainty (mean and standard deviation).
In order to facilitate application by LCA practitioners, we additionally calculated occupation and transformation CFs using the background approach. In this approach the foreground CFs were aggregated per country and to the LU classes proposed by Koellner et al. 49 , which are used in the most common LCI databases (e.g. ecoinvent 50 and GaBi 51 databases). The key between LU classes used in this paper and elementary flows proposed by Koellner et al. 49 49 . In total we calculated 9,464 background occupation and transformation CFs. Background CFs also have an associated mean and standard deviation. Figure 2 shows the foreground transformation CF for four different transitions. The highest SOC depletions are found in the urban LU class (Fig. 2b). This occurs due to the fact that urban systems have no C inputs into the soil, and thus in the long term the ASOC stock is equal to the inert organic matter (in all other LU classes there are active organic matter pools). Among croplands, leaving crop residues on the field leads, in general, to an increase on SOC stock, and thus a negative CF when the initial LU is the same crop but without residues left on the field (Fig. 2a). Transitions to croplands from forest results in a SOC depletion/loss globally with rare exceptions (Fig. 2d). Transitions from grassland (Fig. 2c) lead, in general, to SOC depletion/loss, however for some crops under specific management practices (leaving residues on the field and irrigation) can result in SOC gains (negative SOC depletion) due to higher C input into the soil. An example of this is North American irrigated wheat maintaining residues on the field. (2021) 8 www.nature.com/scientificdata www.nature.com/scientificdata/ Among croplands, SOC depletion/gain is highly dependent on the specific crop type. For example, for irrigated maize, the effect of maintaining residues on the field results in a negative CF for about 87% of the UHTUs - Fig. 2a. SOC stock increases when cropland is converted to grassland in most of the UHTUs and initial LU classes. The exceptions are crop classes that have higher C inputs that grasslands, as is the case of cereals with high production of sub-products/residues that are incorporated into the soil. Results also illustrate that SOC dynamics are mostly influenced by crop residues and not precipitation and temperature, e.g. most of the UHTUs when transformed from rainfed maize removing residues from the field to irrigated maize maintaining residues on the field gain SOC, regardless of the region (Fig. 2a). This is even more evident when the final LU is urban (Fig. 2b), where in all UHTU the transformation leads to SOC losses. This means that the CFs are more affected by C inputs into the soil than mineralization rates (which depend on the climatic conditions). For example, less than 10% of the UHTs have positive CF for transformation to grassland from irrigated maize without residues left on the field (Fig. 2c). However, the percentage increases to 30% when the initial LU is irrigated maize with residues left on the field due the increase of C inputs in the soil.

Comparison between land uses.
On average, occupation CFs when crop residues are left on the field are 60% lower (less SOC depletion) than the CFs for the same crop when residues are removed. The difference is largest for wheat (80% less SOC depletion) and the smallest difference is for barley (48% less SOC depletion). The average effect of irrigation on all agricultural classes is a decrease of 30% in SOC depletion. The difference is minimum for sweet potato (less than 5% reduction) and maximum for maize (about 70% reduction). For both management practices, SOC depletion is affected the most in temperate regions, which is where crops have the highest potential yields and the highest need for irrigation. This result is a consequence of the ASOC stocks in Morais et al. 26 . For transformation CFs, the differences already found for occupation CFs are amplified due to the non-linearity of SOC regeneration, i.e. the difference between SOC curves between LU1 and LU2 are considered until the infinity (see in detail in the Methods section) while other models consider linear regeneration in finite time.
Most of the transformations from forest classes result in a SOC stock loss (positive transformation CF). Similar to the case of transformations from grassland, only few cases result in a negative transformation CF. For example, only 5% of the UHTUs have negative CFs for the transformation from broadleaf deciduous forest in the climate zone "warm temperate, dry" to irrigated maize without residues left on the field.
In general, forest LUs have higher uncertainty than agricultural LU classes, ±95 t c/ha and ±50 t C/ha, respectively (average SOC stock: 163 t C/ha and 43, respectively). "Needleleaf Evergreen -Cold temperate, dry" is the class with highest uncertainty for occupation CFs, i.e. average confidence interval ±200 t C/ha. "Needleleaf Evergreen -Tropical" is the LU class with the lowest interval of confidence (i.e. ±27 t C/ha). Among croplands, interval of confidence range between about 30 t C/ha ("Rainfed Olives") and 120 t C/ha ("Irrigated Sugarcane"). Figure 3 presents occupation CFs for the "agriculture/arable" LU class for the background CFs obtained in this paper, and the comparable CFs from Teixeira et al. 13 and Brandão and Milà i Canals 12 . Summary statistics about the aggregation of those factors at country scale are shown in Table 2. The geographic applicability of CFs is dependent on the number of UHTUs. CFs proposed in this paper used about 17,000 UHTU and therefore the variation range is the largest (between 30 and 600 t C/ha - Fig. 3a). Teixeira et al. 13 used only 96 UHTUs (less than Fig. 2 Graphical representation of characterization factors for transformation (a) from rainfed maize removing residues from the field to irrigated maize maintaining residues on the field, (b) from rainfed maize (maintaining residues on the field) to urban, (c) from grassland to rainfed maize (maintaining residues on the field), and (d) from needleleaf evergreen forest (in warm temperate and dry region) to rainfed maize (maintaining residues on the field). A positive value means a SOC depletion/loss (and conversely for SOC gain). SOC -Soil organic carbon.

technical Validation
www.nature.com/scientificdata www.nature.com/scientificdata/ 0.5% of the total number of UHTUs used in this paper), and CFs range between 10 and 40 t C/ha (Fig. 3b depicts consensus CFs using simple average approach). Brandão and Milà i Canals 12 used the lowest number of UHTUs, only 10 UHTUs, and CFs range between 7 and 30 t C/ha (Fig. 3c).
Despite these differences, in general the hotspots of SOC depletion are similar for all models. Highest SOC depletions are found at higher latitudes (i.e. North America and Northern Europe). There is a mismatch for specific countries such as India and Australia, which are hotspots of SOC loss in the work of Teixeira et al. 13 and, to a less degree, Brandão and Milà i Canals 12 , but not in the CFs obtained in this paper. The main factors that explain differences are the variability in the data sources particularly for characterizing ASOC at PNV. Here, ASOC before and after regeneration are both calculated using the same model and therefore are quantitatively consistent. Other characterization models in the literature use different sources for quantifying SOC at PNV. ASOC at PNV is highly variable between sources. For example, the IPCC 39 indicates that the maximum ASOC at PNV is 146 t C/ ha (for volcanic soils and boreal climate region), while according to the Global Soil Organic Carbon Map 54 from the FAO the maximum SOC stock is about 750 t C/ha in the boreal climate region. Another example of this is Brazil, one of countries with the highest occupation CF for Brandão and Milà i Canals 12 but not in CFs proposed in this paper and in CFs proposed by Teixeira et al. 13 . Again, this is ultimately due to differences in the quantification of ASOC at PNV. The PNV LU class in Brazilian UHTUs is "Tropical forest", which according to the IPCC 39 has lower C inputs than other forest LU classes, which combined with high mineralization produces fast organic matter turnover and therefore lower ASOC stock.

Usage Notes
One commonly referred problem regarding the use of advanced CFs from modelling is the fact there is insufficient resolution in inventories to use them 55 . This problem can be easily overcome for the CFs presented in this paper due to the distinction we introduced between background and foreground at LCIA level. At LCI level, it is already common to think in terms of background/foreground, but for LCIA the same distinction is useful. Put

This study
Teixeira et al. 13   www.nature.com/scientificdata www.nature.com/scientificdata/ plainly, the idea is to apply aggregated/simplified CFs when there is no field-level data, and more specific CFs when there can be sufficient information. To use these CFs in an LCA study, we propose joint use of foreground and background CFs as they are compatible and were obtained consistently through the application of the same model and data sources. During data collection for the life cycle inventory (LCI) stage, data regarding LU occupation and transformation should be compiled (including information for the initial and final LU classes). In the LCIA stage, foreground CFs should be applied for foreground LU inventory flows. Background CFs should be applied for background LU elementary flows obtained from LCA databases (or other sources), which just describe LU occupation and transformation for aggregated LU classes (as Koellner et al. 49 ). Following this procedure, practitioners will obtain more accurate impact assessment for the foreground processes/elementary flows (which usually represent most of the impact), instead of using the same highly generic CFs for both background and foreground elementary flows as is current practice in LCIA.
To facilitate the work of LCA practitioners, we provide all the CFs produced in this paper in a Zenodo repository 53 . This repository includes all the foreground and background CFs, in multiple data formats. All the original SOC dynamic curves used for calculating the CFs are also included in this Zenodo repository 53 (including transitions for crop, grassland, forest and urban LU classes) in the file "SOC_dynamics.zip".
The foreground occupation and transformation CFs (at UHTU level and at country level) are available in raster format (tiff file) and table format (excel file). The foreground occupation CFs in raster format are zipped in the file "Raster_Foreground_Occupation.zip" (one file per LU class) and foreground transformation CFs in raster format are zipped in the file "Raster_Foreground_Transformation.zip" (one file per LU class transition). The CFs available in raster format (tiff files) can be opened using geographic information systems (GIS) tools. Both foreground occupation and transformation CFs in table format are in the file "Table_CFs_foreground.xlsx" (all LU classes and transition in the same file).
The background occupation and transformation CFs (at country level) can also be downloaded in raster format (tiff file) and table format (excel file). The foreground occupation CFs in raster format are zipped in the file "Raster_Background_Occupation.zip" (one file per LU class) and transformation foreground CFs in raster format are zipped in the file "Raster_Background_Transformation.zip" (one file per LU transition). Both background occupation and transformation CFs in table format are in the file "Table_CFs_background.xlsx" (all LU classes and transition in the same file).
Finally, we also provide the background CFs produced in this paper in an OpenLCA 56 impact assessment method file "LCIA_OpenLCA_file.zip", available in the same Zenodo repository 53 . This file can be used without adaptation for the elementary flows in the PEF database (zip file compatible with JSON-LD).

Code availability
All code necessary to calculate the CFs is freely available from a Zenodo repository 57 . The ASOC and SOC data used to calculate CFs for crop LU classes were obtained from Morais et al. 34 (Zenodo repository 34 ). We used MATLAB release R2018a to calculate the CFs, including the new RothC runs for transitions to forests, which are available in Morais et al. 57 (Zenodo repository). The script to run the MATLAB version of the RothC model is the script "RothC_TMorais.m", and the script to calculate foreground and background CFs is "CFs_calculation.m". The code is not commented, but detailed instructions for how to use the MATLAB scripts is in the Zenodo repository can be provided by the authors after e-mail contact. The Zenodo repository also indicates the list of data needed to run the model (references for collecting the data can be found throughout the paper). Morais et al. 53 (Zenodo repository) includes all the SOC dynamics and obtained CFs.