A global dataset on phosphorus in agricultural soils

Numerous drivers such as farming practices, erosion, land-use change, and soil biogeochemical background, determine the global spatial distribution of phosphorus (P) in agricultural soils. Here, we revised an approach published earlier (called here GPASOIL-v0), in which several global datasets describing these drivers were combined with a process model for soil P dynamics to reconstruct the past and current distribution of P in cropland and grassland soils. The objective of the present update, called GPASOIL-v1, is to incorporate recent advances in process understanding about soil inorganic P dynamics, in datasets to describe the different drivers, and in regional soil P measurements for benchmarking. We trace the impact of the update on the reconstructed soil P. After the update we estimate a global averaged inorganic labile P of 187 kgP ha−1 for cropland and 91 kgP ha−1 for grassland in 2018 for the top 0–0.3 m soil layer, but these values are sensitive to the mineralization rates chosen for the organic P pools. Uncertainty in the driver estimates lead to coefficients of variation of 0.22 and 0.54 for cropland and grassland, respectively. This work makes the methods for simulating the agricultural soil P maps more transparent and reproducible than previous estimates, and increases the confidence in the new estimates, while the evaluation against regional dataset still suggests rooms for further improvement.

Additional conditions when a random value is chosen Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean is equal to 0.22 multiplied by N in chemical fertilizer provided by (Xu et al., 2019) and a standard-deviation equal to 25% of the mean was assumed.The value of 25 % was arbitrary chosen.
For each grid-cell, the 1 st estimate is equal to 0.20 multiplied by N in manure provided by (Xu et al., 2019) and the 2 nd estimate is equal to the 1 st one multiplied by a scaling factor based on the country-scale estimate of P manure produced by livestock (Demay et al., 2023).

NPP involved in plant uptake computation
Normal distribution with four standard-deviation between two estimates The two estimates are both based on the (Kastner et al., 2021) spatial distribution but with global average of either (Kastner et al., 2021) or (Sun et al., 2021).P concentration of aboveground plant involved in the computation of P plant uptake Normal distribution with four standard-deviation between two estimates First estimate of 2.5e -2 gP (100gFM) -1 (Ref (Wang et al., 2018)) and second estimate of 1.5e -1 gP (100gFM) -1 (Ref (Lun et al., 2021)).
Random value > 2.5e -2 gP (100gFM) -1 No uncertainty considered (i.e. for each grid-cell, once the random value for P uptake was computed, P in residues was deduced by keeping the same (P uptake : P residues) ratio as with the mean values of uptake and residue).Two estimates equal to the mean -50% and the mean + 50%, with a mean composition of 0.4 (inorganic labile), 0.4 (organic labile), 0.2 (stable organic).

Ensure consistency between the different fractions
For each grid-cell, the 1 st estimate equal to 0.20 multiplied by N in manure provided by (Zhang et al., 2017) and the 2 nd estimate equal to 1 st one multiplied by a scaling factor based on the country-scale estimate of P manure produced by livestock (Demay et al., 2023).
Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean is equal to the value computed following Eq.37 and a standard-deviation equal to 25% of the mean was assumed.The value of 25 % was arbitrary chosen.For each grid-cell, once the random value for P uptake was computed, P in residues was deduced by keeping the same (P uptake : P residues) ratio as with the mean values of uptake and residues.Two estimates equal to the mean -50% and the mean + 50%, with a mean composition of 0.4 (inorganic labile), 0.4 (organic labile), 0.2 (stable organic).

Ensure consistency between the different fractions
Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean is equal to the value derived from combination of (Wang et al., 2015) and (Wang et al., 2017), and a standard-deviation equal to 60% of the mean was assumed.The value of 60 % was derived from values provided at the global scale by (Wang et al., 2017).

Composition of P in atmospheric deposition
No uncertainty considered (i.e.once the random value for P total deposition was computed, the contribution of each source (mineral dust, and others) to the total deposition was deduced by keeping the same contribution as the one computed with the mean value).

Normal distribution defined by a mean and a standarddeviation
For each grid-cell, the mean is equal to the value derived from (van Puijenbroek et al., 2019) and (Demay et al., 2023), and a std equal to 15% of the mean was assumed.
The value of 15 % was arbitrary chosen.

Random value > 0
Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean is equal to the mean value provided by (Borrelli et al., 2017) and a standarddeviation equal to 16% of the mean was considered.The value of 16 % corresponds to the upper uncertainty range found in (Borrelli et al., 2017).Note : in (Borrelli et al., 2017), the uncertainty was not centered : -6.68% +15.6%: Random value > 0 Near-surface air temperature, soil temperature and soil water content (absolute and relative to the field capacity) Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean and standard-deviation were computed by using 9 CMIP-6 simulations.
The random value varies between the spatial minimum and spatial maximum of the mean value.
Soil texture, soil water pH, and soil carbon concentration Normal distribution with 3.75 standard-deviation between two estimates For each grid-cell, the two estimates correspond to the 5% and 95 % quantiles provided by Soilgrids 2.0 (Poggio et al., 2021) The random value varies between the spatial minimum of the 5% quantile and the spatial maximum of the 95% quantile Normal distribution defined by a mean and a standarddeviation For each grid-cell, the mean is equal to the mean value provided by (He et al., 2023) and the standard-deviation was approached by the standard-error provided by (He et al., 2023).The standard-error is preferred here, instead of the standard-deviation, as with random forest (as used to generate the dataset in (He et al., 2023)), the standarderror is a measure of the probability of the true value while the standard-deviation is a measure of the probability of samples which increases with the number of trees used in the random forest.The standard-error of the 0.1-0.2mwas used to approach the standard-error of the 0-0.3m horizon.
The random value varies between (the spatial minimum -spatial mean of the standard-error) and (the spatial maximum + the spatial mean of the standard-error).

Contribution of each pool to the total P in unmanaged soils
No uncertainty considered (i.e. the contribution of each pool to the total soil P is kept the same for each random value as the contribution to the mean).

Table S1 :
Summary of the strategy used to consider the uncertainty related to the different drivers in (data=GPASOIL-v1) ; i.e. strategy to compute a random value for each variable considered.