Mapping peat thickness and carbon stocks of the central Congo Basin using field data

The world’s largest tropical peatland complex is found in the central Congo Basin. However, there is a lack of in situ measurements to understand the peatland’s distribution and the amount of carbon stored in it. So far, peat in this region has been sampled only in largely rain-fed interfluvial basins in the north of the Republic of the Congo. Here we present the first extensive field surveys of peat in the Democratic Republic of the Congo, which covers two-thirds of the estimated peatland area, including from previously undocumented river-influenced settings. We use field data from both countries to compute the first spatial models of peat thickness (mean 1.7 ± 0.9 m; maximum 5.6 m) and peat carbon density (mean 1,712 ± 634 MgC ha−1; maximum 3,970 MgC ha−1) for the central Congo Basin. We show that the peatland complex covers 167,600 km2, 36% of the world’s tropical peatland area, and that 29.0 PgC is stored below ground in peat across the region (95% confidence interval, 26.3–32.2 PgC). Our measurement-based constraints give high confidence of globally significant peat carbon stocks in the central Congo Basin, totalling approximately 28% of the world’s tropical peat carbon. Only 8% of this peat carbon lies within nationally protected areas, suggesting its vulnerability to future land-use change. Field surveys suggest peatlands in the central Congo Basin are globally significant carbon stocks, storing approximately 28% of the world’s tropical peat carbon.

inundated 14 to depths up to 1.5 m during the main wet season 15 , suggesting seasonal river flooding and/or upland run-off as key sources of water. Whether peat accumulates under these river-influenced conditions is currently unknown.
In this Article, we present new in situ data on peat presence, thickness and carbon density (mass per unit area) from the central Congo Basin in DRC. We specifically investigated the riverinfluenced swamp forests along the Congo River and its Ruki, Busira and Ikelemba tributaries in contrast to previous data collection from interfluvial basins 9 (Fig. 1a). Every 250 m along 18 transects, we recorded vegetation characteristics, peat presence and peat thickness. We targeted a first group of ten transects in locations highly likely to contain peat, to help test hypotheses (detailed in Supplementary Table 1) about the role of vegetation, surface wetness, nutrient status and topography in peat accumulation. To improve mapping capabilities, we sampled a second group of eight transects specifically to test preliminary maps that gave conflicting results or suspected false predictions of peat presence (detailed in Supplementary Table 1). We combine these new field measurements from DRC with previous transect records in ROC using the same protocols 9 and other ground-truth data (Supplementary Table 2) to produce (1) a second-generation map of peatland extent, (2) a first-generation map of peat thickness and (3) a first-generation map of below-ground peat carbon density for the central Congo Basin. These maps enable us to compute the first well-constrained estimate of total below-ground peat carbon stocks in the world's largest tropical peatland complex.

Mapping peatland extent
We found peat along all ten hypothesis-testing transects in DRC that were predicted to be peatlands 9 . Our new field data show that extensive carbon-rich peatlands are present in the forested wetlands of the DRC's Cuvette Centrale, including in geomorphologically distinct river-influenced regions predicted as peatlands by ref. 9 .
The best-performing algorithm to map the peatlands was the maximum likelihood (ML) classifier, because of its ability to most accurately predict in regions with no training data (Methods). ML was run 1,000 times on nine remotely sensed datasets using a random two-thirds of 1,736 ground-truth data points each time (Extended Data Fig. 1), giving a median total peatland area for the central Congo Basin of 167,600 km 2 (95% confidence interval (CI), 159,400-175,100 km 2 ). This is 15% higher than the previous estimate 9 . We found that 90% of all pixels that are predicted as peat in the median map result were predicted as peat in at least 950 out of 1,000 runs (that is, with ≥ 95% probability, either as hardwood-or  Points indicate transects, coloured by region. The Congo and Ruki River regional groups appear to be in largely river-influenced peatlands, predominating in DRC, sampled for this study. The Likouala-aux-Herbes River and Ubangi River regional groups are in largely rain-fed interfluvial basins, predominating in ROC, from ref. 9 . The base map, in green, shows the first-generation peat swamp forest map 9 . Inset: location of the central Congo Basin peatlands. b, Predicted land-cover classes across the central Congo Basin from this study as the most likely class per pixel (>50%), using a legend identical to ref. 9 to facilitate comparison. In both panels, national boundaries are black lines; sub-national boundaries are grey lines; non-peat-forming forest includes both terra firme and non-peat-forming seasonally inundated forests. Panel a adapted from ref. 9 , Springer Nature Limited.  9 shows that of the 382 locations assessed across DRC, 77.7% were correctly classified as either being peat swamp or not by the first-generation map 9 . Comparing our new map with the first-generation map 9 shows large areas of agreement (white in Fig. 2). However, we predict areas of peat that were previously not mapped 9 , particularly around Lake Mai-Ndombe and the Ngiri and upper Congo/Lulonga rivers in DRC (red in Fig. 2). In addition, small areas of previously predicted peat deposits 9 are no longer predicted by our new model, particularly along the Sangha and Likouala-Mossaka rivers in ROC (blue in Fig. 2). These areas of difference are probably areas of high uncertainty and should therefore be priorities for future fieldwork.
More formally, we compare our new second-generation map with the original map 9 using balanced accuracy (BA), which is similar to MCC but better suited for comparison across different datasets 16 . For our new map, median BA is 91.9% (95% CI, 90.2-93.6%), compared with 89.8% (86.0-93.4%) for the first-generation map 9 . The substantially smaller BA interval indicates improved confidence in our new peatland map, despite only a small increase in median BA. This is probably due to the effect of our larger sample size being partly offset by an increase in its spatial extent and ecological diversity, particularly data from the Congo River region, where all algorithms that we tested are underperforming (Supplementary Table 3). Overall, our in situ data from DRC, including from river-influenced settings that are being reported for the first time, confirm the central Congo Basin peatlands as the world's largest tropical peatland complex, and that DRC and ROC are the second and third most important countries in the tropics for peatland area after Indonesia 1 , respectively (Extended Data Fig. 2).

Mapping peat thickness and carbon density
We measured peat thickness at 238 locations in DRC (including 59 laboratory-verified measurements; Extended Data Fig. 3), finding a mean (±s.d.) thickness of 2.4 (±1.6) m and a maximum of 6.4 m. This shows that river-influenced peatlands can attain similar peat thickness as rain-fed interfluvial basins reported in ROC 9 (Table 1). There is no uniform increase in peat thickness with distance from the peatland margin (Extended Data Fig. 4), with linear regression being only a modest fit (R 2 = 41.0%; root-mean-square error (RMSE) = 1.21 m). Thus, we developed a random forest (RF) regression to estimate peat thickness, using 463 thickness measurements across both countries. Our final RF model includes four predictors after variable selection (Methods): distance from the peatland margin, precipitation seasonality, climatic water balance (precipitation minus potential evapotranspiration) and distance from the nearest drainage point (R 2 = 93.4%; RMSE = 0.42 m). The RF model outperforms multiple linear regression with interactions using the same four variables (adjusted R 2 = 73.6%, RMSE = 0.80 m; Extended Data Fig. 5).
Spatially, we predict thick peat deposits in the centres of the largest interfluvial basins (far from peatland margins), and in smaller, river-influenced valley-floor peatlands along the Ruki/Busira rivers (Fig. 3a). The river valleys' thick deposits are probably driven by greater climatic water balance and lower precipitation seasonality in the eastern part of the Cuvette Centrale region (Extended Data Fig. 6), plus potentially greater water inputs from nearby higher ground, which offsets the shorter distances from peatland margins. Our modelled results are consistent with our field data, as the two deepest peat cores are from the interfluvial Centre transect in ROC (5.9 m) and the river-influenced Bondamba transect on the Busira River in DRC (6.4 m). Overall, mean (±s.d.) modelled Table 1  peat thickness (1.7 ± 0.9 m) is lower than our field measurements (2.4 ± 1.5 m; Table 1), as expected given our linear transects, which oversample deeper peat at the centre relative to the periphery in approximately ovoid peatlands. Areas of high uncertainty in peat thickness occur where distance from the margin is uncertain (Fig. 3b). Our results contrast strongly with an 'expert system approach' that assigned peat-thickness values on the basis of hydrological terrain relief alone and estimated a mean thickness of 6.5 ± 3.5 m for the central Congo Basin peatlands 17 , compared with our field-derived estimate of 1.7 ± 0.9 m (Fig. 3a). After distance from the margin, precipitation seasonality and climatic water balance are the most important predictors of peat thickness in the RF model, reflecting the relative importance of rainfall inputs in peat accumulation in central Congo. This appears to differ from smaller-scale assessments in temperate 18 or other tropical peatlands 19 , where surface topography (elevation and slope) are primary predictors of peat thickness. However, this is potentially merely an artefact of the spatial scale of the studies, as climate varies only over large scales. Alternatively, the relatively low rainfall in the central Congo Basin (~1,700 mm yr −1 ), compared with other tropical peatland regions (for example, ~2,500-3,000 mm yr −1 in Northwest Amazonia and Southeast Asia) 9,20 , may mean that peat thickness is more strongly related to climate in central Congo, as it implies greater exposure to (seasonal)    Fig. 7) to 100 peat-thickness estimates (Fig. 3a), generating 2,000 carbon-density estimates. b, Relative uncertainty (%) of the carbon-density estimate, expressed as ± half the width of the 95% CI as percentage of the median. Black lines represent national boundaries; grey lines represent sub-national administrative boundaries.
drought conditions that may cross thresholds that negatively impact peat accumulation rates. Peat bulk density measured across the central Congo Basin is 0.17 ± 0.06 g cm −3 (mean ± s.d.; n = 80 cores), and mean carbon concentration is 55.7 ± 3.2% (n = 80; 56.6 ± 4.5% for the 22 well-sampled cores). While peat bulk density is significantly lower in largely river-influenced sites than in rain-fed interfluvial basins (P < 0.01), no significant difference between these peatland types is found for either peat carbon concentration or carbon density (mass per unit area; Table 1).
We used the peat-thickness, bulk density and carbon concentration measurements to construct a linear peat-thicknesscarbon-density regression (Extended Data Fig. 7). We applied this regression model to our peat-thickness map to spatially model carbon stocks per unit area (Fig. 4a). Modelled below-ground peat carbon density for the central Congo Basin is 1,712 ± 634 MgC ha −1 , similar to the field-measured mean of 1,741 ± 1,186 MgC ha −1 (mean ± s.d., n = 80; Table 1). This carbon density is approximately nine times the mean carbon stored in above-ground live tree biomass of African tropical moist forests (~198 MgC ha −1 ) 21 . Compared with recently mapped peatlands in the lowland Peruvian Amazon (mean 867 MgC ha −1 ) 22 , the central Congo peatlands store almost twice as much carbon per hectare. Spatial patterns of peat carbon density (Fig. 4a) and uncertainty (Fig. 4b) follow similar patterns as peat thickness (Fig. 3a,b).

Estimating basin-wide peat carbon stocks
Median estimated total peat carbon stock in the central Congo Basin is 29.0 Pg (95% CI, 26.3-32.2; Extended Data Fig. 8a), based on bootstrapping the area estimate and peat-thickness-carbon-density regression. This is similar to the median 30.6 PgC reported by ref. 9 ., but their lower 95% CI was 6.3 Pg, which our study increases to 26.3 Pg. This constraint on the carbon-stock estimate is possible because our larger field-based dataset allows a spatial modelling approach so that we can sum carbon density across all peat pixels. Therefore, the possibility of low values of carbon storage in the central Congo peatlands can now confidently be discarded.
Our new results show that the central Congo Basin peatlands are a globally important carbon stock. About two-thirds of this peat carbon is in DRC (19.6 PgC; 95% CI, 17.9-21.9), and one-third in ROC (9.3 PgC; 95% CI, 8.4-10.2; Extended Data Fig. 2), which is equivalent to approximately 82% and 238% of each country's above-ground forest carbon stock, respectively 23 . The high peat carbon stocks are found across several administrative regions in both countries, with the largest stocks in DRC's Équateur province (Extended Data Fig. 2). Sensitivity analysis shows that uncertainty in total peat carbon stock is now driven mostly by uncertainty in peatland area (Extended Data Fig. 8b).
Because the central Congo peatlands are relatively undisturbed 24,25 , our new maps of peatland extent, thickness and carbon density form a baseline description for the decade 2000-2010, given the remotely sensed data used. Today the peatlands of the central Congo Basin are threatened by hydrocarbon exploration, logging, palm oil plantations, hydroelectric dams and climate change 24,26 . While the peatlands are largely within a UN Ramsar Convention transboundary wetland designation, we estimate that only 2.4 PgC in peat, just 8% of total stocks, currently lies within formal national-level protected areas (Extended Data Figs. 9 and 10). Meanwhile, logging, mining or palm oil concessions together overlie 7.4 PgC in peat, or 26% of total stocks (Extended Data Figs. 9 and 10), while hydrocarbon concessions cover almost the entire peatland complex 24,26 .
Our results show that the central Congo Basin peatlands cover approximately 36% of the world's tropical peatland area, and store approximately 28% of the world's tropical peat carbon 5 . Therefore, keeping the central Congo Basin peatlands wet is vital to prevent peat carbon being released to the atmosphere. The identification of extensive river-influenced peatlands suggests that there is more than one geomorphological setting where peat is found in the central Congo Basin. Further work is required to understand both the sources and flows of water in these river-influenced peatlands, specifically the relative contributions of water from precipitation, riverbank overflow and run-off from higher ground to peat formation and maintenance. Given the current areas of formal protection of peatlands are largely centred around interfluvial basins, we suggest that additional protective measures will be needed to safeguard the newly identified river-influenced peatlands of the central Congo Basin. Keeping the central Congo peatlands free from disturbance would also help protect the rich biodiversity, including forest elephants, lowland gorillas, chimpanzees and bonobos 24,27,28 , that form part of this globally important but threatened ecosystem.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41561-022-00966-7.

Methods
Field-data collection. Fieldwork was conducted in DRC between January 2018 and March 2020. Ten transects (4-11 km long) were installed, identical to the approach in ref. 9 , in locations that were highly likely to be peatland. These were selected to help test hypotheses about the role of vegetation, surface wetness, nutrient status and topography in peat accumulation ( Fig. 1a and Supplementary  Table 1). A further eight transects (0.5-3 km long) were installed to assess our peat mapping capabilities (Fig. 1a and Supplementary Table 1).
Every 250 m along each transect, land cover was classified as one of six classes: water, savannah, terra firme forest, non-peat-forming seasonally inundated forest, hardwood-dominated peat swamp forest or palm-dominated peat swamp forest. Peat swamp forest was classified as palm dominated when >50% of the canopy, estimated by eye, was palms (commonly Raphia laurentii or Raphia sese). In addition, several ground-truth points were collected at locations in the vicinity of each transect from the clearly identifiable land-cover classes water, savannah and terra firme forest.
Peat presence/absence was recorded every 250 m along all transects, and peat thickness (if present) was measured by inserting metal poles into the ground until the poles were prevented from going any further by the underlying mineral layer, identical to the pole method of ref. 9 . In addition, a core of the full peat profile was extracted every kilometre along the ten hypothesis-testing transects, if peat was present, with a Russian-type corer (52 mm stainless steel Eijkelkamp model); these 63 cores were sealed in plastic for laboratory analysis.
Peat-thickness laboratory measurements. Peat was defined as having an organic matter (OM) content of ≥65% and a thickness of ≥0.3 m (sensu ref. 9 ). Therefore, down-core OM content of all 63 cores was analysed to measure peat thickness. The organic matter content of each 0.1-m-thick peat sample was estimated via loss on ignition (LOI), whereby samples were heated at 550 °C for 4 h. The mass fraction lost after heating was used as an estimate of total OM content (% of mass). Peat thickness was defined as the deepest 0.1 m with OM ≥ 65%, after which there is a transition to mineral soil. Samples below this depth were excluded from further analysis. Rare mineral intrusions into the peat layer above this depth, where OM < 65% for a sample within the peat column, were retained for further analysis. In total, 59 out of 63 collected cores had LOI-verified peat thickness ≥0.3 m.
The pole method used to estimate peat thickness in the field was calibrated against LOI-verified measurements by fitting a linear regression model between all LOI-verified and pole-method peat-thickness measurements sampled at the same location (93 sites across ROC and DRC, including 37 from ref. 9 ). Three measurements from DRC with a Cook's distance >4× the mean Cook's distance were excluded as influential outliers. Mean pole-method offset was significantly higher along the DRC transects (0.94 m) than along those in ROC (0.48 m; P < 0.001) due to the presence of softer alluvium substrate in river-influenced sites in DRC. We therefore added this grouping as a categorical variable to the regression. The resulting model (adjusted R 2 = 0.95; P < 0.001; Extended Data Fig. 3) was used to correct all pole-method measurements in each group for which no LOI-verified thickness was available: corrected peat thickness = −0.1760 + 0.8626 × (pole-method thickness) -0.3284 × (country), with country dummy coded as ROC (0) and DRC (1).

Carbon-density estimates.
To calculate carbon density (mass per unit area), estimates of carbon storage in each 0.1-m-thick peat sample (thickness × bulk density × carbon concentration) were summed to provide an estimate of total carbon density per core (MgC ha −1 ), identical to ref. 9 . We estimated carbon density for 80 peat cores (OM ≥ 65%, thickness ≥ 0.3 m), located every other kilometre along 18 transects, including 37 cores from the ten transects used for hypothesis testing in DRC and 43 cores from eight transects in ROC 9 .
Peat thickness of the 80 cores was obtained by laboratory LOI. To estimate peat bulk density, every other 0.1 m down-core, samples of a known peat volume were weighed after being dried for 24 h at 105 °C (n = 906). Bulk density (g cm −3 ) was then calculated by dividing the dry sample mass (g) by the volume of the sample taken from the peat corer dimensions (cm 3 ). Within each core, linear interpolation was used to estimate bulk density for the alternate 0.1-m-thick samples of the core that were not measured.
For total carbon concentration (%), only the deepest core per transect, plus additional deep cores from the Lokolama transect (1) in DRC and Ekolongouma transect (3) in ROC (22 in total, 11 from DRC and 11 from ROC 9 ) were sampled down-core. Every other 0.1-m-thick sample was measured using an elemental analyser (Elementar Vario MICRO Cube with thermal conductivity detection for all cores, except those from Boboka, Lobaka and Ipombo transects, which were analysed using Sercon ANCA GSL with isotope-ratio mass spectrometer detection, due to COVID-19 disruption). All samples (n = 422) were pre-dried for 48 h at 40 °C and ground to <100 μm using an MM301 mixer mill. Again, linear interpolation was used within each core for the alternate samples that were not measured.
The remaining 58 cores had less-intensive carbon concentration sampling. We therefore interpolated the carbon concentration for each 0.1-m-thick sample because well-sampled cores show a consistent pattern with depth: an increase to a depth of about 0.5 m followed by a long, very weak decline and finally a strong decline over the deepest approximately 0.5 m of the core 9 . We used segmented regression on the 22 well-sampled cores (segmented package in R, version 1.3-1) to parameterize the three sections of the core, using the means of these relationships to interpolate carbon concentrations for the remaining 58 cores, following ref. 9 .
To estimate carbon density from modelled peat thickness across the basin, we developed a regression model between peat thickness and per-unit-area carbon density using the 80 sampled cores. We compared linear regressions for normal, logarithmic-and square-root-transformed peat thickness, selecting the model with the lowest corrected Akaike information criterion (AICc) and highest R 2 . A linear model with square-root-transformed peat thickness was found to provide the best fit (R 2 = 0.86; P < 0.001; Extended Data Fig. 7). Bootstrapping was applied (boot package in R, version 1.  to assess uncertainty around the regression. Modelling peatland extent. Satellites cannot detect peat directly. We therefore mapped vegetation and used field-based associations between peat and vegetation to infer peat presence 9,29 . Five land-cover classes were used for the purpose of peatland mapping: water, savannah, palm-dominated peat swamp forest, hardwood-dominated peat swamp forest and non-peat-forming forest. In this classification, field recordings of non-peat-forming seasonally inundated forest (<30 cm thickness of ≥65% OM) were grouped with field recordings of terra firme forest, which also does not form peat, to form the non-peat-forming forest class. Our field recordings of hardwood-or palm-dominated peat swamp forest, by definition, consist of all forest sites that form peat, including any seasonally inundated forest that forms peat (≥30 cm of ≥65% OM).
A total of 1,736 ground-truth data points were used: 172 in water, 476 in savannah, 632 in non-peat-forming forest (97 non-peat-forming seasonally inundated forest and 535 terra firme forest), 188 in palm-dominated peat swamp forest and 268 in hardwood-dominated peat swamp forest (Extended Data Fig. 1). These data come from eight sources (Supplementary Table 2): first, ground-truth locations collected for this study using a GPS (Garmin GPSMAP 64 s) at all transect sites in DRC for which a land-cover class was determined (382 points); second, published ground-truth data from nine transects in ROC (292 points) 9 ; third, 299 GPS locations of known savannah and terra firme forest land-cover classes from archaeological research databases across the basin 30,31 ; fourth, 191 GPS locations from permanent long-term forest inventory plots of the African Tropical Rainforest Observation Network, mostly from terra firme forest 32 , retrieved from the ForestPlots database 33,34 ; fifth, 229 GPS data points from terra firme forest or savannah locations in and around Lomami National Park (R. Batumike, G. Imani and A. Cuní-Sanchez, personal communication); sixth, 24 published savannah data points in and around Lomami NP 35 ; seventh, 23 published locations of savannah, terra firme forest and palm-or hardwood-dominated peat swamp forest in DRC 11 ; eighth, 296 data points from Google Earth for unambiguous savannah and water sites (middle of lakes or rivers) distributed across the region.
We used nine remote-sensing products to map peat-associated vegetation ( Supplementary Fig. 1). Eight of these are identical to those used by ref. 9 : three optical products (Landsat 7 ETM + bands 5 (SWIR 1), 4 (NIR) and 3 (Red)); three L-band synthetic-aperture radar products (ALOS PALSAR HV, HH and HV/HH); and two topographic products (SRTM DEM (digital elevation model) void-filled with ASTER GDEM v.2 data and slope; acquisition date 2000). To this, we added a HAND-index (height above nearest drainage point), which significantly improved model performances (median MCC 79.7%, compared with 77.8% or 75.6% for just DEM or HAND alone, respectively; P < 0.001).
HAND was derived from the SRTM DEM with the algorithm from ref. 36 , using the HydroSHEDS global river network at 15 s resolution as reference product 37 . Alternative NASADEM-or MERIT DEM-derived 38-40 combinations of DEM, HAND and slope were tested with an initial subset of data in R, while keeping all other remote-sensing products the same (median MCC: 79.0% and 75.1%, respectively), but did not significantly improve model performance compared with SRTM-derived products (80.9% median MCC; P < 0.001).
The Landsat bands are pre-processed, seamless cloud-free mosaics for ROC (composite of three years, 2000, 2005 and 2010) and DRC (composite of six years, 2005-2010) 41 . These mosaics performed better than more recent basin-wide automated cloud-free Sentinel-2 mosaics that we developed (bands 5, 8A and 11; composite of five years, 2016-2020), probably because they contain less directional reflectance artefacts (the median MCC of 80.9% for the pre-processed Landsat mosaics is significantly higher than the 78.1% for the Sentinel-2 mosaics, P < 0.005).
The ALOS PALSAR radar bands are mosaics of mean values of annual JAXA composites for the years 2007-2010 (ref. 9 ). More recent radar data (ALOS-2 PALSAR-2 HV, HH, HV/HH; 2015-2017) did not significantly improve model performances (median MCC 80.9% and 80.6%, respectively; P < 0.01). All remote-sensing products were resized to a common 50 m grid using a cubic convolution resampling method.
We then tested which classification algorithm to use, as more sophisticated algorithms might improve overall accuracy against our training dataset but might also reduce regional accuracy of the map in areas far from test data, critical in this case given large areas of the central Congo peatland region remain unsampled.
Three supervised classification algorithms were tested in order of increasing complexity: ML, support-vector machine (SVM) and RF. We assessed each classifier using both a random and spatial cross-validation (CV) approach [42][43][44] . Random CV was implemented using stratified two-thirds Monte Carlo selection, whereby 1,000 times, we randomly selected two-thirds of all data points per class as training data, to be evaluated against the remaining one-third per class as testing data. Spatial CV was implemented by grouping all transects data points in four distinct hydro-geomorphological regions: (1) transects perpendicular to the black-water Likouala-aux-Herbes River (n = 179 data points); (2) transects perpendicular to the white-water Ubangi River (n = 113); (3) transects perpendicular to the Congo River, intermediate between black and white water (n = 123); and (4) transects perpendicular to the black-water Ruki, Busira and Ikelemba rivers, plus other nearby transects (collectively named the Ruki group; n = 258). To each group we added ground-truth data points from other non-transect data sources (Supplementary Table 2) that belonged to the same map regions (n = 82, 27, 20 and 113, respectively). We then tested 1,000 times how well each classifier performs in each of the four regions when trained only on a stratified two-thirds Monte Carlo selection of the remaining data points (data points from the three other regional transect groups) plus ground-truth data points not associated with or near any transect group (n = 821; for example, the savannah and terra firme forest data points in Lomami National Park in DRC, which are far (> 300 km) from any transect group).
Model performance was based on MCC for binary peat/non-peat predictions (hardwood-and palm-dominated peat swamp forest classes combined into one peat class; water, savannah and non-peat-forming forest combined into one non-peat class). We compared MCC, rather than popular metrics such as Cohen's kappa, F1 score or accuracy, because it is thought to be the most reliable evaluation metric for binary classifications 45,46 . We also computed BA from random CV to compare with the first-generation map. While less robust than MCC, BA is independent of imbalances in the prevalence of positives/negatives in the data, thus allowing better comparison between classifiers trained on different datasets 16 . The best estimate of each accuracy metric or area estimate per model or region is the median value of 1,000 runs, alongside a 95% CI.
In the case of SVM and RF, random CV models were implemented in Google Earth Engine (GEE) 47 using all nine remote-sensing products. However, because ML is currently not supported by GEE, random CV with this algorithm was implemented in IDL-ENVI software (version 8.7-5.5), using a principal component analysis to reduce the nine remote-sensing products to six uncorrelated principal components to reduce computation time. All spatial CV models were implemented in R (superClass function from the RStoolbox package, version 0.2.6), with principal component analysis also applied in the case of ML only. All RF models were trained using 500 trees, with three input products used at each split in the forest (the default, the square root of the number of variables). All SVM models were implemented with a radial basis function kernel, with all other parameters set to default values.
Comparison of the ML, SVM and RF models with the model performance of ref. 9 , using balanced accuracy from random CV, shows improved results only in the case of the ML classifier (Supplementary Table 3). Comparing MCC using the spatial CV approach, we found that the ML algorithm is also most transferable to regions for which we lack training data. While RF gives slightly better MCC with random CV, when no regions are omitted, spatial CV shows particularly poor predictive performance of this algorithm for the Congo and Ruki regions when trained on data from the other regions. SVM has the lowest MCC of all three classifiers with random CV and performs worst of all three in the Congo region with spatial CV.
In addition, applying spatial CV to the largely interfluvial basin region (ROC transects; n = 401) and the largely river-influenced region (DRC transects; n = 540) also shows RF performs poorly (Supplementary Table 3). This further supports selecting the ML algorithm to produce our second-generation peat-extent map of the central Congo peatlands. The final peatland-extent estimate is then obtained as the median value (alongside 95% CI) of the combined hardwood-and palm-dominated peat swamp forest extent from 1,000 ML runs, each time trained with two-thirds of the ground-truth data.
Modelling peat thickness. A map of distance from the peatland margins was developed in GEE using the median ML peat probability map, the ML map with a 50% peat probability threshold (>500 hardwood-or palm-dominated peat swamp predictions out of 1,000 runs). For each peat pixel in this binary classification, a cost function was used to calculate the Euclidean distance to the nearest non-peat pixel after speckle and noise were removed using a 5 × 5 squared-kernel majority filter. Using this distance map, transects were found to have markedly different relationships between peat thickness and distance from the peatland margin, that is, different slopes (n = 18; P < 0.001; Extended Data Fig. 4). The modest linear fit (R 2 = 41.0%; RMSE = 1.21 m) cautions against a uniform regression between peat thickness and distance from the margin across the basin.
Instead, we developed a spatially explicit RF regression model to predict peat thickness, derived from 14 remotely sensed potential covariates that may explain variation in peat thickness. These 14 variables included the nine optical, radar and topographic products used in the peatland-extent analysis, as well as distance from the peatland margin, distance from the nearest drainage point (same reference network as for HAND) 37 , precipitation seasonality 48 , climatic water balance (mean annual precipitation 48 minus mean annual potential evapotranspiration 49 ) and live woody above-ground biomass 50 . Ten of these variables were found to be significantly correlated with peat thickness (Kendall's τ, P < 0.01): all three optical bands, all three radar bands, distance from the peatland margin, distance from the nearest drainage point, precipitation seasonality and climatic water balance. Applying stepwise backwards selection, we tested combinations of these ten predictors by each time dropping one predictor out of the model in order from low to high variable importance, selecting as the best model the one with highest median R 2 and lowest median RMSE obtained from 100 random (two-thirds) CVs. The importance of each variable was assessed by calculating mean decrease impurity, the total decrease in the residual sum of squares of the regression after splitting on that variable, averaged over all decision trees in the RF. Median mean decrease impurity was calculated for each variable on the basis of 100 random (two-thirds) CVs of the overall model containing all ten significant predictors.
The best model contained four predictors: distance from the peatland margin, distance to the nearest drainage point, climatic water balance (all positively correlated with peat thickness; Kendall's τ coefficient = 0.49, 0.15 and 0.13, respectively; P < 0.001 for all) and precipitation seasonality (negatively correlated with thickness; Kendall's τ = −0.11; P < 0.01); see Extended Data Fig. 6 for their spatial variability.
The RF regression was implemented in GEE with 500 trees and all other parameters set to default values. Predictor variables were resampled to 50 m resolution. As training data, we included all LOI-verified and corrected pole-method thickness measurements that fell within the masked map of >50% peat probability (n = 463), including thickness >0 and <0.3 m from non-peat sites that could improve predictions of shallow peat deposits near the margins (n = 12).
Our final RF model (R 2 = 93.4%;, RMSE = 0.42 m) had consistently smaller residuals compared with a multiple linear regression model containing the same four predictors with interaction effects (adjusted R 2 = 73.6%;, RMSE = 0.80 m; Extended Data Fig. 5). It also performed better when testing out-of-sample performance, using 100 random two-thirds CVs of training data (median R 2 = 82.2%, RMSE = 0.68 m and median adjusted R 2 = 73.6%, RMSE = 0.85 m for RF model and multiple linear regression, respectively).
For uncertainty on our thickness predictions, we first estimated area uncertainty by creating 100 different maps of distance from the peat margin by randomly selecting (with replacement) a minimum peat probability threshold >0% and <100%, removing speckle and noise, and re-calculating the closest distance to the nearest non-peat pixel. We then combined the 100 distance maps each time with the three other selected predictors (precipitation seasonality, climatic water balance and distance from nearest drainage point) as input in an RF model to develop 100 different peat-thickness maps. For these model runs, we included all available thickness measurements (>0 m) that fell within each specific distance map. Each output map was masked to an area ≥0.3 m thickness, consistent with our peat definition. A map of median peat thickness (Fig. 4a) and relative uncertainty (± half the width of the 95% CI as percentage of the median; Fig. 4b) was then calculated for each pixel on the basis of the 100 available thickness estimates.
Carbon-stock estimates. We mapped carbon density across the central Congo Basin in GEE by applying 20 bootstrapped thickness-carbon regressions that were normally distributed around the best fit (Extended Data Fig. 7) to the 100 peat-thickness maps from the RF regression model, generating a map of median carbon density out of 2,000 estimates (Fig. 4a), together with relative uncertainty (± half the width of the 95% CI as percentage of the median; Fig. 4b).
Total peat carbon stocks were computed in GEE by summing carbon density (Mg ha −1 ) over all 50 m grid squares defined as peat. To assess uncertainty around this estimate, we again combined the 100 peat-thickness maps (uncertainty from area and thickness) with 20 bootstrapped thickness-carbon regressions (uncertainty from carbon density, including bulk density and carbon concentration). We thus obtained 2,000 peat carbon-stock estimates for the total central Congo Basin peatland complex, which were used to estimate the mean, median and 95% CI (Extended Data Fig. 8a).
Regional carbon-stock estimates were similarly obtained for each sub-national administrative region (departments in ROC and provinces in DRC; Extended Data Fig. 2), as well as national-level protected areas (national parks and nature/ biosphere/community reserves) 51 and logging 52,53 , mining 54,55 and palm oil [56][57][58] concessions (Extended Data Figs. 9 and 10). As hydrocarbon concessions cover almost the whole peatlands area 24,26 , they cover almost 100% of the central Congo peat carbon stocks.
Sensitivity analysis was performed by bootstrapping the area, thickness or carbon-density component, while keeping the others constant (Extended Data Fig. 8b). For area, we bootstrapped 100 randomly selected peatland area estimates; for thickness, 100 randomly selected two-thirds subsets of all thickness measurements; for carbon density, 20 normally distributed regression equations from the bootstrapped thickness-carbon relationship.

Data availability
All map results from this study are available for download as raster files from https://congopeat.net/maps/. The supporting ground-truth data, peat-thickness measurements and carbon-density measurements are available from https:// github.com/CongoPeat/Peatland-mapping.git. The remote-sensing datasets are available for download from https://www.eorc.jaxa.jp/ALOS/en/dataset/fnf_e.htm (ALOS PALSAR and ALOS-2 PALSAR-2 25 m HV and HH data), http://osfac.net/ (OSFAC ROC and DRC 60 m Landsat ETM + bands 5, 4 and 3 mosaics) and http:// earthexplorer.usgs.gov/ (SRTM DEM 1-arc second and ASTER GDEM v2 1-arc second data). Fig. 4 | Relationships between field-measured peat thickness (LOI + corrected pole-method measurements) and distance from the peatland margin. Distance from the peatland margin is calculated as the shortest distance to a non-peat pixel in any direction, based on a smoothed median Maximum Likelihood map of peatland extent (> 50% peat probability threshold). Transects are ordered by increasing regression slope (in m km −1 ; upper left corner of each panel), with colours indicating the four transect regions. Note that the horizontal axes are different for each panel. Shaded grey shows 95% confidence intervals around each regression. Fig. 9 | Distribution of national protected areas and industrial concessions across the central Congo Basin peatland complex. The base map shows belowground peat carbon density (shaded grey; Fig. 4a), overlaid with protected areas at national-level (national parks and nature/biosphere/ community reserves; adapted with permission from ref. 51 ), or industrial logging (adapted with permission from refs. 52,53 ), mining (adapted with permission from refs. 54,55 ), and palm oil (adapted with permission from refs. [56][57][58] concessions. Black lines represent national boundaries; grey lines represent sub-national administrative boundaries. Images from refs. 52-55 and 57 adapted under a CC BY licence.