Boreal–Arctic wetland methane emissions modulated by warming and vegetation activity

Wetland methane (CH4) emissions over the Boreal–Arctic region are vulnerable to climate change and linked to climate feedbacks, yet understanding of their long-term dynamics remains uncertain. Here, we upscaled and analysed two decades (2002–2021) of Boreal–Arctic wetland CH4 emissions, representing an unprecedented compilation of eddy covariance and chamber observations. We found a robust increasing trend of CH4 emissions (+8.9%) with strong inter-annual variability. The majority of emission increases occurred in early summer (June and July) and were mainly driven by warming (52.3%) and ecosystem productivity (40.7%). Moreover, a 2 °C temperature anomaly in 2016 led to the highest recorded annual CH4 emissions (22.3 Tg CH4 yr−1) over this region, driven primarily by high emissions over Western Siberian lowlands. However, current-generation models from the Global Carbon Project failed to capture the emission magnitude and trend, and may bias the estimates in future wetland CH4 emission driven by amplified Boreal–Arctic warming and greening.


Table of contents
Table S1.Information of eddy covariance sites in this analysis.
Table S2.Information of chamber sites used to constrain the model in this study.
Table S3.Site id, name, and location of the atmospheric methane content observation sites.
Table S4.Sites that cover the year 2016 and its adjacent years, and show anomaly high temperature in 2016.
Table S5.Annual wetland methane emissions and trends estimated by different models.
Text S1.Combination of the BAWLD and WAD2M datasets.

Text S1. Combination of the BAWLD and WAD2M datasets
The percentage of each wetland type within each upscaled wetland grid cell was determined by combining BAWLD and WAD2M datasets.The WAD2M dataset provided the temporal dynamics of wetland extent information but without differentiating wetland types 1 , while the BAWLD dataset recorded the wetland extent for each wetland type but was temporally static 2 .Here we used the wetland typespecific extent information in BAWLD to calculate the fraction of each wetland type relative to its total wetland area.Then we derived the temporally varied wetland extent for each wetland type by multiplying the wetland extent of WAD2M by the wetland type-specific fraction calculated from the first step (Eq.1).By doing so, we assume that the fraction of each wetland type relative to its total wetland area does not change over time because we do not have information regarding temporally varying wetland types.For the upscaling, we considered all wetland grid cells in the BAWLD dataset that provided wetland type information. (1) where  !,# () is the wetland extent of the i-th wetland type used for upscaling CH4 in the s-th grid cell at time t;  $%&'(,# () is the wetland extent for the WAD2M dataset at time t;  )%$*&,!() and  )%$*&,+ () are the wetland extent of the i-th and j-th wetland types in the BAWLD dataset, respectively, and n is the total number of wetland types within the s-th grid cell of the BAWLD dataset.

Text S2. Inferring causal relationships
In the Causal-ML approach, we first used a data-driven causality inference method to identify the causal relationships between CH4 emission and its abiotic and biotic drivers 3,4 .A causal relationship is present between two variables if changes in one variable (e.g., temperature) directly result in changes in another variable (e.g., wetland CH4 emissions) 4-7 , all else being equal.The literature describes three ways to infer causal relationships: interventional experiments 8 , process-oriented model simulations 9,10 , and data-driven causality inference 4,11,12 .Interventional experiments intervene on the variable of interest while simultaneously maintaining all other factors equal, and then tests the impact of this intervention on the target variable 8 .Interventional experiments across large spatial scale remain challenging 13 .Processoriented model simulations (i.e., computer simulations) control confounding factors and focus on the causal relationship of interest, but the model itself requires substantial expert knowledge and reasonable model structure and parameterization.Unfortunately, the current generation of bottom-up biogeochemical models are highly uncertain and poorly constrained with observed wetland CH4 emissions (see the main text).Therefore, a data-driven causality inference method was selected for analyzing causal processes in wetland CH4 emissions.
Specifically, we used the PCMCI method to identify the causal relationships of wetland CH4 emissions.Wetland CH4 emissions are controlled by multiple abiotic and biotic factors, and these processes can be asynchronous 3,14-16 .The PCMCI method is particularly suitable for inferring such multi-variate controlled and time-lagged causal relationships 4,6,17,18 , and has been used for understanding temporal dynamics of Earth system processes beyond wetland CH4 emissions 3 , such as climate systems 4,11,18,19 and land-atmosphere interactions 12,20 .The method contains two steps, PC (named after Peter Spirtes and Clark Glymour 21 ) and MCI (i.e., momentary conditional independence 18 ).The first step, PC, iteratively identifies a smaller size of relevant necessary confounders that affected the inferred causal relationship between two variables of interest (e.g., temperature and CH4 flux) through conditional independence tests.Subsequently, the momentary conditional independence tests are used to detect and quantify the causal strength between the two variables of interest by removing the confounding effects from the identified necessary confounders in the PC step 17,18 .By avoiding conditioning on high-dimensional variables, the PCMCI improves the detection power of causal relationships 17,18 .We considered abiotic and biotic variables (see the input datasets in the main text) that have reliable global observations or estimates and have been suggested to be mechanistically related to wetland CH4 emissions by previous studies 14,22 .The high predictability of the Causal-ML demonstrated that the input variables we selected explained the majority of the variance in wetland CH4 dynamics.While we acknowledge that factors beyond the ones considered here could also affect wetland CH4 emissions (e.g., oxygen availability, soil pH, and ferric iron and sulfate reducers 22 ), the paucity of observations from site to global scale impedes us from including those additional variables for the causality inference and upscaling.

Text S3. Sensitivity of CH4 dynamics to wetland extent datasets
We used three additional wetland datasets to discuss the sensitivity of wetland CH4 dynamics to uncertain wetland extent, including the static wetland dataset derived from Global Lakes and Wetlands Database (GLWD) 23 , and temporally dynamic wetlands 24 estimated by the TOPography-based hydrological MODEL (TOPMODEL).The GLWD data was derived by merging multiple surveys, maps, and inventories of lake and reservoirs, small water bodies, and wetlands.We used GLWD after excluding lakes, rivers, and reservoirs.In addition to GLWD, we used the temporally varied wetland extent derived from the TOPMODEL model 24 .The TOPMODEL model treated continuous or regularly saturated grid-cells as wetlands.By assuming that water pathways were mainly determined by topography, the model used a compound topographic index to estimate the water table depth and thus the wetland fraction in each grid-cell 24,25 .The modeled wetland extent was temporally varied with a monthly resolution, and was calibrated by observation-based data of Global Inundation Extent from Multi-Satellites (GIEMS-2) and Regularly Flooded Wetland (RFW) 26 , respectively.The two modeled wetland datasets constrained by different observations were referred to as TOPGIEMS-2 and TOPRFW, respectively.We did not directly use the GIEMS-2 data due to its limited temporal coverage 27 .To focus on the sensitivity of wetland CH4 emission dynamics to wetland extent changes, we kept the CH4 emission intensity the same as in our baseline analysis, while replacing WAD2M with GLWD, TOPGIEMS-2, and TOPRFW, respectively.The upscaled wetland CH4 emissions using different wetland extent datasets during 2002-2021 were compared in Fig. S12.

Text S4. Sensitivity of CH4 dynamics to inputting datasets
We conducted four groups of sensitivity experiments to confirm that the increasing trend and strong interannual variations of wetland CH4 emissions were robust given uncertainty in the input datasets.Specifically, we first replaced the originally-used input variables of air temperature, air pressure, precipitation, and wind speed with those of the CRU JRA dataset, while keeping the other factors including soil conditions, GPP, snow, and wetlands the same.Then we upscaled the wetland CH4 emissions during 2002-2021 using the updated dataset and compared the regional trend and interannual variations of wetland CH4 emissions with those in the main text.Similarly, we also used the datasets from GLDAS and MERRA-2 which further provided soil temperature and soil wetness for wetland CH4 upscaling.Additionally, we replaced the GPP used in the main text with that of PML-GPP, which was derived from an empirical process-model that considered the CO2 fertilization effects, and was constrained by multi-source remote sensing observations 28 .The upscaled wetland CH4 emissions using different sources of input variables during 2002-2021 were compared in Fig. S13, and had consistent increasing trends in wetland CH4 emissions.

Figure S1 .
Figure S1.Temporal coverage of chamber and eddy covariance observations, and the locations of atmospheric methane content observation sites.

Figure S4 .
Figure S4.Wetland CH4 emissions in the Boreal-Arctic area from 2002-2021 during each season.

Figure S5 .
Figure S5.Contribution of abiotic and biotic drivers to wetland CH4 emissions.

Figure S6 .
Figure S6.Wetland methane emission anomaly, and surface air temperature anomaly in 2005 and 2020.

Figure S9 .
Figure S9.Model performance and increasing trend of wetland CH4 emissions in the Boreal-Arctic during 2002-2021 using the leave-one-out validation and the temporalcross-validation schemes.

Figure S10 .
Figure S10.Wetland CH4 emissions with increasing trends in the Boreal-Arctic during 2002-2021 using different wetland extent datasets and different forcing datasets.

Text S2 .
Inferring causal relationships.Text S3.Sensitivity of CH4 dynamics to wetland extent datasets.Text S4.Sensitivity of CH4 dynamics to inputting datasets.

Fig. S1 .
Fig. S1.(a) Temporal coverage of chamber and eddy covariance observations, and (b) the locations of atmospheric methane content observation sites.

Fig. S4 .
Fig. S4.Wetland CH4 emissions in the Boreal-Arctic area from 2002-2021 during each season.The shaded blue area indicates the standard deviation in estimated wetland CH4 emissions due to model parameter uncertainty; the dashed blue lines indicate the linearly regressed trends of wetland CH4 emissions.

Fig. S6 .FCH4
Fig. S6.Wetland methane flux (FCH4) anomaly, and surface air temperature anomaly in 2005 and 2020.The anomalies were calculated relative to the multi-year annual mean value from 2002 to 2021.The dashed boxes marked regions in (a)-(d) are two wetland hotpots: Western Siberian lowland and Hudson Bay lowland.
Fig. S8.Inter-annual wetland CH4 emissions in the Boreal-Arctic estimated by topdown models (blue lines), bottom-up models (gold lines), and this study (the red line).

Table S1 .
Information of eddy covariance sites in this analysis.The observations of Interior, Scotty, Stordalen, Churchill, and Denali are obtained from BAWLD-CH4 dataset 66 , and the rest sites are obtained from Fluxnet-CH4 dataset67,68.

Table S2 .
Information of chamber sites used to constrain the model in this study.The datasets are obtained from BAWLD-CH4 dataset 66 and the datasets inBao et al. 69

Table S4 .
Sites that cover the year 2016 and its adjacent years, and show anomaly high temperature in 2016.