Microscale drivers of summer CO2 fluxes in the Svalbard High Arctic tundra

High-Arctic ecosystems are strongly affected by climate change, and it is still unclear whether they will become a carbon source or sink in the next few decades. In turn, such knowledge gaps on the drivers and the processes controlling CO2 fluxes and storage make future projections of the Arctic carbon budget a challenging goal. During summer 2019, we extensively measured CO2 fluxes at the soil–vegetation–atmosphere interface, together with basic meteoclimatic variables and ecological characteristics in the Bayelva river basin near Ny Ålesund, Spitzbergen, Svalbard (NO). By means of multi-regression models, we identified the main small-scale drivers of CO2 emission (Ecosystem Respiration, ER), and uptake (Gross Primary Production, GPP) in this tundra biome, showing that (i) at point scale, the temporal variability of fluxes is controlled by the classical drivers, i.e. air temperature and solar irradiance respectively for ER and GPP, (ii) at site scale, the heterogeneity of fractional vegetation cover, soil moisture and vegetation type acted as additional source of variability for both CO2 emissions and uptake. The assessment of the relative importance of such drivers in the multi-regression model contributes to a better understanding of the terrestrial carbon dioxide exchanges and of Critical Zone processes in the Arctic tundra.

. Location of the study site (red circle) with indication of toponyms. Aerial/satellite image modified from TopoSvalbard (courtesy of Norwegian Polar Institute, https:// topos valba rd. npolar. no/). Top-left inset: location of study site (red star) in Spitsbergen, Svalbard Islands (NO). Bottom-right inset: distribution of the sampling areas for point-scale (black dot), site-scale (red dashed line) and species-specific (solid lines) samplings. Satellite basemap available in MATLABR2020a, hosted by Esri.

Results
Point-scale samplings. Point-scale samplings were performed for 24 h across 2 days of clear sky and stable meteorological conditions at a fixed point, mostly covered with Carex spp. The CO 2 fluxes and the meteorological variables displayed a hump-shaped temporal behavior during the measurement days (see Fig. 2; S2 in the Supplementary Material), with the only exceptions of air pressure, that showed a slightly decreasing trend with average value of 1023 hPa and fluctuations comparable with measurement errors, and soil moisture, that varied irregularly preventing the identification of a clear temporal evolution. The pattern of Ecosystem Respiration (ER) and Gross Primary Production (GPP) matched those of their classical drivers, i.e. air temperature ( T a ) and solar irradiance (rs). A net uptake (i.e. negative Net Ecosystem Exchange, NEE) was observed during the day, between about 3:00 and 18:00 UTC, and net emission (i.e. positive NEE) in the remaining hours, when solar irradiance dropped below 100 W/m 2 . Total night darkness is absent during Arctic summer and therefore null irradiance was never attained, with a minimum measured irradiance of 37.12 W/m 2 over the whole campaign.
The classical dependence of carbon emission on temperature is modelled as an exponential 29 , while the dependence of carbon uptake by photosynthesis is modelled as a double-hyperbolic function of solar radiation 30 . These are expressed as: where a is a free parameter, corresponding to ecosystem respiration at 0 °C, b is related to the usual Q 10 factor 29 as Q 10 = exp (10 b), F max is the maximum photosynthetic flux for infinite light supply and α is the apparent quantum (1) ER = a exp (bT a ) + ε, (2) GPP = F max αrs F max + αrs + ε Figure 2. Upper panels: Plots of (a) ER versus T a and (b) GPP vs rs for point-scale samplings, with red lines corresponding to best-fit curves, according to Eqs. (1) and (2). Lower panels: Plots of the measured versus modelled ER (c) and GPP (d) for site-scale samplings. Modelled values as obtained from Eqs. (3,4). Parameters and statistics as in Table 1. Inset: explained variance for models including only the listed drivers.
Scientific Reports | (2022) 12:763 | https://doi.org/10.1038/s41598-021-04728-0 www.nature.com/scientificreports/ yield. These regression parameters are to be fitted from data, while ε represents the residual of the fit, that includes all the unmodelled dependences and the stochastic contribution of Gaussian noise. At point-scale, the temporal variations of ER and GPP were reproduced by the classical dependences (1) and (2), with large values of the explained variance (0.73 for ER and 0.86 for GPP), as shown in Fig. 2a, b. Best-fit parameter values were a = 1.06 µmol/m 2 /s and b = 0.036 • C −1 in Eq. (1), and F max = −3.23 µmol/m 2 /s and α = −0.021 µmol/W/s in Eq. (2). For both emission and uptake, the introduction of additional variables did not improve the performances of the models in terms of AIC (Akaike Information Criterion 33,34 , see the section "Regression Models" in "Methods"), which was − 13.21 for ER and − 35.25 for GPP. Residuals were gaussian (Lilliefors' test 35 ), and heteroscedastic (Bartlett's test 36 ). Therefore, the hourly temporal evolution of CO 2 fluxes at point-scale was best explained by the temporal variation of air temperature and solar irradiance, respectively.  Table 1. Here, ε represents the (random) residuals of the model. The temperature response parameter in the ER model, b 0 , corresponds to Q 10 = 1.6, well within the range of published estimates for the Arctic tundra [37][38][39][40] . Figure 2c, d shows the modelled versus measured values for the best-fit model, respectively for ER and GPP, and the variance explained by different combinations of the flux drivers. The GFC caused the largest enhancement of σ 2 expl compared to the classical models ( σ 2 expl increased of + 0.58 and + 0.72 respectively for ER and GPP), while the model enhancement due to VWC was smaller ( σ 2 expl increased by + 0.03 in both ER and GPP models). Residuals were gaussian, according to Lilliefors' test 35 , and heteroscedastic, according to Bartlett's test 36 .

Site-scale samplings.
As shown in Fig. 3a, the distribution of the GFC was dominated by small values, mostly confined below 0.20, with sporadic samplings characterized by a larger green vegetation cover. Large values of GFC were mostly associated with sampling points where the cover was dominated by vascular species, class V, while moss, lichens and bacterial soil crust, included in class NV, showed the lowest GFC values (Fig. 3c). Only 4 points (i.e., 2% of the total sample size) were classified as bare soil, class BS, and therefore were not included in the following analysis, being a minor component in this environment (see CAVM-classification in the Method section). Significant differences were found between the mean GFC of V and NV classes (p = 0.001) and larger variability was associated with V compared to NV (boxes and bars in Fig. 3c). Sampling points where a clear predominance of V or NV species was not present (class MIX) showed a significantly different mean value of GFC compared to both V (p = 0.03) and NV (p = 0.02) classes, with mean value and variability in between the V and NV ones. Significant differences between vascular, non-vascular and mixed classes were obtained also in the ER and GPP mean values (Fig. 3b-d), where again we observed larger variability and stronger average fluxes associated with  41 , and the variability range of both fluxes of NV (bars in Fig. 3b-d) agrees with that observed at the Eddy Covariance site of Lloyd 42 , which is dominated by non-vascular vegetation. Empirical models separately derived for the V, NV and MIX cases led to the identification of the same driving variables indicated in Eqs. (3,4). As expected, the maximum photosynthetic rate, F, was higher for V compared to NV and MIX 42 . However, the explained variance drastically lowered passing from V to MIX to NV for both ER and GPP ( Table 2; Fig. 4a-d). Vascular plants were characterized by explained variances that were similar to those identified for the whole set of measurements (' All' in Table 2). In contrast, non-vascular and mixed vegetation showed explained variances lower than 0.50 and none of the other measured variables enhanced the model representativeness. The presence of GFC as one of the drivers for NV fluxes is presumably due to the presence of (green) mosses, and to the fact that assigning a point measurement to one of the vegetation classes was based on the prevailing cover type, which cannot exclude the occurrence of small subdominant individuals of vascular species in points classified as NV. In any case, the increase in explained variance generated by the introduction of GFC in the multi-regression model decreased from V to MIX to NV (Fig. 4c, d). Conversely, the contribution of VWC increased from V to MIX to NV (Fig. 4c, d).
Species-specific samplings. Given     Comparing the meteo-climatic variables and CO 2 fluxes for each couple of species-specific measurements (see Table S1 in the Supplementary Material for differences and their significance), solar irradiance showed no significant differences between classes, with the only exception of the couple CX-SI. Significantly larger atmospheric pressure and significantly lower air humidity were recorded for CX samplings, which were indeed performed during fair, clear sky whether conditions (https:// sekli ma. met. no/ obser vatio ns). DR showed significant differences in the mean soil temperature and soil moisture (also shown in Fig. 5d), compared to other species. The average green fractional cover (Fig. 5c) was significantly different between most species, except for the couples CX-SL, CX-SX, SL-SX and DR-SI. Fluxes were significantly different between most of the species, except for the couples of DR-SI and SL-SX that did not show significant differences in ER, NEE and GPP. ER was comparable (p > 0.05) between CX-DR, CX-SI, DR-SI and SL-SX, and NEE was comparable (p > 0.05) between CX-SL, CX-SX, SL-SX and DR-SI. As already observed for site-scale samplings, more intense (lower) GPP matched higher ER (Fig. 5a,  b), and, across species, the magnitude (absolute values) of both fluxes was higher for DR and SI, although being also highly variable, while SX showed the less intense mean ER and GPP, as well as the lowest variability. Such patterns are consistent with other species-specific studies in the same catchment 44 .
Following the same procedure discussed above, we estimated a suitable empirical model for each vascular species. First, we tested the classic models (1-2), which explained flux variability to only a limited extent ( σ 2 expl ranging from 0.32 to 0.49 for ER and from 0.08 to 0.38 for GPP), although larger explained variances were obtained in these species-specific samplings compared to site-scale samplings. Then, we turned to multi regression models, obtaining the same models as for the site-scale samplings (Eqs. 3, 4), where again GFC was responsible for the largest enhancement of the explained variance. Estimated parameters and statistics are reported in Table 2. Patterns of F among species closely follow the ones of 45 . Significant differences between some of the species were obtained in parameters related to GFC (i.e. a 1 and A 1 , as shown in Table S2 in the Supplementary Material). In particular, www.nature.com/scientificreports/ the SI and DR parameters did not show any significant difference, as well as SL and SX, while CX was in between those two groups, showing significant differences in a 1 compared to DR, but no differences with respect to SI. No significant correlations were found between GFC and VWC, neither for any of the species (see also Fig. S4 in the Supplementary Material), nor for the site-scale samplings. This excluded cross-correlations in the model, potentially resulting from the effect of the biomass on the local soil moisture (e.g., larger biomass could have higher water demand, resulting in the reduction of soil moisture in the root zone).
Merging all vascular vegetation species (column ' All Spp' in Table 2), the best models were again those including GFC and VWC, with explained variance comparable with that obtained for site-scale samplings (Table 1). Significant differences between site-scale samplings (' All' in Table 1) and the merged vegetation case (' All Spp' in Table 2) were observed not only in the mean value of the fluxes (p = 0.003 for ER and p = 0.005 for GPP), but also in model parameters related to GFC (p = 0.04 for a 1 and p = 0.01 for A 1 ), presumably owing to the contribution of non-vascular vegetation to the fluxes in the full site-scale samplings. This suggests that the total flux and its variability at site scale cannot be explained solely by the vascular species contribution, despite the fact that the fluxes for vascular vegetation were larger than for non-vascular vegetation. Conversely, the models estimated for the merged species-specific samplings and the vascular class in site-scale samplings did not show significant differences, confirming that the selected species properly represented the behavior of the vascular plants in this area.

Discussion
Biological CO 2 emissions by plant and soil respiration are known to depend on temperature 29 , while CO 2 uptake by photosynthesis is a function of solar radiation 30 . However, other additional drivers modulate the carbon fluxes in the soil-vegetation system. Several studies proved the influence of soil moisture on CO 2 emission and uptake in polar and Alpine tundra [46][47][48][49]49 . Soil properties, such as the active layer depth 39,46,50 , were sometimes included in the models. Biomass, green vegetated area or leaf area index (LAI) are recognized to be important drivers, describing the contribution of vegetation to the carbon fluxes 37,51 . Finally, plant functional types or speciesspecific regressions were used to explore the carbon fluxes resulting from diversified vegetation physiology 37,45,52 .
Here, we identified the main drivers of CO 2 emission and uptake at the peak of the growing season, developing empirical multi-regression models able to explain a large portion of the flux variance with a minimal set of parameters. At point-scale (small circular plots with a diameter of about 0.29 m), most of the temporal variability of the fluxes at a fixed location was explained by the classical univariate dependences of ER and GPP on air temperature and solar irradiance, respectively. When zooming out to the site-scale (of the order of 150 × 150 m 2 , representative of the local vegetation community), the standard functions explained only a small fraction of the flux variability, dictating the need for multi-regression models, in which soil moisture and green fractional vegetation cover of plants emerged as essential drivers besides classical ones. In addition, the CO 2 fluxes depended on the vegetation composition within the sampling area. This indicates that at site-scale most of the variability is related to the spatial heterogeneity of soil moisture and vegetation. It is worth noticing that the site-scale samplings were performed between roughly 7:00 and 19:00 UTC in different days (Fig. S3 in Supplementary material). During the site-scale sampling time the air temperature and light intensity spanned respectively between 8.2 and 17.5 °C, and 95 and 651 W/m 2 , which are comparable with the variations observed in point-scale samplings (Fig. 2). Hence, the relevance of the additional drivers in the site-scale study is not just due to a possibly small variability of temperature and light intensity. Conversely, these additional drivers account for surface heterogeneity and act to modify the magnitude of the fluxes' dependence on their classical drivers.
Vegetation effects on fluxes are expected to depend on plant characteristics, such as the leaf area 51,53,54 , where plant-atmosphere gas exchange occurs via plant stomata, on the green biomass 55 , that contains chlorophyll, and on species-specific response to environmental drivers at comparable biomass 44,45 . Here, the green fractional vegetation cover (GFC) was used as a bulk descriptor of the vegetation area prone to gas exchanges. The GFC estimated from digital images provides a speditive and accurate information, useful for exploration studies. Clearly, GFC could be replaced by other on-site measurements, such as the Normalized Difference Vegetation Index (NDVI) estimated by portable NDVI sensors 37 . In general, GFC may underestimate vegetation compared to the green or leaf area index, because it does not account for wilting biomass, reddish plant elements or leaf layering. Nevertheless, this site is characterized by very limited stratification and vertical development of vegetation 45 . Only sporadic measurements were performed on wilting vegetation and solely the margin of Saxifraga oppositifolia showed reddish nuances, which are therefore not included in the GFC computation. On the other hand, GFC has the advantage of accounting also for moss cover, which is neglected by LAI estimates 37 .
The GFC resulted to be the strongest additional predictor of the fluxes, because it was associated with the largest enhancement in the explained variance, for both single vascular vegetation species (species-specific samplings) and mixed vascular and non-vascular vegetation (site-scale samplings). This suggests that a large portion of ER is generated by autotrophic respiration. Soil moisture (VWC) was the second significant additional predictor, combining effects from precipitation and active layer thawing (similarly to what was observed for the Alpine tundra 56 , and the Antarctic tundra 49 ). Soil moisture is a recognized constraint for fluxes of poikilohydric vegetation such as mosses and lichens 39,51,57 . This was for instance suggested to be the main limit of the Eddy Covariance (EC) modelling study of Lloyd 42 in the same site at peak season. Furthermore, soil moisture can impact carbon and nitrogen mineralization, thereby affecting CO 2 assimilation 58 . The predictors identified here agreed with those highlighted by Li et al. 40 and Cannone et al. 45 , for the same sampling period, including the drivers in the non-linear modelling framework. Moreover, our assessment of the drivers' weight in the estimated models represents a unique insight in the modelling effort for the Arctic environment.
An important point concerns the role of vascular versus non-vascular vegetation. The mosaic structure of the Arctic tundra is characterized by isolated vascular species nested inside a matrix of non-vascular elements, such as lichens, mosses or bacterial soil crust 59  www.nature.com/scientificreports/ observed that the contribution of vascular species was the most relevant to the site carbon fluxes, and vascular species showed, on average, ER and GPP that were significantly larger than the non-vascular ones. Random sampling of the surface indicated that most of our measurements involved vascular vegetation, or a surface partially covered with vascular vegetation (class 'MIX' in Fig. 3), suggesting that the spread of vascular plants was not negligible at this site, in keeping with Yoshitake et al. 60 . Vascular plants are late successional species in the tundra biome 57,60 and are expected to be facilitated by the effects of climate change 14 . Higher temperatures, broadened growing period and larger water availability 1,61 may possibly result in vascular species outcompeting pioneering vegetation 14,57 , such as lichens and mosses, and driving the carbon dioxide exchanges of the Arctic tundra in the next future. For these reasons, we explicitly explored the differences between vascular vegetation classes that were representative of the local species pool. For each of the vegetation classes, the same drivers of site-scale samplings were obtained. Significant differences between model parameters related to the green fractional cover ( a 1 and A 1 in Eqs. 3, 4) estimated for different vegetation classes suggested a further distinction of vascular classes into functional groups, hinging in particular on the different response of carbon fluxes to the same green fractional vegetation cover. Different values of the model parameters for the same GFC indicated a species-specific response. Such results mirrored the significant differences between species, which were comparable between SI-DR and SX-SL, while CX showed hybrid characteristics between these two groups.
The observed flux differences may depend on the vegetative cycle of the species along the growing season, as for instance the flowering and seed dispersal of SL and SX occur early in the season (https:// svalb ardfl ora. no/), possibly resulting in an anticipated vegetative peak compared to other species. Our species grouping (i.e. SI-DR vs SX-SL) agrees with the results of a cluster analysis based on microhabitat for the same site 62 , possibly suggesting that plants belonging to the same group not only favour similar substrates but also display similar functioning. Previous studies also suggested that differences in fluxes can be correlated with the successional status and nitrogen content along a deglaciated transect in this catchment 44 and with functional types related to ecosystem processes 63 . Nevertheless, all the above interpretations are based on a classification that relies on plant functioning rather than on their morphology, and a unifying modelling framework is still missing.
The main outcome of this analysis was a data-driven model of carbon dioxide emission and uptake that accounts for all the above drivers. The work presented here is one of the few attempts to build empirical models for ER and GPP variability at fine scale in the High-Arctic tundra. Interestingly, in the site-scale samplings we found larger explained variance for GPP compared to ER model. Similar modelling efforts focussing on the Arctic tundra also detected this gap in the ability of models to capture ER variability 37,39,46 , despite including also drivers such as thaw depth and nitrogen content. The explained variance of Eqs. (3,4) drops for the case of nonvascular vegetation, consistently with the findings of Segal & Sullivan 39 . This may suggest that, to date, models at site to landscape scale are still limited in their ability to reproduce the processes that regulate CO 2 fluxes from lichens and mosses. By contrast, the modelling of fluxes associated with vascular species resulted in much larger explained variance compared to non-vascular case, both at site scale and in species-specific samplings, with values comparable for instance with 39 . Overall, the explained variance of our model for both fluxes and in site-scale and species-specific samplings are comparable or higher than other published models for the Arctic region 37,39 , with the advantage of relying on solid theoretical assumptions on the role of additional drivers in Eqs. (3,4). Specifically, Eqs. (3,4) are derived from the hypothesis that additional drivers act to perturb the parameters in Eqs. (1,2). Such hypothesis is supported by observations showing that the light-saturated photosynthetic rate, F max in Eq. (2), can depend on nitrogen leaf content 44 or LAI 53,64 , which may result from the photosynthetic capacity of plants being influenced by environmental constraints.
Clearly, the empirical models developed in this study can not be automatically applied to other Arctic sites. Here, we focused on the peak growing season ER and GPP, in the high Arctic Svalbard tundra. We remark that the tundra communities of Svalbard (P1-P2 CAVM) are also found in the northern part of Siberia and Canada 65 , a non-negligible part of the Arctic area and possibly sharing similar carbon flux dynamics. In addition, we stress that other drivers, such as soil or plant nitrogen content 39 or thaw depth 46 , could become important over longer periods and/or in different sites. Interestingly, for a close-by site and the same vascular species Cannone et al. 45 identified soil temperature, photosynthetic active radiation, LAI and species-specific photosynthetic capacity as the most relevant drivers during edge seasons (i.e. beginning and end of the growing season), despite their importance changing along the season and possibly between years. At a different location in Svalbard, Cannone et al. 50 found primary drivers of ER and NEE that changed along the growing season, with LAI, surface temperature and soil moisture being the most important drivers at the season peak, in agreement with our results. Together, the similarities of our results with the findings from other sites and in different sampling periods, and the comparable explained variance with similar modelling efforts, suggest that our approach could be extended to other Arctic sites. Further studies of this type would help in representing the processes that drive the fluxes over the entire Arctic region.
Process-based models could make use of the knowledge gained from data-driven models to achieve more reliable projections. At watershed scale, data-driven models could become a useful parameterization to be incorporated in process-based models, as attempted by Uchida et al. 66 . Comparing regressive models obtained in different locations would also give information on processes and variables to be included and coupled in regional process-based models simulating the terrestrial carbon cycle [67][68][69][70] . Regional models would be suitable for longterm forecast, despite different models using different equations and different drivers for CO 2 fluxes. In this context, we suggest that vegetation dynamics (represented for instance by the green fractional vegetation cover) and the land water cycle should be coupled with carbon dynamics in order to achieve realistic representations of the Arctic system. Projections under climate change scenarios would in particular benefit from accounting for at least two vegetation classes, namely vascular and non-vascular plants, as we showed that CO 2 fluxes strongly depended on the relative cover of those two. In fact, quantifying the spatial distribution of vegetation types and www.nature.com/scientificreports/ their fractional cover, and combining this information with assessments of the vegetation respiration and photosynthetic capacity, may help in the assessment of carbon exchanges in Arctic ecosystems.
In conclusion, we recall that it is still debated whether the high Arctic tundra will behave as a carbon sink or source under changed climatic conditions. Uncertainties are not consistent between different process-based models and field data, thus leaving open issues in carbon budget assessments 71 . The origin of those uncertainties can be explored by small-scale studies that aim to unveil the processes underlying carbon fluxes in such a fast-changing landscape mosaic. To pursue this aim we built data-driven regression models of CO 2 fluxes in the High-Arctic tundra of Svalbard. For this area, we showed that different, spatially heterogeneous drivers (mainly soil moisture and green fractional vegetation cover) acted to modify the dependences of GPP and ER on their classical drivers, namely solar irradiance and environmental temperature, thus determining the flux variability at site scale. Further studies should aim to create a solid framework of plant classifications according to their relationship with CO 2 fluxes. A plant classification based on carbon fluxes would in particular benefit studies based on the Eddy Covariance method 42 . Maps of vegetation cover have been shown to be a crucial point in relating Eddy Covariance and chamber measurements, allowing for the explanation of measurement variations in relation to the characteristics of the footprint area, and for upscaling the results of chamber measurements to the footprint-scale 72,73 . This will help in small-scale assessment of carbon fluxes and associated climatic feedbacks.

Methods
Study area. The experimental site is located in the foremost part of the Bayelva river catchment, West of Ny Ålesund, in the Brøgger peninsula, Spitsbergen, Norway (78° 55′ 24″ N, 11° 55′ 15″ E), and can be classified as moderate snow-bed habitat 74 . The catchment, covering 32 km 2 and ranging from 4 to 742 m a.s.l. 75 , is surrounded on the southern border by steep mountains that host the Austre and Vestre Brøggerbreen glaciers. The measurement site (Fig. 1) is located on a slight hill slope, degrading towards South to a small lake, on the Signehamna formation 76 . In 2016-2019 the ground was snow-free from June to September, the maximum snow depth occurred in April (27 cm) and July-August daily air temperature and relative humidity were 5.7 °C and 78.3% (see https:// sekli ma. met. no/ obser vatio ns/). The mean annual (water equivalent) precipitation is 588 mm/year. The vegetation cover is heterogeneous, typical of the bioclimate subzone B-C 77 with vascular plants constellating the matrix of mosses and lichens, and can be classified as prostrate dwarf-shrub and herbs tundra (P1) 65 . The soils are characterized primarily as haplorthels, with high lithic and low nutrient content 78 and the underlying permafrost has typical active layer depths of ∼ 0.5 to 1.5 m 79 .
Scopes and sampling design. We designed three different sampling strategies, aimed at investigating: Flux measurements. In all sampling sets we simultaneously measured CO 2 fluxes and basic meteoclimatic and environmental variables. Fluxes were measured by the non-steady state, closed dynamic flux chamber method 80 using a LI-COR LI-840 IRGA (InfraRed Gas Analyser) spectrophotometer and a circular stainlesssteel collar (661 cm 2 area, well within the range used for this site 40,41,45 ) inserted into the soil just prior to the measurement (to a depth of about 2 cm) where to place the transparent chamber (10 cm height). Gasses inside the chamber volume were mixed and homogenized through a coiled, pierced tube connected to the outlet of the gas analyzer and a pressure equalizer kept the pressure inside the chamber at equilibrium with the environment. The tendency of CO 2 concentration change inside the chamber was linearly interpolated (R 2 varied between 0.73 and 0.99) to obtain the flux value (see Fig. S5 in the Supplementary Material and also 56 ). At each sampling point we consecutively measured the NEE with the transparent chamber, and ER by covering the chamber with a black cover, and purging all the apparatus between the two consecutive measurements. GPP was obtained by subtracting the two measurements (GPP = NEE-ER), assuming that the environmental conditions varied negligibly during the sampling time (about 3 min). Conventionally, GPP has negative values (GPP ≤ 0), ER is positive (ER ≥ 0), while NEE can be either positive, negative or null, depending on the prevailing component (ER or GPP). www.nature.com/scientificreports/ was measured by a sensor installed above the flux chamber. The raw dataset was quality controlled by statistical analysis of the frequency distribution and by the Rosner tests 81 to identify potential outliers. At each sampling point, we also collected 10 Mpixel RGB pictures at nadir, in order to monitor the composition and fractional cover of vegetation inscribed within the collar (see Fig. S1 in the Supplementary Material), by using a HUAWEI P20 PRO CLT-L29, equipped with Leica dual camera and CMOS sensor placed at an height of about 50 cm above the soil surface. The GFC was estimated from the pictures as the ratio of the vertical projection of the green vegetated area to the ground (total) area of measurement, by an ad-hoc algorithm using MatlabR2020a. Pictures were low-pass filtered by using 'wiener2' Matlab intrinsic function to reduce the signal-to-noise ratio and RGB channels allowed to obtain a greenness index (g = 2G-B-R, with R, G and B the brightness of red, green and blue channel at each pixel) to discriminate green, photosynthetic vegetation form background (following 82 ) inside the collar. Such an estimate does not account for vegetation layering, which is negligible in the Arctic tundra, and therefore GFC ranges between 0 and 1. Here, the area of the single measurements, corresponding to the round area included within the steel collar, will be called sampling point. Sampling site will refer to the area over which 177 measurements were performed in points having random spatial distribution. Points were georeferenced and pseudoreplicates having the same coordinates within the GPS uncertainty were discarded from the dataset of site-scale and species-specific samplings. Regression models. CO 2 drivers were selected using multi regression models for the site samplings set and for the species-specific set, where Eqs. (1) and (2) reproduced only a small part of fluxes variability. We started from the known drivers of carbon emissions and uptake: air temperature, T a

29
, and solar irradiance, rs 30 , respectively, as in Eqs. (1) and (2). We assumed that the rs/PAR (Photosynthetically Active Radiation) ratio is constant 30 and that wavelengths outside the PAR band are photosynthetically inactive 83 . Here, we chose the same model class selected by Magnani et al. 56 for the Alpine tundra, based on the hypothesis that additional variables in the model cause small modifications of the parameters of Eqs. (1) and (2). Hence, using Taylor expansions, a general formulation of the multi-regression models can be written as The additional variables, represented by x i and y i in (M1) and (M2), were selected among the measured ones, plus the hour of sampling (h), the sampling date (DOY) and the green vegetation cover (GFC), as described in 'Dataset' . A partial-correlation analysis on all the candidate drivers was performed before building the model and candidate drivers having significant correlations were not allowed to be simultaneously included in the model. Different letters for the additional variables in Eqs. (M1) and (M2) have been used to highlight that the two sets of predictors were selected independently of each other. We used the set of meteoclimatic variables recorded during NEE measurements, as humidity and temperature do not significantly vary during the consecutive NEE and ER measurements, while it is important to use the rs records obtained simultaneously with NEE measurements, when photosynthesis is active. Parameters that gave non-significant (p > 0.05, see also 'Statistical analysis') contributions to the model were statistically pruned. In multi regression modelling, the use of an increasing number of variables could enhance model representativeness to the detriment of model simplicity. We quantified model representativeness by the explained variance, defined as σ expl 2 = (σ flux 2 − σ res 2 )/σ flux 2 , with σ flux 2 the variance of measured fluxes (either ER of GPP) and σ res 2 the model residual variance (i.e. the variance of residuals, ε ). The most efficient model was selected by means of the Akaike Information Criterion (AIC, 33 ), which identifies the model that minimizes the residual variance, with a penalty for the number of parameters. For regressive models, AIC is expressed as AIC = Nlog i ε 2 i − 2k 34 , with k the number of estimated parameters, including the residual variance itself, and N the number of data. The smallest value of AIC identifies the most efficient model. Finally, we used the Lilliefors' test 35 for assessing the null hypothesis of gaussian distribution of the residuals, and Bartlett's test 36 for assessing their heteroscedasticity. Once the model was identified, the weight of each additional drivers included in the regression was estimated by including one additional driver at a time in the regression and refitting the model, thus obtaining the explained variance for the reduced model, to be compared with the explained variance of the classical model (1) or (2). We performed the statistical analysis in MatlabR2020a, using intrinsic functions for nonlinear regressions, Lilliefors' and Bartlett's test.

Statistical analysis.
In the species-specific sampling, a preliminary analysis of the variables averaged over each vegetation species highlighted whether different species were associated with characteristic environmental and flux values. Significant differences (i.e. not ascribed to random fluctuations, p < 0.05) between species were assessed using a randomization technique that follows Jacobson et al. 84 : two series of species-specific data were (M1) ER = (a 0 + a 1 x 1 + a 2 x 2 + · · ·) exp (b 0 T a ) + ε, (M2) GPP = [Fα 0 rs/(F + α 0 rs)][ A 0 + A 1 y 1 + A 2 y 2 + · · · + (Fα 0 rs/(F + α 0 rs)) B 0 + B 1 y 1 + B 2 y 2 + B 3 y 1 rs + B 4 y 2 rs + · · · ] + ε Scientific Reports | (2022) 12:763 | https://doi.org/10.1038/s41598-021-04728-0 www.nature.com/scientificreports/ shuffled, by mixing the individual values across the series, several times, then the difference of the means of the shuffled series (which is theoretically null) gave an estimate of the probability (P) of obtaining a difference larger than the unshuffled one. This allowed us to test the distribution of the difference of the means against the null hypothesis of statistically different means, by performing a double-tailed test. Randomization was also used to assess (i) the significance of regression coefficients, by shuffling pairs of dependent-independent variables and testing the null hypothesis of statistically uncorrelated variables, and (ii) the significance of the differences observed between species-specific models, by shuffling the independent variables for two series of measurements (e.g. dependent-independent pairs measured for two species), using the shuffled series to fit the models and testing the hypothesis that the observed difference was induced by random fluctuations of the variables. In the last case, the comparison of the parameters estimated for different vegetation classes was used to assess whether the species-specific response of the system to the same drivers was comparable (e.g. different species may have different specific productivity for the same GFC resulting in different CO 2 uptake). We implemented ad hoc functions in MatlabR2020a for estimating P-values with the shuffling techniques. A detailed description of the statistical tests can also be found in Magnani et al. 56 .