Robust and perfectible constraints on human-induced Arctic ampli�cation

The Arctic near-surface warming is much faster than its global counterpart. Yet, this Arctic amplification occurs a rate that is season, model and forcing-dependent. The present study aims at using temperature observations and reanalyses to constrain the projections of Arctic climate during the November-to-March season. Results show that the recently observed four-fold warming ratio is not entirely due to a human influence, and will decrease with increasing radiative forcings. Global versus regional temperature observations lead to complementary constraints on the projections. When Arctic amplification is defined as the additional polar warming relative to global warming, model uncertainties are narrowed by more than 30% after constraint. Similar results are obtained for projected changes in the Arctic sea ice extent (40%) and when using sea ice observations to constrain the polar warming (37%), thereby confirming the key role of sea ice as a positive but model, season and time-dependent surface feedback.

These early findings have been partly corroborated by CMIP5 models. Yet, the underlying mechanisms and the quantification of their relative contributions to AA have been a matter of intense debate. Besides the central role of diminishing sea-ice 2,3,4 , other important feedbacks have been highlighted, such as large contributions from longwave feedbacks 5,6 and significant changes in atmospheric and oceanic heat transport 7,8 . A single model study 4 highlighted that the water vapor feedback contributes negatively to AA because its induced surface warming is stronger in lower latitudes. A radiative kernel method was applied to analyze the polar warming factors in a large subset of CMIP5 projections based on an intermediate RCP4.5 scenario 2 . Several feedbacks were reported to contribute to surface warming both over land and ocean, such as the water vapor and lapse rate feedbacks (LRF), but also a positive longwave cloud feedback in winter. In contrast, the positive radiative sea-ice feedback is by definition only active over the ocean and is maximum in summer although it peaks at different times of the year depending on the latitude. Yet, the sea ice retreat is not a purely radiative feedback and is also important in winter when it is associated with enhanced surface evaporation over the Arctic Ocean 2 .
Idealized climate change simulations in which the CO2 forcing is prescribed in distinct geographical regions 8,9 showed that AA is dominated by local rather than extra-polar forcings, specifically through a positive LRF. This finding was further supported through a decomposition of the Arctic LRF into "upper" and "lower" troposphere contributions, showing that this feedback arises primarily as an atmospheric response to the local sea-ice loss, although it is reduced in subpolar latitudes by an enhanced poleward atmospheric energy transport 10 . The uncertainty in Arctic climate projections is dominated by model differences (rather than internal climate variability or even scenario uncertainty) and shifts gradually from autumn to winter over the 21 st century, in line with a dominant influence of the sea ice retreat that also shows a similar seasonal shift 11 .
The latest-generation CMIP6 models broadly capture the observed near-surface pattern of winterdominated Arctic warming and lead to similar conclusions about the main feedbacks 3,12,13 . Abrupt four times CO2 experiments compared to preindustrial climate revealed that the LRF and surface albedo feedback contribute most to AA 3 . In comparison with CMIP5, stronger polar warming in CMIP6 was first attributed to a larger surface albedo feedback, combined with less-negative cloud feedbacks. However, scaling the NH polar warming by the concomitant global warming yields a similar degree of AA 3 . A radiative kernel method applied onto CMIP6 models highlighted that AA does not primarily arise from the surface albedo feedback but rather from the LRF, especially in winter 13 . This positive Arctic LRF is a noteworthy regional exception given the overall negative LRF at the global scale. It is a seasonal phenomenon, triggered by sea surface temperature changes after sea-ice loss, and not by the degree of atmospheric stratification 14 . The key role of the declining sea ice cover may also explain why the inter-model spread of AA was found to decrease with increasing radiative forcings 12,14 .

Confidential manuscript
To sum up, there are still inconsistencies regarding the drivers and magnitude of AA in global climate models. Such discrepancies can be due to multiple methodological issues: AA definition, single versus multi-model studies, idealized increased-CO2 versus scenario experiments, choice of the emission scenario and of the radiative kernel for feedback decomposition, focus on either annual or seasonal means. Moreover, polar feedbacks are tightly coupled to changes in oceanic and atmospheric energy transport, so that their contributions to AA should not be considered in isolation 15 . Clearly, our assessment of these feedbacks remains limited and other, top-down rather than bottom-up, approaches may be needed to constrain the projections in this highly sensitive region.

Quantifying uncertainties in CMIP6 projections of the Arctic climate
The inter-model spread in AA has not been reduced in the latest CMIP6 global climate projections 3,12,14 (Fig. 1). According to the latest assessment report delivered by the first working group of the Intergovernmental Panel on Climate Change (IPCC AR6 WG1), "there remains substantial uncertainty in the magnitude of projected AA, with the Arctic warming ranging from two to four times the global average in models". This finding is supported by our own analysis of 37 CMIP6 models. Our baseline period for present-day climate is the 1995-2014 period also chosen by the AR6 WG1. The focus is on the SSP5-8.5 high-emission scenario and on an extended winter season (ONDJFM), when the Arctic warming is the highest in both models and observations. This choice allows us to maximize the signal to noise ratio and, thus, to use a single realization for each model. Similar results are obtained with the previous-generation CMIP5 models despite their overall lower climate sensitivity (Fig. S1). In line with former studies 12,14 , a higher degree of amplification over the Arctic ocean is found in the mid-21st century (Fig. S2) or in a weaker emission scenario (Fig. S3). Yet, our AA definition is based on both land and sea surface temperature (SST) north of 60°N and may thus also involve a snow feedback component which will not be explored in details in the present study, but may show a different forcing-dependence.
If one does not scale the projected climate change by the corresponding global warming in each CMIP6 model, the inter-model spread also includes the constrasted climate sensitivity across the CMIP6 models which adds to the previously shown AA uncertainty (Fig. S4). The latitudinal distribution of this spread is consistent with the pattern of uncertainties in the response of sea-ice concentration 15 (Fig. S5), snow cover ( Fig. S6) and total precipitable water ( Fig. S7), among other variables that may contribute to the high-latitude positive feedbacks on polar warming. Not surprisingly, the projected response of total cloud cover ( Fig. S8) ranges from slightly negative to clearly positive values over the Arctic across the CMIP6 multi-model ensemble. According to these models and in line with our results, the Arctic will become cloudier in a warmer climate, especially in winter when the associated positive radiative feedback occurs primarily in the longwave portion of the spectrum 16 . Given this fairly robust response, cloud feedbacks are important but were not so far identified as a major contribution to the inter-model spread in AA.
Several metrics 17 have been used to quantify the degree of AA in observations. Some are based upon the ratio of linar trends or of interannual variability between polar (here north of 60°N) and global mean near-surface air temperatures respectively (hereafter PSAT and GSAT). Another relies on the regression between polar and global warmings. Such definitions are sensitive to the selected time interval and cannot be easily used to provide a continuous monitoring given the weak signal to noise ratio at the beginning of the instrumental record or even in the mid 20 th century, thereby leading to undefined or very unstable values. This is the reason why another proposed definition 18 will be used in the present study, where AA is simply estimated as the difference between PSAT and GSAT. Such a definition is suitable for us since our statistical package (See Methods) allows us to constrain the polar warming with several observations, including GSAT and PSAT, so that the ratio in their constrained warming rates can be considered as a by-product of the method. Moreover, such Confidential manuscript a definition is also convenient to assess the possible evolution of AA, which is a time-evolving and bias-dependent metric as will be discussed later on. , as well as and the 10% and 90% local percentiles (bottom panels). All stereopolar maps are based on a set of thirty-six CMIP6 models with available monthly mean tas outputs. All anomalies are estimated as the differences between the 2081-2100 and 1995-2014 climatologies and scaled by the corresponding global warming. Figure 2a shows a scatterplot of AA in individual CMIP6 models according to the corresponding projected global warming at the end of the 21 st century relative to the 1995-2014 baseline period. Global warming only accounts for two thirds of the total spread in AA (i.e., PSAT -GSAT anomalies) among the CMIP6 ensemble. This result suggests that regional feedbacks also contribute to significant uncertainties in AA. This suggestion is further supported by panels b and c in Figure  2, showing a stronger correlation between AA and ONDJFM anomalies in NH sea ice extent and NH total precipitable water respectively. In contrast, there is no obvious link between AA and projected changes in the polar total cloud cover (Figure 2d), thereby suggesting that cloud feedbacks are mainly contributing to uncertainties in polar warming through their global rather than high-latitude footprints (mainly via uncertainties in the response of GSAT).

Confidential manuscript
While our substractive rather than multiplicative AA definition is not the most popular, Fig. S2 repeats the same analysis as in Figure 2 but for polar warming rather than AA (i.e., without substracting the GSAT from the PSAT anomalies). In this case, the slope found in Fig. S2a is the ratio between PSAT and GSAT warmings and shows a best fit of 2.6 K per 1K of global warming at the end of the 21 st century. The inter-model spread in PSAT anomalies is accounted for at 84% by uncertainties in the projected global warming. Yet, and in line with our previous results, the correlation is even higher when considering NH sea ice extent or total precipitable water anomalies. This again highlights that AA is generally stronger in models that project a large reduction in sea ice extent and a large percentage increase in tropospheric humidity. One key question is therefore to compare the effect of different observational constraints, not only on different metrics of Arctic climate change (to check whether they lead to consistent results about the regional sensitivity of CMIP6 models), but also on polar warming (to check whether some variables may be more efficient than GSAT to constrain this specific metric).

Constraining and attributing recent changes in Arctic climate
Beyond climate models, AA was also unveiled by paleoclimate reconstructions 19,20 , instrumental and satellite records 21,22,23 , as well as state-of-the-art atmospheric reanalyses 23,24 . An early observational study 21 based on a 125-year instrumental record did not support the simulated AA of global warming, but highlighted that the Arctic climate variability is dominated by multidecadal fluctuations which may obscure long-term changes. The key role of sea-ice internal multidecadal variability could not be conclusively identified, but was highlighted by several subsequent studies 25,26 .
In the early 21 st century, a two-fold ratio between the Arctic and global near-surface warming (slightly lower than the simulated 2.6 ratio found in Fig. S9 at the end of the 21 st century) was observed, leading to further investigation about the role of changes in sea ice, atmospheric and oceanic circulation, cloud cover and atmospheric water vapour 23 . As in models, the observed Arctic warming was shown to be strongest at the surface and primarily consistent with the concomitant sea-ice retreat. No significant evidence was found for a significant radiative impact of changes in cloud cover (despite a recent increase in Arctic cloudiness 27 ), but an increase in atmospheric water vapour content was reported and may have contributed to enhance the Arctic warming during summer and early autumn.
More recently 24 , a much stronger four-fold AA was reported over 1979-2021 from multiple temperature datasets including the state-of-the-art ERA5 reanalysis 28,29 . Such a warming ratio is extremely rare in the CMIP5 and CMIP6 simulations, thereby suggesting that it is either an extremely unlikely event or that AA is systematically underestimated by climate models 24 . This result advocates for a better understanding and consideration of internal climate variability when constraining the Arctic climate projections with the observed trends 30 . Our analysis is based on the KCC statistical package 31-32 and on the combined use of GSAT and PSAT reconstructions (see Methods). This Bayesian technique allows us to derive a posterior distribution of the projected anomalies from the prior distribution derived from the raw model outputs. Doing so, it takes account of both internal variability and observational uncertainties to constrain recent and future simulated climate changes in a consistent way. Yet, and unlike for GSAT, we did not include observational errors in our PSAT dataset, which is simply derived from ERA5 whose quality has been improved in the Arctic compared to previous global atmospheric Confidential manuscript reanalyses 29 . Similarly, our reconstructions of sea ice extent, total precipitable water and total cloud cover north of 60°N have been derived from ERA5 and should be considered cautiously given the evolution of the global observing system since 1959, especially for cloud and water variables.
Regarding the near-surface warming rate, our results support an observed four-fold ratio between the Arctic (north of 60°N) and the globe respectively (cf. black cross in Figure 3a). Yet, such a high value is outside the range of simulated values found in CMIP6 models and of the related prior joint distribution (blue ellipse fitted on raw model outputs). All CMIP6 models seem to underestimate the observed Arctic amplification factor, but a significant part of this underestimation may be due to internal variability which contributes to the overlap between the blue and black ellipses in Figure  3a. In the end, the uncertainty in the forced response of the CMIP6 models (red ellipse) is much less than in the prior distribution (blue ellipse). Yet, our constrained estimate of the polar warming amplification factor (around 3.5) is not much different from the unconstrained estimate. This value is much higher than the 2.6 ratio found at the end of the 21 st century (Fig. S9a), which is fully consistent with the fact that the regional Arctic feedbacks are time-evolving and forcingdependent 12,14 . This finding suggests that some Arctic feedbacks may be also sensitive to the base state and thus to biases in climate models. Such a link can be obscured by a few outliers, but is clear in the case of the cloud response (Fig. S10). Some CMIP6 models show a stronger than observed Arctic cloudiness 27 and is so close to 100% that it cannot increase across the 21 st century. This result highlights the need to further improve climate models and/or to constrain their projections with reliable observations.

Confidential manuscript
Beyond the recent AA, the KCC method allows us to constrain and attribute simulated changes in PSAT since 1850 (Figure 3b). Not surprisingly, the results show that the simulated changes cannot be explained by natural forcings and are thus mostly due to human activities. Yet, the polar warming induced by our greenhouse gas (GHG) emissions has been parly offset by anthropogenic aerosols across the 20 th century. The unconstrained aerosol cooling effect appears to be overestimated compared to the effect constrained with KCC, which may suggest that some aerosol effects may not be properly accounted for in most CMIP6 models, such as a possible warming effect due to the deposition of black carbon aerosols on sea-ice and a related decrease in surface albedo. Similarly, the unconstrained polar warming due to GHG seems to be slightly overestimated before the mid-20 th century, although their recent contribution seems to be quite realistic compared to ERA5. The observed increase in PSAT (black crosses) is thus mostly attributable to the GHG emissions and is reasonably captured by the ensemble mean CMIP6 historical simulations. Yet, this result does not mean that the CMIP6 projections are also reliable given the possible evolution of the Arctic feedbacks and the large inter-model spread found around the ensemble mean. This is the reason why we will now use KCC to constrain future changes in the Arctic climate.

Constraining future changes in Arctic climate
As discussed earlier, AA was found in both paleoclimate and future climate simulations, thereby suggesting that ice-core-based reconstructions may provide quantitative insights on global climate changes 19,33 . Yet, it may be inappropriate to simply scale an observational estimate of past temperature changes to predict the future climate sensitivity 33 . Moreover, the documented dependence of both Arctic and tropical feedbacks on control climate 15,33 may challenge the feasibility of constraining climate projections with paleo data.
A possible alternative to narrow model uncertainties in climate projections is to use the so-called "emergent constraint" (EC) technique 34,35 . This empirical statistical method consists of linking future climate changes to observable metrics that can be more or less accurately simulated by CMIP-class models. The relevance of these metrics is usually supported by the existence of a correlation with the projected future climate response within an ensemble of models. ECs are generally applied in a simple regression framework, where the ensemble is used to define a predictive relationship that can be combined with observations to produce an estimate of constrained projections. While this method has drawn a growing interest and has been applied to multiple variables, there are so far only few studies related to the Arctic 30,36 . Both relate to the projection of sea ice and rely on observed trends rather than on model biases. This strategy is consistent with our results that show no apparent link between see ice sensitivity and sea ice mean state across the CMIP6 models (Fig. S11). It is also support our choice to use the full historical record (rather than just a climatology or a linear trend estimate) to constrain the projections.
Interestingly, a recent EC study 37 was aimed at constraining the large CMIP6 scatter in AA with a broad set of recent observations co-located to model data. The results suggested that the lower thermodynamic structure of the atmosphere is more realistically depicted in climate models with limited AA (weakly positive polar LRF) in the recent past. In contrast, remote influences that can shape the warming structure in the free troposphere are more realistically captured by models with a strong AA (strongly positive Arctic LRF). The two contrasted findings highlight the difficulty to define and combine relevant ECs.
Moreover, there is increasing evidence that most ECs that have been proposed to constrain CMIP5 projections are generally less efficient when applied to CMIP6 models 35 . This spurious behavior can arise from model interdependency 29,36,37 through common structural model assumptions and can lead to overconfident constrained projections. Our KCC technique does not build on empirical linear regression schemes and has been tested successfully in a perfect model framework 31  Confidential manuscript already applied at both global 31 and local scales 32 . Beyond temperature, KCC has been also used to constrain other variables, such as global total precipitable water 38 or global land surface relative humidity 39 , leading to consistent results for both CMIP6 and CMIP5 models.
Here, the KCC method is first applied to the SSP5-8.5 high-emission scenario of CMIP6. In line with the limited GSAT influence on our AA metric (Fig1.a), our results show the added value of PSAT observations for constraining the Arctic warming (Fig. S12) and AA (Fig. 4). Using observations of global mean surface temperature (GMST, combination of SAT over land and sea surface temperature over the ocean) only (Fig. 4b), KCC leads to a narrowing and downward shift of the AA projections, in line with the well-known 31 overestimation of global warming by some CMIP6 models (Fig. 4a). In contrast, the ERA5 constraint of PSAT does not change the ensemble mean projection and mostly exclude the lowest range of projected AA. Moreover, the obtained reduction of the 90% confidence interval at the end of the century is slightly stronger (~29%) than for the GMST constraint (~26%). Finally, and not surprisingly, the possible combination of both observational constraints within KCC leads to an even stronger narrowing (31.4%) and suggests that the most extreme AA responses found across CMIP6 models are not compatible with the observations. Note that KCC leads to even more confident mid-century projections so that the provided narrowing of the 5-95% confidence interval is here a minimum value.  Not surprisingly given our AA definition, similar results are obtained when focusing on PSAT projections (Fig. S12) since GSAT and PSAT are not independent metrics. Moreover, our results are not much sensitive to the choice of the prior distribution since similar qualitative results have been obtained with the previous-generation CMIP5 models, although the narrowing of uncertainties is then lower given the lower inter-model spread (Fig. S13). KCC can be also used to constrain other projected changes in the Arctic climate. It is for instance even more efficient to narrow uncertainties in the projected ONDJFM sea ice extent with an overall reduction of 5-95% confidence interval by 40% at the end of the century (Fig. S14). Note that the ERA5 sea ice concentration that have been here used as an observational constraint (in addition to GMST) are derived from satellite data since ERA5 is a global atmospheric reanalysis driven by observed oceanic bounday conditions.
Similarly, ERA5 can also be used to constrain the average total precipitation water (Fig. S15) or total cloud cover (Fig. S16) projected north of 60°N. The KCC results should be here considered with more caution since ERA5 has been shown to exhibit spurious global trends in tropospheric humidity given the evolution of the assimilated satellite data 38 . Yet, the results are physically consistent with those obtained for PSAT and the northern hemisphere sea ice extent. GMST observations lead to lower the upper bound of the projected Arctic warming and, consistently, have similar effects on the projected high-latitude total precipitable water and cloud cover. In contrast, our regional ERA5 constraints lead to exclude the lower bound values of the CMIP6 ensemble. This is also consistent with the opposite effects on the projected sea ice extent. This robust finding confirms that regional climate change does not scale accurately with global warming across different models 40 , and that local observations are also very important to constrain regional climate change 32,40,41 . The results also suggest that more reliable reconstructions of atmospheric humidity and cloudiness would be very useful to better constrain the projections.
Finally, and given the partial redundancy between global and regional temperature variations, we can use the ERA5 sea ice extent rather than GMST to constrain the surface polar warming (Fig.  S17). Results are then fully consistent with our previous attempt to constrain PSAT (Fig. S12). Yet, the double constraint based on both ERA5 sea ice extent and ERA5 PSAT leads to a slightly better performance of KCC with a 36.6% (instead of 35.4%) narrowing of the 5-95% confidence interval (but again no significant change in the ensemble mean response). This result confirms the significant correlation between GSAT and PSAT (as indicated in Fig. 2a) and emphasize the added value of reliable gridded observations for constraining climate change projections at the regional scale. Note however that we here assumed no observational uncertainty in the ERA5 reanalysis, so that only internal variability is accounted for in KCC. Results are not much sensitive if we introduce a 20% random mesurement error as a white noise in the observed timeseries (not shown). Clearly, KCC is a robust method for constraining both past and future climate change, and can provide even more tightly constrained projections if we use more reliable observations or longer timeseries. It could be thus used in a semi-operational context where climate projections are constrained on a regular basis (every year), using best quality-checked and updated datatsets.

Methods
The observational constraint method, called Kriging for Climate Change (KCC), has been previously applied to global and local warming 31,32 , and can be easily applied to other climate variables 38,39 as long as their internal variability can be fitted with a simple mix of auto-regressive processes (Fig. S18). KCC consists of three steps. First, the forced response of each climate model is estimated over the whole 1850-2100 period and the response to GHG forcings is estimated separately for the attribution component of the study. Second, the sample of the forced responses from the available climate models (CMIP5 or CMIP6) is used as a prior of the real-world forced response. Finally, observations are used to derive a posterior distribution of the past and future forced response given observations. The method can be summarised using the following equation: where y is the time-series of observations (a vector), x is the time-series of the forced response (a vector), H is an observational operator (matrix), ε is the random noise associated with internal variability and measurement errors (a vector), and ε ∼ N(0, Σ y ), where N stands for the multivariate Gaussian distribution. Climate models are used to construct a prior on x: π(x) = N(μ x , Σ x ). Then the posterior distribution given observations y can be derived as p(x|y) = N(μ p , Σ p ). Remarkably, μ p and Σ p are available in closed-form expressions.
In the following, we assess the forced response of Arctic climate (e.g., spatially averaged nearsurface temperature or integrated sea-ice extent), as well as the response to specific subsets of radiative forcings (attribution). These forced responses are then constrained by the historical observed global warming (https://www.metoffice.gov.uk/hadobs/hadcrut5/) and/or by ERA5 reanalyses (https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5).
We thus consider the following CMIP matrix: x = (T all , E all , E ghg , E nat ) where each element is an entire 1850-2100 time-series of the forced response, T and E stand for global mean surface temperature (GMST) and an ERA5 variable, respectively. "all", "ghg" or "nat" are the subsets of external forcings considered. Similarly, we define an observed matrix as: i.e., only observed time-series are used in y. The length of these time-series varies: 1850-2021 for HadCRUT5 GMST and 1950-2021 for ERA5 (we have noticed that ERA5 has been recently updated from 1940 onwards, but we feel that the 1940s will not provide a significant additional constraint within KCC given the limited GHG influence in the mid-20 th century).
All attribution or projection diagnoses presented below can be derived from the posterior distribution p(x|y). μ x and Σ x are estimated as the sample mean and covariance of the forced responses. Σ y requires statistical modelling of internal variability and measurement errors. The intrinsic variance of the selected global and Arctic climate indices is derived from observations after subtracting the multi-model mean estimate of the forced response from HadCRUT5 and ERA5 data respectively. We also assume a dependence between the global and regional variability, by accounting for the correlation between the two residuals in Σ y . The assessment of measurement uncertainty is based on the HadCRUT5 ensemble for GMST (200 members), while for ERA5, we do not account for observational errors (but have tested that our results are robust if we introduce random errors in the timeseries).