Introduction

Arctic amplification (AA), the simple fact that the Arctic surface warms at a faster rate than the rest of the globe1,2, is the most prominent feature of increasing atmospheric concentrations of greenhouse gases (GHG)3,4,5, as a consequence of anthropogenic emissions. Within the Arctic, AA has had wide impacts on ecosystems and socio-economic activities5. For example, amplified Arctic warming has melted sea ice in shelf regions, and opened a new route for polar commercial shipping6,7,8, fishery activities9,10, and natural resource extractions11. AA has also been suggested to exert a far-reaching influence on extreme weather events in North America, Europe, and Northern Siberia12,13,14,15, although this influence—and the possible dynamical pathways—are poorly understood and, in fact, remain vigorously debated16,17,18,19,20,21. Improving our understanding of AA and its causes is, therefore, not only of scientific merit, but also has regional and, potentially, global implications5.

Simulations with general circulation models (GCM) forced by increasing CO2 concentrations had revealed that AA is not a consequence of a larger radiative CO2 forcing at the poles than at lower-latitudes, as one might have naively imagined: in fact, upon increasing CO2 concentrations, the direct radiative forcing is larger at the equator than at the poles22,23. AA, is thus produced by local positive feedbacks22,24,25,26,27,28, poleward heat and moisture transports28,29,30,31,32,33, oceanic heat exchange mechanisms34,35,36, and, possibly, complex interactions among these factors37,38. While the seasonal evolution of AA is a complex phenomenon, involving multiple coupled mechanisms with different seasonal features35,38,39,40, it is widely accepted that the sea-ice conditions—and the accompanying atmosphere-ocean heat exchange—are an important player35,36,41,42,43,44,45,46,47,48,49,50,51. The amplified Arctic warming, ultimately caused by GHG increases, is thus closely tied to the seasonal evolution of sea ice. During Northern Hemisphere summer, sea-ice reduction allows absorbed solar radiation to warm the ocean mixed layer, a process enhanced by the sea-ice albedo feedback2,39,52,53,54. Then, in the following autumn and winter, when the atmosphere rapidly cools, the enhanced air-sea thermal contrast results in stronger surface heat and moisture fluxes entering the atmosphere, and these produce stronger lower-tropospheric warming, enhanced by the sea-ice insulation effect50,55 and longwave feedback processes22,38,56. Fully-coupled ocean-atmosphere-sea-ice-land GCM experiments in which Arctic sea-ice loss is artificially specified showed a clear causal link between sea-ice loss and amplified Arctic atmospheric warming44,45,46,47,48,57,58. More importantly, since the amount of future Arctic sea-ice loss depends on the seasonal evolution of sea ice as the climate warms59,60, one would expect the seasonal cycle of AA to correspondingly evolve with increasing CO2 as a recent study revealed61. This is one of the key motivations for this work: to investigate the seasonal cycle of AA in response to CO2 forcing.

In addition, to date, most modeling studies of Arctic climate change are based on the experiments with doubling (hereafter 2×) or, at most, quadrupling (hereafter 4×) of CO2 concentrations from pre-industrial levels (~285 ppm, hereafter PI)22,38,62. Because the Arctic climate system is characterized by large internal variability63,64,65,66, exploring larger CO2 values would be useful to obtain a better signal-to-noise ratio. Moreover, in practical terms, looking at higher CO2 values is of interest because in the high emission scenario used for climate projections by Phase 6 of the Coupled Model Intercomparison Project (CMIP6), CO2 levels can reach higher values than 4 × CO2 after the year 210067. However, scenario integrations involve many forcings that change in complex ways, often non-monotonically (e.g., aerosols or ozone-depleting GHGs), thus complicating our understanding of the Arctic response to increasing CO2.

In order to clearly bring out the characteristics of sea-ice loss and AA at high CO2 concentrations, we here analyze a set of GCM simulations performed in a recent study68, with abrupt forcing spanning the range 1 × to 8 × CO2. This allows us to document the response of the Arctic climate system as a function of CO2 over a broad range of values, without the confounding interference of other forcings. Having documented that response, we explicitly demonstrate that the results obtained from the abrupt n × CO2 simulations are not of merely academic interest. Analyzing the Arctic climate evolution under high-CO2 emissions scenarios, in both single-model large-ensembles and CMIP6 multi-model ensemble, we show that the key features of the response seen in the abrupt CO2-only forcing simulations also emerge in the realistic scenario simulations throughout the course of 21st century.

Results

Arctic warming in response to abrupt CO2 forcing

We start by examining the response of the annual-mean Arctic surface air temperature (SAT) to CO2 forcing in the fully coupled ocean-atmosphere-sea-ice-land model experiments. In this study, we define the response as the difference, averaged over the last 30 years of the simulations, between the n × CO2 and the 1 × CO2 (i.e., the pre-industrial) experiments, with n ranging from 2 to 8. Figure 1a shows that, in fully coupled model experiments, the Arctic SAT response becomes stronger as CO2 forcing increases (solid line with circle symbols) with range from 6.3 K for 2 × CO2 to 17.1 K for 8 × CO2. As the Arctic SAT warming is coupled closely to sea-ice retreat35,36,38, the corresponding sea-ice extent (SIE) response decreases with increasing CO2 (Fig. 1b). The weaker SAT and SIE responses at 4 × CO2 than at 3 × CO2 are associated with a substantial weakening of the Atlantic meridional overturning circulation (AMOC) in 4 × CO2 experiment68, which results in less oceanic heat transport into the Arctic, and reduced Arctic warming and sea-ice loss. We confirm this by examining the corresponding slab ocean model experiments, in which changes in ocean dynamics are absent. There are no kinks at 4 × CO2 in slab ocean model SAT and SIE responses (dashed lines with triangles). It is important to note that the non-monotonic climate response at 4 × CO2 is not confined within the Arctic, as it is present in many aspects of the climate response in the Northern Hemisphere (e.g., the AMOC, tropical expansion, precipitation), and has been shown not to be a model-dependent feature (see a recent study68 using the same abrupt CO2 experiments for more details).

Fig. 1: Annual-mean Arctic surface and tropospheric response as a function of CO2 forcing in fully coupled and slab ocean model experiments.
figure 1

a Surface air temperature (SAT) response. The colored dots are the 2× and 4 × CO2 experiments from six CMIP6 models: CESM2 (blue), CNRM-CM6-1 (cyan), GISS-E2-1-G (green), MIROC6 (yellow), MRI-ESM2-0 (magenta), TaiESM1 (red). b Sea-ice extent (SIE) response. c Turbulent heat flux (latent plus sensible heat components) response (positive value means heat fluxes from ocean to atmosphere). d 1000–500 hPa thickness (geopotential height difference between 1000 hPa and 500 hPa) response. The solid line with circle symbols represents responses for the fully coupled model experiments, whereas the dashed line with triangle symbols for the slab ocean model experiments. Error bars denote 95% confidence intervals calculated using Student’s t-distribution.

Examining the n × CO2 simulations further, we find increasing surface turbulent (latent and sensible) heat fluxes entering the atmosphere as CO2 increases (Fig. 1c), as one would expect from the larger areas of open water accompanying the larger sea-ice losses. The consistent responses of SAT, SIE, and air-sea heat fluxes confirm that these components are closely coupled with each other, and they are likely working together to enhance the warming of the lower-troposphere, as recent studies have argued35,36,38. We also examine the response of the surface shortwave and longwave radiative fluxes (positive upward into the atmosphere), both of which show more negative values at greater CO2 forcing (Supplementary Fig. 1), indicating that they penetrate more into the ocean and leave the atmosphere cooler as CO2 increases. We interpret the shortwave and longwave radiative changes to be both a cause and a consequence of the warmer troposphere and larger sea-ice retreat. The reader is likely well aware that, while we have only presented the annual-mean responses, both turbulent heat and radiative fluxes have strong seasonal cycles35,36,39, but we postpone the discussion of the seasonal features of the responses to the next section in order to first examine the vertical structure.

As seen in Fig. 2a–g, our simulations reveal that the vertical extent of the zonal-mean air temperature response not only becomes stronger but also penetrates deeper into the troposphere as the forcing is increased. The Arctic warming response of 2 × CO2 is rather shallow, mostly below 850 hPa, but reaches much higher as the forcing is increased to 8 × CO2. Quantitatively, the polar cap-averaged (60°N–90°) temperature responses are about 6 K at 1000 hPa and 3 K at 500 hPa for 2 × CO2, whereas they are about 20 K at 1000 hPa and 12 K at 500 hPa for 8 × CO2 (Fig. 2h). These vertical profiles manifest a stronger bottom-heavy warming structure and a larger vertical temperature gradient at higher CO2: this favors the lapse-rate feedback which enhances the near-surface Arctic warming, as many studies highlighted22,24,28,38. Similar results are found in slab ocean model experiments (Supplementary Fig. 2), and the overall stronger warming responses appear consistent with previous studies69,70. As to whether the amplified Arctic warming might be able to impact mid-latitude weather and climate, it has been suggested that Arctic warming extending into the middle troposphere (e.g., 500 hPa) would affect the mid-latitude circulation71,72,73 and possibly cause cooling in mid-latitudes72. We have calculated the 1000–500 hPa thickness over the polar cap for all the n × CO2 simulations and find that the troposphere indeed becomes thicker as CO2 increases (Fig. 1d), and this may lead to mid-latitude cooling, although this signal could be masked by the prevailing global warming.

Fig. 2: Vertical structure of the zonal-mean air temperature response to increasing CO2 forcing.
figure 2

ag The zonal-mean air temperature responses for 2× to 8 × CO2 respectively in fully coupled model experiments. Only the responses passing Student’s t-test with 95% confidence interval are shown. h Polar cap-averaged (60°N–90°N) air temperature responses.

Arctic amplification and its seasonal cycle in response to CO2 forcing

We now turn from Arctic warming to Arctic amplification (AA), which we here quantify with a non-dimensional factor (hereafter AAF) defined as the Arctic-averaged (60°N–90°N) SAT response divided by a globally-average one. Although both the annual Arctic and the global SAT increase rapidly in the first 20-year simulations, in the n × CO2 runs the annual-mean AAF (Fig. 3a) decreases for the first 70 years and then flattens out, with AAF values around 2 for the higher CO2 forcings (5× to 8 × CO2), but somewhat larger for 2× and 3 × CO2. The 4 × CO2 case shows the smallest AAF, and deviates from other cases because of the weaker Arctic warming response (Fig. 1a) related to the AMOC slowdown, as discussed in the previous section. Focusing on the last 30 years of each simulation (grey shading in Fig. 3a), we find that the AAF decreases from about 2.3 to 1.9 as the forcing is increased from 2× to 8 × CO2 (solid line with circles in Fig. 3b). The AAFs in slab ocean model experiments show similar decreasing values, but without the 4 × CO2 drop (dashed line with triangles in Fig. 3b), confirming again that the ocean dynamics associated with the AMOC slowdown is the main cause of the AAF drop at 4 × CO2 value in the fully coupled model experiment. We also investigate the seasonal amplitude of AAF, defined as the maximum minus the minimum of 30-year means, and find that it decreases with increasing CO2 (Fig. 3c). This indicates that smaller seasonal variations would occur at high CO2 concentrations.

Fig. 3: Time evolution and annual-mean amplitude of the AAF in the abrupt CO2 forcing experiments.
figure 3

a The evolution of the annual AAF for 2× to 8 × CO2 in fully coupled model experiments. The last 30-year period is shaded with grey. b Annual-mean AAFs as functions of CO2 forcing. c The AAF annual amplitude or seasonal range, defined as the maximum AAF minus minimum AAF of the mean seasonal cycle response. In b and c, the solid line with circle symbols represents responses for the fully coupled model experiments, whereas the dashed line with triangle symbols for the slab ocean model experiments. Error bars denote 95% confidence intervals calculated using Student’s t-distribution.

However, we would be missing out on an important aspect of the Arctic response to CO2 forcing if we limit our discussion of the AAF to the annual mean. This is because the response of Arctic air temperatures, sea ice, and atmosphere-ocean heat exchange (and the associated feedback mechanisms) depends strongly on season35,39. For example, over the period 1958–2017, the warming of the Arctic in boreal winter (about 4 K) has been five times larger than in summer (about 0.8 K), based on a reanalysis data (and GCM simulations), as shown in Fig. 1 of a recent study35. We thus next examine, in some details, the seasonal evolution of the Arctic response to increasing CO2, starting with the AAF.

The seasonal cycle of AAF at increasing CO2 concentrations, averaged over the last 30-year of the abrupt forcing simulations, is shown in Fig. 4b. Two features immediately stand out: (1) the AAF decreases at larger CO2 forcing, and (2) the month of the largest AAF shifts to a later winter season. Investigation into the seasonal evolution of AAF from July to June indicates that the AAF maximum values, decreasing with CO2 forcing, determine the decrease in annual amplitude, because the minimum values do not respond much (Fig. 4b). We further find that the AAF peak occurs in November for 2 × CO2 case, and the peak progresses to December for 3× to 5 × CO2, and then to January for yet higher CO2 concentrations. This shift is dominated by the seasonal evolution of Arctic SAT response (Fig. 4a), not the extra-Arctic SAT response (Fig. 4d), referenced to the month of July (i.e., with the month of July value subtracted out). On the other hand, the global SAT seasonal responses are largely controlled by Arctic ones (c.f., 4c and 4a). We also find similar seasonal shifts of the peak AAF from November to December and the dominant roles of Arctic SAT responses in the AAF in slab ocean model experiments (Supplementary Figs. 3a, b). Our results suggest that the AA is delayed by 1 to 2 months in response to stronger CO2 forcing.

Fig. 4: Seasonal evolutions of the Arctic SAT, SIE, turbulent heat flux response, and AAFs in the abrupt CO2 forcing experiments.
figure 4

a The evolution of Arctic SAT responses for 2× to 8 × CO2. b Similar as (a) but for AAF. c, d Same as (a) but for global and extra-Arctic (90°S–59°N) SAT responses. e, f Same as (a) but for SIE and turbulent heat flux responses. Error bars denote 95% confidence intervals calculated using Student’s t-distribution. Values in a, c and d are referenced to July values. All quantities are computed from the fully-coupled model runs.

To investigate the underlying mechanism, which we expect is closely related to the sea-ice conditions and atmosphere-ocean heat fluxes in the cold season, we perform the same seasonal cycle analysis for Arctic SIE and turbulent (latent plus sensible) heat fluxes (Fig. 4e and f, respectively). It is evident that the largest SIE loss occurs in most runs (2 to 5 × CO2) about one month before the turbulent heat flux maximum (which shift from November to December) in fully coupled model experiments, indicating that the sea-ice loss tends to produce the atmosphere-ocean heat fluxes and eventually give rise to AAF shifts. At higher CO2 forcing (7 and 8 × CO2), the largest SIE loss occurs almost together with the turbulent heat flux maximum, suggesting that this mechanism contracts with time. On the other hand, the net surface shortwave fluxes are nearly zero in boreal winter, and thus are unlikely to play an important role (Supplementary Figs. 4a and b). The net surface longwave fluxes do show seasonal shifts during the cold season, but overall they are lagged by about one month in fully coupled model experiments (Supplementary Fig. 4b) compared to the shifts in SAT, SIE, and turbulent heat fluxes; this leads us to conclude that they are unlikely to be a driver of the seasonal shift, and are more likely a consequence, as revealed in the annual-mean values. In the slab ocean model experiments, the longwave fluxes do not show consistent seasonal shifts in 4× to 6 × CO2 cases (Supplementary Fig. 4d), reflecting the lack of interactive ocean dynamics in regulating the climatological seasonal cycle. Further analysis would be needed to clarify the longwave flux differences between fully coupled and slab ocean model experiments. Taken in combination, these results suggest that the shift of the AAF peak from November to December and January, in response to increasing CO2 forcing, is mainly governed by the interactive processes involving SAT, SIE, and atmosphere-ocean heat flux components.

The above analysis could be misleading because the proposed mechanism is supposed to only work above the Arctic ocean, not the Arctic land. To address this issue, we perform the same analysis but with the ocean-only and land-only Arctic domains. Supplementary Figure 5 shows the ocean-only Arctic SAT response, AAF, and turbulent heat flux response: these are similar, though slightly weaker, to the results in Fig. 4. In contrast, the land-only counterparts do not show consistent seasonal shifts (Supplementary Fig. 6). We further separate the ocean-only Arctic domain into ice-free and ice-covered regions, and find that the seasonal shift only occurs in the latter. These additional analyses not only confirm that the dominant role of the coupling between SAT, SIE, and atmosphere-ocean heat flux in shifting the AAF, but also suggest that the land component does not contribute much to the shift.

Because the above analyses are based on very simple abrupt n × CO2 forcing experiments, they provide a very clean benchmark for quantifying the seasonal shift of the AAF with increasing CO2. However, one might wonder whether they are too idealized to be of practical value. To address this concern, we finally turn our attention to more realistic scenarios, as those would be more informative for stakeholders and policymakers. We thus analyze a set of single-model initial-condition large ensembles (SMILEs) from Multi-Model Large Ensemble Archive (MMLEA)74. All the SMILEs simulations considered here were forced with GHG emissions and other forcings following the Representative Concentration Pathway 8.5 (RCP85) after the year 2005. The goal of this exercise is to determine whether the AAF also shifts in the scenarios, and if so when.

We first examine the Community Earth System Model version 1 (CESM1)75 SMILE, which was produced with the exact same model configuration used in our fully coupled abrupt n × CO2 forcing experiments, to investigate the AA peak shift from early to late 21st century. Figure 5a and b show the Arctic SAT responses and AAFs (referenced to 1921–1950 mean) averaged over periods 2011–2040, 2041–2070, 2071–2100 periods. The peaks of the SAT response and of the AAF show consistent shifts from November to December, with the AAF maximum decreasing from about 3.5 to 3.2, as the 30-year periods approach the end of this century. This behavior is robust, as it can also be seen in the GFDL-CM3 SMILE, which also shows seasonal shifts of Arctic warming response and AAF (Fig. 5c, d). We also find that the CanESM2 shows a clear shift from November to December, and that the MPI-ESM also shows a shift (with larger January than November values for the 2071–2100 period, and smaller January than November values for other periods (see Supplementary Fig. 7).

Fig. 5: Seasonal evolutions of the Arctic SAT responses and AAF in the CESM1 and GFDL-CM3 SMILEs, and the CMIP6 models.
figure 5

a The evolution of Arctic SAT responses (reference to 1921–1950 averages) in CESM1 SMILE. b Similar as (a) but for AAF. c, d Similar as (a), (b) but for GFDL-CM3 SMILE. e, f Similar as (a), (b) but for multi-model mean of 40 CMIP6 models. In ad, the solid line represents ensemble means and the color shading indicates one standard deviation across ensemble members. In e, f, the solid line represents multi-model means and the color shading indicates one standard deviation across models. In a, c, e, all values are referenced to July values. The ensemble or model size is labeled in the parenthesis of each title.

Finally, we have examined the seasonal cycle of the SAT and the AAF in 40 different CMIP6 models (one member for each model) under the Shared Socioeconomic Pathway 5-8.5 (SSP585) scenario; the multi-model mean and cross-model spread are shown in Fig. 5e, f. While the seasonal shift of the AAF peak is not as marked as in our abrupt n × CO2 forcing experiments, or in the SMILEs, due to a relatively large inter-model spread, already identified in CMIP5 studies via local energy loss above sea-ice retreat regions39, a shifting tendency of the peak from the earlier to the later decades is still present. The CMIP6 models under SSP585 forcing, therefore, exhibit the shifting tendency as the January value is larger than the November one in the late 21st century but not the case for other periods, and suggest that the seasonal shift in AA is very likely to emerge in the upcoming decades, with its precise timing depending on the amount of CO2 concentrations in the atmosphere (and being subject, of course, to model uncertainty). We have also examined the seasonal shift of the Arctic SAT response in each CMIP6 model (Supplementary Fig. 8): more than half of them show a 1- or 2-month shift, and the rest show at least some shifting tendency. Only one model shows reverse shifting (GISS-E2-1-G).

Discussion

We have documented the warming of the Arctic in response to increasing concentrations of CO2, from 2× to 8× its PI value, analyzing a suite of abrupt CO2 forcing experiments from a recent study68. The annual-mean Arctic warming response is found to be closely coupled to the response of sea ice and surface turbulent heat fluxes, and become intensified at high CO2 forcing due to more open water as a result of more sea-ice loss (and resultant sea-ice insulation effect50,55) and stronger heat fluxes entering the atmosphere in a nearly ice-free state. The vertical profile of the Arctic warming response is also found to extend higher into the troposphere as CO2 increases, and thus may be able to produce a clearer Arctic warm-mid-latitude cold linkage, as suggested by a recent study72; however, the question of whether this may be overwhelmed by the overall CO2-induced warming signal in mid-latitudes remains open. The annual AAF, on the other hand, is found to decrease as CO2 forcing increases, consistent with previous studies35,52,55. This weakening is mainly attributed to the relatively smaller sea-ice loss and the accompanying weaker turbulent heat fluxes at higher CO2 levels compared to those at low CO2 levels, which limits the degree of Arctic warming.

Our analysis of the seasonal evolution of Arctic climate in response to abrupt CO2 forcing has yielded several new insights. The seasonal evolution of AAF shows that the peak value occurs in the cold season and shifts from November to December and January as CO2 forcing increases. This seasonal AAF shift in our abrupt n × CO2 experiments stems from the shift of the Arctic SAT response, not the extra-Arctic SAT response, because no clear seasonal shift is seen outside the Arctic. These findings are also seen in the SMILEs and the CMIP6 models under high-CO2 emission scenario: in those models, the shift of the AAF peak from November to December is projected to emerge in the second half of the 21st century, when CO2 concentrations reach sufficiently high values. If the CO2 emission continues to increase beyond the year 210067, the models project that the AAF peak will progress into January or February. Our findings have substantial implications for Arctic ecosystems and socio-economics. For instance, the timing of new commercial shipping routes6,7,8 and fishery activities in the Arctic9,10 in boreal winter may need to be re-assessed in upcoming decades if the projection from SMILEs and CMIP6 model we have analyzed proves to be realistic.

The ensemble spread in SMILE simulations represents the effect of internal variability, whereas the spread in the CMIP6 simulations also includes the structural model uncertainties76,77. It is clear from Fig. 5 that the former changes little as time progresses, whereas the latter increases substantially. This means that the structural model uncertainties may become important toward the end of the 21st century, as noted in a recent study76 which also pointed out the increasing importance of scenario uncertainty (see also a recent study78). Therefore, caution is needed in interpreting the multi-model means in future projections, as the forced responses can be masked by structural and scenario uncertainties: this is likely the cause for the less apparent seasonal shift of the AAF peaks in CMIP6 multi-model mean. Our results have the additional value of examining single-model simulations with large-ensemble members, which shows the clear emergence of the delayed AAF peaks by the end of 21st century.

We propose that the dominant mechanism behind the shift of the seasonal AAF peak in Northern Hemisphere winter with increasing CO2 involves the concomitant seasonal evolution of Arctic near-surface warming, sea-ice loss, and atmosphere-ocean heat exchange. Indeed, the close relationship between those three components, and their interactions, have been emphasized as the principal mechanism for cold-season amplified Arctic warming by many previous studies35,39,49,54. The lapse-rate and convective cloud feedbacks have also been highlighted by some previous studies as important in producing an amplified Arctic warming in winter22,27; however, more recent studies have demonstrated that they are either linked to ocean-to-atmosphere heat and moisture fluxes35,37,38,40 or controlled by preceding sea-ice albedo feedback36. The water vapor feedback, on the other hand, is strong primarily during the sea-ice melting season22,27,39. In addition, the potential role of remote impacts from atmospheric and oceanic heat transports are yet to be investigated, although some studies have argued that their effects on near-surface Arctic warming are relatively small (due to the cancellation of dry and wet components)22,39,42,53. The above mechanisms are less likely, we believe, to play a dominant role in shifting the AAF peak in boreal winter as CO2 increases, as supported by our analysis on the surface net shortwave and longwave fluxes. Nevertheless, we plan to perform a complete feedback decomposition analysis35,38,79 in a future study, to further clarify their contributions and quantify their uncertainties. In addition, the sea-ice heat capacity mechanism has also been proposed as an important player in the seasonal shift of Arctic temperature change80. Finally, the CO2 forcing structure has also been suggested to be critical for the amplified Arctic warming28,81, and we also hope to investigate that in the future.

Methods

Fully coupled and slab ocean model abrupt CO2 forcing experiments

In this study, we analyze a set of abrupt CO2 fully coupled and slab ocean model experiments, carried out in a recent study68. In the fully coupled model experiments, the Community Earth System Model version 1 (CESM1), consisting of the Community Atmosphere Model version 5 (CAM5, 30 vertical levels) and parallel ocean program version 2 (POP2, 60 vertical levels) with nominal 1o horizontal resolution in all model components75, was forced by 1× (i.e., PI CO2), 2×, 3×, 4×, 5×, 6×, 7×, and 8 × CO2 forcings, while all other trace gases, ozone concentrations, and aerosols were fixed at PI values. Following the 4 × CO2 protocol for CMIP6, all of our fully coupled model experiments are integrated for 150 years starting from PI initial conditions. The slab ocean model experiments use the same atmospheric component which is, however, coupled to a mixed-layer ocean with prescribed ocean heat transport82, and kept constant at PI annual and monthly values derived from CESM1 simulations, respectively. We also show the results of 60-year slab ocean model experiments with 1×, 2×, 3×, 4×, 5×, and 6 × CO2 forcings. In these runs, we define the response in any variable of interest as the difference between any of the n × CO2 runs and the 1 × CO2 run (i.e., the PI control run), averaged over the last 30 years of each integration. We have verified that our results are not sensitive to the choice of averaging period.

SMILE model output

We analyze monthly SAT data from six SMILEs in the Multi-Model Large Ensemble Archive (MMLEA)74. These SMILEs model runs were forced with historical and RCP8.5 emission scenarios, and were performed under the CESM1 Large Ensemble Community Project75 (40 members), the Canadian Earth System Model (CanESM2) Large Ensemble83 (50 members), the Commonwealth Scientific and Industrial Research Organisation (CSIRO-Mk3.6.0) Large Ensemble84 (30 members), Geophysical Fluid Dynamics Laboratory (GFDL-CM3) Large Ensemble85 (20 members), the Geophysical Fluid Dynamics Laboratory Earth System Model (GFDL-ESM2M) Large Ensemble86 (30 members), and the Max Planck Institute Earth System Model (MPI-ESM) Grand Ensemble87 (100 members). We do not show results of GFDL-ESM2M and CSIRO-Mk3.6.0 because their sea-ice extents during the 21st century are unrealistically high76 and, as a consequence, changes in the AAF are rather small. For CESM1, GFDL-CM3, and MPI-ESM, the SAT response is calculated as the SAT difference between the 30-year mean in the historical or RCP8.5 periods and 30-year mean in 1921–1950 period (which we take as the reference period). The CanESM2 runs only start at 1950, so we use 1951–1980 as the reference period.

CMIP6 output

Monthly SAT data from 40 Coupled Model Intercomparison Project phase 6 (CMIP6)88 models are used in this study. All models are from the r1i1p1 ensemble with PI, historical and Shared Socioeconomic Pathway 5-8.5 (SSP585, an update of RCP8.5 forcing scenario) runs available. These models include: ACCESS-CM2, ACCESS-ESM1-5, AWI-CM-1-1-MR, BCC-CSM2-MR, CAMS-CSM1-0, CanESM5, CESM2, CESM2-WACCM, CIESM, CMCC-CM2-SR5, CNRM-CM6-1-HR, CNRM-CM6-1, CNRM-ESM2-1, E3SM-1-1, EC-Earth3, EC-Earth3-Veg, FGOALS-f3-L, FGOALS-g3, FIO-ESM-2-0, GFDL-CM4, GFDL-ESM4, GISS-E2-1-G, HadGEM3-GC31-LL, HadGEM3-GC31-MM, IITM-ESM, INM-CM4-8, INM-CM5-0, IPSL-CM6A-LR, KACE-1-0-G, MCM-UA-1-0, MIROC6, MIROC-ES2L, MPI-ESM1-2-HR, MPI-ESM1-2-LR, MRI-ESM2-0, NESM3, NorESM2-LM, NorESM2-MM, TaiESM1, UKESM1-0-LL. For each model, the SAT response is calculated as the SAT difference between the 30-year mean in historical or SSP585 runs and 30-year mean in PI runs. We then take an average of these SAT responses as multi-model mean SAT response shown in Fig. 5. Six models (i.e., CESM2, CNRM-CM6-1, GISS-E2-1-G, MIROC6, MRI-ESM2-0, and TaiESM1) having both 2 × CO2 and 4 × CO2 experiments are analyzed and presented in Fig. 1a.

Statistical significance analysis

Statistical significance is indicated with error bars in Figs. 1, 3, and 4, calculated using a Student’s t-distribution with 95% confidence intervals using a Python statistical package (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html) that considers the probability density function \(f(x,\nu )=\frac{{{\Gamma }}((\nu +1)/2)}{\sqrt{\pi \nu }{{\Gamma }}(\nu /2)}{(1+{x}^{2}/\nu )}^{-(\nu +1)/2}\), where ν is the degree of freedom and Γ is the gamma function. In Fig. 2, the statistical significance of 30-year mean differences is determined using a two-sided Student’s t-test, the null hypothesis being that the difference of 30-year mean is zero. If the null hypothesis can be rejected with a 5% significance level (i.e., the p-value <0.05), we refer to the mean difference as statistically significant.