Observation-based selection of climate models projects Arctic ice-free summers around 2035

Arctic sea ice has been retreating at an accelerating pace over the past decades. Model projections show that the Arctic Ocean could be almost ice free in summer by the middle of this century. However, the uncertainties related to these projections are relatively large. Here we use 33 global climate models from the Coupled Model Intercomparison Project 6 (CMIP6) and select models that best capture the observed Arctic sea-ice area and volume and northward ocean heat transport to refine model projections of Arctic sea ice. This model selection leads to lower Arctic sea-ice area and volume relative to the multi-model mean without model selection and summer ice-free conditions could occur as early as around 2035. These results highlight a potential underestimation of future Arctic sea-ice loss when including all CMIP6 models. The Arctic Ocean could be ice free in summer as early as 2035, according to an analysis of CMIP6 models which selects only the models that best capture observed sea-ice area and volume and northward ocean heat transport

T he retreat of Arctic sea ice is one of the most striking consequences of global warming and has strong implications for local and remote climate, biosphere and society 1 . The total area of the Arctic Ocean covered by sea ice, the Arctic sea-ice area, has decreased by about 2 million km 2 (yearly average) in the past 40 years of satellite observations, with more pronounced loss in the summer [1][2][3] . As sea ice has also thinned by 1.5-2 m in the central Arctic since 1980 4,5 , the total Arctic sea-ice volume has substantially decreased at a rate of about 3000 km 3 per decade since 1979 6,7 . The current Arctic sea-ice losses are strongly connected to rising global temperatures [8][9][10] , and thus to cumulative greenhouse gas emissions into the atmosphere 3,11 . The observed sensitivity of sea-ice changes to cumulative greenhouse gas emissions has been used to provide an estimate of the future Arctic sea-ice area 11 . However, this simple linear extrapolation neglects non-linearities in the climate system and oceanice-atmosphere interactions and feedbacks [12][13][14] , resulting in large short-and long-term deviations from the ongoing negative trend in sea-ice area and volume [15][16][17][18] .
In order to include these non-linearities and interactions, climate models can be used to provide more reliable projections of the fate of Arctic sea ice 19,20 . In particular, global climate models coupling the atmosphere, ocean and sea ice are well suited to make such projections [21][22][23][24] . The inclusion of these models in the different Coupled Model Intercomparison Project (CMIP) phases [25][26][27] allows for estimates of Arctic sea-ice area and volume projections in the next decades to centuries. The latest CMIP6 modelling effort 27 will feed into the next Intergovernmental Panel on Climate Change Assessment Report 6 and includes climate model projections that follow different future greenhouse gas emission scenarios using the Shared Socioeconomic Pathways (SSPs) 28 .
Our study is the first one, to the best of our knowledge, to make use of a large range of selection criteria to refine future projections of Arctic sea ice coming from the CMIP6 model simulations. We select models that best represent the present Arctic seaice state and northward ocean heat transport, as the latter is a major driver of recent sea-ice loss 29,30 . We find that the sea-ice loss over this century is larger using different model selection criteria compared to the average over all models without model selection. In particular, we find that summer ice-free Arctic conditions could occur as early as around 2035 in the selection case, compared to 2061 in the no-selection case.

Results and discussion
Projections without model selection. In our study, we focus on both the high-emission SSP5-8.5 and low-emission SSP1-2.6 scenarios, which correspond to a global warming of around 4 and 1 ∘ C, respectively, over this century (2081-2100 relative to 1995-2014) 31 . Averaged over 33 CMIP6 models (totalling 166 model members, Supplementary Table 1), the multi-model mean March Arctic sea-ice area and volume are reduced by 45% and 78%, respectively, in 2096-2100, compared to 2015-2019, in the high-emission scenario ( Fig. 1a and Supplementary Fig. 1a). In September, the multi-model mean Arctic sea-ice area and volume are decreased by 90% and 98%, respectively, at the end of the century (Fig. 1b and Supplementary Fig. 1b). In addition, the Arctic Ocean becomes almost ice free (sea-ice area lower than 1 million km 214 ) in September in 2061 for the multi-model mean, with a large inter-model spread covering the whole twenty-first century (Fig. 1b). Note that first ice-free conditions are reached about 10 years earlier (for the multi-model mean) in a previous CMIP6 study 24 that did not include two models that have a high mean state bias (which we include here) and used only the first member of each model. These Arctic sea-ice area and volume changes are considerably slowed down in the low-emission scenario: the multi-model mean March sea-ice area and volume are reduced by only 8% and 28%, respectively, at the end of the century, while the September sea-ice area is decreased by 49% and thus never reaches almost ice-free conditions during this century, and the September sea-ice volume is lowered by 69% ( Supplementary Figs. 2 and 3). However, model projections suffer from large uncertainties related to the chosen greenhouse gas emission scenarios, model physics and internal variability 32,33 , such that the spread in future Arctic seaice projections is relatively large among climate models ( Fig. 1 and Supplementary Fig. 2) 21,24 . In the high-emission scenario, the model spread increases over time for March sea-ice area (Fig. 1a), while it decreases for September sea-ice area as a large number of models lose almost all their sea ice around 2050 (Fig. 1b). In the low-emission scenario, the model spread in March and September sea-ice area does not substantially vary over time as the changes over the twenty-first century are not as large as in the highemission scenario ( Supplementary Fig. 2).
Model selection. Considering the simple average of all available models assumes that all models are equally plausible and that the range of their projections is representative of the uncertainty 34 . As some models better represent one specific aspect of the observed climate (Arctic sea ice in our case), we can argue that these models provide more accurate projections of this specific aspect. A good agreement with observations does not constitute final evidence that models are correct, but a consistently bad agreement with observations clearly indicates some problems of the models 34 . Different approaches have been taken to reduce the model spread in projections of Arctic sea-ice area for a given emission scenario. One such approach consists in giving a weight to each model based on its performance relative to observations during the historical period: models that strongly agree with observations receive more weight than models that poorly agree 31,34,35 . Another approach is to select models based on their historical performance and exclude models that do not satisfy the selection criteria 21,24,36,37 .
In our study, we adopt the latter approach, i.e. model selection, as it allows to exclude model outliers that show large biases in relevant variables for Arctic sea-ice representation based on clearly defined selection criteria. We define a series of selection criteria based on the mean, variability and trend in Arctic sea-ice area and volume (see 'Methods' section and Supplementary Table 1). The northward Atlantic and Pacific ocean heat transports at different latitudes are also used as selection criteria as they have been shown to have a large influence on recent seaice changes 29,30,[38][39][40] . While the atmosphere drives most Arctic sea-ice loss in the short term (within a decade), the ocean plays a greater role in the long term (beyond 10 years) 41 . These criteria are used to retain CMIP6 models closest to observations over the historical period , and allow us to compute the multimodel mean Arctic sea-ice area and volume until the end of the twenty-first century based on the selected models, thus refining model projections of Arctic sea-ice area and volume. To reduce the effect of internal variability, we use the longest possible period for the model-observation comparison and use all existing ensemble members 16,33 .
Projections with model selection. When applying our selection criteria, we find that the Arctic sea-ice area and volume generally reach lower values at the end of this century compared to the case without selection, for both emission scenarios (Figs. 2 and 3 and Supplementary Figs. 1 and 3-5), in agreement with a previous study using CMIP5 models 35 . This is mainly due to stronger reductions in sea-ice area and volume over the twenty-first century in the selected models, and to a lesser extent due to smaller initial present-day Arctic sea-ice area ( Fig. 3 and Supplementary  Fig. 5). The stronger reductions in sea-ice area and volume over the twenty-first century partly stem from the fact that some of the selected models have a larger sensitivity to anthropogenic global warming than the non-selected models. Also, the smaller presentday sea-ice area in the selected models is due to the fact that the multi-model mean without selection overestimates the observed sea-ice area (Fig. 3a, b); thus, the selection of models closer to observations reduces this overestimation, explaining the smaller present-day sea-ice area. We checked the robustness of our results by performing a bootstrap analysis in which we randomly selected 10 models and averaged the results over 1000 realizations. As the multi-model mean sea-ice area and volume averaged over the random selection are very close to the multi-model mean over the full sample of models (Fig. 3), our finding that model selection with our criteria leads to smaller sea-ice area and volume is robust.
The loss of sea-ice area and volume over this century is most pronounced when selecting models that best represent the historical Atlantic and Pacific ocean heat transports, in combination or not with the mean sea-ice area and volume (Figs. 2 and 3 and Supplementary Figs. 1 and 3-5). In the high-emission scenario and for all selection criteria including ocean heat transport, March sea-ice area and volume reach less than 7 million km 2 and less than 5000 km 3 , respectively, by the end of the twenty-first century, and September sea ice completely disappears (Fig. 3). Selecting models that best represent the observed mean sea-ice area and volume and trend in sea-ice area also provides a stronger reduction in future Arctic sea-ice area and volume compared to no selection, especially in September, but that reduction is less strong than with the ocean heat transport criteria (Figs. 2 and 3 and Supplementary Figs. 1 and 3 -5).
The selections based on the variability in sea-ice area and volume and trend in sea-ice volume are not as clear-cut: depending on the month or the scenario, these selection criteria provide smaller or larger reductions in sea-ice area and volume (Figs. 2 and 3 and Supplementary Figs. 1 and 3 -5). For sea-ice area and volume variability, this is partly linked to the fact that these quantities are directly related to atmospheric variability 10 . In turn, the latter does not highly depend on the total amount of ice. Thus, even a model with too much (or not enough) sea ice can have a realistic atmospheric variability, leading to a realistic sea-ice variability.
An additional model selection criterion that we include in our analysis is the minimum number of ensemble members. We select all models that have at least five members, as this allows to both keep models that partly take into account the uncertainty linked to internal variability and to have about a third of the total number of models (ten models). We find that the multi-model mean averaged over these models also leads to a stronger sea-ice loss relative to no selection, with no remaining sea ice in September by the end of the century and reductions of 60% and 87% in March sea-ice area and volume, respectively, in the highemission scenario (Figs. 2 and 3). It is important to point out that seven models (out of ten) selected according to the number of members are also selected in at least another selection criterion associated with ocean heat transport (Supplementary Table 1). The three remaining models are also selected in criteria related to mean sea-ice area, mean sea-ice volume and/or trend in sea-ice area. Thus, the models selected according to the number of members tend to be the ones that are also selected in criteria presenting a larger sea-ice loss compared to the multi-model mean without selection.
Ice-free Arctic. Our model selection based on historical performance allows to exclude outliers that have either too much or not enough Arctic sea ice. For the winter months, outliers are mainly located on the high end as most models overestimate the observed sea-ice area (Fig. 3a), while for the summer months outliers are located on either end (Fig. 3b) 24 . Thus, our model selection narrows down the spread in model projections of Arctic sea ice by excluding outliers. In particular, the threshold of an ice-free Arctic in summer is reached earlier with model selection compared to without selection. In the high-emission scenario, five out of six selection criteria that include ocean heat transport provide a first ice-free Arctic in September before 2040 (range of multimodel means: 2032-2039), more than 20 years before the date of ice-free Arctic for the multi-model mean without model selection (i.e. 2061) (Fig. 4). In the selection based on the number of members, the first September ice-free Arctic is reached in 2043. The selection criteria associated with only sea-ice area or volume provide a later date of ice-free Arctic, but still about a decade before the case without selection (between 2047 and 2052), except for the selection based on sea-ice area variability (2066) (Fig. 4). These results are in agreement with a previous study 35 in which a multiple diagnostic ensemble regression is applied to CMIP5 models using model weights. They also find an earlier neardisappearance of Arctic sea ice by more than a decade in the high-emission scenario.
It is important to point out that in the high-emission scenario four models do not reach ice-free conditions in September before the end of the twenty-first century and have a relatively large seaice area compared to other models (Fig. 1b). This explains the relatively late date of first ice-free Arctic for multi-model means that include these models, i.e. without selection and associated with four selection criteria (mean sea-ice volume, sea-ice area variability, sea-ice volume variability and trend in sea-ice volume), compared to the spread in first ice-free Arctic date of models included in these criteria (Fig. 4). In particular, the year of first ice-free Arctic is 2061 for the multi-model mean averaged over all models, while a majority of models (64%) reach this threshold before 2050. If we remove the four models that do not reach ice-free conditions before 2100, the multi-model mean first ice-free Arctic date is advanced to 2048, and is within the range of selection criteria based on sea-ice area and volume only. As our different model selections generally exclude the four models that do not reach ice-free conditions before 2100, the timing of icefree conditions occurs earlier compared to the multi-model mean without selection.
In the low-emission scenario, the six selection criteria that include ocean heat transport and the five-member selection criterion all provide a first ice-free Arctic in September at least some years before the end of this century, but with a sea-ice area staying close to the 1 million km 2 threshold until the end of the century (Supplementary Figs. 4b and 6). For five out of six criteria that include ocean heat transport, the date of first ice-free Arctic is delayed by 4-16 years compared to the high-emission scenario, while it is delayed by 33 years for the selection based on the number of members and 36 years for the remaining ocean heat transport criterion. Ice-free conditions are also reached in the last decade of the twenty-first century for the multi-model mean based on the trend in sea-ice area (46 years later than in the highemission scenario). For the five remaining selection criteria based  Fig. 6). In addition, although 45% of the models simulate an icefree Arctic before 2050, no ice-free Arctic occurs in this century for the multi-model mean averaged over all models, as 12 models do not reach ice-free conditions before 2100 ( Supplementary  Fig. 6).
Conclusions. The rapid ongoing disintegration of Arctic sea ice can have dramatic consequences on other components of the climate system, such as the atmosphere 42,43 and the ocean 44,45 , as well as the biosphere and our societies 1 . This calls for a need to improve the future projections of Arctic sea ice. In our study, we have shown that these projections can potentially be improved by selecting climate models that best represent the present state in terms of sea-ice area, sea-ice volume and northward ocean heat transport. This model selection reveals that sea-ice area and volume reach lower values at the end of this century compared to the multi-model mean without selection. This arises both from a more rapid reduction in these quantities through this century and from a lower present-day sea-ice area. Using such a model selection, the timing of an almost ice-free Arctic in summer is advanced by up to 29 years in the high-emission scenario, i.e. it could occur as early as around 2035. Thus, these results highlight a potential underestimation of the future Arctic sea-ice loss when including all CMIP6 models.
Another way to reduce uncertainties in future model projections of Arctic sea ice is to extend our methodology to processbased selection criteria. In our study, we found that the selection criteria related to ocean heat transport, known to be a key driver of the recent sea-ice loss, provide the earliest dates of a first icefree Arctic. Identifying the models that are able to correctly reproduce the mechanisms by which sea ice is affected by ocean heat transport (and other climate drivers) would allow to provide better future projections of Arctic sea ice.
Methods CMIP6 model simulations. In our study, we analysed outputs from climate models participating in the CMIP6 effort 27 . We extracted the monthly mean seaice concentration and sea-ice volume per area (or sea-ice thickness if the sea-ice volume per area was not available) from CMIP6 models that were run over both the historical period (1850-2014) and the future (2015-2100), using SSP1-2.6 (weak greenhouse gas emission scenario) and SSP5-8.5 (strong greenhouse gas emission scenario). We computed the total Arctic sea-ice area as the product of sea-ice concentration and grid-cell area summed over the ocean region north of 40 ∘ N. The total Arctic sea-ice volume is the product of sea-ice volume per area (or sea-ice thickness times sea-ice concentration) and grid-cell area summed over the ocean region north of 40 ∘ N. Sea-ice area from 32 models is used for the SSP1-2.6 scenario and from 33 models for the SSP5-8.5 scenario (Supplementary Table 1). As some models have run several ensemble members with different initial conditions, we have a total of 166 model simulations for both SSP1-2.6 and SSP5-8.5. Sea-ice volume from 28 models is used for both SSP1-2.6 and SSP5-8.5, including a total of 155 member simulations for SSP1-2.6 and 154 member simulations for SSP5-8.5. In addition, we extracted the monthly mean historical northward ocean heat transport (computed directly by the different models) from 16 models (it was not available for the other models). We computed the ensemble mean sea-ice area, sea-ice volume and ocean heat transport over all members for each individual model. In our study, we always use the ensemble mean for each individual model, as we think it provides the best representation of the response to an increase in greenhouse gas emissions. Supplementary Table 1  Reference products. In order to evaluate CMIP6 models over the historical period, we used different observational and reanalysis datasets. For sea-ice area, we retrieved sea-ice concentration from the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Ocean Sea Ice Satellite Application Facility (OSI SAF) 46 available since 1979, and we integrated this quantity over the northern hemisphere (north of 40 ∘ N). We used sea-ice volume from the Pan-Arctic Ice-Ocean Modelling and Assimilation System (PIOMAS) 6 , which is a coupled ocean-sea ice model with capability of assimilating daily sea-ice concentration and sea-surface temperature. This dataset is available since 1979 and shows reasonable agreement with observations 47 . Estimates of ocean meridional heat transport (Atlantic and Pacific) come from Trenberth et al. 48 and are deduced from top-ofatmosphere radiation coming from Clouds and the Earth's Radiant Energy System, vertically integrated atmospheric energy divergence from ERA-Interim, and ocean heat content from Ocean Reanalysis System 5. This dataset is available for the period 2000-2016. Finally, we also used Atlantic Ocean heat transport estimates derived from the Rapid Climate Change Meridional Overturning Circulation and Heatflux Array (RAPID-MOCHA) observing system deployed at 26 ∘ N (2004-2018) 49 , as well as from the Overturning in the Subpolar North Atlantic Programme (OSNAP) observing system deployed around 57 ∘ N (2014 and 2016) 50 .
Selection criteria. In order to retain CMIP6 models closest to observations and reanalysis over the historical period, we defined a series of selection criteria based on sea-ice area, sea-ice volume, ocean heat transport and the number of ensemble members. Here is a description of these selection criteria: (1) Mean sea-ice area: we selected the 15 models (about half of the available models) closest to the observed mean sea-ice area averaged over 1979-2014 for both March and September combined. retained only the models that have at least five ensemble members in the projection scenarios (ten models in total). For criteria 1-8, we combined March and September sea-ice quantities (criteria 1-6), Atlantic ocean heat transport at two latitudes (criterion 7) and Atlantic and Pacific ocean heat transports (criterion 8). For these criteria, we looked at the ranking of the two diagnostics (e.g. March and September mean sea-ice area for criterion 1) and we picked models that are best placed in the two rankings combined until we selected the desired amount of models (e.g. 15 models for criterion 1). The models that are included in each selection criterion are shown in Supplementary Table 1. For each selection criterion, we computed the multi-model mean sea-ice area and volume (Figs. 2 and 3 and Supplementary Figs. 1 and 3-5).
As each model selection reduces the number of models taken into account for computing the associated multi-model mean, we also performed a bootstrap analysis in which we randomly selected ten models and we averaged the results over 1000 realizations (Fig. 3 and Supplementary Fig. 5, in which the error bars show the standard deviations over the 1000 realizations). This allows to check that our results obtained with our selection criteria are not obtained by chance.

Data availability
All the CMIP6 model data 27 used in this study (historical and scenario runs) can be accessed through the ESGF nodes: https://esgf-node.llnl.gov/search/cmip6. A list of the model simulations used in this study and associated references is provided in Supplementary Tables 2-4. The observed sea-ice concentration from OSI SAF 46 can be accessed through the EUMETSAT repository: https://doi.org/10.15770/ EUM_SAF_OSI_0008. The PIOMAS 6 sea-ice volume data can be accessed via the Polar Science Center of the University of Washington: http://psc.apl.uw.edu/research/projects/ arctic-sea-ice-volume-anomaly. The ocean meridional heat transport estimates from Trenberth et al. 48 are located here: https://doi.org/10.5065/9v3y-fn61. The RAPID-MOCHA 49 ocean heat transport at 26.5 ∘ N can be retrieved from the Rosenstiel School Ocean Technology Lab: https://mocha.rsmas.miami.edu/mocha/results/index.html. The OSNAP 50 ocean heat transport data can be accessed here: https://www.o-snap.org/ observations/data.

Code availability
The Python scripts to produce the figures of this article are available on Zenodo: https:// zenodo.org/record/4912115.