Robust but weak winter atmospheric circulation response to future Arctic sea ice loss

The possibility that Arctic sea ice loss weakens mid-latitude westerlies, promoting more severe cold winters, has sparked more than a decade of scientific debate, with apparent support from observations but inconclusive modelling evidence. Here we show that sixteen models contributing to the Polar Amplification Model Intercomparison Project simulate a weakening of mid-latitude westerlies in response to projected Arctic sea ice loss. We develop an emergent constraint based on eddy feedback, which is 1.2 to 3 times too weak in the models, suggesting that the real-world weakening lies towards the higher end of the model simulations. Still, the modelled response to Arctic sea ice loss is weak: the North Atlantic Oscillation response is similar in magnitude and offsets the projected response to increased greenhouse gases, but would only account for around 10% of variations in individual years. We further find that relationships between Arctic sea ice and atmospheric circulation have weakened recently in observations and are no longer inconsistent with those in models.

1. The authors use a 3 stage mechanism to explain the weakening of the upward EP flux, and to explain the "positive eddy feedback" which underlies their emergent constraint. However, this mechanism seems to confuse many aspects of the wave-mean flow problem of jet shifts. For example, in stage 1 the authors state: "Reduced wind shear reduces baroclinic eddy formation, weakening the storm track and reducing F_p at the surface..." But wind shear and baroclinic eddy formation are *both* determined by meridional temperature gradients: near-surface temperature gradients are responsible for the reversal of the PV gradient that drives baroclinic eddy formation. The QG dynamics which the authors appeal to here represent the 2nd-order dynamics on top of the 1st-order geostrophically-balanced flow (i.e., thermal wind). Similarly, the second step of the mechanism describes a circulation that is consistent with the jet shift, but cannot be used to establish causality. I also note that the authors further undermine their conceptual picture with some notation issues, e.g. taking the divergence of scalar quantities (see minor comments below).
I suggest either cutting this section, or scaling the discussion back and simply noting that the nearsurface temperature response drives an equatorward jet shift by both shifting the temperature gradient and shifting the baroclinic regional equatorwards. Filling in the details can be left to future studies.
Having said all that, changes in eddy momentum fluxes do reinforce ("feedback" on) jet shifts, but this is true of any jet shift. So, if I've understood correctly, the authors' emergent constraint claims that this set of models will underestimate jet shifts in response to any forcing, not just sea ice loss.
2. Moving on to the emergent constraint itself, I am still not sure how exactly it is calculated. It seems as though the authors correlate the EMF divergence in the box shown in Figure 6a with the zonalmean zonal wind in this box, so that the metric M is the r^2 value of the correlations for DJF. The correlations use the highest frequency data available, without any averaging. Is this correct? Are the reanalysis data de-trended? It seems like the PAMIP experiments are time-slices without repeating climatology (no trends). Could this affect the comparison with the reanalysis data (which may have some underlying trends)? I am also concerned that the relationships shown in Figure 7 mostly come from one "bad" model: E3SMv1. As discussed e.g., by Brient and Schneider (2016), "bad" outlier models can exert a strong leverage on emergent constraints, yet we should place less weight on them, since they are presumably less realistic. I suggest the authors either use a methodology which damps the impact of these models (like the Brient and Schneider approach) or investigate how their results change when this model is not included in the analysis. correlated with the background SPV (r=0.50, p=0.03, not shown) but in this case the observations are near the middle of the model range so that ER is close to the simple ensemble mean". So one constraint suggests the observations are outside of the model range, and the other puts them in the middle of the model range --which should we trust? Or should have low confidence in both constraints?
Minor comments: 1. In equation 4, and the discussion of the EP fluxes, there seems to be some confusion regarding vector and scalar quantities. In the middle of equation 5, the authors are taking gradients of scalar variables, not the divergence of vectors ($\nabla F_\phi$ not $\nabla\cdot F_\phi$). Also I suggest just writing out the $\frac{\partial F_p}{\partial p}$, otherwise it's confusing whether the gradient operator refers to a horizontal gradient, a vertical gradient or a 3D gradient.
2. At L580 it says "cite" where I'm guessing a citation is meant to go. Also at L644.

3.
Not sure what is meant by the sentence at L642-3. In the ensemble mean, \bar{y} = \beta \bar{x}.
4. Figure 2: Suggesting using a single colorbar, along the bottom or right side of the figure, to save space. Figure 3c: I'm confused, the contours show F_p, and the arrows show (F_\phi, F_p). Is this correct? I also don't understand how the normalization was done from the caption. Could you either write it out or use equations? 6. Figure 4d: what do the solid/dashed gray contours represent? 7. Figure 9: should the caption say the BK curves are red, not green?

5.
Reviewer #2: Remarks to the Author: Accept subject to minor revisions The manuscript is thorough showing the atmospheric dynamic complexity due to loss of sea ice. I was pleased with the discussion of real world physics in addition to just models.
Line 294 I do not understand sentence: since models are able to predict the real world better than themselves Line 317 I Disagree: "are thus unlikely to drive large impacts in individual winters." It is still possible to have short impact events of one to several weeks in any given year. A seasonal average may still be small Reviewer #3: Remarks to the Author: Using 16 different atmospheric models with more than 3000 ensemble members, this study investigates the transient response of northern hemisphere winter westerlies to future Arctic sea ice loss. Consistent with previous modeling studies, this study finds that the Arctic sea ice loss causes a robust equatorward shift of mid-latitude westerlies: a significant weakening around 50-70N and a slight strengthening around 30-40N. A key finding is that the inter-model differences in zonal-mean wind responses (ZWRI) can be explained by eddy feedback parameter and the eddy feedback parameter of reanalysis data is about 1.2~3 times larger than in climate models. I believe a thorough and comprehensive analysis of multi-model simulations can warrant publication at Nature Communications. In particular, this study 1) comprehensively quantifies the sensitivity to sea ice loss by utilizing 16 models with more than 3000 ensembles, 2) provides insight into the sensitivity of zonal-mean wind response to sea ice loss by introducing a zonal wind response index (ZWRI) that can explain the meridional circulation anomalies, and 3) is partly successful in providing an emergent constraint by calculating eddy feedback parameter both for climate models and for reanalysis data.
Specific comments: It took me considerable time, effort and patience to read through this paper. This is not only because TEM dynamics are difficult to understand but also because this paper tries to deliver too much information.
1) There are two key messages and these two are not closely related to each other. To me, quantifying the multi-model ZWRI by eddy feedback parameter is a key message of this study. However, the abstract emphasizes that the modelled response to Arctic sea ice loss is weak and the relationships between Arctic sea ice and atmospheric circulation have weakened recently.
2) I suggest deleting Figure 9, which is not closely related to previous figures. I understand that the authors want to deliver as much information as possible to educate readers, but please reconsider.
3) Abstract: "the North Atlantic Oscillation response is similar in magnitude and offsets the projected response to increased greenhouse gases, but would only account for around 10% of variations in individual years" Is this really necessary to include this sentence in the abstract? A previous modelling study pointed out that the equatorward shift of NH westerlies driven by future Arctic sea ice loss is opposed by the response to low-latitude surface warming (see Figure 5 of Blackport and Kushner 2017). They also noted in the abstract that "internal variability can easily contaminate the estimates..." Small/large, strong/weak are subjective words and the time mean response of westerlies to future Arctic sea ice loss is not necessarily small compared to the westerly response to the future tropical SST warming. Blackport, R., and P. J. Kushner, 2017: Isolating the Atmospheric Circulation Response to Arctic Sea Ice Loss in the Coupled Climate System. J. Climate, 30, 2163-2185. Figure 4 or move this figure to Supplementary information. I really cannot understand why the October TEM circulation and EP flux anomalies are special and can be interpreted as physical mechanisms. It is well known that summer sea ice loss and the associated increase in Arctic ocean heat content are accompanied by seasonally persistent surface warming. I guess the authors are careful about interpreting the winter surface warming because the winter Arctic warming in observation is not only driven by summer sea ice loss but also by winter circulation anomalies? I think the authors do not need to worry about this issue because this PAMIP experiment is designed to isolate the impact of Arctic sea ice loss from other factors. 5) Lines 629-631: Please explain the difference between eddy driving and eddy feedback. 6) Lines 642-643: I am not sure whether this statement is correct or not. Please consult with a statistician. 7) Line 644: "regressioncite" seems to be a typo. Figures 5 and 6: Which season? Are they about DJF average? 9) Line 574: "assess the effect of coupling": Does coupling imply ocean coupling? Figures 3, 4, 5, 6 more in detail. It seems that T_bar is zonal-mean temperature anomalies and U_bar is zonal-mean zonal wind. How about changing T_bar and U_bar to [T] and [U] ?

10) Please write down the definitions of U_bar and T_bar shown in
In this paper, Smith et al examine the winter atmospheric circulation response to Arctic sea ice loss using results from 16 models contributing to the Polar Amplification Model Intercomparison Project. The authors convincingly show that the ensemble-mean response to winter sea-ice loss under the global-mean warming of 2C is an equatorward shift of the mid-latitude jet in the Northern Hemisphere, associated with cooling over mid-latitudes. The authors also develop an emergent constraint for the atmospheric response to sea-ice loss based on an "eddy feedback". I am very skeptical of the logic behind this proposed emergent constraint, as well as of its robustness, so it is difficult for me to recommend publication of the manuscript at this stage. My concerns about the feedback are listed below, as well as some minor comments.
1. The authors use a 3 stage mechanism to explain the weakening of the upward EP flux, and to explain the "positive eddy feedback" which underlies their emergent constraint. However, this mechanism seems to confuse many aspects of the wave-mean flow problem of jet shifts. For example, in stage 1 the authors state: "Reduced wind shear reduces baroclinic eddy formation, weakening the storm track and reducing F_p at the surface..." But wind shear and baroclinic eddy formation are *both* determined by meridional temperature gradients: near-surface temperature gradients are responsible for the reversal of the PV gradient that drives baroclinic eddy formation.
We agree, which is why we were careful to include both a reduction in wind shear and a reduction in baroclinic eddy formation and F_p in stage 1. To make this clearer we have renamed stage 1 to be "Reduced zonal wind shear and eddy formation".
The QG dynamics which the authors appeal to here represent the 2nd-order dynamics on top of the 1st-order geostrophically-balanced flow (i.e., thermal wind).
Again, we agree, which is why stage 1 is predominantly governed by the thermal wind relation whereas stage 2 requires a consideration of the 2 nd -order dynamics. Of course, both thermal wind and the QG balance must be satisfied together, and the meridional circulation is the adjustment needed to satisfy thermal wind and the changes in eddy fluxes, which both occur in stage 1 as mentioned above. We have amended the text to make it clearer that the resulting flow maintains the thermal wind balance and is consistent with changes in eddy activity.
Similarly, the second step of the mechanism describes a circulation that is consistent with the jet shift, but cannot be used to establish causality.
We agree that causality cannot be established unequivocally. We now state this and have amended the text to avoid claiming causality. However, the processes that we highlight are all consequences of an imposed reduction in surface meridional temperature gradient and highlight a key role for eddies. Furthermore, the monthly evolution provides some information about how the different processes develop. To make this clearer we have expanded Fig 4 to show November in addition to October.
I also note that the authors further undermine their conceptual picture with some notation issues, e.g. taking the divergence of scalar quantities (see minor comments below).
Apologies, this has now been corrected (see minor comments below) -thanks for spotting this I suggest either cutting this section, or scaling the discussion back and simply noting that the near-surface temperature response drives an equatorward jet shift by both shifting the temperature gradient and shifting the baroclinic regional equatorwards. Filling in the details can be left to future studies.
We believe that a detailed understanding of the physical processes is essential for developing a credible emergent constraint. We are therefore loathed to cut this analysis which shows an important role for eddies and hence underpins our proposed constraint. We are conscious however that some of the processes are not particularly easy to understand and we have amended the text to clarify our arguments. We believe this section is now much clearer -but suggestions for further improvements would of course be welcome.
Having said all that, changes in eddy momentum fluxes do reinforce ("feedback" on) jet shifts, but this is true of any jet shift. So, if I've understood correctly, the authors' emergent constraint claims that this set of models will underestimate jet shifts in response to any forcing, not just sea ice loss.
Absolutely. We already link to the "signal-to-noise paradox" which shows that models underestimate the magnitude of predicted atmospheric circulation changes, especially in the north Atlantic, and we hope that our new results will motivate further studies. We have also added a comment that our emergent constraint could apply to other forcings to motivate further work to understand the model spread in future climate projections.
2. Moving on to the emergent constraint itself, I am still not sure how exactly it is calculated. It seems as though the authors correlate the EMF divergence in the box shown in Figure 6a with the zonal-mean zonal wind in this box, so that the metric M is the r^2 value of the correlations for DJF.

Yes, that is correct -though we have now further clarified that M is the local r^2 between zonal mean wind and EMF divergence averaged over the box in Fig 6a to avoid any confusion that we may have averaged the zonal winds and EMF divergences over the box before computing the correlations.
The correlations use the highest frequency data available, without any averaging. Is this correct?
No -we have clarified that the regressions are based on seasonal mean data. This avoids the need to analyse large volumes of high frequency data and allows models for which high frequency data are not available to be included (daily velocities are not top priority in the PAMIP data request). We find strong correlations using seasonal mean data (Fig. 6a) along with a large spread across the models (Fig. 6b) and statistically significant differences in our eddy feedback parameter (Fig. 7). Furthermore, our measure of eddy feedback explains some of the model spread in ZWRI, consistent with the important role of eddies in the physical mechanism. Hence, we argue that eddy feedback may be assessed with seasonal mean data and hope that our simple measure will facilitate future studies of its role in other contexts.
Are the reanalysis data de-trended? It seems like the PAMIP experiments are time-slices without repeating climatology (no trends). Could this affect the comparison with the reanalysis data (which may have some underlying trends)?

This is a good question and in fact we had already checked that detrending makes virtually no difference to the eddy feedback estimation for the reanalyses. We have now included a statement to make this clear -thanks for highlighting this.
I am also concerned that the relationships shown in Figure 7 mostly come from one "bad" model: E3SMv1. As discussed e.g., by Brient and Schneider (2016), "bad" outlier models can exert a strong leverage on emergent constraints, yet we should place less weight on them, since they are presumably less realistic. I suggest the authors either use a methodology which damps the impact of these models (like the Brient and Schneider approach) or investigate how their results change when this model is not included in the analysis. Reference: Brient, F. and T. Schneider, 2016: Constraints on climate sensitivity from space-based measurements of low-cloud reflection. Journal of Climate, 29, 5821-5835.
We tested this by removing each model in turn and repeating the regression, as suggested. This is most sensitive to removing E3SMv1 and CanESM5, increasing the p values to 0.16 and 0.07 respectively for ZWRI, and to 0.06 and 0.10 respectively for SPV. Thus, the ZWRI or SPV remains significant at p ≤ 0.07 when outlying models are removed. We have included these results (and the reference) in the manuscript.
3. The authors seem to undermine their own emergent constraint at L264-6: "For example, ZWRI is correlated with the background SPV (r=0.50, p=0.03, not shown) but in this case the observations are near the middle of the model range so that ER is close to the simple ensemble mean". So one constraint suggests the observations are outside of the model range, and the other puts them in the middle of the model range --which should we trust? Or should have low confidence in both constraints?

Apologies for not being clear here. If we had only looked at the relationship between ZWRI and the polar vortex strength we would have concluded that the real-world response was near the middle of the models. But this would be incorrect, given the existence of the constraint based on eddy feedback that is much more strongly related to the physical processes and places the real world towards the upper end of the models. We have now amended the text to make this clearer.
Minor comments: 1. In equation 4, and the discussion of the EP fluxes, there seems to be some confusion regarding vector and scalar quantities. In the middle of equation 5, the authors are taking gradients of scalar variables, not the divergence of vectors ($\nabla F_\phi$ not $\nabla\cdot F_\phi$). Also I suggest just writing out the $\frac{\partial F_p}{\partial p}$, otherwise it's confusing whether the gradient operator refers to a horizontal gradient, a vertical gradient or a 3D gradient.
Apologies, this has now been corrected -thanks for spotting this. We decided to use ∇ ∅ ∅ etc rather than write out 1 cos ∅ ∅ cos ∅ ∅ 2. At L580 it says "cite" where I'm guessing a citation is meant to go. Also at L644.
Corrected -thanks for spotting these 3. Not sure what is meant by the sentence at L642-3. In the ensemble mean, \bar{y} = \beta \bar{x}.
We have clarified that the simple multi-model ensemble mean is inappropriate if the noise is not independent of x, as stated by Bracegirdle and Stephenson (2012) 4. Figure 2: Suggesting using a single colorbar, along the bottom or right side of the figure, to save space.
Agreed, figure has been replaced -thanks for this suggestion 5. Figure 3c: I'm confused, the contours show F_p, and the arrows show (F_\phi, F_p). Is this correct? I also don't understand how the normalization was done from the caption. Could you either write it out or use equations?
Yes, that is correct -we have adjusted the caption to make it clearer. We have also clarified the standardisation by adding a description in Methods.
6. Figure 4d: what do the solid/dashed gray contours represent?
The contours are unnecessary and have now been removed -thanks for spotting this 7. Figure 9: should the caption say the BK curves are red, not green?
Yes, thanks for spotting this -now corrected

Many thanks for your comments and suggestions. Please see our replies in blue below.
Reviewer #2 (Remarks to the Author): Accept subject to minor revisions The manuscript is thorough showing the atmospheric dynamic complexity due to loss of sea ice. I was pleased with the discussion of real world physics in addition to just models.
Line 294 I do not understand sentence: since models are able to predict the real world better than themselves We have clarified this: "This has been referred to as the "signal-to-noise paradox" (Scaife and Smith 2018) since models are unexpectedly able to predict the real world better than they can predict one of their own ensemble members".
Line 317 I Disagree: "are thus unlikely to drive large impacts in individual winters." It is still possible to have short impact events of one to several weeks in any given year. A seasonal average may still be small Thanks -we have clarified that large seasonal mean impacts are unlikely

Many thanks for your comments and suggestions. Please see our replies in blue below.
Reviewer #3 (Remarks to the Author): Using 16 different atmospheric models with more than 3000 ensemble members, this study investigates the transient response of northern hemisphere winter westerlies to future Arctic sea ice loss. Consistent with previous modeling studies, this study finds that the Arctic sea ice loss causes a robust equatorward shift of mid-latitude westerlies: a significant weakening around 50-70N and a slight strengthening around 30-40N. A key finding is that the inter-model differences in zonal-mean wind responses (ZWRI) can be explained by eddy feedback parameter and the eddy feedback parameter of reanalysis data is about 1.2~3 times larger than in climate models. I believe a thorough and comprehensive analysis of multi-model simulations can warrant publication at Nature Communications. In particular, this study 1) comprehensively quantifies the sensitivity to sea ice loss by utilizing 16 models with more than 3000 ensembles, 2) provides insight into the sensitivity of zonal-mean wind response to sea ice loss by introducing a zonal wind response index (ZWRI) that can explain the meridional circulation anomalies, and 3) is partly successful in providing an emergent constraint by calculating eddy feedback parameter both for climate models and for reanalysis data.
Specific comments: It took me considerable time, effort and patience to read through this paper. This is not only because TEM dynamics are difficult to understand but also because this paper tries to deliver too much information.
Many thanks for your patience and perseverance. We accept that the paper contains a lot of information, but we believe that a detailed description of the physical processes is needed to justify the constraint, and significantly adds to previous studies. Following your comments, we have clarified some of the text, broken down some of the paragraphs into more digestible pieces, and added text to clarify our arguments. We hope these improvements make the paper much easier to read.
1) There are two key messages and these two are not closely related to each other. To me, quantifying the multi-model ZWRI by eddy feedback parameter is a key message of this study. However, the abstract emphasizes that the modelled response to Arctic sea ice loss is weak and the relationships between Arctic sea ice and atmospheric circulation have weakened recently.
We believe that both messages are important for addressing the perceived disagreement between observations and models (highlighted recently in ref 5): we show both that there is a robust response in models, and that it is consistent with observations when model biases and the latest observational data are taken into account. We also highlight that the response is weak compared to interannual variability, which partly explains why it has been so difficult to diagnose in previous observational and modelling studies.
2) I suggest deleting Figure 9, which is not closely related to previous figures. I understand that the authors want to deliver as much information as possible to educate readers, but please reconsider.
Whether models and observations disagree is a key part of the debate, as highlighted in ref 5. We considered removing Fig 9 as  3) Abstract: "the North Atlantic Oscillation response is similar in magnitude and offsets the projected response to increased greenhouse gases, but would only account for around 10% of variations in individual years" Is this really necessary to include this sentence in the abstract? A previous modelling study pointed out that the equatorward shift of NH westerlies driven by future Arctic sea ice loss is opposed by the response to low-latitude surface warming (see Figure 5 of Blackport and Kushner 2017). They also noted in the abstract that "internal variability can easily contaminate the estimates..." Small/large, strong/weak are subjective words and the time mean response of westerlies to future Arctic sea ice loss is not necessarily small compared to the westerly response to the future tropical SST warming. Blackport, R., and P.  Figure 4 or move this figure to Supplementary information. I really cannot understand why the October TEM circulation and EP flux anomalies are special and can be interpreted as physical mechanisms. It is well known that summer sea ice loss and the associated increase in Arctic ocean heat content are accompanied by seasonally persistent surface warming. I guess the authors are careful about interpreting the winter surface warming because the winter Arctic warming in observation is not only driven by summer sea ice loss but also by winter circulation anomalies? I think the authors do not need to worry about this issue because this PAMIP experiment is designed to isolate the impact of Arctic sea ice loss from other factors. Figure 4 is key for understanding the physical mechanism and hence for developing the emergent constraint. Many studies have pointed to an increase in upward wave flux that reduces the polar vortex, but the reason for this increase has not been understood before, and it is counterintuitive given the expected weakening of the storm tracks which are the source of the waves. It is possible to increase upward wave activity directly by increasing the zonal asymmetries in the sea ice region, and there is some evidence for this (positive values near the surface at latitudes greater than 80N in Fig  3c). However, by far the strongest increase in upward wave flux occurs around 40-50N, and this is also the pathway into the stratosphere which has been highlighted to begin in October in other studies. Figure 4 shows that the response evolves from the expected reduction in upward wave activity in October (consistent with reduced storm tracks) to the DJF equatorward shift, and that this equatorward shift is consistent with an eddy-driven meridional circulation, highlighting the potential role of eddy feedback that is used in the emergent constraint.

4) I suggest deleting
Many thanks for your comment. In response we have strengthened our discussion, and hope this has improved the paper.
5) Lines 629-631: Please explain the difference between eddy driving and eddy feedback.
We have clarified the difference.
6) Lines 642-643: I am not sure whether this statement is correct or not. Please consult with a statistician. (2012) -reference now added 7) Line 644: "regressioncite" seems to be a typo.

This is stated by Bracegirdle and Stephenson
Corrected -thanks for spotting this 8) Captions in Figures 5 and 6: Which season? Are they about DJF average?
Yes -now clarified in the captions -thanks for pointing this out 9) Line 574: "assess the effect of coupling": Does coupling imply ocean coupling?
We have clarified that this refers to ocean-atmosphere coupling -thanks for pointing this out 10) Please write down the definitions of U_bar and T_bar shown in Figures 3, 4, 5, 6 more in detail. It seems that T_bar is zonal-mean temperature anomalies and U_bar is zonal-mean zonal wind. How about changing T_bar and U_bar to [T] and [U] ?
Yes, ̅ and ̅ are simply the zonal means. We have clarified this (line 95) and prefer to keep this commonly used notation.