Increasing occurrence of cold and warm extremes during the recent global warming slowdown

The recent levelling of global mean temperatures after the late 1990s, the so-called global warming hiatus or slowdown, ignited a surge of scientific interest into natural global mean surface temperature variability, observed temperature biases, and climate communication, but many questions remain about how these findings relate to variations in more societally relevant temperature extremes. Here we show that both summertime warm and wintertime cold extreme occurrences increased over land during the so-called hiatus period, and that these increases occurred for distinct reasons. The increase in cold extremes is associated with an atmospheric circulation pattern resembling the warm Arctic-cold continents pattern, whereas the increase in warm extremes is tied to a pattern of sea surface temperatures resembling the Atlantic Multidecadal Oscillation. These findings indicate that large-scale factors responsible for the most societally relevant temperature variations over continents are distinct from those of global mean surface temperature.

I cannot recommend this manuscript for publication because of the following main concerns.
1. The current study does not add any new fundamental inside in the current understanding of the regional nature of temperature change over Northern Hemisphere land during the hiatus period. 1a. Specifically, the role of the NAO during the hiatus period is described before (e.g. Trenberth et al. 2014, Nature Climate Change). The Trenberth et al. study highlights the role of the NAO during the cold winter of 2009-2010,2010-2011 and 2012-2013 at the end of the hiatus period that are contributing to a negative cooling trend over Eurasia and Eastern United States. The Trenberth et al. study also suggests a mechanism of tropical Pacific origin that affects NAO/cold Eurasian temperatures and Arctic warming. Furthermore, the Trenberth et al. study also highlights the role of the PDO affecting specifically the cooling signal over western United States. Taking into account that the regression model outcome applied in the submitted study is very similar for seasonal means and extremes occurrence, it is not clear what fundamental new findings are provided. 1b. The role of AMO in the hiatus has been discussed before (See Medhoug et al. 2017 and references therein). The authors of the submitted study to not reconcile their findings with the findings in the literature on this topic. Furthermore, the described results of CESM1 experiments showing trends for the period 1979-2012 do not explain the increased in hot extremes during the period 2002-2014. How has the AMO changed during that period superposed on the anthropogenic signal?
2. Definition of the hiatus period: The authors of this submitted study define the hiatus period from 2002-2014. This period strongly differs from definitions in the literatures using the period 1998-2012or 1998-2013(e.g., Medhoug et al. 2017Trenberth et al. 2014, Nature Climate Change). The reasoning for using the period 2002-2014 is not given.
3. Utilizing of the term "Warm-Arctic -Cold Continent" (WACC) circulation pattern: The authors do not provide any evidence for the occurrence and importance of such a prescriptive term for the circulation pattern instead of referring to the NAO which resembles their identified 500hPa extremes circulation pattern (as the authors point out themselves). The NAO which is a welldefined mode of variability varies strongly on decadal to multi-decadal time scale with wellestablished effects on the climate on Northern Hemisphere continental climate. Furthermore, it becomes obvious that recent negative NAO events that occurred during the second half of the studied period contributed to their result. Finally, the physical mechanism of such a WACC pattern that is different from the NAO itself is not clear. Other recent studies point to the impact of a different circulation phenomenon, namely the strengthening of the Siberian high as a main driver of the Eurasian cooling trend over the 1990 to 2014 period when the so called WACC like temperature trend pattern occurred (Sun et al. 2016, GRL).

As a whole, investigating trends in temperature extreme occurrences averaged of the Northern
The manuscript is very well and clearly written, and addresses a scientific gap as changes in temperature extremes during the 'hiatus' period have previously been reported but not explained. I think this is a very nice and comprehensive study to explain these potential drivers of decadalscale changes in temperature extremes. A few points, however, should be clarified: -Also the annual global land average temperatures continued to increase during the 'hiatus' period (see e.g. Seneviratne et al 2014, doi:10.1038/nclimate2145). So how do these increases in summer hot extreme temperatures differ from the increasing average temperatures over land?
- Cohen et al. 2012 (doi: 10.1029/2011GL050582) documented seasonal asymmetries in temperature change (incl. summer vs winter). The authors included a brief discussion around differences between seasonal means and seasonal extremes, but I think it could be clarified in how far these 'extreme' changes relate or possibly exceed the mean changes. Given the authors use measures of relatively moderate extremes (that occur on average on 10% of the days), I expect the results between these extremes and the seasonal averages to be reasonably similar. However, there may be larger differences when looking at some more extreme measures of hot/cold extremes.
-As a conclusion, the authors may want to highlight that GMST probably is not a particularly useful measure for relevant climate states or of climate change in general, as it may average out different kinds of events, seasonal and regional characteristics A few specific comments: Line 7: it should be specified if the increase is e.g. in frequency or intensity or associated temperature. In particular for cold extremes the term "increase" can be ambiguous: increasing frequency would be consistent with cooler conditions, increase in associated temperatures with warming.
seasons that improve the skill of the linear statistical model when included in the regression pool : in winter, a pattern of Z500 that resembles the negative NAO/NAM, and a SST pattern that resembles the AMO in summer. The observational evidences are completed by similar analyses of extreme temperature variability in a long coupled ocean-atmosphere control simulation. The increased trend in cold extremes is shown to be an extreme realization when compared to internal variability of the model. A potential link with Arctic sea ice loss is evoked that could explain its emergence in recent years in observations. In contrast, the summer warm extremes trend is in the range of internal climate variability. Sensitivity experiments performed with the CESM climate model support the role of the SST pattern in driving the increase in summer warm extremes in the recent decades.
The paper is clear and well-written, and presents some robust statistical analyses and welldesigned numerical experiments. It provides a nice overview of the trends in temperature extremes over land in recent decades, and a convincing illustration of how temperature extremes can rise in a period of global warming slowdown, especially due to atmospheric internal variability. I think this paper will be a very valuable contribution to the field and I'll fully support publication for it provided that the authors verify a couple of points as listed below. I require a major revision for the paper, but it is rather minor to me, it's just that I'd like to see the author's answers before I fully support publication.

Main comments
-I wonder about the robustness of the Z500/SST patterns you identify as precursors of the temperature extremes. They are defined from regressing the temperature indices on the gridded Z500/SST anomalies over 1950-2014, but is this strongly dependent to the time period that is chosen ? If you were using a training period to identify the patterns with the partial regression analysis (for example the first half of the record, 1950-1980), then apply the linear model to predict the temperature indices over the latter period , would the results remain robust ? I guess the full period is needed to identify the patterns since a large fraction of the skill comes from their decadal/multidecadal variability rather than interannual variability (negative-NAO trend in the 2000's, and AMO cycle for the SST). The robustness of the patterns in regard of the period that is used should be discussed somewhere in the paper, or in the methods section.
-l. 85, what are the spatial and temporal correlations between the Z500 pattern and the NAO/NAM ? If it's high, why not using a NAO/NAM index directly ? Please justify the benefit of using the Z500 pattern. Same remark with the SST pattern, how is it correlated with the summer AMO, and why not using directly an AMO index ? -The cold extreme trend is at the tail of the cold days trend distribution from the FLOR simulation. You mention that this simulation exhibit sustantial long-term variability despite constant radiative forcing, but is it comparable to observations ? It would be nice to see a comparison of the PDF of cold and warm extremes in FLOR vs observations to support this claim.
-In link with my previous comment, how large is the variability of the Z500 pattern in FLOR, compared to observations ? Since the pattern resembles the NAO, you could plot a power spectra of the NAO index in your model and in observations to verify whether the model exhibits enough long-term NAO variability compared to observations. Recent studies have shown that the lowfrequency variability of the NAO is too weak in current GCMs, which can lead to underestimated internal variability, especially in the North Atlantic region (e.g., Wang et al. 2017). It is possible that the low-frequency fluctuations of the NAO is underestimated in the FLOR simulation, which could partly explain why cold extremes trends are less tied to the Z500 pattern in FLOR than in the real world (section S6). Similarly, it would be nice to see a comparison of the power spectra of the AMV (or summer SST pattern) as simulated by FLOR vs observations. Minor comments -l. 45 : you refer to the trend as "time", which I don't find very clear. You could clarify here that your time predictor is the linear time trend (adding "referred to as time in the rest of the study", for instance). Or replace "time" by "trend" for the name of the predictor ? -l. 233 : extremese -> extremes -l. 497-501 and in other sections of the paper : when you refer to lags, you don't specify if they are negative or positive. Please clarify, maybe adding the sign of the lag (-11 months for instance) -section S4, l. 10 : "at each grid <b>point</b> on predictor i" 1

Response to Reviewers
We thank Reviewer #1 for the constructive comments that have allowed us to clarify many points and to strengthen the revised manuscript. We respond to each comment below.

R1:
The current study does not add any new fundamental inside in the current understanding of the regional nature of temperature change over Northern Hemisphere land during the hiatus period. 1a. Specifically, the role of the NAO during the hiatus period is described before (e.g. Trenberth et al. 2014, Nature Climate Change). The Trenberth et al. study highlights the role of the NAO during the cold winter of 2009-2010,2010-2011 and 2012-2013  Response: We acknowledge that in the original manuscript we did not highlight the new fundamental insights as clearly as we should have, but we believe that this study offers several novel findings, most notably:

1) Both wintertime TX10d and summertime TX90d increased in the Northern
Hemisphere during the hiatus, a result that is counter-intuitive.

2) The three important indices-annual-mean GMST, NH land winter SAT/TX10d
and summer SAT/TX90d-are each governed by a different mechanism. The insight is enabled by examining these indices together for the first time in a single study.
Although previous studies have touched on some of these elements in one way or another, we believe that we identify and synthesize several key unique elements that explain these two results, as we argue in our responses below.
We appreciate the reference to the Trenberth et al. (2014) study, and we agree that we should clarify the distinctions between our findings and theirs. We argue that our results are quite distinct from Trenberth in several key respects. Based on our investigation reported below, we find the following distinguishing results with respect to the role of the NAO during the hiatus period: • Although the z500 extremes pattern that we identify is related to the NAO, it is not equivalent to the canonical NAO and is more closely connected to hemispheric land cold extreme occurrences than the NAO.

• There is no evidence that tropical Pacific SSTs during the hiatus period excited the z500 extremes pattern, which contrasts the claim of Trenberth et al. (2014).
To address the first point above, which also addresses a comment raised by Reviewer 3, we first determine how closely the z500 extremes pattern is related to the NAO. We downloaded the monthly mean NAO index from the NOAA Climate Prediction Center website and found that the correlation coefficient between the DJFM z500 extremes index and NAO index is -0.65. Clearly the two indices are significantly related but there also is a substantial amount of z500 extremes index variability that cannot be explained by the NAO.
We next address how well the NAO index can be used as a substitute for the z500 extremes index to explain hemispheric land cold extreme and seasonal mean temperature variability. After linearly removing the influence of the other TX10d predictors (time trend, ENSO, and volcanic AOD), the z500 extremes index can explain 56% of the residual DJFM TX10d variance. The NAO index, in contrast, only explains 17% of the residual TX10d variance.

This finding indicates that the NAO index is not an adequate substitute for the z500 extremes index that we have identified, and that the z500 extremes index is much more closely related to NH land cold extreme occurrences than the NAO index. This suggests that within the continuum of NAO-like atmospheric circulation patterns, there is considerable variation in the strength of the relationship with hemispheric extreme temperature occurrences and that, for the purposes of this study, it is important to identify the NAO-like pattern that is
most strongly related to extreme temperature occurrences. In addition, this result indicates that the canonical NAO cannot explain the increase in hemispheric cold extreme occurrences during the hiatus period like the z500 extremes pattern can.
Similarly, we compared the relationship between the two indices and seasonal mean hemispheric land temperature. As reported in the original version of the manuscript, the correlation between the z500 extremes index and linearly detrended DJFM NH land temperature is -0.54. The correlation with the NAO index is only 0.19. Again, the link with the z500 extremes index clearly is much stronger. This finding relates to another comment by the reviewer about why we relate the z500 extremes pattern to the "Warm Arctic Cold Continents" pattern, which we discuss in a later reply.

Overall, these findings indicate that although Trenberth et al. (2014) and others may have suggested a connection between the NAO and recent increases in cold air outbreaks in
some regions such as Europe, the connection with the NAO is not as simple as it may appear or may have been implied in previous studies. These previous studies have not performed as rigorous or as quantitative an analysis of the predictors of hemispheric cold extreme occurrences as we have performed here. Our study examines many factors that have been known to impact both global and regional climate variability, but we specifically 3 single out the patterns that dominate seasonal, hemispheric land temperature variability, which have notable contrasts to canonical patterns identified previously.
As mentioned by the reviewer, Trenberth et al. (2014) also suggest that the predominantly negative phase of the NAO and the outbreaks of cold over Europe were forced by the tropical Pacific through quasi-stationary atmospheric Rossby waves. Although we believe this is an interesting hypothesis, we also believe that this claim is debatable. The link between tropical Pacific SSTs and the NAO has been highly debated, but the consensus is that the negative phase of the NAO is associated with warm equatorial tropical Pacific conditions ( To investigate the connection between the z500 extremes pattern and tropical SSTs, we calculated the partial regression coefficients of the z500 extremes index on the global SST anomalies. This calculation, illustrated below and in Fig. S10 of the revised manuscript, reveals that the z500 extremes pattern is weakly though significantly related to positive SST anomalies in the tropical equatorial Pacific. We believe that this investigation motivated by the reviewer's comment is an important addition to the revised manuscript. In lines 98-102, we now discuss the relationship between the z500 extremes pattern and the NAO and AO, while noting that the z500 extremes pattern is more closely tied to cold extreme occurrences. In Section S6 of the supplementary material, we discuss this analysis in detail. In lines 184-192, we discuss the relationship between the z500 extremes pattern and SSTs. We discuss the SST partial regression pattern in Section S7. As in the analysis of the z500 extremes pattern and the NAO index, we find that the SST extremes index is more closely related to hemispheric land extreme temperature occurrence and seasonal mean temperature than the AMO. Following the same procedure described in our first response, we find that the SST extremes index can explain 75% of the residual variance of JJAS TX90d. The AMO only explains 35% of the residual variance. Therefore, we have identified a previously undocumented pattern that explains the variation in summertime extreme temperature occurrence better than the AMO index. In addition, we document in the original version of the manuscript that the correlation between the SST extremes index and the linearly detrended JJAS NH land temperature is 0.68. The correlation for the AMO is only 0.20. In summary, although the SST extremes index is significantly related to the AMO, it is much more closely tied to hemispheric extreme temperature occurrence and hemispheric, seasonal mean temperature.  Overall, we believe that these responses to Reviewer 1's first two comments demonstrate that our study does provide unique insights into the sources of regional temperature variability. We provide a focus on drivers of hemispheric land temperature variability that have contrasted the global mean temperature focus in most hiatus studies, we have demonstrated that we have identified unique patterns that are more closely tied to hemispheric extreme temperature occurrence than canonical indices, and we challenge some existing hypotheses about regional temperature variability during the recent hiatus period. Response: We recognize this comment as a valid criticism of the original version of the manuscript, and we hope that our response to the reviewer's first comment clarifies why distinguish the z500 extremes pattern from the NAO and also make a connection to a "Warm-Arctic-Cold-Continents" pattern. In summary, (1) although the z500 extremes index is significantly related to the NAO index, the NAO index linearly explains less than half of the z500 extremes index; (2) the z500 extremes index explains substantially more variance in the hemispheric cold extreme occurrences than the NAO index; and (3) most specifically in relation to the connection with the "WACC" pattern, the z500 extremes index is much more highly correlated with seasonal mean hemispheric continental temperature than the NAO index. These points are now explained more clearly in the main text and Supplementary Section S6.

The reviewer raises a good point that we should make a clearer connection between
Regarding the physical mechanisms that distinguish the pattern we identify and the NAO, we recognize that we have more questions than answers at this point. However, we believe that our study provides a starting point for future studies to address these questions. There is recent work that recognizes the Arctic Oscillation as a continuum of patterns with unique dynamical mechanisms rather than a single mode (Dai and Tan 2017, doi: 10.1175/JCLI-D-16-0467.1). It seems plausible that the WACC-related z500 extremes pattern is part of this NAO/AO continuum that happens to be more strongly related to hemispheric land temperature than other members of the continuum.

R1: As a whole, investigating trends in temperature extreme occurrences averaged of the Northern Hemisphere land instead of seasonal mean temperature trends does not provide any additional insight into regional land temperature changes during the hiatus period specially over such a very short record. Specifically, these short time series (13 years) of extreme occurrence are very noisy and regression lines to illustrate trends are strongly affected by single years.
Response: Regarding the reviewer's first point, we agree that that there is a clear connection between extreme temperature occurrences and seasonal mean temperature trends over land. Following comments by Reviewer 2, we have strengthened this discussion in the manuscript, particularly in supplementary section S5. However, we argue that our manuscript provides unique insights into the climate patterns that dominate the variability of both extreme temperature occurrence and seasonal mean temperature over Northern Hemisphere land, which are distinct from those that dominate annual mean, global mean surface temperature. As we argue in our responses above, the patterns we identify and the arguments we make about the sources of regional temperature variability are distinct from the patterns and arguments presented in previous studies.
We also agree with the reviewer's second point that short time series are noisy, and trends defined over such short periods are not robust. We attempt to bring out this point in our analysis related to Fig. 4. However, we believe that there is both a scientific and societal need to be able to explain these approximately decadal climate variations, even if they may be considered noise from an anthropogenically forced climate change perspective. Indeed, this spirit appears to have been an important driving force in the many global warming hiatus studies over the past few years. This idea is echoed in the Trenberth et al. (2014) study referenced by the reviewer, as they write that "it is vital to understand related interannual and decadal variability… and its regionality." We believe our study 9 contributes to this objective by providing insights on regional temperature variations that are much more societally relevant than global mean temperature.
We thank Reviewer #2 for the thorough and constructive comments that have resulted in a strengthened manuscript. We respond to each comment below.

R2: Also the annual global land average temperatures continued to increase during the 'hiatus'
period (see e.g. Seneviratne et al 2014(see e.g. Seneviratne et al , doi:10.1038. So how do these increases in summer hot extreme temperatures differ from the increasing average temperatures over land?
Response: The increases in warm and cold extreme occurrences do indeed closely track the area-averaged seasonal mean land temperature. The following figure shows the time series of the seasonal land temperature anomaly, now shown as Supplementary Fig. S7, which can be compared with Fig. 1a. The correlation coefficients between DJFM TX10d and seasonal mean temperature time series is -0.86. The correspondence is even stronger in boreal summer, as the correlation for JJAS TX90d is 0.96. Following the reviewer's comment, we now mention this point in lines 240-243 of the main text.  Our work demonstrates the sources of these asymmetric trends in both temperature extreme occurrences and seasonal mean temperature, which contrast the dominant drivers of global mean surface temperature.

Regarding the possibility that more 'extreme extremes' may behave differently, we examined the time series of different measures of extreme temperatures following the methodology of Seneviratne et al. (2014). The key differences with that study are: (1) we consider both hot and cold extremes, (2) we partition the year into boreal summer (June -September) and boreal winter (December -March), and (3) we focus only on the Northern
Hemisphere. These results (Fig. R2.2   We now discuss these findings in Supplementary Section S9 in the manuscript. We address the reviewer's main point in the last paragraph: "These findings suggest that the strong correspondence between seasonal mean temperature and extreme temperature occurrences during the hiatus period found in this study may not hold as well for more extreme measures of summertime hot extremes but is likely to be robust for wintertime cold extremes. The reason for the amplified response of the hottest extremes requires more study, but soil moisture-temperature feedbacks are one plausible culprit (Vogel et al. 2017), although there is some indication that this mechanism for accelerated warming of hot extremes in climate models may not hold in observations, at least in some regions (Donat et al. 2017, doi:10.1002/2017GL073733). Overall, the degree to which extreme temperatures follow the changes in the mean of the temperature distribution may depend on the region, season, type of extreme (hot or cold), and the threshold used to define the extremes." R2: As a conclusion, the authors may want to highlight that GMST probably is not a particularly useful measure for relevant climate states or of climate change in general, as it may average out different kinds of events, seasonal and regional characteristics.
Response: We agree with this perspective and thank the reviewer for the suggestion. We now bring out the suggested point in lines 246-249 of the revised manuscript.

R2:
Line 7: it should be specified if the increase is e.g. in frequency or intensity or associated temperature. In particular for cold extremes the term "increase" can be ambiguous: increasing frequency would be consistent with cooler conditions, increase in associated temperatures with warming.
Response: We agree that our original phrasing was ambiguous and how now changed the word "extremes" to "extreme occurrences" in line 8. 16 We thank Reviewer #3 for the insightful and helpful comments that have resulted in a strengthened manuscript. We respond to each comment below.

R3
: -I wonder about the robustness of the Z500/SST patterns you identify as precursors of the temperature extremes. They are defined from regressing the temperature indices on the gridded Z500/SST anomalies over 1950-2014, but is this strongly dependent to the time period that is chosen ? If you were using a training period to identify the patterns with the partial regression analysis (for example the first half of the record, 1950-1980), then apply the linear model to predict the temperature indices over the latter period , would the results remain robust ? I guess the full period is needed to identify the patterns since a large fraction of the skill comes from their decadal/multidecadal variability rather than interannual variability (negative-NAO trend in the 2000's, and AMO cycle for the SST). The robustness of the patterns in regard of the period that is used should be discussed somewhere in the paper, or in the methods section.
Response: We agree that the robustness of the z500 and SST extremes patterns deserves additional consideration. As the reviewer indicates, it may be too strict a test to divide the record in half, given that the sample sizes in the training period would be quite small, and there is substantial multidecadal variability that we would miss in the training period. (2) differences in the z500 and SST extremes patterns between the two different datasets, and (3) differences in the z500 and SST extremes pattern regression coefficients, holding the z500 and SST patterns identical for the two datasets.
The top two panels in the figure below show the z500 and SST extremes patterns determined from the 1951-2001 data. As we can see, the patterns are very similar to those reported in the manuscript. This analysis confirms that the patterns and their relationships with extreme temperature occurrence were not unique to the hiatus period. The bottom two panels show the out-of-sample predictions of the z500 and SST extremes pattern contributions to DJFM TX10d and JJAS TX90d, respectively, in comparison with the in-sample partial regressions reported in the manuscript (Fig. 2). We see some differences between the in-sample and out-of-sample time series, but the hiatus period trends and much of the interannual variability are in good agreement. Overall, we believe that this analysis supports the robustness of the z500 and SST extremes patterns and their contribution to Northern Hemisphere extreme temperature variability during the hiatus period. We now explain the results of these calculation in the Methods section (lines 640-662). We also present the figure shown above as Supplementary Fig. S4 Response: The reviewer's comment echoes a similar sentiment raised by Reviewer 1. In response, we have performed thorough comparisons between the z500 extremes index and NAO/AO indices and between the SST extremes index and AMO index. These results are now detailed in Supplementary Section S6. The key findings are: • The z500 extremes index has a strong relationship with the NAO and AO index (r = -0.65 and -0.68, respectively). The SST extremes index also has a strong relationship with the AMO index (r = 0.66). The significant relationships revealed above are not surprising, but they also reveal that there is a substantial amount of variability of the z500 and SST extremes patterns that cannot be explained by the NAO/AO or AMO. • The z500 extremes pattern is much more strongly related to hemispheric cold extreme occurrences than either the NAO or AO. After linearly removing the influence of the other predictors (time trend, ENSO, and volcanic AOD) from the NH DJFM TX10d time series, the z500 extremes index explains 56% of the residual TX10d variance. The NAO and AO indices, however, only explain 17% of the residual TX10d variance. • Similarly, the SST extremes pattern is much more strongly related to hemispheric warm extreme occurrences than the AMO. The SST extremes pattern explains 75% of the residual JJAS TX90d after the removal of the linear influence of the other predictors. The AMO index explains only 35% of the residual TX90d variance.
These findings demonstrate that, despite the similarity between the z500 and SST extremes patterns and canonical patterns of climate variability, there are key differences that connect the z500 and SST extremes patterns to extreme temperature occurrence more strongly.

R3:
The cold extreme trend is at the tail of the cold days trend distribution from the FLOR simulation. You mention that this simulation exhibit substantial long-term variability despite constant radiative forcing, but is it comparable to observations ? It would be nice to see a comparison of the PDF of cold and warm extremes in FLOR vs observations to support this claim.
Response: This is a fair point. To address this comment and to keep the analysis consistent with that of the FLOR simulation, we calculated histograms of 13-yr TX10d and TX90d trends from the linearly detrended observational datasets. This analysis, illustrated below and now in Fig. S13  Response: In light of the results presented above, we agree that it is worthwhile to investigate low-frequency variability in FLOR more closely. We followed the reviewer's suggestion and calculated power spectra of the NAO in observations (reanalysis) and in FLOR. These calculations indicated that FLOR may underestimate low-frequency NAO variability, as suggested by the reviewer. However, given that there are some differences between the NAO and z500 extremes pattern, and between the observed and simulated NAO, we decided that it would be simpler to show maps of 500 hPa geopotential height standard deviations in the FLOR simulation and in linearly detrended reanalysis data. We show below and in Fig. S14 of the revised manuscript the standard deviations for both seasonal and 13-yr running mean data. Several features stand out in the plots shown above. First, on interannual timescales, FLOR overestimates the geopotential height variance in the northeast Pacific and southern North America (panel c). This feature is not unexpected, given that these regions are preferentially impacted by strong ENSO episodes, and that FLOR is known to overestimate ENSO variance (Vecchi et al. 2014). The second and more relevant feature is that FLOR underestimates geopotential height variance over the North Atlantic and parts of Eurasia, which is consistent with the reviewer's suggestion. This feature particularly stands out in the low-frequency differences (panel f). Most interestingly, the two primary regions for which FLOR underestimates 500 hPa geopotential height variance correspond well with two action centers of the z500 extremes pattern (Fig. 3). As the reviewer suggests, this analysis indicates that climate models, including high-resolution models like FLOR, may underestimate natural, multidecadal variability of cold extreme occurrences owing to the underestimation of NAO-like variability over the North Atlantic and Eurasia.
We now discuss these findings in lines 158-171 of the main text and in Supplementary Section S8. We believe that this is a valuable addition to the manuscript because it demonstrates that previous studies that rely on current state-of-the-art climate models to attribute WACC-related changes may have a key deficiency in simulating multidecadal variability of continental cold extremes and related atmospheric circulation variability.
We have chosen not to follow a similar line of analysis with AMO-like variability and summertime warm extremes for several reasons. First, as described above, the distributions of 13-yr TX90d trends are not significantly different between FLOR and observations. Second, even if there are differences between FLOR and observations, it would be difficult to attribute its cause because, as discussed in the main text, we expect anthropogenic forcing, particularly from anthropogenic aerosols, to have some projection on the AMO and SST extremes pattern. Finally, even if FLOR underestimates internal Atlantic multidecadal variability, the implication would only strengthen the existing conclusion that internally driven, apparent accelerations of warm extreme occurrences like what occurred from 2002-2014 are relatively common in the climate model. For these reasons, we believe that additional analysis on AMO or AMO-like variability would contribute to the growing length of the manuscript without adding much additional clarity.

R3:
Minor comments -l. 45 : you refer to the trend as "time", which I don't find very clear. You could clarify here that your time predictor is the linear time trend (adding "referred to as time in the rest of the study", for instance). Or replace "time" by "trend" for the name of the predictor ?
Response: In the revised manuscript, we have replaced "time" with "time trend" following the reviewer's suggestion. Response: We now specify all lags as negative when they indicate the predictor leading the predictand, and we clarify this convention in lines 591-592 of the methods section.