Numerical weather prediction fundamentally depends on both initial condition memory and boundary condition influences. While the former quickly decays within 2 weeks in the extratropical troposphere, the latter provides predictability at longer lead times. However, subseasonal to seasonal (S2S) prediction with lead times of about 2 weeks to 2 months is particularly difficult and has been described as a predictability desert1. This is because initial condition memory has mostly vanished, but the system’s response to boundary condition forcing is still weak, which makes it difficult to outperform persistence forecasts. Nevertheless, in certain situations (so-called windows of opportunity2) predictability at S2S time scales is facilitated through specific anomalies in regions outside the extratropical troposphere, either by prolonging initial condition memory or by enhancing boundary condition control. This can manifest through changes in occurrence of certain weather regimes in the extratropical troposphere3,4,5,6.

Among the most influential sources of subseasonal predictability for extratropical weather during winter and spring are strong anomalies of the stratospheric polar vortex. For example, abrupt breakdowns of the polar vortex, such as associated with sudden stratospheric warmings7 (SSWs), tend to be followed by a persistent equatorward jet shift in the troposphere for up to two months8. This signal tends to be strongest over the Atlantic (associated with a negative phase of the North Atlantic Oscillation, NAO) and enhances the likelihood of cold air outbreaks over Eurasia9,10,11,12,13. Likewise, events of extremely strong polar vortex tend to be followed by a poleward jet shift with greater likelihood for storm series14,15.

From this perspective stratospheric conditions may provide a source of predictability on S2S time scales. Indeed, past literature has demonstrated improved forecast skill for ensemble forecasts initialized around SSWs16 or strong polar vortex episodes17. However, the employed skill measures such as the correlation skill score (CSS) primarily rely on the ensemble mean signal (e.g., the above mentioned negative NAO state following SSWs). Another, perhaps equally informative aspect of predictability is the ensemble spread, which may be thought of as measuring the degree of confidence in the ensemble mean signal. The ensemble spread therefore measures whether the flow becomes more predictable. Notably, the response of tropospheric ensemble spread to stratospheric extreme events has not been studied and this is the main focus of the present study.

We cluster a large set of ensemble forecasts provided by the S2S database18 into weak, moderate and strong initial polar vortex categories. Regions are diagnosed that experience anomalously low or high spread following polar vortex extremes. In particular, we identify regions of near-zero ensemble mean signal that nevertheless exhibit enhanced or reduced intrinsic predictability due to spread anomalies.


Forecast uncertainty following polar vortex events

We use 1000 hPa geopotential height (Z1000) as a metric to quantify the anomalous surface circulation in weak or strong vortex initializations. The results are based on daily diagnostics that are averaged over lead time days 14 to 34. Figure 1 illustrates that weak vortex initializations are on average followed by a negative North Atlantic Oscillation (NAO) pattern in the ensemble mean, consistent with many previous studies. Positive anomalies (=high pressure) are observed over the polar cap northward of about 60 N, negative anomalies are observed over the Euroatlantic sector. The positive anomalies over the Pacific seem in conflict with the expected negative Arctic Oscillation (AO) signature. They likely represent ENSO signatures that do not fully average out due to the limited hindcast period and associated sampling bias. This is supported by the average ENSO index at initial time, which yields −0.36 for weak and +0.23 for strong vortex initializations.

Fig. 1: Ensemble mean and spread following weak/ strong polar vortex.
figure 1

Ensemble statistics of 1000 hPa geopotential height (Z1000) in S2S ECMWF forecasts with a weak (panel a, average over 169 cases) and strong (panel b, average over 169 cases) initial polar vortex. Contours denote the Z1000 ensemble mean (15 gpm intervals). Shading shows lead-time- and season-aware Z1000 anomalies of ensemble variance.

How is the negative NAO signal linked to anomalies in forecast spread? Fig. 1 reveals a pronounced reduction of ensemble spread, and therefore enhanced forecast confidence over the North Atlantic jet exit region, centered around Iceland to Scandinavia. The observed negative anomalies of ensemble variance of up to ~ 3000 gpm2 correspond to about − 0.5σ (standard deviations, relative to the climatological spread distribution) or to a reduction by roughly 20%. In the extra-tropics this is the largest signal in terms of standardized anomalies. Interestingly, the regions of anomalous Z1000 spread differ from those of anomalous Z1000 ensemble mean. For example, the region around Denmark shows essentially zero ensemble mean response, but is nevertheless characterized by enhanced forecast confidence due to reduced ensemble spread.

Furthermore, weak vortex initializations reveal positive spread anomalies over the Atlantic around 40 N and over the North-West Pacific. While enhanced spread translates into reduced forecast confidence, the magnitude of the signals is weaker in comparison to the negative spread anomalies over North Eurasia.

Forecasts initialized with a strong polar vortex qualitatively show similar patterns of mean and spread anomalies compared to weak vortex initializations, with opposite sign. Most pronounced is an increase in forecast spread over Northern Eurasia, where anomalies correspond to about +0.3 standard deviations or to an increase by roughly 15%.

Most forecasts neither fall into the weak or strong vortex category by construction. The associated composite-mean anomalies of mean and spread are close to zero (not shown). An equivalent analysis based on UKMO forecast data (1568 ensemble forecasts with lead times up to +60 days) reveals overall very similar results (see supplement Fig. S3).

Dynamical link between ensemble mean and spread

What causes the observed spatial patterns of ensemble mean and spread? To help answer this question we focus on the Atlantic sector, as signals over the Pacific are weaker and more strongly affected by ENSO variability.

Our guiding hypothesis is as follows. We generally expect largest ensemble spread to occur in regions of strongest synoptic-scale storm development, primarily due to the unpredictable nature of exact storm location and strength at subseasonal lead times. Following stratospheric extreme events, the Atlantic jet undergoes latitudinal shifts that are associated with similar shifts in storm tracks8 and these should manifest in associated anomalies in ensemble spread.

To illustrate the effects of polar vortex variability on the North Atlantic eddy-driven jet, Fig. 2a, c present mean and variance anomalies of horizontal winds at 850 hPa (zonal wind: U850, meridional wind: V850). In alignment with the negative (positive) NAO phase, the composite-mean westerly zonal winds at latitudes northward of about 50 N weaken (strengthen) over the Atlantic, corresponding to the latitudinal jet shift. Weak polar vortex forecasts reveal anomalously low ensemble variance (i.e., high confidence) of zonal wind over Iceland to Scandinavia. Positive ensemble variance anomalies of up to 0.4σ are observed over Canada’s East coast and the Canary basin. Overall, the ensemble spread anomalies of U850 spatially align well with the spread anomalies of Z1000.

Fig. 2: Circulation anomalies and eddy activity by polar vortex state.
figure 2

Weeks 3–5 composite-mean of weak (a, b; 169 cases) and strong (c, d; 169 cases) polar vortex forecasts. 850 hPa horizontal wind diagnostics a, c for Mid-November to End-March climatology (purple streamlines; larger values marked darker), anomalies of ensemble mean (black arrows) and anomalies of ensemble variance (shading). Upper troposphere diagnostics b, d in terms of 200 hPa horizontal wind anomalies (black arrows), anomalies of eddy activity at 320 Kelvin isentrope as a measure for synoptic-scale Rossby waves (shading; see text for details) and Mid-November to End-March climatological latitude of the maximum meridional potential vorticity gradient at 320 Kelvin isentrope.

Over the North Atlantic, the strong meridional potential vorticity (PV) gradient in the upper troposphere acts as a waveguide for Rossby waves19. Climatologically, the largest meridional gradient of PV at 320K (PV320K) over the Atlantic is located near 50 N (see Fig. 2b, d). As Rossby waves propagate eastward and encounter regions of smaller PV gradient, e.g., from the Atlantic to Europe, they typically undergo a nonlinear decay stage. This Rossby wave breaking can be associated with the occurrence of extreme weather such as strong cyclones20,21,22,23.

During the negative NAO phase the region of largest Rossby wave activity and breaking is shifted southward along with the strong meridional gradient of PV320K. Northern Europe then experiences less synoptic variability, whereas the opposite holds for corresponding regions southward of ~45N.

To illustrate regions of anomalous Rossby wave activity, we define the time and space dependent eddy activity \({{{{{{{\mathcal{A}}}}}}}}\). Eddies are separated from the mean flow using time-filtering, based on a 7-day running mean, which is applied gridpoint-wise to PV320K. Eddy activity then corresponds to the time variance of PV320K relative to that 7-day background state and thereby serves as a proxy for synoptic-scale wave activity (see methods section for details). Figure 2b presents deseasonalized anomalies of eddy activity averaged over weak polar vortex initializations. Negative anomalies are observed over Iceland to Scandinavia, corresponding to reduced synoptic variability. Positive anomalies appear over Canada to Greenland and southward of 50 N over the entire Atlantic sector from the US to Europe. This is in agreement with the known equatorward jet shift that is indicated by the 200 hPa zonal wind weakening (strengthening) at the poleward (equatorward) flanks of maximum climatological meridional PV320K gradient. Overall, the anomaly patterns of eddy activity align well with ensemble variance anomalies in Z1000 (Fig. 1a) and U,V850 (Fig. 2a). The agreement supports the hypothesis that ensemble spread at subseasonal lead times is associated with synoptic variability. In order to test model sensitivity, we have computed eddy activity anomalies following weak vortex events based on ERA5 data and found good qualitative agreement (not shown).

Strong polar vortex initializations show overall similar spatial patterns of opposite sign for U,V850 ensemble mean, U850 ensemble variance anomalies, U200 ensemble mean anomalies and eddy activity anomalies (Fig. 2c, d). The positive NAO phase relates to a northward jet shift. For example, over Northern Europe this results in increased storminess (see larger U850) and increased synoptic variability (see positive eddy activity anomalies) that translates into anomalously high forecast uncertainty (see positive U850 and also Z1000 spread anomalies).

In addition to a latitudinal dipole of synoptic variability (e.g., decreased spread over Northern Europe and increased spread over the Mid-Atlantic following a weak polar vortex), a variance anomaly signal is observed over Newfoundland. Weak vortex initializations show increased spread of U850 and, to some extent, of Z1000. This signal is associated with smaller magnitudes of standardized anomalies compared to that over Northern Europe ( + 0.2σ versus − 0.4σ). It is spatially consistent with anomalies in eddy activity, though it cannot be explained by a shift in the jet exit region. It is correlated with the NAO (not shown) and seems to be associated with increased storm activity over the West Atlantic during the negative NAO phase, as revealed by a simple Z1000 tracking algorithm (see supplement Fig. S4, consistent with another study24). Associated with a negative NAO, blocking episodes around Greenland and Northern Europe potentially hinder the eastward progression of storms originating from around Newfoundland25,26. A previous study23 argued that the increased cyclone occurrence over the West Atlantic and the increased blocking occurrence to the East are dynamically linked to cyclonic Rossby wave breaking events.


Extreme stratospheric polar vortex events tend to be followed by persistent tropospheric circulation anomalies over the course of several weeks, with associated modulations of tropospheric predictability at subseasonal timescales. Previous work has focused on the ensemble mean response as a measure of such a window of forecast opportunity. Our results highlight an additional aspect of stratospheric impacts on subseasonal predictability: modifications to regional eddy activity cause robust changes in ensemble spread (forecast uncertainty) that directly translate into modified predictability. Notably, such spread-related changes in predictability are not simply co-aligned with ensemble mean-related changes in predictability and reveal modified predictability even if the ensemble mean signal is zero (such as over Denmark in Fig. 1).

It should be noted that in general both mean and spread modulations affect predictability, but conclusions about forecast skill depend on the evaluation metric and, ultimately, the underlying question. For example, shifts in the mean of the NAM distribution are observed following weak or strong polar vortex events. Such modulations of the ensemble mean are reflected in certain forecast skill metrics, such as the correlation skill score (CSS) or the anomaly correlation coefficient (ACC)16,17. In practice, such forecast skill metrics will be strongly sensitive to a given model’s ability to truthfully represent stratosphere-troposphere coupling27. However, even in a hypothetical model that includes a perfect representation of stratosphere-troposphere coupling, and therefore a perfect representation of the ensemble mean response to stratospheric extreme events, predictability will be intrinsically limited by chaos due to unpredictable weather fluctuations. It is the ensemble spread that measures the amount of unpredictable fluctuations. In this sense the flow may be described as being intrinsically more or less predictable in situations where the ensemble spread is smaller or larger, respectively. Our results have demonstrated that stratospheric extreme events lead to robust modulations of such intrinsic predictability.

While the CSS or ACC are strongly sensitive to the ensemble mean signal, the ensemble spread strongly affects another frequently used forecast skill metric: the mean squared error (MSE, i.e., the mean of the squared difference between the ensemble mean forecast and the observations). Notably, in a perfect model the MSE is independent of the ensemble mean, but only depends on spread. For imperfect models, MSE is additionally affected by model biases. Nevertheless, provided that the model bias contribution is small compared to the ensemble spread contribution (perhaps a reasonable assumption for the ECMWF system used here), ensemble spread anomalies are still expected to align with anomalies in MSE. Our results would then suggest weak polar vortex initializations to be associated with reduced MSE over Northern Europe and increased MSE over the Mid Atlantic. Indeed, composite-mean anomalies of Z1000 MSE for weak polar vortex initializations align closely with those of spread anomalies, although the fields are more patchy (see supplement Fig. S2). Given the relatively small number of weak/ strong vortex events in the real atmosphere, more events would be needed to robustly determine changes in the forecast error. Under the assumption mentioned above, ensemble spread anomaly diagnostics facilitate statistically robust estimates of where forecast errors increase or decrease. Large ensembles can hereby, to some extent, make up for the scarcity of verification dates in the real atmosphere.

Given the substantial stratospheric impact on intrinsic subseasonal predictability as measured by the ensemble spread, other known teleconnection patterns may show similar modulations of forecast uncertainty at subseasonal time scales. Indeed, different initial states of well-known teleconnections of tropical origin (ENSO, MJO, QBO) all show robust anomalies in Z1000 ensemble spread averaged over weeks 3−5 (Fig. 3). Of these, ENSO shows spread anomalies of similar strength compared to those following extreme stratospheric polar vortex states, whereas MJO and QBO show much smaller anomalies. Note that some of these signatures of tropical teleconnections may be facilitated through the stratosphere28,29,30, which limits interpretations of their relative importance for tropospheric predictability. In an attempt to circumvent this issue, we have tested clustering the results in Fig. 3 by conditioning on moderate states of the polar vortex (see supplementary Fig. S5). While QBO-related tropospheric signals seem to be mediated through stratospheric anomalies, the impact of ENSO appears to be relatively unaffected by the conditioning, indicating that its tropospheric pathway might be dominant.

Fig. 3: S2S forecast uncertainty of different teleconnections.
figure 3

Difference of anomalies of Z1000 ensemble mean (black contours every 15 gpm, negative values dashed) and ensemble variance (shading) between forecasts initialized in a strong versus weak vortex conditions, b El Niño (ENSO 3.4 index above 1; 334 cases) versus La Niña (ENSO 3.4 index below − 1; 412 cases) conditions, c MJO phases 7,8 (497 cases) versus phases 2,3 (434 cases), where the MJO amplitude exceeds 0.75 and d westerly (822 cases) versus easterly (1572 cases) QBO phase (defined by the zonal wind at 50 hPa). MJO and QBO are displayed with an additional Z1000 mean contour at ± 10 gpm (in gray) due to smaller signals. The reader is referred to supplementary Fig. S5 for a corresponding analysis, where forecasts are conditioned on moderate states of the polar vortex.

Figure 3 suggests that over the Pacific, Z1000 mean and spread anomalies are dominated by ENSO variability that projects strongly onto a Pacific-North American (PNA) pattern at subseasonal time scales. Forecasts initialized during El Niño are followed by a strengthened Aleutian low, which is associated with negative spread anomalies on its western and positive spread anomalies on its eastern flank. In general, the signals over the Pacific appear to be more sensitive to ENSO than polar vortex variability. Therefore, the available S2S data do not allow fully comprehensive analyses of whether near-surface signatures of polar vortex variability is more NAO or more AO31. The limited time period from 1997 to 2021 is presumably not long enough to properly sample, e.g., sudden warmings under all different ENSO influences and a longer reforecast period would be desirable. Over the Atlantic, signals associated with ENSO variability are generally weaker. During El Niño, Z1000 exhibits negative mean and positive spread anomalies over Great Britain and also upstream, at the West coast of North America, a potential hint for increased cyclogenesis there.

In comparison to polar vortex and ENSO variability, there are only small signals observed following different MJO phases. Consistent with previous literature32, we find that MJO phases 7 and 8 are followed by a negative NAO. Conditioning on a moderate polar vortex leads to slight weakening of the signals (supplement Fig. S5b).

Similarly, the influence of different QBO phases on Z1000 mean and spread is relatively weak. Forecasts initialized during an easterly QBO phase result in slight negative Z1000 spread anomalies over Northern Europe and positive spread anomalies over the Mid-Atlantic. The signals reduce further when conditioning on a moderate polar vortex (Fig. S5c), suggesting that the observed anomalies are mediated via more weak polar vortex states in the easterly QBO phase, in alignment with previous work33.


Based on large sets of extended-range ensemble forecasts our results demonstrate that forecast uncertainty as measured by the ensemble spread offers added value over the ensemble mean for characterizing subseasonal predictability. Importantly, reductions in ensemble spread may indicate enhanced predictability associated with reduced forecast uncertainty in regions where the ensemble mean response is weak, i.e., in regions where the ensemble mean itself does not represent a source of predictability. Strong surface signatures of ensemble spread anomalies are found following weak or strong stratospheric polar vortex events. Consistent with NAO-related modulations of jet latitude and strength and the associated variations in synoptic storm activity, spread anomalies are not in phase with the mean anomalies, but appear where the storm track is altered, which occurs especially in the jet exit region.

In general, spread anomalies facilitate a statistically robust estimate of forecast error patterns as they can be computed without a verification dataset. Applying the developed diagnostics to remote tropical forcings revealed that extended-range forecast errors over the Atlantic are at first order affected by the long-lasting NAO signals, such as introduced by polar vortex extreme events. In contrast, forecast errors over the Pacific are dominated by PNA-variability that is strongly influenced by ENSO variability.

Data and methods

S2S forecast data

The study is based on extended-range ensemble forecasts by ECMWF, which were obtained from the S2S prediction project database18. So-called realtime ensemble forecasts with 51 members each are available with initializations each Monday and Thursday. We include 114 realtime 51-member ensemble forecasts that are initialized during winters 2017/18 to 2020/21, between 16 November and 22 February, respectively. All forecast have a maximum lead time of 46 days. For each realtime forecast, ECMWF provides a set of 20 so-called hindcast ensemble forecasts, each of which has 11 members. Hindcasts are initialized with the analysis from the same day of the year as their associated realtime forecast, but for each of the 20 previous years. For example, a 51-member realtime forecast is available on 2 January 2020 with an associated set of 20 11-member hindcasts with initial conditions on 2 January 2019, 2 January 2018, ..., 2 January 2000. In total, the 114 realtime ensemble forecasts are accompanied by corresponding 114 × 20 = 2280 hindcast ensembles. Note that due to model updates, our set of forecasts stems from four different model cycles; the resulting uncertainties are expected to be minor compared to the benefit of a larger sample size.

The zonal-mean zonal wind at 10 hPa and 60N (U6010hPa) is used as a proxy for the state of the stratospheric polar vortex. The tropospheric evolution is analyzed in terms of potential vorticity at the 320 K isentrope (PV320K), horizontal winds at 850 hPa and 250 hPa (U,V 850, 250) and geopotential height at 1000 hPa (Z1000). All data are retrieved as 0 UTC instantaneous fields at a horizontal resolution of 2.5 degrees (except for PV320, which was retrieved at a 1 degree resolution).

Forecast anomalies are computed by subtracting a model climatology, which depends on

  1. 1.

    forecast lead time (e.g., some variable might systematically drift to smaller or larger values over lead time due to model biases),

  2. 2.

    model version (e.g., model biases might be reduced after an updated model cycle) and

  3. 3.

    the day of year (e.g., the polar vortex tends to be strongest in January).

Technical details of the deseasonalization procedure are described in Appendix A of Spaeth and Birner, 202234.

Clustering of ensembles based on the polar vortex

Each ensemble forecast is labeled as weak, moderate or strong vortex, depending on U6010hPa in ERA5 reanalysis35 at initial time. Thereby, the effect of initial conditions on forecast evolution is isolated (though few cases exist where the polar vortex becomes strong or weak only during the forecast period). For weak vortex forecasts, U6010hPa is required to be negative. Following a standard SSW definition36, the forecasts in this group are thus likely to have experienced a SSW shortly before initialization. Out of all 2394 ensemble forecasts, 169 qualify as weak vortex initializations. Similarly, the 169 forecasts with the strongest initial polar vortex are labeled as strong vortex initializations. The remaining forecasts form the moderate vortex category.

Overall, forecasts sample winters from 1997 to 2021. The distribution of weak and strong vortex initializations over different years is provided in the supplementary Fig. S1a. Moreover, Fig. S1b confirms that the average polar vortex evolution over lead time agrees well between forecasts and reanalysis for the three categories weak, moderate and strong initial vortex.

Eddy activity

Daily atmospheric variability in the extratropics is characterized by synoptic-scale weather systems (decorrelation timescales on the order of few days37). These eddies can be quantified via quasi-geostrophic wave activity, which is proportional to potential vorticity variance around some basic state, which separates eddies from the mean flow. Upon integration over the full domain, wave activity fulfills a conservation law for adiabatic, inviscid dynamics38.

As a proxy for wave activity, we define the time- and space-dependent eddy activity, \({{{{{{{\mathcal{A}}}}}}}}\), a simplified measure for synoptic-scale waves, which corresponds to potential vorticity variance:


PV320K is the potential vorticity at the 320 K isentrope. Mean-flow and synoptic-scale eddies are separated using simple time-filtering, based on a 7-day centered running mean at each grid point, denoted by the overbar. Planetary-scale perturbations typically have longer time scales than seven days and are therefore filtered out. Variability that is associated with periods shorter than one day do not contribute as well, because data is used only at daily resolution. Therefore, the eddy activity essentially corresponds to a temporal bandpass filtered potential vorticity variance. We have tested that the results are not very sensitive to choices of basic state timescales between 5 and 14 days (not shown) and Gibbs phenomena have near-zero impact on results as tested by multiple iterative application of the 7-day moving average (not shown). Large values of eddy activity can for example be associated with Rossby wave breaking.

Tropical teleconnection indices

Predictability arising from stratosphere-troposphere coupling is compared to the effect of tropical teleconnections, including variability from the El-Niño-Southern-Oscillation (ENSO), the Quasi-Biennial-Oscillation (QBO) and the Madden-Julian-Oscillation (MJO). The ENSO 3.4 index is used to characterize the ENSO state. Data was downloaded from the National Oceanic and Atmospheric Administration: The QBO phase is here defined based on the observational data of the zonal wind over Singapore, provided by the Freie Universitaet Berlin: A positive (negative) zonal wind at 50 hPa here defines the westerly (easterly) phase of the QBO. The MJO is characterized based on the real-time multivariate MJO index (RMM)39 and the data is downloaded from the Columbia University of New York: