Introduction

In 2020 boreal summer (July-August), the western North Pacific subtropical high (WNPSH) exhibited an exceptionally strong and enduring nature, resulting in one of the least active typhoon seasons in the Northwestern Pacific. This phenomenon led to unprecedented drought conditions in certain East Asian regions, notably Taiwan, while other areas such as East China and Japan faced recurring flash floods due to the persistent Meiyu front1,2. Given that East Asia is the world’s most densely populated region, climate extremes in this area can have profound socioeconomic repercussions and exert influence over global supply chains3.

While accurately forecasting the WNPSH on extended timescales is crucial for mitigating the impacts of associated extremes, subseasonal forecasting in this regard is still in its infancy due to the complex dynamics involved and its intermediate states on the frequency spectrum (between seasonal and synoptic timescales). Several factors have been identified as influential in modulating WNPSH variability across timescales. As the descending branch of the Hadley Cell, tropical variability in meridional overturning circulation such as the Intertropical Convergence Zone (ITCZ), and the Madden-Julian oscillation (MJO) leave their imprints on the subtropical high. Over interannual and longer timescales, the displacement of ITCZ directly modulates the strength in the zonally symmetric component of the subtropical high4,5,6. On subseasonal timescales, previous studies have found variability in WNPSH as the boreal summer intraseasonal oscillation (BSISO) propagates poleward7,8. Taking a zonally asymmetric perspective, factors such as the thermal contrast between land and ocean9,10, the inter-basin sea surface temperature (SST) gradient1,11, the phase locking between El Niño Southern Oscillation (ENSO) and seasonal cycle12,13, the monsoon convective heating14, and atmospheric teleconnections15,16,17 collectively influence the westward expansion and eastward retreat of the WNPSH’s western boundary, as well as its overall maintenance. In addition, the WNPSH is also governed by its own internal dynamics (Rossby wave dynamics) which determines the transient evolution14,18. Therefore, the dynamical behavior of WNPSH is a joint result of the external forcing and its internal dynamics.

Recent studies have found changes in WNPSH in both reanalysis data and the general circulation models (GCM) attributed to global warming9,10,19. One of the likely features is the westward extension of the WNPSH, which originates from the amplified thermal contrast between land and ocean. In a warmer climate, the land tends to have more pronounced rise in temperature than the ocean which can enhance the zonal asymmetric component of subtropical high9,10,20,21. Other studies found uncertainty exists due to the different metrics used for quantification22,23,24 and model biases in inter-basin SST gradient, which corresponds to the observed early warming pattern19,20. The change in zonal symmetric component, on the contrary, is less certain10. As populations continue to grow, the broader impacts associated with extreme events influenced by the WNPSH are anticipated to increase in a warmer future. Hence, it is of utmost importance for the scientific community to acquire a more comprehensive understanding of the underlying dynamics in order to provide reliable predictions.

Despite the existence of complex dynamics, the power spectrum observed in the WNPSH provides insights for modeling its internal dynamics and making reliable forecasts. Generally, the time series of the WNPSH follows a red-noise spectrum, except for subseasonal timescales of 10-40 days, which suggests the presence of oscillatory modes or non-modal transient growth in these periods18 (also see Fig. 1c, discussion provided in later sections). The non-modal dynamics result in a prolonged response to the transient forcing, which has important implications for studying climate extremes25,26. While both red-noise and non-modal processes can be effectively modeled using linear stochastic dynamics or linear inverse model (LIM)27, its applicability to WNPSH remains unexplored. LIM has been proven successful in modeling ENSO dynamics26, MJO tropical-extratropical teleconnection28,29, and atmospheric blocking30. Henderson et al. (2020)28 and Tseng et al. (2021)29 demonstrated the importance of non-modal transient growth to the subseasonal-to-seasonal (S2S) prediction of MJO teleconnection. They further highlight how the transient growth of the initial state contributes to the sustaining of forecast signals. This effect results in a higher signal-to-noise ratio, which ultimately extends its predictive capability to a greater extent.

Fig. 1: Spatial and temporal structures of WNPSH leading variability.
figure 1

a 5880gpm lines of 2020 JJA. Red contour: Only considering climatology and EOF1, Blue contour: Totoal field. Green contour: Climatology. Shading: 850hPa geopotential height anomaly (m) in 2020. b The EOF1 of WNPSH over the domain of 20°N-30°N, 100°E-170°E, and from 1000hPa to 250hPa. EOF analysis is employed on meridionally averaged z. c The power spectrum of JJA WNPSH index (bar) and 95% confidence intervals of red-noise spectrum.

To better understand the dynamics responsible for WNPSH predictability and meet the necessity of developing reliable S2S forecasts, the objective of this study will address the following questions:

  • Can the internal dynamics of WNPSH be represented using a linear stochastic process?

  • What constitutes the optimal predictable pattern of WNPSH westward expansion?

  • What determines the predictability limit of daily WNPSH?

We will demonstrate that the convection over the Philippine/South China Sea and Japan are key precursors that maximize the growth of WNPSH. With the presence of the optimal pattern in the initial states, an additional window of forecast opportunities of approximately 20 days can be attained.

Results

Climatology of WNPSH

To address the raised questions, we begin by establishing a WNPSH index (IWNPSH). Here we employ empirical orthogonal function analysis (EOFs) on the meridionally averaged eddy geopotential height within the domain of 20°N-30°N, 100°E-170°E, and from 1000hPa to 250hPa to reveal the dominant variability of WNPSH. Using eddy geopotential height mitigates the impact of global warming on the total geopotential height, preventing potential unrealistic increases as predicted by hydrostatic balance. Unless specified otherwise, we will employ eddy geopotential height (referred as z) for the subsequent analysis. The first leading mode, as shown in Fig. 1b, accounts for approximately 37% of the total variance and exhibits an equivalent barotropic structure. It displays maximum amplitudes over the regions of 125°E-140°E and 1000hPa–600hPa. Being positioned to the west of the climatological center (depicted by the green contour in Fig. 1a) of WNPSH, EOF1 captures the variability associated with its westward extension (eastward retreat). The presence of a near-surface maximum provides evidence that EOF1 is dynamically driven due to strong mid- and upper-troposphere subsidence. We also compared IWNPSH with three other indices documented in previous studies (Supplementary Figs. 1 and 2), which have been used to describe the intensity, area, and westward extension of WNPSH. It is demonstrated they are highly correlated at monthly timescales and moderately correlated at daily timescales. One reason is that many former indices are designed to capture WNPSH variability on monthly and longer timescales while some daily geopotential height patterns are not well-defined in these indices (see SI for details).

To depict the horizontal structure, we regress the geopotential height against the first principal component (PC, the time series of EOF1). The red contour in Fig. 1a displays the superposition of the EOF1 pattern with WNPSH climatology (including the zonal symmetric part, indicated by the green contour), specifically for the case of 2020 summer (June-August, JJA). The blue contour represents the total field. Here, we adopt the commonly used 5880-gpm (geopotential meter) line to visualize the western boundary of WNPSH, as it exhibits a high sensitivity to the WNPSH activity. It is noteworthy that the spatial collocation of the blue and red contours suggests that EOF1 serves as a reliable indicator of WNPSH activity. Henceforth, we will adopt the first principal component of WNPSH (PC1) as our subtropical high index (IWNPSH) for subsequent analyses. It’s important to observe that the design of IWNPSH is focused on capturing the primary variability in the westward expansion of WNPSH. A similar approach can be employed to identify its northward expansion.

The 2020 event is an exceptional and unprecedented case. To demonstrate that, Supplementary Fig. 3a presents IWNPSH spanning the period from 1979 to 2021. A notable feature from Supplementary Fig. 3a is that the summer of 2020 stands out as one of a few seasons with IWNPSH close to one standard deviation for almost the entire period. This distinct characteristic can be attributed to the influence of decaying El Niño (developing La Niña, yellow shading in Supplementary Fig. 3) conditions on SST. In boreal summer, the presence of decaying El Niño (developing La Niña) along with the seasonal southward shift of westerly wind anomaly in the equatorial Pacific causes the development of anticyclonic anomaly over the western North Pacific and speeds up the termination of El Niño (see Stuecker 2013, 201512,13). This anticyclone also favors the convection over Japan and the Philippine/South China Sea: an environment of WNPSH optimal forcing, which leads to the westward extension of WNPSH on transient timescales. More in-depth discussions will be provided in the later sections.

Regarding the entire time series spanning from 1979 to 2021, IWNPSH generally follows the red-noise spectrum, except for slightly higher variance within the 10-40 day periods compared to the background noise (Fig. 1c). This indicates the potential presence of oscillatory signals or non-modal transient growth. Non-modal linear dynamics typically exhibit a pattern of short-term growth followed by long-term decay, a consequence of the asymmetrical interactions among components in the dynamical system. For example, in ENSO dynamics, there exist asymmetrical interdependencies among system variables (e.g., how surface winds drive surface ocean currents, but not conversely26). The existence of non-modal growth usually leads to higher observed variance in the variable of interest compared to that of damped linear systems (i.e., red-noise). Meanwhile, the long-term decay causes the power spectrum to gradually taper towards the red-noise spectrum as we transition from high frequency to low frequency (e.g., bars in Fig. 1c). Given both red-noise and non-modal processes can be well modeled by LIM, it is intriguing to explore whether the fundamental mechanisms of the WNPSH can be effectively captured by linear stochastic dynamics.

Subseasonal Predictability of WNPSH and Optimal Predictable Pattern

In order to answer the first question in the introduction and comprehend the dynamics that govern the time evolution of WNPSH, we utilize a LIM to develop the dynamical operator. The prognostic variables considered in this model include 850hPa and 250hPa stream function (ψ) and velocity potential (χ) obtained from the domains of 10°N-60°N, 100°E-170°E for ψ and 20°S-30°N, 0°E-360°E for χ. The domain selection is optimized to capture the essential processes related to the WNPSH, notably Rossby wave dynamics. The analyses remain qualitatively consistent as long as the domain is sufficiently sized (see SI for further information). The model can be formulated as follows:

$${{{\bf{x}}}}({{{\boldsymbol{\tau }}}})={{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}{{{\bf{x}}}}(0)+{{{\boldsymbol{\epsilon }}}}$$
(1)

Here, x(0) and x(τ) correspond to the state vectors at t = 0 and t = τ, respectively. The linear dynamical operator is denoted by Gτ, and ϵ represents the stochastic forcing. To extract the dominant information from the variables of interest and minimize the computational effort, we employ the EOF analysis (Methods). The state vector x(τ) comprises time series (PCs, Methods) of various numbers of the leading modes for both ψ and χ, which collectively account for approximately 90% of the total variance. It’s important to note that the spatial structure in Equation (1) is disregarded since it does not vary with time. Equation (1) can be considered as a linearized, dimensionally reduced representation of a general circulation model (GCM).

Following Equation (1), the forecast IWNPSH (or any field of interest) can further be estimated through a multivariate linear regression:

$${{{{\rm{I}}}}}_{{{{\rm{WNPSH}}}}}={{{\boldsymbol{\rho }}}}{{{\bf{x}}}}(\tau )$$
(2)

where x(τ) represents the forecast state vector from Equation (1), and ρ is a squared matrix with the regression coefficients along the diagonal elements that map the forecast states of ψ and χ to WNPSH index (IWNPSH). To identify the optimal initial pattern that maximizes the westward extension of WNPSH, we can combine Equations (1) and (2) to obtain the following expression:

$${{{\boldsymbol{\rho }}}}{{{\bf{x}}}}({{{\boldsymbol{\tau }}}}){{{\bf{x}}}}{({{{\boldsymbol{\tau }}}})}^{T}{{{{\boldsymbol{\rho }}}}}^{T}{\left[{{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T}\right]}^{-1}-{{{\boldsymbol{\lambda }}}}{{{\bf{p}}}}=0$$
(3)
$$\to \rho {{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}{{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T}{{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}^{T}{\rho }^{T}{\left[{{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T}\right]}^{-1}-{{{\boldsymbol{\lambda }}}}{{{\bf{p}}}}=0$$
(4)

In Equation (3), the first term on the left-hand side (LHS) can be interpreted as the amplification factor of IWNPSH (i.e., the ratio in forecast variance between the final and the initial states). By solving the eigenvalue problem of Equations (3) (or (4)) and arranging the eigenvalues (λ) in descending order, the first eigenvector (i.e., p1) corresponds to the initial pattern (x(0)) or forcing (ϵ) that maximizes the forecast signals of WNPSH at t = τ. The results based on Gτ=15, and x(0) = p1 are demonstrated in Fig. 2.

Fig. 2: The initial optimal pattern of WNPSH.
figure 2

a The time-longitude Hovmoller diagram of atmospheric response to the initial optiaml forcing p1. Shading shows the eddy geopotential height anomaly averaged over 20°N-30°N (unit=m2s−2); Contours show the velocity potential averaged over 0°N-30°N (dashed contours for negative value and solid contours for positive values, interval = 2 × 105m2s−1). Scatters and black contours are regions passing 5% significance level based on a binomial test (sign test). b The amplification factors, λ, of 16 different LIMs from Gτ=10 to Gτ=25. Box plots show the data range across 16 models and the orange line represents the median values. Boxes are the interquartile range. Error bars represent 1.5 interquartile range from the median of 16 models. Circle dots represent outliers that surpass 1.5 interquartile range. Details of λ calculation are provided in the main text. c The composite pattern in 500hPa ω (scatters, red for ascending motion and blue for descending motion), SST anomaly (shading), and 850hPa rotational wind (vectors) with the time series in initial optimal greater than 1 standard deviation. The time series of initial optimal is derived by projecting observed z onto initial optimal pattern in z. Pressure velocity with an amplitude smaller than 0.01hPa  s−1 and wind with an amplitude smaller than 0.5m  s−1 are omitted. In addition, only fields that pass bootstrapping test at a 5% significance level are visualized in 2(c). (see SI for details).

Figure 2a shows the Hovmoller diagram of the model response in 850hPa eddy geopotential height (shading) and 250hPa velocity potential (contours) to the initial optimal forcing as a function of integration time and longitude. Specifically, x(0) = p1 is prescribed in the model initial states and integrates forward in time (details of numerical integration are provided in Methods). The forecast eddy geopotential height is then averaged over 20°N-30°N to observe the westward extension of WNPSH. The contours represent χ, which is averaged over 0°N-30°N and serves as a proxy for convection. From Fig. 2a, it is evident that at t = 0, there is a dipole pattern of z spatially collocated with χ, suggesting a coupling between the initial signals of the subtropical high and convection. Negative χ (dashed contour) reflects the active convection over Philippine/South China Sea (see later description in Fig. 2c). As the forecast lead time increases, a westward propagating signal of positive geopotential height, similar to the observed westward extension of WNPSH, becomes apparent. It should be noted that the optimal pattern (initial forcing) is only provided at t = 0; no prescribed ψ and χ at t > 0. As a result, the observed evolution is entirely governed by the internal dynamics of Gτ or, alternatively, can be seen as the response of this dynamical system to the initial forcing. This characteristic is apparent in a lag-lead relationship, as illustrated in Supplementary Fig. 6. The figure exhibits the lag correlation between the time series of the initial optimal and IWNPSH, where a positive lag and positive correlation coefficients indicate that the initial optimal is leading the development of WNPSH. Notably, Supplementary Fig. 6 reveals a heightened and statistically significant correlation within the τ = 0−20 range, providing further support for the observation in the LIM. Based on Fig. 2a and Supplementary Fig. 6, it is suggested that the mechanisms in determining the westward extension of WNPSH can be effectively simulated using a simple linear GCM.

The westward propagation of the initial optimal pattern can be explained through planetary Rossby wave dynamics. In a steady-state high pressure characterized by mid-troposphere subsidence, it generally follows the Sverdrup vorticity balance, where the planetary vorticity advection is balanced by the vortex stretching:

$$\frac{d\zeta }{dt}\approx -\beta v+f\frac{\partial \omega }{\partial p}\,\approx \,0$$
(5)

Here, β represents the gradient of planetary vorticity f, v represents the meridional wind, and \(\frac{\partial \omega }{\partial p}\) is the vertical differentiation of pressure velocity ω. Over regions with subsidence aloft, low-level northerly flow is the dominant feature, such as the eastern flank of the semi-permanent anticyclone over coastal California, as discussed in Rodwell and Hoskins (2001)14. However, on transient timescales (i.e., westward-extending WNPSH), the propagation is mostly dominated by the planetary vorticity advection (i.e., \(\frac{d\zeta }{dt}\approx -\beta v\)), where the southerly flow over the western flank of WNPSH brings low planetary vorticity from lower latitudes, ultimately leading to the westward extension of anticyclone18. Both features are evident in Fig. 2c, which presents the composite pattern in 850hPa rotational wind (vector), 500hPa ω (dots), and SST (shading) when the time series of the optimal initial pattern p1 exceeds 1 standard deviation. Over regions with maximum descending (blue scatters) and ascending (red scatters) motion aloft, we can observe surface northerly and southerly respectively suggesting the main balance happens between vortex stretching and planetary vorticity advection. Around the region of 25°N and 140°E, southerly flow exists at the nodal region between ascending and descending centers (regions with small mid-troposphere ω and \(f\frac{\partial \omega }{\partial p} \sim 0\)) indicating the dominant role of planetary vorticity advection in the extension of the subtropical high18. The southerly is a balanced flow of the dipole pressure anomaly in the initial optimal (see Fig. 2a), which originates from the convection over southern Japan and the regions of the Philippine/South China Sea. The results in Fig. 2a and c suggest the convection over these two regions is an important indicator for the following development of WNPSH around 125°E. It is also worth noting that our study does not exclude the possibility of the feedback processes between WNPSH and the adjacent convection as suggested by previous research18. The focus on the initial optimal forcing helps us to isolate the roles between the force and the response and identify the most predictable feature.

Upon examining Fig. 2a, we can observe that the entire process of westward extension lasts approximately 20 days, indicating a potential forecasting opportunity for WNPSH. This observation gains further support when we illustrate the amplification factor of λ as a function of integration time. An amplification factor, λ, is defined as the ratio between the variance in forecast states to the initial states (Equation (4)). In Fig. 2b, the box plots represent the range of λ values obtained from multiple LIMs (i.e., Gτ), which were estimated by varying τ values in Equation (1) (and also in Equation (9)), and the solid line represents the median of all models. This setup allows us to test the sensitivity of λ calculated from different lag relations in the data (also see discussion in Methods). Specifically, we first calculate G1 for each individual model (Gτ) with various τ ranging from τ = 10 to τ = 25 (16 models in total, see Methods for details). G1 is the dynamical operator for a single-step integration. Next, we perform forward integration using the operator \({{{{\bf{G}}}}}_{{{{\bf{N}}}}}^{{\prime} }=\mathop{\prod }\nolimits_{i = 1}^{N}{{{{\bf{G}}}}}_{{{{\bf{1}}}}}\), where N represents the number of integration time steps. By substituting \({{{{\bf{G}}}}}_{{{{\bf{N}}}}}^{{\prime} }\) back into Gτ in Equation (4), we can estimate the amplification factor for each model and for a chosen N.

By inspecting Fig. 2b, one can find the forecast variance of IWNPSH has maximum amplification around 5 to 15 days before the signals dip below the climatological variance level (represented by a gray dashed line). This characteristic remains consistent across 16 different models. Figure 2b highlights an additional potential for forecasting on subseasonal timescales, facilitated by the presence of initial optimal forcing. Conversely, if red noise (modal) is the only process taken into account, the forecast signals will merely attenuate and stay beneath the climatological variance.

While Fig. 2 illustrates a relatively simple setup, it’s important to note that optimal forcing can occur not only during the initial states but also at any point in time. Given the linearity of the model, the total response will be the accumulation of each individual response to the forcing at different lead times (i.e., Green’s function). Along with the transient growth, a plateau of cumulative response can be anticipated if a model is frequently forced by optimal patterns. In the next section, we will highlight the importance of the cumulative effect of optimal forcing of the record-breaking 2020 event.

A record-breaking event–2020 subtropical high

The summer of 2020 holds a special significance for WNPSH due to its strong and enduring characteristics (Supplementary Fig. 3a). According to previous literature9,10,12,31, the enhancement of WNPSH on seasonal and longer timescales can be primarily attributed to two factors: first, the increase in thermal contrasts between land and ocean, and second, the atmospheric teleconnected response to ENSO transition states (from El Niño to La Niña). The enhanced thermal contrast between land and ocean in boreal summer has been widely documented as the possible driver of the long-term trend in WNPSH expansion9,20. While the uncertainty exists due to the model bias in tropical SST and the metric used for quantification22,23, the bias-corrected result based on emergent constraints20 suggests the strengthening in WNPSH across GCMs. A zonally asymmetric heating pattern, with land temperature undergoing a more pronounced increase compared to the ocean, can generate an anomalous anticyclone to the west of WNPSH. This feature becomes particularly evident in the lower troposphere, where the heating achieves its strongest amplitude9.

Apart from the long-term trend, WNPSH’s behavior is also influenced by tropical SST variability, i.e., ENSO. Specifically, during the spring and early summer of a decaying El Niño (developing La Niña) year, an anticyclonic circulation is observed over the Philippine Sea and the western North Pacific (around 15°N, 150°E, see Fig. 1b in Stuecker et al. (2015)13). Several mechanisms have been proposed to explain this occurrence, including the nonlinear interaction between the seasonal cycle and ENSO13, the atmospheric response to the delayed warming of the Indian Ocean32,33 and Western Pacific cooling34, and the interaction between the Asian monsoon and ENSO35. While the above explanations mainly focus on the quasi-stationary response, typically in terms of monthly or seasonal averages, we will introduce an alternative perspective by examining the transient dynamics of WNPSH, which can form an additional feedback loop and amplify the existing anticyclonic circulation.

According to the linear stochastic dynamics and the prior simulation, it becomes clear that the atmospheric response exhibits a much longer memory (i.e., \({{{\bf{x}}}}({{{\boldsymbol{\tau }}}}){{{\bf{x}}}}{({{{\boldsymbol{\tau }}}})}^{T}{({{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T})}^{-1} > 1\) for a specific range of τ) in comparison to the stochastic forcing that drives the response. On the contrary, stochastic forcing decorrelates between any two different time steps due to its short memory, i.e.,

$${{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}}{{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}+{{{\boldsymbol{\tau }}}}}^{{{{\bf{T}}}}}={{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}}{{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}}^{{{{\bf{T}}}}}{{{{\boldsymbol{\delta }}}}}_{{{{\bf{t}}}},{{{\bf{t}}}}+{{{\boldsymbol{\tau }}}}}\,{{{\rm{where}}}}\,{{{{\boldsymbol{\delta }}}}}_{{{{\bf{t}}}},{{{\bf{t}}}}+{{{\boldsymbol{\tau }}}}}=\left\{\begin{array}{ll}{{{\bf{I}}}}\quad &{{{\rm{if}}}}\,\tau =0\\ 0\quad &{{{\rm{otherwise}}}}\end{array}\right.$$
(6)

Here, \({{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}}{{{{\boldsymbol{\epsilon }}}}}_{{{{\bf{t}}}}+{{{\boldsymbol{\tau }}}}}^{{{{\bf{T}}}}}\) represents the noise covariance (as per Equation (10)) between time steps t and t + τ, and δt,t+τ denotes the Dirac delta function. However, if a system is consistently modulated by low-frequency variability with a pattern akin to the optimal forcing, the non-modal dynamics can further magnify the existing WNPSH and sustain its amplitude. Consequently, an extended and cumulative response in WNPSH can be anticipated.

The anticyclonic circulation during the decaying El Niño (developing La Niña) spring and early summer enhances convection over Japan and the Philippine/South China Sea as well as subsidence over Northwestern Pacific, aligning spatially with the horseshoe shape of the optimal pattern (see ω in Figs. 2c and 1b in Stuecker et al. (2015)13). Consequently, the likelihood of observing the initial optimal of WNPSH is also increased during developing La Niña years. To support the aforementioned claim, Fig. 3 depicts the JJA time series of IWNPSH (y-axis) and the initial optimal (x-axis) for 2019, 2020, and 2021 (arrows and lines). These years coincidentally represent various ENSO phases. Specifically, 2019 corresponds to a weak El Niño event, 2020 is a developing year towards La Niña condition, and 2021 represents the second La Niña year. In 2019, the initial optimal’s amplitude remains relatively small, staying generally below 1 standard deviation. With this modest amplitude, the transient growth of WNPSH is less pronounced. In contrast, throughout the 2020 JJA period, the optimal pattern consistently stays close to 1 standard deviation, indicating a sustained impact from low-frequency variability. Consequently, the cumulative influence of the transient growth prevents WNPSH decay, leading to an observed IWNPSH plateau. In 2021, the WNPSH undergoes intraseasonal variability, completing two cycles within JJA. Differing from 2020, the average IWNPSH for 2021 JJA is around -0.14 (0.8 for 2020), suggesting limited low-frequency modulation on seasonal and longer timescales. The difference among these 3 years suggests the role of background low-frequency variability in modulating the presence of initial optimal while the cumulative response to the initial optimal maintains the amplitude of WNPSH. The Hovmoller diagrams presented in Supplementary Fig. 7 offer an alternative perspective on the preceding discussion. In 2020, a persistent dipole of vertical velocity (scatters) consistently appeared in the Western Pacific (indicated by two vertical dashed lines). Conversely, 2019 and 2021 exhibit more transient signals. The accompanying time series on the right panels further underscores the intimate connection between the initial optimal state and the subsequent evolution of WNPSH.

Fig. 3: The phase relation of subtropical high and its initial optimal.
figure 3

The time evolution (lines) of initial optimal and IWNPSH for years of a 2019, b 2020, and c 2021. The shading shows the probability density function of all historical events. The dashed circle represents the boundary of 1 standard deviation.

To further support the statement above, Supplementary Fig. 3b shows the entire time series of initial optimal. It becomes clear that some of ENSO transition years (yellow shading in Supplementary Fig. 3) with the strongest WNPSH are also characterized by sustained amplitude of initial optimal, such as 1980, 1995, and 2010. Supplementary Figs. 8 and 9 provide a closer look of these years with persistent optimal signals, revealing their similar behavior to the year 2020. A common feature in Fig. 3 and Supplementary Fig. 9 is the counterclockwise trajectory rotation, indicating the optimal forcing consistently leads WNPSH development. This is further justified by the climatological probability density function’s (shading) round shape.

The question remains as to why the dipole pattern in 2020 is not as clear as those shown in 2021 but still results in one of the strongest WNPSH in history (Supplementary Fig. 7). This phenomenon can also be elucidated by the presence or absence of low-frequency forcing. In cases without strong external forcing, such as the case in 2021, a complete cycle may be observed where the initial optimal leads to the growth of WNPSH, and an expanded WNPSH suppresses convection to the west. Conversely, under circumstances involving external forcing, such as the SST cooling in the Northwestern Pacific34 and persistent Meiyu front at Japan during the decaying phase of the El Niño, sustained subsidence in response to the forcing materializes at the east center. The low-frequency subsidence consistently forces one side of the dipole, creating conditions favoring the initial optimal. In such cases, a sequential westward-propagating high-pressure anomaly is likely to be observed. Therefore, before one event reaches its maximum phase, we already observe the second optimal starting propagating westward due to externally forced eastern subsidence. The 2010 subtropical high is another case similar to 2020. (Supplementary Fig. 8)

To better understand whether convection over the Philippine/South China Sea and Japan is a key indicator for the subsequent WNPSH development, we conduct composite analysis of 850hPa vorticity over the parameter spaces associated with the convection indices of Japan and the Philippine/South China Sea. Specifically, we first derive the mean vertical motion across the domains: 0°N-30°N, 110°E-125°E for the South China Sea to the Philippine Sea, and 30°N-40°N, 125°E-160°E for Japan, with each region being independently scrutinized (i.e., black boxes in Supplementary Fig. 10a and 10d). These standardized time series of domain-averaged vertical motion are then employed as the convection indices. The historical events are categorized into bins based on these convection indices, with intervals of 0.2 standard deviations. Subsequently, pattern correlations are calculated between the initial optimal vorticity (depicted in Supplementary Fig. 10b shading) and the averaged vorticity pattern at 850hPa within each bins. The result is presented in Supplementary Fig. 10c.

In Supplementary Fig. 10c, a clear pattern emerges where the correlation coefficient progressively increases as we move from the lower-left quadrant to the upper-right quadrant. This progression centers around a value of 0 at the origin. This result implies that the pattern demonstrates a greater resemblance to the initial optimal pattern when active convection is concurrently present in both Japan and the Philippine/South China Sea domains. To lend further support to this observation, we aggregate cases involving the real-time initial optimal pattern and IWNPSH with amplitudes exceeding 1.5 or dipping below -1.5 standard deviations onto the parameter space. In Supplementary Fig. 10c, it is evident that the circled dots (representing the initial optimal pattern) are distinctly spaced apart from each other, while the squared dots (IWNPSH) cluster within the range of uncertainty. This separation among the circled dots emphasizes the connection between convection in these two domains and the initial optimal pattern. On the other hand, the cluster of squared dots arises from the delayed response of WNPSH, where the initial optimal emerges before WNPSH is fully developed.

Experimental S2S Forecasts of WNPSH

While the non-modal dynamics contribute to understanding the unprecedented event in 2020, their implications for the subseasonal forecast of WNPSH have not been explored. To address this gap, we conducted experimental forecasts, categorizing the data into two groups: (1) initial optimal states with amplitudes exceeding 1.5 standard deviations, and (2) initial optimal states with amplitudes smaller than 0.5 standard deviations. The forecast target is the eddy geopotential height averaged over 20°N-30°N, as indicated in the domain of Fig. 1b. In Fig. 4, the blue and pink shadings represent forecast skills (pattern correlations) for strong and weak initial optimal cases, respectively. The solid line denotes where the skills in strong cases significantly surpass those in weak cases at a 5% significance level, as determined by a t-test. From Fig. 4a, it is evident that when the model is initialized at strong initial optimal states, the forecast skills generally outperform those initialized in weak initial optimal states for the first 20 days of forecast lead.

Fig. 4: Experimental predictions of averaged eddy geopotential height between 20°N and 30°N, as specified in the Fig. 1b domain.
figure 4

Forecast skill is assessed using pattern correlation. a Illustrates forecast skill for strong initial optimal cases (amplitude > 1.5 standard deviations, shown in blue) and weak cases (amplitude < 0.5 standard deviations, shown in pink). b mirrors (a) but focuses on positive initial optimal cases (initial optimal > 1.5). c mirrors (a) except for negative initial optimal cases (initial optimal < − 1.5). The presence of the black line denotes that the forecast skills are significantly higher for the strong initial optimal cases at 5% significance level by one-tailed t-test.

If we further categorize strong cases into two groups: (1) initial optimal states greater than 1.5 standard deviations and (2) initial optimal states smaller than -1.5 standard deviations, it becomes apparent that the heightened prediction skills around 10-30 days of lead time stem from the positive initial optimal cases, indicating the development of a subtropical high anomaly rather than a low anomaly. This suggests the asymmetric nature of WNPSH development in the real world. While linear dynamics do not incorporate this asymmetry, further investigation is needed to diagnose the source of the asymmetric response in future studies. In summary, Fig. 4 highlights that the transient growth of WNPSH can amplify signals from the initial states, leading to improved forecast skills on subseasonal timescales.

Discussion

While the mechanisms governing the seasonal to long-term trends in WNPSH variability have been extensively examined in prior literature9,10,11,12,13,19,21,23, studies concerning its subseasonal predictability remain in their nascent stages due to intricate multi-scale interactions. In this study, we investigate the initial optimal state that lends the maximum growth of WNPSH on subseasonal timescales, utilizing linear inverse modeling. The outlined mechanism is summarized in Fig. 5.

Fig. 5: A schematic diagram of the non-modal growth mechanisms.
figure 5

a At the optimal initial state, the convection over the Philippine/South China Sea and Japan is indicated by the cloud. The associated overturning circulation is indicated by two curved arrows. The shading shows the SST pattern and vectors represent 850-hPa rotational wind. The solid contour shows the 5880-gpm serving as a proxy of WNPSH western boundary. The two cross sections of vertical motion are indicated by the two arrows (red and blue). b Similar to a except for the fully developed WNPSH.

During the phase of the initial optimal state, convective activity over the Philippine/South China Sea and Japan along with the subsidence east of 150°E form a dipole vorticity pattern. Within the convective region, the stretching (squeezing) of the vortex is counteracted by northerly-induced (southerly-induced) planetary vorticity advection, a balance recognized as the Sverdrup vorticity balance. A comparable condition, albeit with an opposite sign, occurs around 120°E. Meanwhile, in regions situated between the maxima of mid-tropospheric ascent and descent (approximately 130°E-140°E), the dominance of planetary vorticity advection by geostrophic southerly (also referred to as the β effect) ultimately expand the western boundary of WNPSH. Interestingly, the development of WNPSH also terminates its initial optimal (Fig. 5b). During its westward expansion (accompanied by slight poleward expansion), the strong mid-troposphere subsidence inhibits the development of convection. Without the convection shown in the initial optimal, the system is dynamically damped and the non-modal growth vanishes. This process ultimately brings the termination of the growth of WNPSH.

A worth noting point is that the forecast signals benefit from the transient growth last for approximately 20 days. However, this doesn’t imply the predictability limit of daily WNPSH is confined to 20 days. Instead, when a system remains consistently influenced by low-frequency climate variability, which facilitates the presence of the initial optimal state, transient growth can reinforce the observed WNPSH signals. The exceptional case of the record-breaking event in 2020 exemplifies this scenario, where the atmospheric response to ENSO transition phase left a mark by modulating the occurrence of the initial optimal state, leading to a prolonged WNPSH period. In other words, there is no clear boundary of WNPSH predictability unless all low-frequency varaibilities responsible for the initial optimal are unveiled. It’s also important to acknowledge that the mechanism proposed in this study does not contradict previous research, which primarily focused on the post-El Niño strong WNPSH. Instead, our study aims to bridge the gap by offering additional explanations for the transient variability in the westward extension of WNPSH, an aspect that cannot be solely explained by lower-boundary forcing, such as SST.

The insight gained from this study also raise intriguing questions for future research. For instance, considering that various climate variabilities such as the boreal summer intraseasonal oscillation8, Pacific-Japan pattern15, and Silk-road pattern impact WNPSH intensity36, can corresponding changes be observed in the initial optimal state? Moreover, how does the non-modal growth of WNPSH differ between these various processes? All of these processes and their interaction with the initial optimal deserve further exploration.

Another critical implication acquired from this study pertains to the future projections and the persistence of WNPSH. In accordance with the fluctuation-dissipation theory (as stated in Equation (10)), it suggests that if a system’s internal timescale is overestimated, then the system’s response to external forcing (e.g., radiative forcing) is similarly overestimated by the same factor37,38. Consequently, a GCM with an overestimated transient growth of WNPSH will exhibit biases in its response to greenhouse gas radiative forcing, magnified by the same factor. Therefore, it’s intriguing to explore the mechanisms accountable for both the over-persistent and under-persistent behaviors of WNPSH observed across different climate models.

Methods

Reanalysis and SPEAR Data

Reanalysis data (1979–2021) from the European Centre for Medium-Range Weather Forecasts (ERA5, Hersbach et al. 202039) is used as an observational reference and for the development of dynamical-statistical models. Variables used in the model include 850hPa and 250hPa stream function ψ and velocity potential χ. To extract the dominant information inside the data and reduce the computational cost, we employ EOF analysis within specific domains: 10°N-60°N, 100°E-170°E for ψ and 20°S-30°N, 180°W-180°E for χ, before proceeding with model development. It can be achieved by finding the eigenvectors of the covariance matrix χχT and ψψT, where both ψ and χ have dimensions of grid × time. Other variables, such as geopotential height from 1000hPa to 250hPa, 500hPa pressure velocity ω, SST, and 850hPa rotational wind are not directly forecasted in the model but retrieved as diagnostic variables from ψ and χ using multivariate linear regressions or composite analysis. To focus on the subseasonal predictability of WNPSH, we remove the first three harmonics of the seasonal climatology and apply a 10-day low-pass filter to every variable used in this study. We further remove the linear trend in SST to ensure long-term climatology of forecast state tapers off toward 0 given its strong memory.

Linear Inverse Model

A LIM uses coarse-grained variables to develop a dynamical operator and approximate the underlying linear dynamics for the system of interest. i.e.,

$$\frac{d{{{\bf{x}}}}}{dt}={{{\bf{Lx}}}}+{{{\boldsymbol{\xi }}}}$$
(7)

where x is the state vector which contains stream function ψ and velocity potential χ from 850hPa and 250hPa. L is a linear dynamical operator, which describes how the time rate change of each variable in the state vector is governed by the interactions among state variables. To reduce the data dimension, we employ EOF analysis on ψ and χ (see former section for detailed descriptions). The PCs of the leading modes are used as the prognostic variables in Equation (7) rather than the raw data. The truncated fields retain a minimum of 90% variance for each variable, which corresponds to the variance accounted for by the first 12, 15, 10, and 11 leading modes of 850hPa ψ, 250hPa ψ, 850hPa χ, and 250hPa χ, respectively.

An alternative but more widely used form of Equation (7) can be written as:

$$\begin{array}{rcl}{{{\bf{x}}}}(\tau )&=&{{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}{{{\bf{x}}}}(0)+{{{\boldsymbol{\epsilon }}}}\\ &=&{e}^{{{{\bf{L}}}}\tau }{{{\bf{x}}}}(0)+{{{\boldsymbol{\epsilon }}}}\end{array}$$
(8)

Equation (8) can be considered as an integral form of Equation (7), where Gτ and L are linked to each other through eLτ (second line of Equation (8)). ϵ is the random white noise. The dynamical operator Gτ can be derived by

$${{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}={{{{\bf{C}}}}}_{{{{\boldsymbol{\tau }}}}}{{{{\bf{C}}}}}_{{{{\bf{0}}}}}^{-1}$$
(9)

where Cτ is defined as the covariance matrix between x(τ) and x(0), i.e., Cτ = x(τ)x(0)T and C0 is the covariance matrix of x(0) itself, i.e., C0 = x(0)x(0)T. The random white noise ϵ can be found via fluctuation-dissipation relation27, i.e.,

$${{{\boldsymbol{\epsilon }}}}{{{{\boldsymbol{\epsilon }}}}}^{T}\,\approx -\,[{{{\bf{L}}}}{{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T}+{{{\bf{x}}}}({{{\boldsymbol{0}}}}){{{\bf{x}}}}{({{{\boldsymbol{0}}}})}^{T}{{{{\bf{L}}}}}^{T}]\tau +O({{{{\bf{L}}}}}^{2}\tau )$$
(10)

One should note that Equation (10) only holds when τ is much smaller than the characteristic timescales of each component in x. In our study, the time step for integration is 1 day while the variables used in x have their high-frequency signals subtracted (10-day low-pass filtered). Therefore, the assumption in Equation (10) is relatively solid. To further test the appropriateness of modeling WNPSH with LIM, we apply goodness-of-fit to the forecast residual in Equation (8). The results shown in Supplementary Fig. 11 suggests that the residual generally follows Gaussian distributions spanning over a range of τ (from 1 to 25) according to a two-sided K-S test at a 10% significance level.

Forward Integration of LIM

To achieve forward integration, we first estimate G1 for the selected LIM (G15 in our study) based on the relation of

$${{{{\bf{G}}}}}_{{{{\bf{1}}}}}={e}^{{{{\bf{L}}}}}=\exp [\ln {{{{\bf{G}}}}}_{{{{\boldsymbol{\tau }}}}}\cdot {\tau }^{-1}]$$
(11)

Subsequently, for an N-step integration, we can use the N-times cumulative product of G1, i.e., \({{{{\bf{G}}}}}_{{{{\bf{N}}}}}^{{\prime} }=\mathop{\prod }\nolimits_{i = 1}^{N}{{{{\bf{G}}}}}_{{{{\bf{1}}}}}\). The forecast states is then defined as \({{{\bf{x}}}}({{{\bf{N}}}})={{{{\bf{G}}}}}_{{{{\bf{N}}}}}^{{\prime} }{{{\bf{x}}}}({{{\boldsymbol{0}}}})\). With this setup, one can derive the forecast value at any lead time with predetermined Gτ and N.