Exploring dominant processes for multi-month predictability of western Paci ﬁ c precipitation using deep learning

Over the past half-century, there has been an increasing trend in the magnitude and duration of the Madden-Julian Oscillation (MJO) attributable to the signi ﬁ cant warming trend in the Western Paci ﬁ c (WP). The MJO, bridging weather and climate, in ﬂ uences global and regional climate through atmospheric teleconnections, and climate models can predict it for up to 4 – 5 weeks. In this study, we use deep learning (DL) methods to investigate the predictability of the MJO-related western Paci ﬁ c precipitation on a multi-month time scale (5 – 9 weeks). We examine numerous potential predictors across the tropics, selected based on major MJO theories and mechanisms, to identify key factors for long-term MJO prediction. Our results show that DL-based useful potential predictability of the WP precipitation can be extended up to 6 – 7 weeks, with a correlation coef ﬁ cient skill ranging from 0.60 to 0.65. Observational and heat map analysis suggest that cooling anomalies in the central Paci ﬁ c play a crucial role in enhancing westerly anomalies over the Indian Ocean and warming in the WP, thereby strengthening the Walker circulation in the equatorial Paci ﬁ c. In addition, the predictability of WP precipitation is higher in La Nina years than in El Nino or normal years, suggesting that mean cooling in the central Paci ﬁ c may contribute to increased predictability of the MJO-related WP precipitation on the multi-month time scale. Additional model experiments using observed sea surface temperature (SST) anomalies over the central Paci ﬁ c con ﬁ rmed that these anomalies contribute to enhanced MJO-related convective anomalies over the WP. The study highlights that DL is a valuable tool not only for improving MJO-related WP prediction but also for ef ﬁ ciently exploring potential mechanisms linked to long-term predictability.


INTRODUCTION
The Madden-Julian Oscillation (MJO) 1 has a significant impact on various aspects of tropical weather [2][3][4][5][6][7][8][9][10][11][12] , tropical cyclones 7,8 , regional monsoons 3,5,11,13 , and the El Niño Southern Oscillation (ENSO) 14,15 .Thanks to improvements in climate models and international collaborative research, the latest climate models participating in coupled model intercomparison project phase 6 (CMIP6) have a useful skill for MJO forecast up to 30-39 days in advance 16,17 .However, the forecast skill at a longer lead time, exceeding four weeks, remains relatively low due to systematic model biases arising from a poor representation of MJO dynamics and physics 17,18 .As anthropogenic global warming leads to the expansion of the warm pool, there is potential for an increase in MJO-related precipitation anomalies over the western Pacific (WP).This suggests that extreme weather events induced by the MJO in densely populated mid-latitude regions could be intensified 6,[19][20][21][22] .Hence, there is a growing need for improved sub-seasonal forecasts at longer lead times, specifically for WP precipitation.
Deep Learning (DL) has emerged as a valuable tool in weather and climate prediction 23,24 , as well as for model parameterization 25 , development of global climate model 26 , and postprocessing tasks for weather and climate prediction 27,28 .In this study, we employ DL techniques to explore the predictability of WP precipitation at lead times extending over several months.By incorporating existing MJO theories and mechanisms, we aim to deepen our understanding and improve our forecasting capabilities in this crucial region.
In this study, we develop the DL model for the predictability of MJO-related precipitation over the WP, and we demonstrate that the DL model can extend predictability for MJO-related WP precipitation at a longer lead time.Among many possible predictors suggested by MJO theories, sea surface temperature (SST) anomalies over the central and eastern Pacific (CP) play a key role in extending long-term predictability through enhancing MJO-related circulation.These results underscore that DL is a valuable tool for efficiently uncovering the underlying mechanisms associated with weather and climate predictability and improving prediction.

RESULT Forecast skills of DL models
The state-of-the-art climate models have a limitation in predicting MJO beyond one month and dominant processes or mechanisms for leading the long-range MJO forecast with 5-9 weeks of lead time remain unclear.To explore its long-term predictability and driving mechanisms, we build a DL forecast system using various predictors selected based on major MJO theories and mechanisms proposed.Our DL model incorporates various atmospheric variables, including precipitation, lower tropospheric specific humidity, zonal wind, equivalent potential temperature, diabatic heating, zonal and meridional advection of moisture, boundary layer moisture convergence, upper-level zonal wind, the vertical velocity at 500 hPa, and convective instability (CIN) maps, as predictors.These predictors are represented by anomaly maps spanning from 0°to 360°E and 40°S to 40°N, covering the present time up to 12 weeks in advance.The target variable or predictand in this study is the MJO-related precipitation index over the WP region, specifically spanning from 120°to 150°E and 10°S to 10°N.This index is derived from 20-70-day band-pass filtered data and is used to assess the relationship between the MJO and precipitation patterns in the WP region.The target lead time for the forecast is up to 12 weeks ahead, as shown in Fig. 1 and described in the Methods section.To train the DL model, we utilized reanalysis data from 1997 to 2010, and the forecast results from 2012 to 2017 were used for validation.To reduce the impact of MJO events occurring in the latest training year, we introduced a one-year gap between the last training year and the validation period.
Figure 1 shows the prediction skill of the pentad-mean MJOrelated precipitation index over the WP obtained from the DL model using various predictors in terms of the anomaly correlation coefficient between the observed and predicted index.The forecast skills of all the models generally decline as the forecast lead time increases.However, an interesting pattern emerges where the forecast skills start to improve again around 50 days (approximately 7 weeks) and remain relatively consistent up to 80 days (11 weeks).Notably, during the 7-9 week period, the correlation ranges from 0.5 to 0.65, which surpasses the forecast performance of other leading dynamic models 17,18 .These results show that the DL model exhibits skillful predictions potentially up to 2 months in advance.Compared to previous statistical methods such as neural networks or multiple linear regression, the DL model has potentially better predictability for WP precipitation.This improvement can be attributed to the DL model's ability to effectively capture relevant regions from the input data, particularly the anomalous maps [29][30][31][32] .
For most forecast lead times, the individual ensemble members exhibit a narrow range of forecast skills.Additionally, DL models' forecast skill remains relatively unaffected by input data source variations.When alternative reanalysis data are utilized, the forecast skill of the DL model experiences minimal changes.These findings validate the notion that the DL forecast exhibits low uncertainty and that the DL model is suitable for real-time forecasting systems.The DL forecast model has better skills with lower and upper-level zonal wind, convective instability, SST, and lower tropospheric diabatic heating as a predictor, while relatively lower skills with zonal moisture advection, lower-level moisture, and upper diabatic heating.For example, the skill of the DL model with the zonal wind at 850 starts from 0.82 and decreases to 0.62 at 20 days of forecast lead time, maintaining the skill score until 50 days of the lead month.The skill score increases to 0.68 during 55-80 days.The model using SST produces a correlation coefficient of 0.66 at 25 days forecast lead time and shows almost the same correlation up to 80 days.On the other hand, the skill score from the model using zonal moisture advection is 0.63 at 25 days, and it monotonically decreases with increasing forecast lead time and under 0.6 during 55-85 day forecast lead time.Note that a rebound in prediction skill generally was not seen in the previous studies using linear methods and neural network models 31,32 .Also note that when we use all predictors in the DL model, the prediction skill is lower than those using good predictors (Supplementary Fig. S2).

Dependency of forecast skill on target period
We examine the seasonal dependence of the forecast skill of DL models with the SST (Fig. 2).For all seasons, forecast skill decreases as forecast lead time increases, with the lowest values occurring at 40-50 days lead time.The boreal summer (June-August) skill is lower than the boreal winter (December-February) skill for 30-85 days forecast lead time.The current DL model tends to capture the zonal asymmetry in SST, moisture, and circulation that favors eastward propagation during the boreal winter.However, during the boreal summer, northward propagation becomes more dominant over the WP.It is well known that vorticity anomalies due to the mean zonal shear, the meridional structure of the boundary layer moisture, and the air-sea interaction associated with the meridional SST gradient mainly contribute to the northward propagation.Unfortunately, the DL model does not capture these structures well because the input data are relatively poor representations of these processes or mechanisms.We investigate the dependence of forecast skills on the validation period.Based on previous studies, ENSO can influence MJO convective anomalies over the WP by altering Walker circulation and SST anomalies.We calculate forecast skills separately for La Niña, El Niño events, and normal years.During La Niña, the forecast skill gradually decreases to a correlation coefficient of about 0.55 at a lead time of 30-35 days and then stabilizes at about 0.6-0.75 at a lead time of 60-80 days (Supplementary Fig. S1a).In contrast, during El Niño events, the skill tends to decrease further, reaching a correlation coefficient of 0.25 at a lead time of 45-55 days.Some predictors show a slight increase in skill, reaching a correlation of about 0.40-0.55 in correlation, but others show stable correlation values at a forecast time of 60-80 days (Supplementary Fig. S1b).During normal years, the rebound of the skill around 55-85 days was not shown (Supplementary Fig. S1b).Previous studies have shown that MJO propagation is faster and more extensive during La Niña events, with zonal circulation patterns over the Indo-Pacific warm pool 33,34 .In addition, the northward propagation of the MJO is more energetic during La Niña years 35 .A recent study has also shown that the highest predictability of the MJO is observed during La Niña years 16 .These results support the robust and consistent influence of La Niña events on MJO propagation, which is consistent with the results of our studies.

Cause of improved forecast skill
We examine how the DL model improves MJO-related precipitation over the WP at a longer lead time.First, we examine the lagged regression between the input variables used in the DL and WP precipitation from observations.Present WP precipitation is significantly related to strong cooling over the CP and eastern Pacific (EP) and a mediated cooling over the Indian Ocean (IO) and warming in the WP 65 days in advance (e.g., Fig. 3a).Corresponding easterly wind anomalies occur from the WP to CP and westerlies in the IO and the far EP and tropical Atlantic.The typical Kevin-Rossby wave pattern is found in the IO (Rossby wave) and CP (Kelvin wave) (e.g., Fig. 3b).There is an anticyclonic flow in the southern Pacific that contributes to anomalous easterlies in the CP and WP.The surface warming and moisture convergence in the WP can generate the CIN, and its horizontal pattern is similar to that of the SST anomalies (e.g., Fig. 3c).The positive CIN generates upward vertical motion over the WP and downward motion over the IO and CP (e.g., Fig. 3d).The positive convective instability occurs over WP, while it is negative in IO and CP.Corresponding vertical motions at 500 hPa show similar horizontal patterns to those of SST and convective instability.
We performed a "heat map" analysis for a 65-day lead forecast of precipitation in the Western Pacific (WP).The heat map illustrates the contribution of input data to the output (or forecast) of the DL model.Darker (lighter) shading in the heat map indicates a stronger (weaker) contribution of the input data to the WP precipitation forecast.The heatmap generated by the DL model using SST as an input shows negative anomalies over the CP, suggesting that cooling in these regions primarily affects the prediction of WP precipitation 65 days in advance (Fig. 3e).The DL model using the low-level zonal wind shows pronounced anomalies in the CP, WP, and Indian Ocean (Fig. 3f).In addition, the DL model incorporating convective instability and vertical motion at 500 hPa shows strong negative anomalies in the WP (Fig. 3g, h).These results suggest that pre-existing cooling over the CP may increase the likelihood of MJOrelated precipitation in the WP.The CP cooling and WP warming with anomalous easterlies generate sinking motion over the CP and EP and rising motion over the WP (Fig. 4a).The rising motion over the WP is associated with sinking motion over the western IO, inducing westerly surface anomalies and weak cooling in the IO (Fig. 4b).The EP cooling with westerly anomalies extends into the tropical Atlantic Ocean, where it enhances the Atlantic Nino warming.This warming generates ascending motion and divergent flow in the upper troposphere, which is associated with sinking motion over the western IO.The sinking motions enhance further westerly anomalies, inducing convergence and warming over the WP (Fig. 4c).These changes in SST and circulation may be favorable for the generation of the next MJO-related precipitation.To confirm the role of CP cooling on MJO-related precipitation over the WP, we examined the relationship between CP cooling and other SST anomalies and circulation.The horizontal pattern regressed on CP SST is similar to that regressed on WP precipitation anomalies.CP cooling and WP warming are associated with easterlies and IO cooling with westerlies (Fig. 4a).The upper divergence over the WP is associated with convergence in the IO and CP.The weak divergent flow over the tropical Atlantic is associated with convergence over the WIO and CP, enhancing the sinking motion there (Fig. 4b).
To confirm the contribution of the CP region to the WP precipitation forecast 65 days ahead, we performed additional forecasts using DL models with only SST anomalies over the 1) CP, 2) WP, and 3) WIO regions.When we use local SST over the CP or WP regions, the forecast skill does not change much compared to those using global SST as input data (Fig. 5), because WP warming and CP cooling are closely correlated.However, when we use local SST anomalies over the western Indian Ocean, where the MJO is initiated, the forecast skill is significantly reduced up to 0.2 for a 30-day lead time and shows a relatively low correlation (0.52) for a 65-day lead time.These results suggest that the SST and circulation anomalies over the CP contribute significantly to improving the forecast skills for the MJO-related WP precipitation.

Role of central Pacific SST on MJO simulation
The role of the CP SST on the MJO simulation can be demonstrated from numerical experiments, we have performed additional numerical simulations with and without CP cooling nudging (see "Methods").Supplementary Fig. S3 shows wavenumber power spectra along the equator (10°S-10°N.During winter, the observations show a strong eastward propagation in the 30-70 day period and zonal wavenumbers 1-3.In the model without CP cooling, the NESM3.0shows the peak of the 80-100 day period and zonal wavenumber 3 for eastward propagation, suggesting that the eastward propagation speed is much slower and its horizontal scale is smaller than that of the observation.However, in the model with CP cooling, eastward propagation is much stronger than westward propagation at a 50-80-day period and zonal wavenumber 2-4, implying that CP cooling helps to capture the eastward propagating MJO with propagation speed similar to the observation.During summer, the observation shows a relatively weak eastward propagation at 40-60 days.The model without CP cooling produces an eastward propagating at 80-100 days with zonal wavenumber 2-4 and an intermediate westward propagation.However, the model with CP cooling shows reduced westward power and increased eastward power at 50-70 days with zonal wavenumber 2-3.The model results show that CP cooling can induce enhanced MJO eastward propagation and speed, suggesting that CP cooling may contribute to the improvement of MJO-related WP precipitation forecasts.

DISCUSSION
The forecast using the DL model with SST and zonal wind over the CP as a predictor shows high potential predictability for longer lead times (55-75 days).The forecast skill is sensitive to predictors, training periods, and validation, which is less sensitive to ensemble numbers.The lower forecast skill during El Niño may be attributed to suppressed convection due to increased sinking motion in the upper level over the WP.Prediction skills may depend not only on the ENSO phases but also on MJO diversity.For a standing type of MJO event, convective anomalies do not propagate to the WP, while slowly propagating MJO events propagate to the dateline at 3.5 m s -1 , which may be related to a longer forecast lead time.This study shows that CNN successfully extracts the features in MJO-related input variables by using a convolutional process.It is also shown that the convolutional process of the CNN model could train the MJO model with a relatively small number of weather samples 23 .

Observed data and definition of ENSO
We also utilized the daily wind and flux from NCEP-DOE Reanalysis 36 , daily precipitation from GPCP version 2.3 37 , daily specific humidity from NCEP/NCAR Reanalysis 38 , and the NOAA daily Optimum Interpolation Sea Surface Temperature (SST) 39 to prepare input and validation data for DL models.All data were interpolated to 2.5°h orizontal resolution.For the definition of El Nino and La Nina years, we used the Oceanic Nino Index (ONI) 3-month running mean of ERSST.v5SST anomalies in the Niño 3.4 region (5°N-5°S, 120°W-170°W)] from NOAA (https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ONI_v5.php).

Deep Learning Models and Forecast for MJO-related Precipitation in the WP
The DL Models Our DL model consists of two max-pooling and three convolutional layers.The convolutional layers contain the neurons and are connected to the final output.The max-pooling layers are used to find the maximum values from the grids.We used 50 convolutional filters and neurons in the layers.We used 200 of the size of the mini-batch for each epoch and 300 of the epoch for training.The DL is not very sensitive to the number of epochs between 200 and 600 for MJO-related precipitation over WP.We created the heat map by calculating the contribution of each grid point to the output neuron.The heat map contains information about the weights used in the last convolutional layer.
In the DL models, the final output (or predictor) is pentad precipitation averaged over the western Pacific (one dimension).The predictants have used a ten-degree resolution global mapzonal wind at 200 and 850 hPa, specific humidity at 850, SST, equivalent potential temperature (EPT) at 850 hPa, diabatic heating (Yanai) at 300, 700, and 850 hPa, zonal moisture advection at 850 hPa, meridional moisture advection at 850 hPa, boundary layer moisture convergence at 925 hPa, convective instability (EPT at 850 hPa minus EPT at 400 hPa), vertical velocity at 500 hPa, and precipitation.The parameters of the DL are set separately for the target month and the lead month of the forecast.The main reason for using different parameters for each lead time is to improve predictability.Due to the characteristics of the cost function used in deep learning, using the same parameters for all lead times could potentially decrease predictability [30][31][32] .
To investigate the timing and contributions of predictors to the long-term predictability of WP pentad precipitation, we used different parameters for each lead time.The start time of the forecast covers January 1 to December 31 at 5-day intervals.For each forecast case, we used a forecast lead time of 5 days to 90 days.To avoid the influence of multidecadal (or interdecadal) variability (e.g., Atlantic multidecadal variability) on the MJO strength, we used 1997-2015 data, when the phase of AMO, IPO, and IO warning does not change much.The training period of the DL model is between January 1997 and December 2010, and the validation period to evaluate the forecast skill covers January 2012 and December 2015.The forecast skill of MJO-related precipitation in the western Pacific was evaluated using the temporal anomaly correlation coefficient.We used 10 ensemble members and the ensemble means are used for the forecast skill.The ensemble initialization is based on the bootstrap method.The DL used the values in the convolution filter when the cost function (the distance between the predicted value and the true value) is minimum.Using the values from the input layer, the DL model calculates the dot product and produces a horizontal feature map as an output.

Earth system model
We used the third version of the Nanjing University of Information Science and Technology Earth System Model (NESM3.0) 40,41.NESM3.0 consists of atmosphere, ocean, sea ice, and land models that are fully coupled by an explicit coupler.The resolution of the atmosphere model is T63L47.The ocean model has a grid resolution of 1•, with the meridional resolution refined to 1/3• over the equatorial region.It uses 46 vertical layers, with the first 15 layers in the upper 100 m.NESM3.0 simulates not only a reasonable climatology [40][41][42] but also important features of intraseasonal variability [43][44][45] .We performed idealized simulations with external forcing from the year 2000.The initial condition is from a historical simulation based on the Coupled Model Intercomparison Project Phase 6 (CMIP6).The NESM3.0 was integrated for 50 years with the initial condition and 2000s fixed external forcings based on CMIP6, and the last 20 years of data are used for analysis.

Fig. 1
Fig. 1 Correlation skill of the MJO-related precipitation index from the DL model.The correlation skill of the MJO-related precipitation index is shown as a function of the forecast lead month from the DL model for all seasons.The precipitation index is calculated as the average pentad precipitation over the western Pacific region (120°-150°E, 10°S-10°N).The DL model utilizes two predictors: one is the fixed input of the precipitation anomaly map, and the other is selected from various weather variables based on existing MJO theories.The DL model was trained using reanalysis data from 1997 to 2010, and the forecast results from 2012 to 2017 were used for validation.The selected weather variables include the equivalent potential temperature (represented as "theta"), boundary layer moisture convergence ("BLMC"), convective instability ("CIN"), and diabatic heat ("diah") calculated following Yanai et al.(1972).The zonal (meridional) moisture advection is represented by "udqdqx" ("vdqdy").Each line in the figure represents the mean of 10 ensemble members.

Fig. 2
Fig. 2 Seasonal dependency of the MJO-related precipitation index.The correlation skill of the MJO-related precipitation index targeted to each calendar month in the DL model.The precipitation index is defined as pentad precipitation averaged over the western Pacific (120°-150°E, 10°S-10°N).The precipitation and SST anomaly map are used as predictors.The DL model was trained using reanalysis data from 1997 to 2010, and forecast results from 2012 to 2017 are used for validation.The ensemble mean data are used.

Fig. 3
Fig. 3 Physical interpretation of the DL model.Observed 20-100 day filtered sea surface temperature (SST) (a, K), zonal wind at 850 hPa (b, m s -1 ), convective instability (c, K), and vertical velocity (d, cm s -1 ) anomalies regressed on the MJO-related precipitation index over western Pacific (120°-150°N, 10°S-10 N°) with a 65-days lag.The 20 yr (1997-2016) data was used.The dotted area represents significance at the 95% confidence level.The heat maps of each DL model.The DL models use SST (e), zonal wind at 850 hPa (f), convective instability (g), and vertical velocity (h), respectively.The validation period data (2012-2016) are used for heat map analysis.Only the values with over 95% confidence level are shown.

Fig. 4
Fig. 4 Role of Central and Eastern Pacific sea surface temperature (SST).a Observed 20-100 day filtered sea surface temperature (shading) and wind vector at 850 hPa (m s -1 ) anomalies regressed onto the central and eastern Pacific (160 N°-120°W, 5°S-5 N°) with a 65-day lag.The 20 yr (1997-2016) data was used.The dotted area represents significance at the 95% confidence level.b Same as a but for observed velocity potential (shading, 10 5 m 2 s -1 ) and divergent winds (m s -1 ) anomalies.c Schematics for a role of central and eastern Pacific SST on precipitation over western Pacific.Only the values with over 95% confidence level are shown.

Fig. 5
Fig. 5 Impact of central and eastern Pacific on correlation skill of MJO-related precipitation index.The all-season correlation skill of the MJO-related precipitation index as a function of the forecast lead month from the DL models with different forecast domains.The precipitation index is defined as pentad precipitation averaged over the western Pacific (120°-150°E, 10°S-10°N).The DL model used sea surface temperature (SST) as a predictor for four different regions: (1) global region (SST_GLO), (2) western Indian Ocean only (SST_WIO), central Pacific (SST_CP), and western Pacific (SST_WP).The DL model was trained by using reanalysis data from 1997 to 2010 and the forecast results from 2012-2017 are used for validation.Each line represents the mean of 10 ensemble numbers.