Introduction

With the country’s food production and economy depending on it1,2, the Indian summer monsoon rainfall (ISMR) is the lifeline of one-fifth of the world’s population living in the region. It is not surprising, therefore, that developing a model for forewarning of ISMR one season in advance has a history of more than a century in India led by the pioneering work of Blanford3. Despite great advances in ocean-atmosphere observation, computing resources, development of improved empirical4,5 and coupled climate models for ISMR prediction, the prediction of ISMR remains a grand challenge problem in climate science6. Although many different predictors have been identified for predicting ISMR since then7, the El Nino and Southern Oscillation (ENSO) has remained the leading predictable driver of ISMR, even though the relationship between ISMR and an ENSO index shows a weakening trend in recent decades (Supplementary Fig. 1b8) with a tendency for recovery to higher negative correlations in more recent years. With ISMR defined from another dataset9, however, the ENSO–ISMR correlation continues to decrease in recent years. Further, the maximum negative correlation between ISMR and Nino3.4 SST takes place 3–4 months after the peak of ISMR (Supplementary Fig. 1c), suggesting a potential role of ISMR in driving the ENSO. In fact the lead–lag relationship (Supplementary Fig. 1c) indicates that the ISMR could feedback and influence the ENSO10 making the ENSO–ISMR relationship tangled11,12 and potentially a bi-directional causality. It is also notable that the ENSO–ISMR relationship undergoes a multi-decadal variation (Supplementary Fig. 1b13) implying that there are periods when ENSO could explain up to 35% of inter-annual variability of ISMR while there are periods when it could explain less than 10% of ISMR variability. Further, it is noted that a nearly equal number of floods and droughts occur without a La Nina or an El Nino (Supplementary Fig. 1a), indicating the role of non-ENSO drivers on observed inter-annual variability of ISMR.

A recent study14 argues that the traditional “signal-to-noise” estimates of “potential skill” or limit of potential predictability are underestimated and a model-based estimate of “potential skill” indicates that it could be much higher than previously thought explaining up to 70% of inter-annual variability of ISMR, far beyond that could be explained by association with the ENSO. Some additional sources of ISMR predictability come from other slowly varying potential drivers like the Eurasian snow cover15,16,17,18, Pacific Decadal Oscillation (PDO)19,20, Indian Ocean Dipole Mode21,22,23, the Atlantic Nino24,25,26,27 and Atlantic tripole28, have been explored recently. However, the physical mechanisms through which they influence the ISMR, the robustness of the relationships, and the fraction of ISMR variability explained by them are still being debated. As a result, the causality and the direction of causality between these potential drivers and ISMR are neither well established nor their independence from the ENSO and the North-Atlantic sea-surface temperature (SST) variability is established.

In addition to the ENSO, there is considerable evidence that cold NA SSTs are associated with mega-droughts of Indian monsoon in the past29,30. In recent years, the NA SST is also linked with ISMR on multi-decadal time-scales28,31,32,33,34,35. While the North-Atlantic multi-decadal variability (alternately also known as the Atlantic multi-decadal oscillation, AMO) could be modified by either natural or anthropogenic aerosols36,37,38,39, ocean-atmosphere feedback is critical for the basic multi-decadal variability in the Atlantic33,40,41. With considerable evidence of decadal predictability of the North-Atlantic climate and the AMO42,43,44,45, it represents an extra-tropical predictable driver for ISMR. In addition, the Southern Annular Mode (SAM) also has a notable association with the ISMR variability46,47,48,49. While the SAM is a manifestation of “internal” atmospheric dynamics with typical auto-correlation of ~2 weeks, it also has a forced or coupled component with modest seasonal predictability50 due to its significant association and linkage with the ENSO51,52,53. Therefore, the SAM is unlikely to contribute to ISMR predictability over and above that arises from association between ENSO and ISMR.

The primary objective of this article is to establish beyond reasonable doubt that the NA-SST or the AMO is an independent driver of the ISMR variability. For establishing the same, first, we need a physical mechanism connecting the NA SST and ISMR. A framework for such a physical teleconnection mechanism has emerged from several recent studies28,31,32,54,55,56. According to it, positive (negative) NA SST drives a large-scale anticyclonic (cyclonic) barotropic vorticity over it on intraseasonal time-scales and sets up an upper level zonal-wave number four Rossby wave-train that produce significant upper level anticyclonic (cyclonic) vorticity over the Indian region that in turn strengthen (weaken) low-level cyclonic vorticity associated with the Indian monsoon. Long active (break) spells as a result of clustering of the spells lead to strengthening (weakening) of the seasonal mean ISMR. The Rossby wave-train represents a North-AtlanticIndian (NAI) teleconnection pattern for linking extra-tropical NA SST to tropical ISMR.

In the present study, we start by describing the NAI pattern of teleconnection between NA SST and ISMR. However, a missing element of the puzzle of teleconnection between the AMO and the ISMR is, how the NA extra-tropical SST drive the local barotropic vorticity on intraseasonal time-scales The conventional wisdom indicates that unlike in the tropics, the atmosphere drives SST anomalies in the extra-tropics. While that may be true on synoptic time-scales, here we provide evidence that indeed the NA SST could drive barotropic vorticity above it on intraseasonal time-scales bridging the missing link in the teleconnection between the AMO and the ISMR. The last important remaining issue is that the causality has not been quantified rigorously. Although the functional relationship between the driver time series like the AMO and the ISMR is intrinsically nonlinear, the causal inference algorithms used so far are all based on linear correlations and regressions and do not adequately address the independence of AMO–ISMR relationship from that with other potential drivers. To test the robustness of our conclusions, we quantify the causality between the AMO and the ISMR in the presence of a number of other potential drivers using two different advanced nonlinear causal inference algorithms and unravel that the NA SST is indeed a driver of the ISMR through a simultaneous atmospheric bridge independent of the ENSO and its impact on the ISMR is comparable to that of the ENSO. We further show that it achieves the same through the NA SST driving a intraseasonal barotropic vorticity above it, which in turn drives extended intraseasonal spells of rainfall over the Indian monsoon region that leads to seasonal mean ISMR anomalies.

Results

North-Atlantic and Indian Monsoon (NAI) teleconnection

Extra-tropical climate teleconnection via the wave-train associated with the Pacific North American (PNA) pattern has been shown to be driven by tropical heat source linked with ENSO SST57,58. However, a similar theoretical support for the generation of the Rossby wave-train connecting NA SST and ISMR is lacking. Here, we contrast the Rossby wave-train connecting the extra-tropics to the tropics (Fig. 1a, c) with the tropical ENSO SST to extra-tropical teleconnection through the PNA pattern (Fig. 1b).

Fig. 1: Contrast between the North-Atlantic–Indian monsoon (NAI) and the Pacific–North-American (PNA) teleconnection patterns.
figure 1

a Spatial pattern of geopotential height associated with the AMO. Regression of JJAS AMO multi-decadal mode on both the JJAS SST (shading, K K−1) and deviation of JJAS 200 hPa geopotential height (contour lines, m2 s−2 K−1) from zonal mean. b Spatial pattern of geopotential height associated with the ENSO. Regression of DJF Nino3.4 SST on DJF 200 hPa geopotential height (m2 s−2 K−1) as viewed from NH (c). The Rossby wave-train associated with the AMO multi-decadal mode. Regression of JJAS AMO multi-decadal mode on JJAS SST (shading, K K−1), deviations on JJAS 200 hPa winds (vectors, m s−1 K−1) and 200 hPa geopotential height from zonal mean (m2 s−2 K−1), where blue curves represents –ve contours, red curves for +ve contours, and black curve represents zero contour. Data length is from 1850–2015, for which AMO MDM is obtained as the IMF-5 of AMO index. Hatching represents regions where winds are significant at 95% CI using two-tailed t-test.

The teleconnection between the extra-tropical AMO and the tropical ISMR through the Rossby wave-train has been shown in some detail in Rajesh and Goswami54. The tropical SST to extratropical climate teleconnection via the wave-train associated with the Pacific North American (PNA) pattern has been shown to be driven by tropical heat source linked with ENSO SST57,58. However, a similar theoretical support for the generation of the Rossby wave-train connecting NA SST and ISMR is lacking. Here, we contrast the Rossby wave-train connecting the extra-tropical NA-SST to the tropics (Fig. 1a, c) with the tropical ENSO SST to extra-tropical teleconnection through the PNA pattern (Fig. 1b).

The regression of the multi-decadal mode of JJAS AMO and deviations from zonal mean of anomalies of JJAS 200 hPa geopotential height (Fig. 1a) and 200 hPa winds (Fig. 1c) illustrate the Rossby wave-train associated with the North-Atlantic SST on multi-decadal time-scale. A similar Rossby wave-train linking NA and the Indian monsoon region was also identified by Joshi and Ha59. A nearly identical wave-train is found to connect the NA SST to ISMR on seasonal and inter-annual time-scale56 too. We call this the North-AtlanticIndian (NAI) teleconnection pattern. The NAI pattern is similar to the circumglobal teleconnection pattern of the northern hemisphere summer time climate described by Ding and Wang60. Difference between the two is that their pattern is related to the ENSO while NAI is unrelated to the ENSO. A similar regression between JJAS Nino3.4 SST and geopotential height on inter-annual time-scale shows the PNA type of pattern in NH (Fig. 1b). Unlike the tropics to extra-tropics teleconnection, which is achieved through the meridionally propagating group of Rossby waves along a great circle arc, the extra-tropical to tropical Rossby wave-train is essentially anomaly of the climatological zonal winds over the extra-tropics. While Hoskins and Karoly57 clarified some initial apprehension regarding how the waves could travel from mean easterlies through the subtropical westerly jet to deep extra-tropics in the PNA type teleconnections, there is no conceptual difficulty in the zonal propagation of the Rossby waves on a largely westerly background mean flow.

Similar to how the lower-stratospheric quasi-biennial oscillation (QBO) modulates tropical deep convection by influencing the vertical wind shear and vorticity at the upper troposphere and lower stratosphere (UTLS) region61,62, persistent divergence (convergence) at an upper level over the Indian monsoon region by the NAI vortex facilitates (inhibits) deep convection frequency leading to strengthening (weakening) of the ISMR. A recent study56 shows that during non-ENSO droughts, it leads to a clustering of “break” monsoon conditions and results in a negative seasonal mean ISMR anomaly. A similar mechanism operates even on different phases of the AMO as well54. This sub-seasonal manifestation of this teleconnection between NA and ISMR could be seen in the composite of daily intraseasonal rainfall anomaly averaged over central India (72°E–86°E, 14°N–28°N) over 20 years (1935–1955) around a peak positive phase of the AMO multi-decadal mode (Supplementary Fig. 2). It shows that the higher than average seasonal mean ISMR arises from two phase-locked “active” spells during the season. Similar phase locking (or climatological intraseasonal oscillation) is seen for upper level (200 hPa) anticyclonic vorticity and lower level (850 hPa) cyclonic vorticity over the region that facilitates the sustenance of the active spells (Supplementary Fig. 2). Linear lead–lag relationships between barotropic vorticity above NA SST and upper level vorticity over the Indian region and Indian monsoon rainfall on intraseasonal time-scale are an indication that the upper level anticyclonic vorticity over India is not a response of the stronger monsoon but likely to be driven by the NA-barotropic vorticity. The indications from this linear analysis are confirmed in the following sections using a nonlinear causal inference technique, where we show that the NA SST indeed drives the barotropic vorticity above it, which in turn drives the upper level vorticity over Indian monsoon region that drives the long active or breaks spells over the Indian monsoon region.

SST and barotropic vorticity on sub-seasonal time-scales

The vertical structures of extra-tropical low-frequency (compared to synoptic) intraseasonal variations are known to be barotropic in nature while those associated with high-frequency (synoptic) disturbances are baroclinic63. The existence of the stationary Rossby wave-train associated with the NAI pattern driven by an episodic barotropic vorticity source, therefore, is understandable. Also the association between the barotropic vorticity source and the underlying SST is unambiguous54,56. However, it is not clear whether the underlying SST drives the low-frequency barotropic vorticity or the atmospheric circulation drives SST and related fluxes. In order to bring out the large-scale barotropic vorticity and to remove the influence of high-frequency spatial variability, both SST data and atmospheric vorticity are averaged to a common resolution of 1o x 1o boxes and we create weekly averages of the daily SST and vorticity. Using JJAS SST and circulation from NOAA 20CRv3 and ERA5, the correlations calculated between SST leading the barotropic vorticity up to 3-weeks to lagging by a week (Fig. 2) show large regions of negative correlations over the North-Atlantic and north Pacific consistent with earlier findings54,56 that warm (cold) waters are overlaid by anticyclonic (cyclonic) barotropic vorticity, which is in contrast to the warm tropical oceanic regions. The fact that the correlations peak with SST leading the vorticity by 1–2 weeks is a strong indication that the SST is the driver for the generation of barotropic vorticity and not a response. This conclusion is supported by nonlinear causal inference calculations in the following sections.

Fig. 2: Driving of northern hemisphere summer barotropic vorticity by SST on super-synoptic time-scales.
figure 2

ah Lead–lag correlation maps between weekly averaged values of SST and barotropic vorticity (vorticity averaged between 700 and 200 hPa). “SST lead1” refers to SST leading the barotropic vorticity by 1-week and so on. Dotted area represents regions significant above 95% CI based on the cutoff critical values for two-tail tests. The weekly data length spans from year 1982–2017.

While our analysis provides evidence that the extra-tropical SST could be a driver of atmospheric circulation variations on sub-seasonal time-scales, how the SST achieves this eluded consensus. As a result, what produces the deep barotropic vorticity response above the SST has remained an open question. Here a mechanistic explanation about the underlying process is proposed based on a hypothesis is that the SST does this by modulating the North-Atlantic Oscillation (NAO)31. However, we realize that the teleconnection between NA SST, barotropic vorticity (BV) above it, Indian upper level vorticity (IUV) and ISMR takes place on an intraseasonal time-scale (Supplementary Fig. 2).

The warmer (colder) SST relative to colder (warmer) SST to the north leads to a north–south surface pressure gradient in the region and results in modulating the strength of the summer NAO. Therefore, we propose that the NA SST produces a similar phase locking of the NAO with the seasonal cycle and thereby produces a phase locking of the BV. A southward (northward) shift of the storm track (on background zonal westerlies) represents an anomalous vorticity forcing at low level. The episodes of persistent spells of vorticity forcing can lead to the generation of equivalent cyclonic or anticyclonic vorticity through vertical propagation of the Rossby wave response. Thus, the NA SST forces could, in principle, result in the genesis and maintenance of observed equivalent barotropic vorticity response above it via modulation of the MSLP and hence the NAO. The hypothesis is tested by creating a summer NAO index defined by the difference of JJAS MSLP between (40oW–30oW, 40oN–30oN) and (30oW–20oW, 50oN–60oN). Recognizing that the relationships between NA SST, NAO, BV and ISMR are intrinsically nonlinear, we use a nonlinear causal inference algorithm to test this hypothesis. Some modeling studies64,65 support that a barotropic vorticity response can emerge from SST forcing in the extra-tropics. Ferreira and Frankignoul65 examined the transient atmospheric response to SST anomalies associated with NAO in a coupled model of intermediate complexity and find that the air-sea heat fluxes lead to a non-adiabatic heating of the atmosphere. The final equilibrium response is barotropic that evolves from an initial baroclinic response. More modeling and diagnostic studies are required for a deeper understanding of the issue.

The absence of significant correlations between the weekly SST and weekly barotropic vorticity at any lead or lag in the SH extra-tropics (Supplementary Fig. 3) is consistent with the above mechanism of generation of large-scale barotropic vorticity through regional displacements of the storm tracks. The large landmasses surrounding the SST in the north Pacific and North-Atlantic facilitate the regional north–south (meridional) displacements of the jetstream/storm tracks and in the generation of the barotropic vorticity on a large-scale in the NH extra-tropics. The absence of such landmasses in the SH extra-tropics makes it difficult to generate similar SST anomalies to produce similar north–south regional pressure differences and corresponding regional displacements of the storm tracks.

Quantifying causal inference in the presence of multiple interacting drivers

While pairwise correlations, linear regressions and lagged regressions establish the associations between the ISMR and its potential drivers like the ENSO, the AMO, the PDO, the NAO, the IOD and the Atlantic Nino (At-Nino), these studies fail to quantify the causal relationships due to the interdependence between the drivers and nonlinearity of the relationships. Some of these drivers are “confounders” that impact both the cause and the effect. Causal discovery algorithms like the PCMCI+ and Granger causality help us to quantify the causality by taking into account the conditional independence between drivers and estimating the probability of a particular directional causality at a statistically significant level in the presence of nonlinearity. We note that there are two aspects of the teleconnection of the seasonal mean ISMR with the remote drivers. A contemporaneous connection takes place through an atmospheric bridge and is almost “simultaneous”, with the largest lead or lag of a few months. The slowly varying drivers like the AMO could also influence the seasonal mean ISMR through an “oceanic bridge” at lead or lag of a several years. As the sample size for such a causality (seasonal mean values for available ~160 years) is rather small, the causal inference is likely to be unreliable and hence we shall not attempt to identify causality of teleconnection through the “oceanic bridge”. In this study, we examine the contemporaneous teleconnection between ISMR and potential drivers through atmospheric bridges via atmospheric circulation. As steady response of atmosphere to forcing from drivers takes place relatively quickly, driving influence from the potential drivers may be expected with short lags of 1–3 months. Therefore, for our causal inference, we use monthly mean anomalies of the indices during 6 summer months (MJJASO) for all years and using maximum lag of 5 months. By restricting maximum lag to 5 months, we restrict potential lag relationships to within the same summer season ensuring teleconnection through an atmospheric bridge. All monthly anomaly indices are detrended before examining the causality using PCMCI+ or Granger causality66,67,68,69,70,71,72. It is to be noted that the term “causal” discovery relies on various assumptions so that the identified causal links are also valid only for selected set of variables as the structure of the causal network may change by adding any additional variables73,74. For a more reliable estimate of the causality between the AMO and ISMR, we employ the causal discovery algorithms in the presence of a number of other potential drivers of ISMR namely, the PDO, the IOD, the At-Nino and the NAO.

The results of PCMCI analysis at α = 0.05 (95% CI) (Fig. 3a) indicate that the ENSO drives ISMR negatively while the AMO drives it positively with similar strengths as we proposed. While the teleconnection between ENSO and ISMR through the Walker circulation is simultaneous, the one between the extra-tropical AMO and ISMR is at a short lag. It is notable that the ENSO–ISMR connection is both ways as expected while the one between the AMO and ISMR is one-way from AMO to ISMR. Our analysis indicates that the PDO has no direct link with the ISMR while the NAO and ISMR association comes through the AMO. The At-Nino also does not have a direct link with the ISMR and the directionless two-way connection between At-Nino and AMO indicates that At-Nino is an integral part of the AMO variability. Therefore, the association between the At-Nino and ISMR reported in some literature is likely to be through its connection with the AMO. As the IOD and the ISMR are linked with a directionless connection, the IOD is not a credible driver of the ISMR. While the IOD has a strong positive directionless association with the ENSO, it has a negative driving influence on the ENSO. The analysis quantifies our claim that the NA SST (and the AMO) is an equally important driver of ISMR variability together with the ENSO.

Fig. 3: Causal links between the ISMR and AMO in the presence of other potential drivers using the PCMCI+ causal inference algorithm.
figure 3

a Causal links between the ISMR and the AMO (NA SST), the ENSO (Nino), the NAO, the PDO, the At-Nino and the IOD on seasonal time-scales based on monthly anomalies of the indices for MJJASO season during the period, 1871–2017 obtained using the multivariate causal framework at 95% alpha level. Monthly anomalies of SST are computed from COBE SST2, Mean sea level pressure from NCEP 20Cv3 and ISMR from Parthasarathy data. b The seasonal causal link between NA SST and ISMR is a result of causal links between the NA SST, the NAO, the BV, the Indian upper level vorticity (IUV) and the Indian lower level vorticity (ILV) (see text for definitions) on intraseasonal time-scales with daily intraseasonal filtered anomalies of indices using a 7-day moving average. The indices for the JJAS season from 1982 to 2010 are detrended. The nonlinear causality from PCMCI+ are shown as arrows, with strength of contemporaneous association (link strength) represented by arrow color (+ve red and –ve blue), while node color represents the node auto-correlation strength. The curved lines represent a time-lagged causal relation, which is represented by the numbers (lag in months), while the straight lines show the contemporaneous relationship between dependencies, with or without orientation. The color represented in the schematics of various processes (other than nodes and connector arrows) in the map is just for illustration purposes only.

Next, we explore how the AMO achieves the directional causality to the ISMR on intraseasonal time-scales. From our earlier analysis, the hypothesis is that the NA-SST drives NAO that leads to NA-barotropic vorticity (BV), which in turn drives central India monsoon intraseasonal oscillations (CI-ISMR) via the upper level vorticity over the region (IUV) on intraseasonal time-scale. To test this hypothesis, causal discovery using PCMCI+ is applied to normalized daily intraseasonal filtered (using a 7-day moving average filter) indices of NA-SST, NAO, CI-ISMR, BV, ILV, and IUV for June to September during 1980–2017 (Fig. 3b). The results are significant at α = 0.01 (99% CI). It is notable that the NA-SST has a positive driving influence on intraseasonal variations of ISMR (pcorr = 0.316). The NA SST achieves this by driving BV (strong negative correlation, pcorr = –0.489) through a positive driving of NAO, which then drives BV negatively. The BV drives IUV positively, which in turn drives intraseasonal variations of CI-ISMR negatively (pcorr = –0.2). The positive driving of ISMR by NA SST is consistent with its negative driving of BV ➔ positive driving of IUV ➔ negative driving of ISMR. Thus, the hypothesis is strongly supported by the nonlinear causal inference calculations.

Using the time series of the same set of potential drivers of ISMR, we have constructed the causal inference using Granger causality framework based on a set of slightly different assumptions on stationarity of the time series etc. In order to test the robustness of the directions and strengths of causal graphs on seasonal time-scale between ISMR and its potential drivers (Fig. 3a), we employ Granger causality on exactly the same time of ISMR, NA-SST, NAO, PDO, IOD, and At-Nino for the same period. The results of linear and nonlinear Granger causality graphs are shown in Fig. 4a, b, respectively, with all results significant at 95% confidence level (α = 0.05). While the linear Granger identifies only one causal link AMO to ISMR at 1-month lead, the nonlinear Granger picks up a causal link from Nino to ISMR at 1-month lead together with the link from AMO to ISMR at lead 1 and 2 months. The signs of the causal graphs are the same as those obtained from the PCMCI+ method. It may be noted that in contrast to the PCMCI+ method, the Granger algorithm used in this study does not indicate “contemporaneous” causality direction. We may compare the nonlinear Granger causality with minimum lead (1-month) with the “simultaneous” causality of PCMCI+. Thus, both the methods agree that the AMO and the ENSO are the only two drivers of ISMR on a seasonal time-scale and both are equally important. However, the Granger fails to identify the reverse causality from ISMR to Nino that appears physically meaningful and identified by the PCMCI+.

Fig. 4: Causal links between the ISMR and AMO in the presence of other potential drivers using the Granger causal inference algorithm.
figure 4

Similar to Fig. 3a but the causal inference links are obtained from monthly anomalies of the same indices using a linear Granger causality and b a nonlinear Granger causality framework at 95% alpha level. Monthly anomalies of SST are computed from COBE SST2, Mean sea level pressure computed from NCEP 20Cv3 and ISMR from Parthasarathy data, spanning from 1871 to 2017 for MJJASO season. As in PCMCI+, both the linear and nonlinear causality are shown as arrows, with strength of association represented by arrow color (+ve red and –ve blue), while node color represents the node auto correlations. The curved lines represent a time-lagged causal relation, and the numbers denote the lag (in months).

We note that the Granger method fails to identify links between the AMO and At-Nino and AMO and PDO that also likely to be physically meaningful, as these systems are known to be intimately linked75,76,77. It is also noted that the Granger does not identify all other links that are not clearly directed. This seems to be consistent with a criticism of Granger causality that it has a “low detection power” in high dimensionality datasets and its applicability is largely limited to bivariate analysis and cannot account for indirect links or common drivers78.

AMO–ISMR relationship: additional source for ISMR predictability

The causal inference calculations confirm that AMO is a driver of ISMR independent of the ENSO and thus adds to the predictability of ISMR. This is consistent with findings of Borah et al.56 who find that all non-ENSO ISMR droughts are associated with significant cold NA SST but with tropical SST close to climatological mean (as in ENSO transition years). Simulation experiments of ISMR driven by NA SST79 also indicate that NA SST has significant influence on simulation of ISMR only when the ENSO is going through transitions. While during a positive (negative) phase of AMO, strong NA SST anomalies persist for ~20 years, during El Nino (La Nina) years of these periods, ENSO–ISMR teleconnection dominates, during non-ENSO years the AMO–ISMR teleconnection dominates the ISMR variability. Indeed, composites of the meridional wind (V) anomalies during JJAS of non-ENSO years during positive and negative phases of ISMR multi-decadal variability show (Supplementary Fig. 8) a circumglobal stationary wave number 4 Rossby wave-train similar to Fig. 1. Thus, during all non-ENSO years, by modulating ISMR through the NAI pattern the NA SST complements ENSO-based predictability thereby enhancing potential predictability of ISMR significantly.

Discussion

Even while the relationship between the ISMR and the ENSO has declined in strength in recent decades, the ENSO remained the only dominant predictable driver for ISMR for over a century. Further, the ENSO–ISMR relationship falls far short of explaining the high “potential skill” of ISMR prediction80. For realizing the “potential skill”, therefore, there is a need for looking beyond tropical SST for ISMR predictability. The mounting evidence that extra-tropical SST associated with the AMO modulates the ISMR provides such a potential source. However, it has been unclear whether the underlying SST drives the Rossby wave-train that links the extra-tropics to tropics or the SST is a response of the atmosphere. Here, using observational data we show that the SST indeed drives a barotropic vorticity above that in turn is responsible for setting up a Rossby wave-train connecting the NA SST and the ISMR. Our analysis using two nonlinear causal discovery algorithms confirm that the AMO is a driver of the ISMR with a lead of 1 month while the ENSO is also a driver of ISMR with a lead of 1 month and their influence on ISMR is of comparable magnitude. The NA-SST and ISMR connection is always from NA-SST to ISMR through the atmospheric bridge while that between the ENSO and ISMR is both ways. Based on some of our earlier work54,56 and lag correlations between NA SST and barotropic vorticity (BV) presented here, we propose that the teleconnection on seasonal time-scale results from a teleconnection between NA SST, NAO, BV, upper level vorticity over India (IUV) and ISMR on intraseasonal time-scales. Application of causal inference algorithm (PCMCI+) to the intraseasonal filtered interacting time series shows that the NA SST drives climatological ISO of ISMR via the NAO driving a similar oscillation in the BV, which in turn driving a similar oscillation in IUV and finally the IUV driving a similar oscillation in ISMR.

Two different causal discovery methods with different assumptions confirming the fact that the AMO is an equally important driver as the ENSO indicate that the conclusion is fairly robust. Long-term SST data used in the study use analysis that fill-up gap or interpolate data in the early data sparse periods. In order to test robustness of our conclusions due to uncertainty in the SST data, we carried out PCMCI+ and Granger causality calculations including the monthly NA-SST and Nino3.4 indices calculated from two other SST datasets keeping all other indices same as in Figs. 3 and 4. These additional calculations (Supplementary Figs. 4, 5, and 6) using all the SST datasets and with both the methods support the conclusion that the AMO and the Nino3.4 are the only two major drivers of ISMR of almost equal strength. Therefore, this causality conclusion appears stable and robust. However, some of the other causal links obtained by the PCMCI+ method in Fig. 3a and by Granger in Fig. 4 appear to be sensitive to the changes in the SST datasets. Notable amongst them is the causality between the IOD and ISMR. With COBE SST, the PCMCI+ method indicates a confounded relationship (Fig. 3a) while both linear and nonlinear Granger shows no link between the two (Fig. 4). However, with Kaplan SST, PCMCI+ shows a positive link with IOD driving ISMR that is also confirmed by Granger with ERSST and Kaplan SST, consistent with indications from some previous studies23. Therefore, while the IOD may have a driving influence on the ISMR, the IOD–ISMR relationship is not as robust as that between the ENSO and ISMR and AMO and ISMR. In conclusion, our findings make a powerful case for a revision of the assumed causality of ENSO as the primary driver of the ISMR and its inter-annual variability. It is also notable that the PCMCI+ method indicates that ISMR has a driving influence on the ENSO (Fig. 3a) while the Granger does not indicate such a driving link from ISMR to ENSO (Fig. 4b). This indicates that ISMR driving ENSO is not as robust as ENSO driving ISMR.

The relationship between NA-SST and ISMR is intrinsically nonlinear. Therefore, a linear correlation between JJAS NA SST and ISMR tends to be small and often insignificant. Chattopadhyaya79 have shown that the NA-SST can influence Indian monsoon rainfall only when deep tropical SST during JJAS is close to the climatology or only during the years when ENSO is in transition from one phase to the other. Also Borah et al.56 show that all non-El Nino droughts of Indian monsoon are associated with strong negative NA-SST in negative phases of the AMO when the deep tropical SST anomalies are close to climatology. In order to bring out the NA SST and ISMR relationship when the ENSO is near neutral, a scatter plot between NA-SST and ISMR for the years when Nino3.4 JJAS SST is within +0.25 s.d and –0.25 s.d. indicates a statistically significant correlation of 0.47. (Supplementary Fig. 10). Thus, the NA SST could potentially explain up to 20% of ISMR variability. The ENSO–ISMR relationship, on the other hand, has a significant linear component (Supplementary Fig. 1c). Based on the lag-0 correlation in Supplementary Fig. 1c, the ENSO also could explain up to 20% of ISMR variability roughly consistent with strengths of causal relationships (Figs. 3 and 4).

An extension of composite of JJAS SST anomalies during non-EN drought years over the period between 1871 and 2015 indicates a global pattern of SST anomaly similar to that of Borah et al.56 but with a weak +ve SST anomaly in equatorial eastern Pacific. The global SST anomalies seem to complement the dominant SST signal in the NA to weaken the ISMR, a weak El Nino in eastern Pacific and –ve SST anomalies over IO that tries to weaken ISMR through reduced moisture convergence (Supplementary Fig. 11a). The global pattern associated with non-La Nina floods (Supplementary Fig. 11b) is almost a mirror opposite of that associated with non-EN droughts again a weak La Nina in eastern Pacific and warm SST anomalies over the IO trying to compliment NA+ve SST anomalies trying to strengthen ISMR. The non-EN drought and non-La Nina flood years marked on the time series of NA SST anomaly indicates that while most non-EN droughts occur during a negative phase of multi-decadal variability of NA SST, the non-La Nina floods tend to occur during positive phases of the NA SST multi-decadal variability. We also carry out El Nino (EN) Indian monsoon drought and La Nina Indian monsoon flood JJAS SST composites during the 1871 and 2020 (Supplementary Fig. 12). While the Pacific Ocean is dominated by the ENSO signal, it is notable that negative (positive) SST anomalies over NA try to complement weakening (strengthening) tendency by El Nino (La Nina). However, we note that over the IO SST anomalies do not cooperate and are just response to large-scale winds associated with weak (strong) monsoon due to El Nino (La Nina).

The latest climate models from the Coupled Climate Model Intercomparison Project-Phase-6 (CMIP6) still have large biases in simulating the present Indian summer monsoon climate as well as the AMO81. The observed periodicity of the multi-decadal modes of both ISMR and NA SST is ~65 years and the variance explained by the observed ISMR multi-decadal mode is ~7%. We have examined how the coupled climate models simulate the teleconnection between ISMR and NA SST. As long as the models simulate reasonable amplitude of ISMR multi-decadal mode, the period of NA SST multi-decadal mode and that of ISMR multi-decadal mode in CMIP6 models are strongly correlated (as in observations) (Supplementary Fig. 7b). Amongst these models, the MPI_ESM1-2_HR simulates the multi-decadal variability of ISMR, as well as that of the NA SST reasonably well as shown by the 11-year moving average (Supplementary Fig. 7c, d and Supplementary Table 1). Composites of simulated meridional velocity (V) at 200 hPa for the non-ENSO years during positive and negative phases of the simulated ISMR multi-decadal mode (Supplementary Fig. 9a, b) indicate a Rossby wave-train similar to the one associated with observed ISMR multi-decadal mode (Supplementary Fig. 8). However, the meridional wind anomalies during the positive phase (Supplementary Fig. 9a) are simulated by the model shifted to the east by about 10–15 degrees longitude over the Indian monsoon region. An examination of the causality (similar to Fig. 3a) between simulated NA SST and ISMR in the presence of ENSO and other potential drivers and the associated Rossby wave-train by the model (Supplementary Fig. 9c) indicates that the AMO is an independent driver of ISMR in the model simulations too. While the simulated ISMR has a driving influence on the simulated ENSO, however, the model does not simulate the reverse as seen in observation. An examination of causality in other CMIP6 models indicates that at least three other models simulate that the AMO is a driver of the ISMR (not shown). The linear correlation between simulated ISMR and Nino3.4 by the model is r = 0.43 compared to that in observations (r = 0.53) indicating that the model has a bias in simulating a weak ENSO–ISMR relationship compared to that in observations. We also find that most other driving links are highly model dependent. In order for the seasonal forecast models to realize the potential skill indicated by associations with the ENSO and the NA-SST, the models must simulate the variability of the potential drivers and their teleconnection with ISMR with fidelity. Efforts are needed to improve the biases of coupled models in simulating ENSO and NA SST variability and teleconnection in order to improve the current poor skill of prediction of ISMR by most models.

Our findings settle a long-standing debate on whether the AMO is a credible driver of the ISMR variability, provide a basis for higher potential predictability of ISMR and highlight the need to embrace the role of extra-tropical SST in the framework of predictability and in seasonal prediction of ISMR. On a broader question, they also settle that the extra-tropical SST could clearly influence the tropical climate on seasonal and subseasonal time-scales. In light of these findings, here we propose that it is time to go beyond the legacy of TOGA and embrace the role of the extra-tropical SST in the conceptual framework for seasonal predictability of tropical climate.

Methods

Observed data and definition of the climate indices

The fixed station-based monthly rainfall data from a long historical dataset of Parthasarathy82 (1871–2016) is used as a measure of the ISMR. AMO is defined as the box averaged SST anomalies over the North-Atlantic box (0°–60°N, 75°W–5°W). NAO index is defined as the difference in the box averaged values of standardized sea level pressure values at North-Atlantic between the boxes, 37.5°N–42.5°N, 32.5°W–27.5°W and 62.5°N–67.5°N, 22.5°W–17.5°W. Nino3.4 index is defined as the box averaged value of SST at central Pacific at 5°S–5°N, 170°W–120°W. The PDO Index is defined as the leading principal component of the North Pacific monthly sea-surface temperature variability above 20°N box. The Atlantic Nino (At-Nino) is defined as the SST anomaly over the tropical Atlantic region 20°W–0°E, 3°S–3°N. IOD is defined as the difference in SST between the boxes 10°N–10°S, 50°E–70°E and 10°S–0°N, 90°E–110°E. The SST fields, obtained from COBE SST2 data83 (1850–2016), are used as the primary SST field for deriving monthly AMO and ENSO indices. The monthly atmospheric fields like SLP, winds, precipitation and geopotential heights are obtained from NCEP 20CR V384. The NCEP 20CR V3 is also used for deriving the NAO index and vorticity fields.

While the SST anomalies in the extra-tropics at synoptic time-scale are driven by the atmosphere, we provide evidence that at intraseasonal time-scale the SST in extra-tropics can drive a barotropic vorticity over the atmosphere aloft. In order to establish the robustness in estimating the relationship between the intraseasonal SST and barotropic vorticity, weekly averaged data from two daily reanalysis, namely (1) ECMWF daily SST fields and vorticity fields obtained from ERA-585 and (2) NOAA OI SST weekly SST86 along with NCEP 20CRv3 daily fields are used. All the datasets were converted to weekly averaged fields and interpolated to 1o x 1o resolution. To demonstrate the day-to-day variability of the barotropic vorticity over the NA and its association with core-monsoon region rainfall evolution, daily 1o x 1o degree rainfall from India Meteorological Department9, together with ERA-5 daily vorticity fields are used.

Mode decomposition method

For the extraction of the multi-decadal modes (MDMs) of various fields, an improved variant of noise assisted Complete Ensemble Empirical Mode Decomposition method (iCEEMDAN, hereinafter referred as CEEMD)87,88,89 has been employed and the regression period has been limited between 1850 and 2015 (see supplementary information for more details).

Designing the casual inference network framework

Our primary goal in this study is to examine the causal relationship between the AMO and the ISMR. The ENSO also being another known driver for the ISMR, the AMO–ISMR causality must be examined in a multivariate framework in the presence of the ENSO. However, other potentially interacting drivers of ISMR like the Pacific Decadal Oscillation (PDO), the Atlantic Nino (At-Nino), the North-Atlantic Oscillation (NAO) and the Indian Ocean Dipole Mode (IOD) could also influence the directionality and or strength of the causality between AMO and ISMR. Therefore, in this study, we attempt to quantify the causal relation between AMO and ISMR in the presence of other potential drivers namely, the PDO, NAO, ENSO, At-Nino and IOD. We look to estimate the causal network between these variables using historical and simulated measurements.

The causal connection between the AMO and the ISMR is addressed in two steps. In order to bring out simultaneous connections on a seasonal time-scale, we examine the causality between ISMR and the other potential drivers using monthly mean anomalies during Boreal summer. For this purpose, time series of ISMR and the drivers namely, the AMO, the ENSO (Nino3.4 SST), the NAO, the PDO, the IOD and At-Nino are constructed using monthly anomalies for 6 months (May–October) for the period between 1871 and 2015. The causal discovery algorithms are applied to the six interacting nodes. However, the connection on monthly and seasonal time-scale is actually a result of a teleconnection on sub-seasonal time-scales54,56. We find that the barotropic vorticity (BV) over the NA is phase-locked with the annual cycle with a similar phase locking of upper level vorticity (IUV), lower level vorticity (ILV), and ISMR over core-monsoon region on intraseasonal time-scale (see Fig. 3b). The finding suggests a plausible causal hypothesis where the BV drives IUV, which in turn drives ILV that results in the ISMR intraseasonal phase locking. To verify the above hypothesis, we utilize the causal discovery framework on boreal summer (JJAS) daily data, smoothed with a 7-point moving average to remove the high-frequency day-to-day weather fluctuations and to bring out the background intraseasonal variability clearly. In this framework, we develop a causal network that consists of six nodes namely, (a) SST over NA (NA-SST), (b) the NAO (c) the barotropic vorticity over NA (BV), which is computed as the 700–200 hPa layer average vorticity over the domain 60°W–0°E, 30°N, 65°N, (d) rainfall over the core-monsoon region in Central India (72°E–86°E, 14°N, 28°N) (CI-ISMR), and (e) upper level (200 hPa) mean vorticity over the Indian monsoon region bounded by the domain 60°E–90°E, 10°N–35°N (IUV) and (f) lower level (850 hPa) mean vorticity over the Indian monsoon region bounded by the domain 60°E–90°E, 10°N–35°N (ILV).

PCMCI+ causal discovery

We estimate the presence/absence and directionality of edges between each pair of these six nodes to indicate their causal relationships, on the basis of their temporal evolution and mutual interactions. Such estimation is done using the PCMCI+67 algorithm under the conditional independence-based causal discovery framework, and validated using the Granger causality framework. The total data length spans 36 years where the SST is obtained from NOAA/NESDIS/NCEI Daily Optimum Interpolation Sea-Surface Temperature (SST), version 2.0, dataset (OISST v2.090, 0.25 degree) and 2.5-degree vorticity fields from NCEP/DOE AMIP-II Reanalysis (Reanalysis-2) In order to isolate the teleconnection pathways and direction of causality between several interacting drivers and ISMR, we employ the PCMCI+ algorithm67, which is an extended version of another algorithm called PCMCI66,78. A data-driven causal inference method, the PCMCI+ flexibly combines linear or nonlinear conditional independence tests with a directed graphical causal model (DGCM) to estimate causal networks from large-scale time series datasets. The causal discovery method based on the Peter and Clark (PC) algorithm74 combined with the Momentary Conditional Independence approach (MCI66), demonstrated to extract causal networks, which includes multiple time series of causation and is found suitable with datasets having correlated variables66,73. PCMCI involve a two-step process starting with a Condition selection or PC, a modification of the Peter and Clark algorithm, which attempts to narrow down the number of connections between variables, followed by a Momentary conditional independence (MCI), which consists of testing links for causal relationships that could be represented through a fully connected causal network graph. PCMCI+ has a higher detection power and especially more contemporaneous orientation when compared to other methods with better control on false-positive links67, which can properly depict the temporal dependency structure of underlying complex dynamical systems. PCMCI+ can identify the full, lagged and contemporaneous causal graph under the standard assumptions of causal sufficiency, and the Markov condition67. The central PCMCI+ method is to increase effect size in individual conditional independence (CI) tests to achieve higher detection power and at the same time maintain a control on false positives65,78 thereby improving the reliability of the CI tests. For a detailed description about the PCMCI algorithm one can refer to Runge78 and some of its recent applications can be found in refs. 68,69 and refs. 70,71.

Granger causality

As the efficiency in identifying the true causality and its direction by the causal inference algorithms depends on the underlying assumptions, we shall test the robustness of our conclusions from the PCMCI+67 algorithm using another causal inference framework (Granger causality) that uses a slightly different set of assumptions. While lagged linear regression analysis can often provide valuable information about causal relationships, lagged regression is susceptible to over-reporting significant relationships when one or more of the variables has substantial memory (auto-correlation). Granger causality analysis pioneered by Granger72, on the other hand, estimates time-lagged causal associations using an autoregressive model framework implemented using standard regression techniques but taking into account the memory of the data and therefore not susceptible to this issue. Some argue78 that if implemented using standard regression techniques to high dimensional datasets, the Granger causality leads to low detection rates due to limited sample size of typical climate time series (e.g., for a monthly time resolution with 30 years of satellite data). In the present application, a reasonably large sample size of monthly anomalies for 6 months for 144 years makes the sample size reasonable and is expected to overcome this issue. Granger causality has been successfully used in some suitably selected climate networks91,92,93,94.

With two time series X and Y when we try to predict Y(t) using the past values of Y and X, X may be called a granger-cause of Y if the past values of the X and Y (e.g., a linear combination of X(t – 1), X(t – 2),…, X(1), Y(t – 1), Y(t – 2) ,… Y(1)) can be used to predict Y(t) better than only the past values of Y. This definition can be easily extended to more than two time series. Consider two prediction models for Y(t):

$${{Y}}\left( {{t}} \right)_{{b}} = {{F}}\left( {{{Y}}\left( {{{t}} - 1} \right),\;{{Y}}\left( {{{t}} - 2} \right),\; \ldots ,\;{{Y}}\left( 1 \right)} \right)$$
(1)

and

$${{Y}}\left( {{t}} \right)_{{f}} = {{F}}\left( {{{Y}}\left( {{{t}} - 1} \right),\;{{Y}}\left( {{{t}} - 2} \right),\; \ldots ,\;{{Y}}\left( 1 \right),\;{{X}}\left( {{{t}} - 1} \right),\; \ldots {{ X}}\left( 1 \right)} \right)$$
(2)

The first model is called the base model while the second the full model. If the performance of the full model in predicting Y(t) is significantly better than the base model, then we can say that X Granger-causes Y. The predictive models can use any function F. If the predictive function (F) is a linear function (e.g., linear regression) then the Granger causality can be called as linear Granger causality, and if F is nonlinear (e.g., decision tree, neural network etc.) then it is called as nonlinear granger causality. For linear Granger Causality, ridge regression is used as the predictive model to avoid over fitting and for nonlinear Granger causality Random Forest is used as the predictive model92. For Nonlinear Granger Causality, we use Random Forest95 with 100 trees. In order to test the robustness of the causality directions found by the PCMCI+ method, we also present results using Granger causality on the same set of time series. More details of Granger causality including optimization of the models as used in this study may be found in the Supporting Online Material.