Introduction

Traditionally, June–September (JJAS) is the accepted season for the Indian Summer Monsoon1,2,3,4,5. However, some studies6 indicate that the Indian summer monsoon rainy season over the Northeast India (NEI) could be as large as 170 days while that over the Northwest India (NWI) could be as short as 70 days. The amplitude of annual cycle of rainfall over NEI is not only much larger than that over Central India (CI), but May rainfall over the NEI is larger than June rainfall over CI (Fig. 1a). Therefore, the ‘rainy season’ over the NEI is clearly much longer than JJAS. The question is whether the May rainfall is part of ‘Indian Summer Monsoon Rainfall (ISMR)’? If we define the rainy season to be the season when the climatological daily rainfall exceeds 50% of its annual range7, the LRS over the NEI is ~155 days while that over the CI is ~122 days. The conventional wisdom has been that the May rainfall is ‘pre-monsoon,’ and the India Meteorological Department (IMD) defines the ‘Onset’ of monsoon over the NEI by about 5th June8. One other reason why we reject May rainfall over NEI to be ‘pre-monsoon’ as it comes in spells of persistent ‘synoptic scale’ rainfall for 3–7 days (Fig. 1b–f). In a broader context, it is important to recognize the east-west asymmetry of the length of the Indian monsoon rainy season (LRS) as climate change may impact it differently over the NEI compared to that over the CI and NWI. A recent study9 indeed shows that by the end of the century, while the LRS is going to decrease over NEI, it is expected to significantly increase over the NWI. The other important implication is that the exclusion of May rainfall over NEI in the quantum of ‘seasonal’ ISMR is a source of significant uncertainty in all variability and predictability studies of ISMR so far and warrants a reexamination of these studies and their association with predictable drivers like the ENSO3,10,11,12,13,14,15. In a recent study16, using monthly data, it is argued that the May rainfall is summer monsoon rainfall and demonstrates that the inclusion of May rainfall in the seasonal monsoon rainfall significantly influences the variability and predictability of the NEI seasonal (May–September, MJJAS) rainfall. The present study comprehensively addresses the question of ‘objectively’ defining the rainy season over NEI based on daily data and examines the variability and predictability of the ‘Length of the Rainy Season’ (LRS).

Fig. 1: Annual cycle and composites of May Precipitation and Wind anomaly at 850 hPa.
figure 1

a The box average climatological annual cycle (solid line) of daily precipitation (TRMM 3B42) are computed from 1998 to 2019, along with its smoothed cycle (dashed line, mean + 3 harmonics) over CI (blue) and NEI (green). The “A” signifies the amplitude of the smoothed cycle. The intersection of the horizontal dashed line (i.e., half of smoothed curve’s amplitude, ‘A/2’) and the smoothed curve shows the ‘Onset’ and ‘Withdrawal’ dates of summer monsoon according to Wang et al.7. bf Composites of spells (persistent of 3 days or more precipitation above 1 s.d.) for the month of May along with wind anomaly for same spell days based on index computed as area-averaged precipitation over NEI from 1998 to 2019. The precipitation data used here is from TRMM 3B42, and the Wind is from daily ERA5 at 850 hPa.

While attempting to define local ‘onset,’ ‘withdrawal’ and local LRS, Mishra et al.6 estimate that the LRS over NEI could be as large as ~170 days, over the CI is ~ 110 days, and that over the extreme NWI could be as low as ~70 days. The longer LRS over the NEI is due partly to ‘onset’ being earlier than that of the Monsoon Onset over Kerala (MoK), as well as a delayed withdrawal compared to that over the southern tip of India. However, there is no recognition of this fact, and according to the latest official document8, the official climatological ‘onset’ over the NEI as a whole is around 5th June. As a result, studies17,18,19,20 that investigate dynamics of inter-annual variability and trend of the NEI rainfall all use JJAS as the monsoon season, while the robustness of the conclusions from these studies remains uncertain. Hence, there is an urgent need to objectively define the ‘onset,’ withdrawal’ and LRS over the NEI and explore their potential drivers. Mishra et al.6 define the local ‘onset’ and ‘demise’ based entirely on local accumulated rainfall anomalies. Unlike the monsoon onset over Kerala (MoK), the onset over the NEI is not driven by ITCZ, as even in early June, the ITCZ does not reach the NEI. Hence, the organization of convection during the onset of the Indian monsoon over the NEI is driven by processes other than ITCZ. Therefore, the local onset over NEI defined by Mishra et al.6 is based on rainfall alone may be reasonable. In this study, we use an objective method using tropospheric temperature gradient (no rainfall information used)21 to define the onset, withdrawal, and LRS for the Indian summer monsoon heat source and show that the onset of Indian monsoon over NEI indeed takes place in mid-May (Fig. 2a, see Methods), reinforcing the findings of Mishra et al.6. Our findings firmly establish that the monsoon season in the NEI is from 2nd week of May and last up to 2nd week of October, a fact not yet widely recognized by the Indian monsoon community. Further, we explore the potential drivers that control these key climate phenomena in the NEI.

Fig. 2: Annual cycle of TT gradient (ΔTT).
figure 2

The climatological daily mean ΔTT are shown from observation (NCEP-v3, green, 1976–2015) and from ensemble mean of CMIP6 models, both on Historical (blue, 1979–2014) and Projection with SSP5–8.5 scenario (red, 1970–2100). The start and end of ‘zero crossings’ of ΔTT defined as the ‘Onset’ and ‘Withdrawal’ dates of the summer monsoon, and the days between them are the ‘Length of Rainy Season (LRS).’ The North and South box for computing ΔTT is shown in the inset figure. The ΔTT for three regions over South Asia are a NEI, b CI, and c All India (AI).

The unique temperature and rainfall regime created by high rainfall over an extended rainy season sustains a unique bio-diversity hotspot over the region22,23. As the LRS and LRS-rainfall (LRS-RF) are strongly correlated for adaptation and building resilience to climate change impacts over the NEI, it is of interest to have reliable estimates of what is expected to happen to the LRS in the coming decades under increasing greenhouse gas forcing as it may have a significant impact on the biodiversity, hydrological disasters, and food production of the region. Therefore, here we also examine the projected changes of ‘Onset,’ ‘Withdrawal,’ and ‘LRS’ over the NEI by a set of high-resolution CMIP6 models using the same objective definition of the season. The implications of the findings are discussed.

Results

‘Onset date (OD),’ ‘withdrawal date (WD),’ and ‘length of the rainy season (LRS)’ over the NEI

Based on an ‘objective’ definition of OD, WD, and LRS (see Methods), the change of sign of tropospheric temperature gradient is calculated from ‘observations, ‘historical simulations’ and ‘projections’ and shown in Fig. 2. Climatologically over this period (1901–2015), the ‘onset’ takes place on 12th May and ‘withdrawal’ takes place on 15th October making the LRS be 157 days (Supplementary Table 1) based on NCEP-v3 data. The standard deviation (s.d.) of inter-annual variability of OD, WD, and LRS are 5, 7, and 9 days respectively. It is notable that the s.d. of OD, WD, and LRS over the NEI are 2–3 days shorter than their counterpart over the All India21. It is also notable that the shortest LRS of 131 days occurred in 1987, a large-scale drought over the country (standardized ISMR anomaly = −1.82) associated with a strong El Niño in the Pacific (standardized MJJAS Nino3.4 SST = +2.15). The climatological ΔTT calculated over a longer period (1901–2015) (Supplementary Fig. 1) is not significantly different from that calculated for the recent period (1979–2015, Supplementary Fig. 2a), with LRS over NEI being 155 days.

The OD, WD, and LRS over the NEI calculated based on ΔTT (see Methodology) from NCEP-v.3 between 1901 and 2015 show a statistically significant (at 95% confidence level) increasing trend for the OD (Fig. 3a), indicating that the ‘onset’ is being systematically delayed over the NEI. Similarly, the WD has a weak decreasing trend indicating that the ‘withdrawal’ has a tendency to occur earlier than normal over the NEI (Fig. 3b). As a result, the LRS has a significant (at 95% confidence level) decreasing trend (Fig. 3c) with a decrease of about 7 days during the period. The decreasing trend of LRS over the NEI is faster during the past six decades (after 1950) and appears to be related to a faster decreasing trend of the thermodynamic index of Indian monsoon (TISM, see Methods) (Fig. 3d) leading to a faster decrease of rainfall over the season (LRS-RF) (Fig. 3e). As the mean rainfall over the Indian monsoon region correlates strongly with the vertically integrated moisture flux convergence (VIMFC), over the region the decreasing trend in LRS-RF is consistent with a decreasing trend of VIMFC (Fig. 3f). While the precipitable water (PW) content in the region has increased during the period (Fig. 3h) in accordance with increasing air–temperature, the decreasing trend of VIMFC is a result of a faster decreasing trend of wind convergence in the lower atmosphere (Fig. 3g). The faster tropical Indian Ocean (IO) warming during the same period that warms up the atmosphere over the tropical IO thereby decreasing the North-South ΔTT over the whole Indian monsoon region appears to be the primary driver of the decease of LRS and LRS rainfall over the NEI during this period24.

Fig. 3: Inter-annual variability and its trends over NEI.
figure 3

The box averaged time-series and its linear trends of anomalous a onset day, b withdrawal day, c LRS, d TISM, e accumulated rainfall, f VIMFC, g Wind convergence at 850 hPa, and h precipitable water are shown. The 95% significance level of trends is tested with the Mann–Kendall nonparametric method. The trend with p < 0.05 is significant. Here, the +ve (−ve) anomaly of onset day means ‘delay (early)’ monsoon onset and vice versa for Withdrawal day. The data used in this figure is from NCEP-v3 (1901–2015), except for ‘Rainfall’, which is of the NEI24 dataset (1920–2009). The computations for dh are done for the summer monsoon season (i.e., days between ‘Onset’ and ‘Withdrawal’ day for a year, as defined from ΔTT).

ENSO’s impact on LRS over the NEI is studied using daily ΔTT composites during El Niño and La Niña years. It shows contraction (expansion) of LRS during El Niño (La Niña) years (Fig. 4b) and was noted for large continental scale monsoon25. The ENSO achieves this by weakening the north-south ΔTT compared to climatological ΔTT over the region during an El Niño while strengthening the ΔTT during a La Niña25. Composites TT anomalies for El Niño and La Niña years (Fig. 4c, d) indicate that on a continental scale, the TT anomalies nearly reverse from El Niño to La Niña thereby weakening (strengthening) north-south ΔTT over the region. While the reversal of the ΔTT during El Niño (La Niña) over the CI is large, that over the NEI region is weaker. This indicates that the ENSO influence is somewhat weaker over the NEI as compared to that over the CI. Thus, the anomaly associated with the extra-tropical stationary wave pattern over northern India and Southern Eurasia during El Niño to La Niña is such that it leads to weakening (strengthening) the ΔTT over the region and leading to shortening (lengthening) the LRS.

Fig. 4: ΔTT of extreme and ENSO years.
figure 4

a The temporal evolution of ΔTT for two extreme years with the longest (1964, blue) and shortest (1987, red) season shows the expansion and contraction of LRS, respectively, over NEI. b The composite of the annual cycle of ΔTT for El Niño (red) and La Niña (blue) years during 1901–2015 are shown over NEI, where the ‘gray’ curve is a long-term daily mean ΔTT. The seasonal (i.e., days between ‘Onset’ and ‘Withdrawal’ day for a year, as defined from ΔTT) spatial structure of TT anomaly over the South Asian monsoon domain for c El Niño and, d La Niña years.

A true ‘monsoon season rainfall’ over NEI is the accumulated rainfall over the region between OD and WD (i.e., LRS), and we call it LRS-RF. An LRS-RF was constructed between 1920 and 2009 using reliable daily rainfall over the region from NEI24. Since the climatological OD and WD are close to 15th May and 15th October, we also create a fixed season rainfall time series between 15th May and 15th October (M15O15). Further, we create a fixed season time series of rainfall during May–October (MJJSO), MJJAS, and JJAS. We note that the LRS-RF correlates strongly (R > 0.8) with all seasonal rainfall time series but most strongly (R = 0.92) with M15O15 as expected (Supplementary Table 2).

Potential drivers and predictability of ‘onset,’ ‘withdrawal,’ and LRS over NEI

The dominance of a quasi-biennial and a quasi-quadrennial mode in the time series of OD, WD, and LRS (Supplementary Figs. 3 and 4) closely associated with the ENSO26 indicates a potential association with the ENSO. To further explore the connection with the ENSO and explore if there are any other potential drivers of the OD, WD, and LRD over the NEI, simultaneous correlations between OD, WD, and LRS and JJA SST of the same year are calculated (Fig. 5a–c). It is noted that the correlation with equatorial Pacific SST is weak for OD but strong for WD and LRS. The large-scale pattern of correlations indicates that the OD over the NEI is associated with the ‘interdecadal ENSO’27, while the WD and LRS are more strongly linked with the ‘interannual ENSO’. This finding is consistent with the observation that the difference in OD between an El Niño and a La Niña is smaller than that for OD (Fig. 4a, b), as was also noted over All India (AI)21. It is also notable that the tropical Atlantic Ocean SST or the Atlantic Zonal Mode (AZM) has a significant positive correlation with the LRS (Fig. 5c). The negative correlation of LRS with ENSO SST and positive correlation with AZM SST is like that between all India Summer Monsoon Rainfall and ENSO and AZM. While teleconnection between ENSO and ISMR has been studied extensively, that between AZM and ISMR is an emerging area28,29,30. The following study29 shows that a Matsuno-Gill response of the heat source associated with the AZM SST introduces moisture convergence anomaly over the Indian monsoon region, thereby influencing the ISMR. The TT anomalies associated with the same teleconnection influence the LRS.

Fig. 5: Potential driver and predictive signals of onset, withdrawal, and LRS of summer monsoon over NEI.
figure 5

The potential drivers from simultaneous correlations of detrended anomalous time series of a onset days, b withdrawal days and, c LRS days, with spatial detrended anomalous time series map of JJA (0) sea surface temperature (SST). The stippled regions indicate correlations significant at 95% level. And the potential predictive signals of LRS is checked through lead-lag correlation of its detrended time series with monthly box averaged SST (detrended) indices d interdecadal ENSO index (170°E, 280°E, 10°S, 10°N) minus (130°E, 200°E, 20°N, 50°N), e Atlantic Niño (320°E, 20°E, 10°S, 10°N), and f Niño3.4 (190°E, 240°E, 5°S, 5°N).

In order to examine any potential predictive signal and its potential influence on the SST itself, correlations between OD, WD, and LRS with SST during D(−1)JF(0) and D(0)JF(+1) are calculated (Supplementary Fig. 5). We note that winter season SST just prior to the northern summer has a weak predictive signal for OD through the ‘interdecadal ENSO.’ However, there is no clear predictive signal for WD or LRS. On the other hand, the correlations between the following winter SST over the equatorial Pacific and WD and LRS are highly significant. Thus, the OD, WD, and LRS of NEI seem to have positive feedback with the ENSO system. How the LRS of NEI influences the initiation or intensification of the ENSO remains unclear at present and would require further investigation. We further explore the lead-lag correlations between NEI LRS and an index of Niño3.4, an Atlantic Niño, and an index of decadal Niño up to lead of 12 months and lag of 12 months (Fig. 5d–f). Consistent with spatial maps of correlations (Fig. 5a–c), the correlation with Atlantic Niño is on the borderline of being statistically significant. On the other hand, the correlations with decadal Niño and Niño3.4 are highly significant simultaneously with potentially predictable at about a season in advance.

Simulated OD, WD, and LRS by CMIP6 models

The LRS-RF is a lifeline for the agricultural economy; an important scientific question is whether, in the coming decades, the LRS-RF will continue to decrease or there is a possibility of reversing the trend of global warming. For this purpose, we examined the simulations by the Climate Model Intercomparison Project Phase-6 (CMIP6) during the historical period (1850–2014) and projection till the end of the century (2015–2100) corresponding to a high-greenhouse gas scenario, namely, Shared Socio-economic Pathways-5 (SSP5–8.5). As the orography of the region (Supplementary Fig. 10) is an important driver of the NEI rainfall, adequate representation of the orography in the models is essential for a reasonable simulation of the mean rainfall over the region and, in turn, the simulation of the observed TT. Generally, low-resolution CMIP6 models tend to underestimate mean rainfall in this region and the ΔTT. Keeping this bias in mind, we selected 9 models (see Methods) from a large number of CMIP6 models (see Supplementary Table 3) for this purpose.

During the historical period, the ensemble mean simulation of climatological ΔTT over CI and AI closely aligns with observations, resulting in LRS simulations of 128 days and 131 days, respectively (Fig. 2b, c). However, over the NEI, the models underestimate ΔTT, leading to an underestimated LRS of 144 days compared to the observed 154 days (Fig. 2a). Despite this bias, the models successfully reproduce the observed east-west asymmetry in LRS, with a longer rainy season over NEI than CI. Utilizing these models’ projections, we examine future changes in LRS and LRS-RF.

CMIP6 projected changes in LRS under a high greenhouse gas scenario

The projected ensemble mean climatological ΔTT by the models indicates that over the AI, it does not change appreciably from the historical period with a LRS of 132 days as against 131 days simulated during the historical period (Fig. 2c). It appears that the greenhouse gas forcing does not change the climatological rainy season of the Indian monsoon. Over CI, the LRS is expected to increase by three days from 128 to 131 days (Fig. 2b), while over NEI, it is expected to decrease by four days from 144 to 140 days (Fig. 2a). This is consistent with recent findings9 that over the historical period, the LRS has decreased over the NEI and increased over NWI.

The time series of ensemble mean simulated and projected OD, WD, LRS, TISM, LRS-RF, VIMFC, PW, and wind convergence at low level (850 hPa) over NEI during the historical period as well as projected under SSP5–8.5 (Fig. 6) indicate that the OD continues to show an increasing trend while the WD has weak decreasing trend resulting in a decreasing trend of LRS (Fig. 6c). Notably, the projected TISM decreases at a significantly faster rate (Fig. 6d) compared to that during the historical period. The most notable finding, however, is an increasing trend of LRS-RF under large greenhouse gas forcing over the NEI (Fig. 6e) consistent with the increasing trend of VIMFC (Fig. 6f) in contrast to that during the historical period. With increased greenhouse gas forcing, an increasing trend of larger PW (Fig. 6g) and an increasing trend of low-level Wind Convergence (Fig. 6h) leads to a stronger increasing trend of VIMFC and an increasing trend of LRS-RF. However, what drives an increasing trend of low-level wind convergence in the backdrop of increasing vertical stability of the atmosphere remains a puzzle.

Fig. 6: Simulations of interannual variability and its trends by CMIP6 over NEI.
figure 6

The box averaged time series and its linear trends of anomalous a onset day, b withdrawal day, c LRS, d TISM, e accumulated rainfall, f VIMFC, g Wind convergence at 850 hPa, and h precipitable water are shown from ensemble mean of CMIP6 models. The 95% significance level of trends is tested with the Mann–Kendall nonparametric method. The trend with p < 0.05 is significant. Here, the +ve (−ve) anomaly of onset day means ‘delay (early)’ monsoon onset and vice versa for withdrawal day. Historical (1850–2014, blue) and Projection with SSP5–8.5 (2015–2100, red). The computations for dh are done for the summer monsoon season (i.e., days between ‘Onset’ and ‘Withdrawal’ day for a year, as defined from ΔTT). The offset between the ensemble mean simulated over the historical period and the ensemble mean projected variables in 2010 is adjusted for display.

To compare the trends of OD, WD, and LRS between the NEI and CI or AI, we analyzed historical and projected data from the same 9 CMIP6 models used for NEI (Supplementary Table 3). Selecting models with minimal bias in simulating the TISM based on its strong correlation with LRS-RF (Supplementary Fig. 6b), we examined trends in OD, WD, LRS, TISM, and LRS-RF. Results showed nearly identical trends for CI and AI, with decreasing LRS during the historical period, similar to NEI. However, the projected LRS had a weak, increasing trend over CI (Fig. 7c) while continuing to decrease over NEI (Fig. 6c). The decreasing TISM trends were more pronounced over NEI than CI in both historical and projected periods (Figs. 6d and 7d). Interestingly, CI saw a shift from a weak, increasing trend of LRS-RF during the historical period to a strong increasing trend in projections. Conversely, NEI experienced a significant shift from a decreasing trend of LRS-RF during the historical period to a notable increasing trend in projections.

Fig. 7: Simulations of interannual variability and its trends by CMIP6 over CI and AI.
figure 7

The box averaged time series and its linear trends of anomalous a, f onset day, b, g withdrawal day, c, h LRS, d, i TISM, and e, j accumulated rainfall over CI (left panel) and AI (right panel) are shown from the ensemble mean of CMIP6 models. The 95% significance level of trends is tested with the Mann–Kendall nonparametric method. The trend with p < 0.05 is significant. Here, the +ve (−ve) anomaly of onset day means ‘delay (early)’ monsoon onset and vice versa for withdrawal day. Historical (1850–2014, blue) and Projection with SSP5–8.5 (2015–2100, red). The ‘TISM’ is computed for the summer monsoon season (i.e., days between ‘Onset’ and ‘Withdrawal’ day for a year, as defined from ΔTT).

The climate models’ substantial bias in simulating historical LRS raises questions about the significance of the decrease (increase) of 4 (5) days over NEI (CI) during projections. Although relatively small in absolute terms, these changes are comparable to the observed interannual variation (9 days, Supplementary Tables 1 and 4), making them significant.

During the historical period, the global correlation patterns between de-trended simulated OD, WD, LRS, and LRS-RF over NEI and JJA (0) SST (Fig. 8, middle panel) show similarities with observations (Fig. 8, left panel) but with significantly weaker amplitude. While OD appears to be influenced by interdecadal ENSO, the WD exhibits an insignificant relationship with Pacific ENSO, unlike in observations (Fig. 5). Consequently, the simulated LRS also displays a weak correlation with interdecadal ENSO. Notably, the correlations between LRS-RF and equatorial Pacific SST are significantly weaker than those between LRS and Equatorial Pacific SST, both in observations and simulations. In projections, correlations with ENSO increase significantly (Fig. 8, right panel) for OD, WD, and LRS but remain weaker compared to observed correlations (Fig. 8, left).

Fig. 8: Potential driver of onset, withdrawal, and LRS of summer monsoon in observation and CMIP6 simulations over NEI.
figure 8

The simulated (Historical - middle panel and Projection with SSP-8.5 - right panel) potential drivers are compared with observation (left panel) through simultaneous correlations of detrended anomalous time series of ac onset days, df withdrawal days, gi LRS days, and jl LRS-RF with spatial detrended anomalous time series map of JJA (0) sea surface temperature (SST). The stippled regions indicate correlations significant at a 95% level.

Discussion

In the absence of an acceptable objective definition of OD, WD, and LRS over the NEI, JJAS has been assumed as the summer monsoon season over the NEI, too, like that over CI. Here, based on the annual cycle of the rainfall over the NEI, we argue that the LRS over NEI is at least one month longer than that over the CI. We also argue that OD, WS, or LRS could be reliably defined only over three large regions, namely, the All India21, over Central and Western India, and over NEI. We show that climatologically, the LRS over the NEI is ~155 days while that over the CI is ~122 days (Supplementary Fig. 2a). We believe that our estimate of LRS over NEI constrained by annual cycles of both rainfall and low-level winds is reliable in qualitative consistent with the following Mishra et al6 firmly establish that the Indian summer monsoon season over the NEI is from the 15th of May to 15th of October (M15O15). The objectively defined ‘onsets’ based on ΔTT are ‘large-scale’ onsets and cannot be used to define local-onset on a small grid level like that defined by IMD6,8 based primarily on daily rainfall. Once the large-scale onsets take place over AI, CI, and NEI, ‘local-onset’ could be defined based on daily persistent rainfall or anomaly of cumulative rainfall6.

The onset of NEI is intimately linked with the monsoon over the BoB. It may be noted that no organized precipitation takes place in the north BoB until the middle of June, and till then, the heat source over the BoB box is dominated by the NEI rainfall along the Myanmar coast. While the onset of monsoon in the NEI would not happen on 12th May without moisture influx from the BoB, there is not enough low-level convergence over north BoB, and the cyclogenesis over the region is still awaited. Thus, the large-scale onset over the BoB is simultaneous with that of the NEI, but the local onset over the north BoB is later and could be estimated by the cumulative daily rainfall anomaly6. The ‘onset’ over the NEI in early May is driven by an increase in the cyclonic vorticity at 850 hPa over the region by a factor of three and maintained by an increase of northward moisture flux transport by a factor of four at the time of ‘onset’31. We have shown32 that a westward propagating quasi-biweekly oscillation passing through the region causes low-level cyclonic vorticity over the NEI, thereby enhancing the ambient weak cyclonic vorticity to about ~3 × 10−6 s−1 and could initiate some of the ‘onsets’ of monsoon over the NEI. The enhancement of low-level vorticity in other ‘onsets’ could be driven by potential vorticity intrusion at the upper level associated with extra-tropical weather disturbances33. Thus, the NEI monsoon and the BoB monsoon are intimately linked to each other, including their ‘onsets.’

Our findings that OD over the NEI is strongly correlated with inter-decadal ENSO while WD is strongly correlated with inter-annual ENSO leads to a strong correlation between LRS and ENSO in general. However, the correlation between ‘LRS-RF’ (rainfall accumulated for LRS days) and simultaneous SST is weaker than that with ‘LRS.’ This indicates that while the ENSO does control the LRS, the potential predictability of LRS-RF is weaker, possibly due to ‘internal climate noise’ arising from high-frequency convective contribution to LRS-RF. It is also noted that while the simultaneous correlation is statistically significant, even at a lead of 3 months, the correlation between potential drivers and LRS-RF tends to be insignificant (Supplementary Fig. 8). An investigation of temporal variations of correlations between OD, WD, LRS, and JJA Niño3.4 SST (Supplementary Fig. 9) indicate that ENSO-LRS correlations remain significant from about 1930 to present (Supplementary Fig. 9c) due to a decreasing trend of ENSO-OD positive correlations (Supplementary Fig. 9a) and an increasing trend of ENSO-OD negative correlations. Our findings point to the challenges in predicting the ‘seasonal rainfall’ over the NEI. One silver lining is that the inter-annual variability of LRS-RF is strongly correlated with rainfall over a fixed season between 15 May and 15 October (M15O15, R > 0.92). While for scientific studies of variability and predictability, we could use the LRS-RF, for other practical purposes, we can safely use M15O15 as the rainy season over the NEI.

While an ensemble of nine CMIP6 models simulate the ΔTT and thereby simulate LRS over CI and AI close to observed, they underestimate the ΔTT leading to an underestimation of LRS over NEI by about 8 days. As the rainfall over the NEI depends largely on the orography of the region, intrinsically low-resolution CMIP6 climate models underestimate rainfall over the region (Supplementary Fig. 10 and Supplementary Fig. 6). As rainfall largely contributes to the TT, the models underestimate the continental TT to the north leading to a weaker than observed ΔTT.

Ensemble-mean predictions under high greenhouse gas forcing indicate that the climatological OD, WD, and LRS are not expected to change appreciably over CI and AI, but the LRS over the NEI is likely to shorten by about 5 days by the end of the century (Fig. 2). During the historical period over NEI, a significant decreasing trend of LRS is driven primarily by an increasing trend of OD over the NEI and is associated with a decreasing trend of LRS-RF (Fig. 3). In the coming decades, however, a persisting decreasing trend of LRS over the NEI is expected to be associated with a significantly increasing trend of LRS-RF (Fig. 6). In contrast, during the historical period over CI, an insignificant decreasing trend of LRS is associated with an insignificant increasing trend of LRS-RF (Fig. 7). In the coming decades, however, an insignificant increasing trend of LRS over the CI (Fig. 7c) is expected to be associated with a significant increasing trend of LRS-RF (Fig. 7e).

It is notable that the TISM (thermodynamic index) has a decreasing trend over NEI during the historical period turning to a stronger decreasing trend in projections (Figs. 3 and 6), while it has a milder decreasing trend during both periods over CI. We analyze LRS, LRS-RF, and TISM trends historically and in projections. LRS shows a strong positive correlation with TISM across all regions on an inter-annual timescale (Supplementary Table 5).

After onset, the precipitation is the primary contributor to the TT, and hence is expected that the TISM is strongly correlated to the LRS-RF21 on an inter-annual time scale. Does it hold for the trends of TISM and LRS-RF too? Over the NEI, the trends of TISM and LRS-RF are similar in the historical period but opposite in the projections. To understand this apparent contradiction, and what controls the trend of TISM, we examined the time series of TT over the North box (green, 89°E–100°E & 5°N–35°N, Fig. 9a), TT over the South box (blue, 40°E–100°E & 15°S–35°N, Fig. 9a), the difference between the two (Fig. 9a, red) together with equatorial SST over the Indian Ocean (IO) averaged between 10°S–10°N, and 40°E–100°E (Fig. 9b, red) for both historical and projected simulation period. The much faster-increasing trend of the TT over the South box (Fig. 9a, blue) is a result of the much faster-increasing trend of equatorial SST and associated deep convective activity over the IO and results in the faster-decreasing trend of TISM (Fig. 9b, blue). An almost ten times faster increase at the rate of 0.4°C/decade of equatorial Indian Ocean SST in projections compared to that in the historical period (@ 0.03°C/decade) explains why the TISM has a much larger decreasing trend in projections. Despite the high equatorial SST weakening TISM, the abundant moisture content over NEI and the north Indian Ocean atmosphere results in a significant increase in VIMFC and leads to increased rainfall.

Fig. 9: Strengthening of Indian Ocean (IO) SST over ΔTT from CMIP6.
figure 9

Interannual variability and its trends of box averaged TT, ΔTT, TISM, and IO SST during historical (1850–2014) and projection under SSP5–8.5 (2015–2100) are shown from the ensemble mean of CMIP6 models. a TTNorth (89°E–100°E & 5°N–35°N, green, left y-axis), TTSouth (40°E–100°E & 15°S–35°N, blue, left y-axis) and ΔTT (i.e., TTNorth–TTSouth, red, right y-axis) and b TISM (blue, left y-axis) and IO SST (red, right y-axis).

With increasing GHG forcing (e.g., SSP5–8.5), the intensified and frequent extreme daily rainfall events over NEI are going to increase, and the shortening (increasing) trend of LRS (LRS-RF) poses a significant increase in hydrological disasters like floods and landslides. While the season length remains predictable due to its strong connection with ENSO, seasonal rainfall (LRS-RF) becomes poorly predictable, especially under SSP5–8.5, as its association with ENSO weakens. This could be due to rainfall having a ‘climate noise’ from high-frequency fluctuations, whereas LRS, based on the ΔTT, filters out such noise, as suggested by a previous study21.

Methods

Observed data

The daily pressure level air temperature with 1° × 1° spatial resolution has been used from the most recent (20CR V3) version of the NOAA-CIRES-DOE for the years 1901–201534. The Tropospheric Temperature (TT, which is derived from air temperature vertically averaged between 600 hPa and 200 hPa) has been used to create indices like—‘onset,’ ‘withdrawal,’ and ‘length of rainy reason’ (LRS). The ERA5 (1978–2020) dataset from the European Center for Medium-Range Weather Forecasts (ECMWF)35 with 0.25° × 0.25° spatial resolution is also used to compare the indices. The associations with several indicators are examined using the Centennial in situ Observation-Based Estimate of SSTs (COBE SST2)36 (1901–2015). Using the IMD37 data set, a box average between 72°E and 85°E and 18°N and 25°N is used to create the CI daily rainfall time series (1901–2015). Similarly, area-averaged time series for NEI are created by averaging daily precipitation over 24 stations in the region from 1920–2009 (hereinafter NEI24)32,38. The daily precipitation data from Tropical Rainfall Measuring Mission (i.e., TRMM-3B42, 1998–2019)39 is also used to compare the annual cycle over CI and NEI. The shuttle radar topography mission data is used for topography40.

CMIP6 models selection criteria

Apart from observed and reanalysis data, the nine high-resolution simulated models from Coupled Model Intercomparison Project phase 6 (CMIP6) are used both in the Historical (1850–2014) and Projection with Socio-economic Pathways-5 (SSP5–8.5) scenario (2015–2100) of first realization (r1i1p1f1). As the annual cycle of the north-south ΔTT is the key parameter that determines the OD, WD, LRS, and strength of the monsoon (TISM), the closeness of the annual cycle of the simulated daily climatological ΔTT is critical. In Supplementary Figs. 6b, 7, we note that the annual cycle of simulated ΔTT over NEI falls in two clusters, one which is close to observed and another that has the positive ΔTT during summer is too weak (which implies weak TISM and hence weaker summer monsoon over NEI). Therefore, we make the TISM normalized (i.e., simulated TISM divided by observed TISM) by the observed TISM as the criterion for selecting the models. The scatter plot of the normalized TISM as a function of the models (Supplementary Fig. 6b) makes the point clear, and this same set of 9 models simulates the annual cycle of precipitation closer to observation (Supplementary Fig. 6a). The details of the models are listed in Supplementary Table 3.

An objective definition of ‘onset,’ ‘withdrawal,’ and LRS over NEI

The objective definition of ‘onset’ and ‘withdrawal’ and hence the LRS for the NEI is like the ones defined for All India (AI) summer monsoon by Eq. (2)21. Here, we briefly summarize the physical basis for the same. We note that the Indian summer monsoon is a convectively coupled system where the seasonal reversal of the winds is intimately linked with the seasonal reversal of the heat source associated with rainfall41. As a result, an objective definition of ‘Monsoon Onset over Kerala’ (MoK) is based on the North–South (N–S) Tropospheric Temperature gradient (ΔTT) over the region changing sign from negative to positive is built based on the convectively coupled nature of the phenomenon21,25. Therefore, it also makes sense to define the ‘Withdrawal’ of the monsoon from the continent when the N-S gradient of the TT changes sign from positive to negative, indicating that the rain band (or heat source) moves back to 5°S and ensuring that winds over the north Arabian Sea turn to north easterlies.

We argue that the same objective criterion could be used to define ‘Onset’ and ‘Withdrawal’ over the NEI. It may be noted that the Indian summer monsoon heat source or TT vertically averaged between 600 hPa and 200 hPa (Eq. 1) over the region between (40°E–100°E, 5°N–35°N, Northern box) can be divided into two subparts, viz., Northeastern region over the NEI (89°E–100°E, 5°N–35°N) and Northwestern region over the CI (40°E–89°E, 5°N–35°N) (Fig. 2 inset). These two regions represent two modes of the dominant pattern of inter-annual variability of the Indian monsoon rainfall42,43. With about three times larger climatological seasonal rainfall over the NEI (Fig. 1b) compared to that over the core monsoon region of CI, it represents a stronger heat source but over a relatively smaller region. The Matsuno–Gill response of this heat source41 would produce south-westerlies over the north BoB. Thus, the change of sign of the N–S gradient of TT (Eqs. 24) over the region (Northeastern box minus the Southern box) from negative to positive may be used to define the ‘Onset Date (OD)’ of summer monsoon, while changing the gradient from positive to negative could be used to define ‘withdrawal date (WD)’ of summer monsoon over AI, CI, and NEI. The climatological ΔTT for the NEI based on NCEP-V3 reanalysis35 data from 1976 to 2015 (Fig. 2a, green curve) indicates a climatological ‘Onset’ over the region by 14th May and ‘withdrawal’ by 14th October, making the LRS over the region 154 days. More details are available in the Supplementary Online Text.

To have an idea about uncertainty in estimating LRS from observations, we calculate climatological daily ΔTT from another state-of-the-art reanalysis, namely the ERA5. The climatological daily ΔTT from ERA5 over all three regions (NEI, AI, and CI) is shown and compared with those from NCEPv.3 in Supplementary Fig. 2, and the statistics of OD, WD, and LRS are summarized in Supplementary Table 1. It is noted that, although the LRS is shorter (147 days) in ERA5 as compared to that in NCEPv.3 (157 days), but the standard deviation of its interannual variability is 9 days, the same in both reanalyses. Thus, there is some uncertainty in estimating the climatological mean LRS, but the estimation of interannual variability of LRS is robust and reliable from any analysis.

In brief, the TT and ΔTT are computed as follows:

$${\rm{{TT}_{air}(lat,lon,time)}}=\mathop{\sum }\limits_{\rm{lev}=600}^{\rm{lev}=200}{\rm{{T}_{air}(lat,lon,time,lev)}}$$
(1)

Where TTair is the tropospheric air temperature.

$$\Delta{\rm{{TT}}}_{\rm{AI}}({\rm{time}})=\mathop{\sum}\limits_{{\rm{lon}}=40^\circ {\rm{E}}}^{{\rm{lon}}=100^\circ {\rm{E}}}\mathop{\sum}\limits_{{\rm{lat}}=5^\circ {\rm{N}}}^{{\rm{lat}}=35^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})-\mathop{\sum}\limits_{{\rm{lon}}=40^\circ {\rm{E}}}^{{\rm{lon}}=100^\circ {\rm{E}}}\mathop{\sum}\limits_{{\rm{lat}}=15^\circ {\rm{S}}}^{{\rm{lat}}=5^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})$$
(2)
$$\Delta {{\rm{TT}}_{\rm{CI}}({\rm{time}})=\mathop{\sum}\limits_{{\rm{lon}}=40^\circ {\rm{E}}}^{{\rm{lon}}=89^\circ {\rm{E}}}\mathop{\sum }\limits_{{\rm{lat}}=5^\circ {\rm{N}}}^{{\rm{lat}}=35^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})-\mathop{\sum}\limits_{{\rm{lon}}=40^\circ E}^{{\rm{lon}}=100^\circ {\rm{E}}}\mathop{\sum}\limits_{{\rm{lat}}=15^\circ {\rm{S}}}^{{\rm{lat}}=5^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})}$$
(3)
$$\Delta {\rm{TT}}_{\rm{NEI}}({\rm{time}})=\mathop{\sum}\limits_{{\rm{lon}}=89^\circ {\rm{E}}}^{{\rm{lon}}=100^\circ {\rm{E}}}\mathop{\sum }\limits_{{\rm{lat}}=5^\circ {\rm{N}}}^{{\rm{lat}}=35^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})-\mathop{\sum}\limits_{{\rm{lon}}=40^\circ {\rm{E}}}^{{\rm{lon}}=100^\circ {\rm{E}}}\mathop{\sum}\limits_{{\rm{lat}}=15^\circ {\rm{S}}}^{{\rm{lat}}=5^\circ {\rm{N}}}{\rm{TT}}({\rm{lat}},{\rm{lon}},{\rm{time}})$$
(4)