## Introduction

El Niño, the warm phase of the most dominant interannual climate variability, has changed its behavior since the 1980s1,2,3,4. Extreme El Niño events have occurred frequently in the past 40 years, at a rate of one event per 13 years, usually causing reorganizations of atmospheric convection and inducing severe climatic disruptions around the globe, such as disastrous floods and droughts across the Pacific region5,6,7. At the same time, the Central Pacific (CP) El Niño8,9,10 (also termed the Dateline El Niño11, El Niño Modoki12, or warm pool El Niño13), which is characterized by the peak ocean warming in the central equatorial Pacific, has become more common13,14,15,16 and the canonical Eastern Pacific (EP) El Niño, which is characterized by the strongest sea surface temperature anomalies (SSTA) located in the far eastern equatorial Pacific, has become less frequent8,9,10,12. The movement of the sea surface temperature (SST) anomaly center shifts atmospheric circulation and convection, causing different climate impacts around the globe9,11,14,17.

Several studies using climate models have projected that El Niño amplitude increases7 and CP El Niño variability increases14 under greenhouse warming and have suspected that the observed El Niño behavior changes could be a consequence of anthropogenic warming. With the buildup of greenhouse gases, the eastern equatorial Pacific warmed faster than the surrounding regions, and the thermocline in the equatorial Pacific flattened, facilitating more occurrences of extreme El Niño and CP El Niño events7,14. The enhanced response of atmospheric vapor to air temperature under greenhouse warming also increases the extreme El Niño18. Multiple palaeo-El Niño proxies also provide evidence for the impact of the climate change on the observed change in El Niño properties. These include enhanced El Niño variability during the late twentieth century relative to the preindustrial period15,19,20,21 and more frequent CP events in recent decades relative to the preindustrial era15,22. In addition to anthropogenic forcing, natural variability also plays a role in the change in El Niño properties23,24. Preindustrial model simulations show that El Niño characteristics exhibit strong interdecadal and intercentennial modulation in the absence of external forcing23. Palaeo-El Niño reconstructions also show a wide range of El Niño variance over the past 7000 year21 and find the El Niño variabilities in both the early 1900s and recent decades are relatively higher than preindustrial levels25, highlighting the role of internal variability. Observations and preindustrial-control model simulations show that Atlantic Multidecadal Oscillation (AMO), a long-lived basin-wide warming or cooling in the Northern Atlantic that generally persists for 60–80 years, could modulate El Niño amplitude26,27,28 and El Niño type29,30.

It is still unclear whether the frequent occurrence of extreme El Niño events and CP El Niño events in the last 40 years is part of natural variability23,24 or a consequence of global warming1,7,14. In order to investigate this, we can analyze past El Niño records to see whether there are other periods with frequent extreme El Niño and CP El Niño occurrences like the past 40 years, but without strong anthropogenic forcing. If this is the case, we can deduce that factors other than anthropogenic activity must play a role in the decadal transition of El Niño types. In this study, we examine multiple long-term instrumental SST datasets and apply cluster analysis to classify El Niño events. We show that there exists another extreme El Niño and CP El Niño epoch, around year 1900, with similar spatial and temporal evolution, dynamic processes, and climate impacts as those that occurred in the last 40 years. The results suggest a role of internal variability. Moreover, we found that both the two periods with increased extreme El Niño and CP El Niño events coincide with the positive phase of the AMO. Then, we investigate the influence of internal variability associated with the AMO on El Niño multidecadal modulation using outputs from the fifth and sixth phases of the Coupled Model Intercomparison Project (CMIP5 and CMIP6) and palaeoclimate proxies. Furthermore, we quantify the contribution of anthropogenic forcing and internal variability to the recently observed El Niño diversity based on a statistical model. Our results highlight that both internal variability associated with the AMO and anthropogenic forcing contribute to the changes in El Niño properties in recent decades.

## Results

### A period beyond the last four decades with frequent extreme and CP events

We applied a cluster analysis to the evolution from the onset to the development of 38 El Niño events for the period 1871 to 2017 using a “merged” SST dataset from HadISST131, Extended Reconstructed SST version 5 (ERSSTv5)32, and Kaplan Extended SST33 (see Methods). Based on the cluster analysis, El Niño events were classified into 4 physically meaningful clusters: (1) extreme, (2) EP, (3) CP, and (4) successive El Niño events (Fig. 1 and Supplementary Fig. 1). The four types of El Niño in this study are mutually exclusive. This categorization distinguishes extreme from moderate events, thus the “EP” in this study refers to “moderate EP”. The four types of identified El Niño events are consistent with ref. 34 during the common period (1901–2017). We focus on the extreme, EP, and CP types of El Niño events in this study. It is intriguing that almost all extreme and CP events occurred in the two periods. One is the period from the 1980s to the present, in which ~8 CP and 3 extreme events exist, and the other is the period 1870s to the 1900s, among which 3 extreme and 3 CP events consecutively occur (Fig. 2a). In contrast, EP events mainly occurred in the 1900s–1930s and 1950s–1980s. The two increased extreme and CP event periods are also detectable from the HadISST1, ERSSTv5, and Kaplan SST datasets (Supplementary Fig. 2a–c). In the three datasets, the numbers of extreme El Niño events are 3, 2, and 2 during the period from the 1870s to the 1900s and 3, 3, and 3 during the period from the 1980s to the present, respectively. For CP El Niño, the number are 3, 3, and 3 during the period from 1870s to the 1900s and 7, 8, and 7 during the period from the 1980s to the present. The slight difference should arise from data uncertainty.

The extreme and CP El Niño events that occurred in the 1870s to 1900s were similar to those in the 1980s to present in both their spatiotemporal evolution of SSTA and dynamic processes. We compare the spatiotemporal evolution of extreme El Niño and CP El Niño events for the two periods using the “merged” SST dataset. The evolution of the extreme El Niño events in the two periods is similar: the warm SST anomaly starts to develop in the western Pacific (WP) during the preceding boreal winter and then propagates eastward with a rapid basin-wide development in the boreal spring. Later, a warm anomaly occurs in the far eastern Pacific during boreal spring and then propagates westward. The maximum intensity occurs around 120°W in December (Fig. 1a, b; Supplementary Fig. 3a). Strong westerly anomalies develop in the western Pacific in the preceding boreal winter and spring (Supplementary Fig. 4a, b). Coupled with pronounced westerly, convective, and warm SST anomalies near the dateline (180°E) (Supplementary Fig. 4e, f), these anomalies are conducive to initiating an eastward propagation of SST anomalies35,36,37. Moreover, we conducted an ocean mixed-layer heat budget analysis of the mixed-layer ocean temperature averaged over the central-eastern Pacific (5°S–5°N, 180°−80°W) region during the El Niño development phase. It shows that during the onset phase, the zonal advective feedbacks are dominant for the extreme events in these two periods (Fig. 1c), and this result is robust, with a total of 5 of 6 extreme events in these two periods (83.3%) supporting it (Supplementary Table 1).

The CP El Niño events in the two periods also show a similar evolution and feature initial warming in the western Pacific and eastward propagation from the western Pacific from early boreal summer to boreal winter (Fig. 1d, e, Supplementary Fig. 3c). The onset occurs around July with warm SST in the western Pacific and anomalous westerlies and convection in the west of the warm SST (Supplementary Fig. 4g, h). The zonal advective feedback is dominant in the CP El Niño events for both periods during the onset phase (Fig. 1f; 10 of 11 CP El Niño events in these two periods, or 90.9%, are dominated by zonal advective feedback13; Supplementary Table 1). The spatiotemporal evolution of EP El Niño events is different from those of extreme and CP El Niño events (Fig. 1g). The initial warm anomalies originate from the EP and then propagate westward. The thermocline feedback is dominant for EP events during the onset phase (Fig. 1h).

To further confirm the similarity of El Niño events that occurred in the 1870s to 1900s and in the 1980s to the present, we compare the climate impacts of El Niño events in the two periods. The composite of boreal winter air temperature anomaly patterns during extreme El Niño events in the period of 1875–1905 closely resembles those in the post-1980s (Supplementary Fig. 5a, b), with a high spatial correlation coefficient of 0.68 (p < 0.001). The boreal autumn air temperature anomaly patterns during CP El Niño events that occurred from 1875 to 1905 are also similar to those that occurred in the post-1980s (Supplementary Fig. 5d, e). Meanwhile, the extreme and CP El Niño events exert different impacts on global temperature to EP El Niño events (Supplementary Fig. 5c, f). Note that we compare boreal autumn rather than boreal winter because the temperature anomaly patterns show more prominent differences between CP and EP El Niño events in boreal autumn than in boreal winter38.

The above results imply that the period of 1875 to 1905 is another extreme and CP El Niño event epoch. Here we use a 2-way table to test the statistical significance (see Methods). The result indicates that the regime shifts in the 1900s and 1980s are both significant at the 99% confidence level (Supplementary Table 2). It suggests that factors other than anthropogenic forcing must play a role in the decadal transition of El Niño types. This result is consistent with a model study showing that variations in El Niño behavior can occur on multidecadal and intercentennial timescales even with fixed climate forcing23.

To examine the sensitivity of the extreme and CP El Niño event epoch of 1875- 1905 to the traditional classification method, we categorized 38 El Niño into two broad types (EP and CP) based on the Niño method13,14,39 (see Supplementary Discussion), and then we further divide the EP type El Niño events into moderate EP and extreme El Niño events by the criterion of DJF-averaged value of the normalized Niño3 index below or above a threshold value of 1.75 standard deviations (s.d.) (see Supplementary Fig. 2e and Supplementary Discussion). Based on the traditional classification method, we identified a total of 16 CP events and 8 extreme events, with 7 CP and 3 extreme events existing in the period from the 1980s to the present and 3 extreme and 3 CP events consecutively occurring in the period 1870s to the 1900s. Thus, the period of 1875–1905, like the recent decades, characterized by increased extreme El Niño and CP El Niño events are still detectable when employing the traditional classification method.

### Decadal variations in El Niño types modulated by the AMO

Figure 3a shows the correlation of the SST field with the zonal SST anomaly gradient in the merged SST dataset. Here, the signal induced by external forcing, including the change in greenhouse gases (GHGs), natural forcing (NAT), and anthropogenic aerosols (AA), has been removed in the SST field and the zonal SST anomaly gradient (Fig. 3a, see Estimates of the forced and internal components in Method). The spatial correlation field shows significant positive values over the western Pacific, suggesting that the equatorial zonal SST gradient is dominated by the western Pacific. Meanwhile, significant positive values are also found over the North Atlantic Ocean and the pattern resembles the AMO (Fig. 3a), suggesting that the zonal SST anomaly gradient is related to the AMO. This result is consistent with previous studies showing that the AMO could remotely affect the mean state of the Pacific43,44,45,46. Thus, the AMO seems to be a key candidate for leading to the multidecadal change in El Niño types.

We further analyzed the relationship between the AMO and El Niño diversity based on outputs from the fifth and sixth phases of the Coupled Model Intercomparison Project (CMIP5 and CMIP6). We use outputs from 20 CMIP5 and 23 CMIP6 pre-industrial control simulations over the last 300 years. The CMIP5 and CMIP6 models could reproduce the observed AMO characteristics, with similar spatial patterns in the North Atlantic Ocean (Supplementary Fig. 6). Due to the model’s bias in simulating the evolution of El Niño, we use the El Niño onset phase (April to August) averaged western Pacific SSTA > 0 °C to roughly distinguish the extreme/CP from EP events. A total of 30 of 43 CMIP5 and CMIP6 models (70%) simulated an increased western Pacific SST for the AMO-positive state minus the AMO-negative state (Fig. 3b). The increased western Pacific SST enhances the zonal SST gradients, which is conducive to the development of El Niño in the Niño4 region13,40. Models that generate a larger increase in the western Pacific SST/zonal SST gradient for the AMO-positive state minus the AMO-negative state tend to simulate a larger increase in the frequency of extreme/CP El Niño events for the AMO-positive state minus the AMO-negative state (r = 0.66, P < 0.001). A total of 29 of 43 CMIP5 and CMIP6 models (67.4%) produce an increased occurrence of CP and extreme El Niño for the AMO-positive state minus the AMO-negative state.

We also use multicentury palaeoclimate reconstructions to examine the relationship between the AMO and the decadal modulation of the different types of El Niño events prior to the instrumental record. The palaeoclimate reconstructions include a 700-year (1300–2006) El Niño Niño3.4 index reconstruction20, a 1200-year (800–2008) AMO index reconstruction47, and a 400-year (1617–2008) record of CP El Niño events reconstructed from ENSO-sensitive proxy records15. To focus on the decadal modulation of different types of El Niño events, we count the occurrences of extreme El Niño events (defined by a Niño-3.4 index >1.2 s.d.) and CP Niño events (identified following the pioneering work of ref. 15) over a 21-year sliding period. The 21-year sliding frequencies of extreme El Niño events and CP Niño events increased in the positive phase of the AMO (Fig. 3c) and they are statistically significant above the 95% confidence level according to a bootstrap test (see Methods).

How does AMO impact El Niño diversity? We examined the process using the 5 CMIP6 and 3 CMIP5 models, in which the western Pacific SST responses to the AMO are comparable to the observations (Supplementary Fig. 7a). The simulated spatial pattern of the Pacific response to the AMO among these models exhibits a common feature: annual mean anticyclonic flows over the Northwest Pacific (NWP), significant warm SSTA over the western Pacific and the subtropical North Pacific (SNP), and strong northward flows from the tropics towards the SNP (Supplementary Fig. 7). We find that these features of the Pacific response to AMO derived from CMIP5 and CMIP6 are consistent with those derived from the observations (Supplementary Fig. 8) and a suite of Atlantic Pacemaker experiments45. Such a spatial pattern of the Pacific response implies that the AMO could induce anomalous high pressure over SNP (Supplementary Fig. 8b), thereby causing SNP warming via wind–evaporation–SST feedback, and SNP warming could further develop warm SSTs in the western Pacific through SST–sea level pressure–cloud–longwave radiation positive feedback45. The zonal SSTA gradient over the central Pacific increases as the western Pacific SSTA increases (Supplementary Fig. 9b), which in turn leads to enhanced zonal advection feedback in the development of El Niño (Supplementary Fig. 9c). Thus, a positive AMO corresponds to a warm SSTA in the western Pacific, a large zonal SSTA gradient over the central Pacific, and an enhanced CP and extreme El Niño, vice versa. These results imply that the AMO could modulate the frequency of the three types of El Niño events, with more frequent extreme and CP events in the positive phase of the AMO.

### Contribution of external forcing and internal variability

The zonal SST gradient is divided into the externally forced component and internal variability-related component in Fig. 4a (see Estimates of the forced and internal components in Method). The externally forced (EX) zonal SST gradient shows an upward trend (Fig. 4a). In response to external forcing, the eastern Pacific warms less than the western Pacific due to upwelling and the shallow thermocline, strengthening the zonal SST gradient via Bjerknes feedback48,49. The contribution of EX to the multidecadal equatorial zonal SST gradient variations increased over time and exceeded that of internal variability from 1981 to 2017 (Fig. 4b).

Although the contribution of EX to the multidecadal equatorial zonal SST gradient variations varies in 5 periods, there is a significant linear correlation between the mean zonal SST gradient and the percentages of occurrence ratio of different types of El Niño events to total El Niño events (Supplementary Fig. 10). We use a linear regression model to estimate the EX-induced and IV-induced occurrence of different types of El Niño events from 1981 to 2017 (see Methods). Note that the results of the linear regression for the sum of extreme and CP El Niño events (Supplementary Fig. 10a) are consistent with the model result (Fig. 3b), providing high confidence for the estimation. If both EX and IV are considered in the model, we use the observed zonal gradient in the last 40 years as a predictor in the linear regression model, which yields ~9.1 extreme and CP El Niño events (including ~2.7 extreme and ~6.3 CP El Niño events) and ~0 EP El Niño events in the last 40 years, which are close to the actual number of occurrences of El Niño events (Fig. 4c). If only IV was considered in the model, ~1.8 extreme and ~4.1 CP El Niño events are predicted, and the AMO accounts for 77.8% and 75.6% of IV-induced extreme and CP El Niño events, respectively. This result suggests that the extreme and CP El Niño event epochs still exist even without considering anthropogenic forcing. Moreover, anthropogenic forcing also plays a role in changing the El Niño diversity in the last 40 years. The linear regression model result shows that anthropogenic forcing-induced warming accounts for up to ~1 more extreme and ~2 more CP El Niño events from the 1980s to the present. Our result, that the recent changes in El Niño events were attributed to synchronized effects of greenhouse warming and internal variability, still holds when the traditional El Niño classification method is used (Supplementary Information gives a detailed analysis).

## Discussion

In this study, we found an extreme El Niño and CP El Niño epoch from 1875 to 1905, with the El Niño events that occurred in this period showing similar spatiotemporal evolution, dynamic processes, and global climatic impacts to those that occurred after 1980, suggesting a role of internal variability. Then, we revealed that anthropogenic forcing and the internal variability associated with the AMO synchronously contributed to the increased frequency of extreme and CP events since 1980 by modulating the background warming in the western Pacific and the associated zonal advective processes. Furthermore, the internal variability is estimated to contribute to the observed extreme and CP El Niño event epochs since 1980 by up to 65%, with AMO accounting for more than 3/4 of that. However, anthropogenic forcing can explain ~1 more extreme and ~2 more CP events in the last 40 years. Our findings highlight that both the external forcing and the internal variability of the AMO are important factors to be considered in projecting the diversity of El Niño events in a changing climate.

Future changes in El Niño are a crucial issue. Climate models project a tropical Pacific mean state change under global warming, which could not only exert a significant impact on tropical precipitation patterns50,51 but also influence the ENSO characteristics52,53. As extreme El Niño and CP El Niño events tend to occur in periods of enhanced equatorial west-minus-east SST gradient, the projection of El Niño types should partly depend on the projection of this gradient in the Pacific.

## Methods

### Observation datasets

We analyze (a) the Hadley Center Sea Ice and SST dataset version 1 (HadISST131) (b) the Extended Reconstructed Sea Surface Temperature (ERSST) version 5 global SST monthly dataset32 and (c) the Kaplan extended SST version 2 dataset33 for 1871–2017. Data uncertainties exist prior to 1950 due to the sparse coverage of instrumental observations across the equatorial Pacific54. To reduce the uncertainty in the three SST datasets, we used a “merged monthly mean SST dataset (merged SST hereafter) following the pioneering work of ref. 34”. The merged SST is defined as the arithmetic mean of the three monthly mean SST datasets. In addition to the “merged” SST dataset, we also used the HadISST1, ERSSTV5, and Kaplan SST datasets to examine the sensitivity of the results to different instrumental datasets (Supplementary Fig. 2). Although slight differences still exist, our key conclusion is not sensitive to the SST products, and it also provides some confidence to our result.

The surface winds and atmospheric circulation fields are derived by merging the NOAA-CIRES Twentieth Century Reanalysis (20CRv2c)55 from 1871 to 2012 and NCEP/DOE Reanalysis 2 data56 from 1979 to 2018. To ensure temporal consistency, we use the differences in monthly climatology between 20CRv2c and NCEP/DOE2 data during the overlap period 1979–2012 to calibrate the mean state of NCEP/DOE. The ocean reanalysis dataset used is from SODA version 2.2.4 for 1871–200857 with a resolution of 0.5° × 0.5° and extended from 2009 to 2018 with the Global Ocean Data Assimilation System (GODAS) with a 1° × 1° grid58. We use the difference between GODAS and SODA2.2.4 during the overlay period 1980–2008 to calibrate the mean state of GODAS. The land air temperature used is the HadCRUT559 gridded monthly surface temperature datasets. The long-term ship-observed sea level pressure (SLP) and marine cloud cover were used from the International Comprehensive Ocean–Atmosphere Data Set (ICOADS) release 360. The SLP field derived from the Hadley SLP (HadSLP2) dataset61 was also used.

### Reconstruction data

We also use multi-century palaeoclimate reconstructions to extend the observational record. The Niño-3.4 index and AMO index reconstructions used are from the NOAA/World Data Center for Paleoclimatology archive, including a 700-year (1300–2006) El Niño Niño-3.4 index reconstruction based on 2,222 tree-ring chronologies from both the tropics and mid-latitudes in both hemispheres20 and a 1200-year (800–2008) Atlantic Multidecadal Variability/Oscillation (AMV/AMO) Reconstructions based on 46 annual resolved terrestrial proxy records47. A 21-year running mean is applied to the AMV/AMO reconstruction index. The first and last 100 years are cut-offs to minimize edge effects introduced by the spectral filter. The overlay period of 1401–1900 is used. We define extreme El Niño as when the Niño 3.4 index is greater than 1.2 standard deviations. In addition to the multi-century record of extreme El Niño events reconstructed from the Niño-3.4 reconstruction, we also use 400 years (1601–2008) of CP El Niño events reconstructed from 27 seasonally resolved coral records15 to study the relationship between CP events and AMO. The period of 1601–1900 is used, and a total of 34 CP events are identified based on the pioneering work of ref. 12: 1618, 1620, 1641, 1652, 1657, 1667, 1672, 1677, 1682, 1688, 1693, 1718, 1730, 1733, 1759, 1769, 1775, 1778, 1779, 1781, 1790, 1799, 1801, 1808, 1816, 1832, 1840, 1850, 1853, 1854, 1873, 1884, 1885, and 1895 years.

### Model simulation data

We used all forcing (ALL) and single-forcing experiments forced separately by greenhouse gases (GHGs), natural forcing (NAT), and anthropogenic aerosols (AAs) from 27 CMIP6 models (Supplementary Table 3). We used the pre-industrial control runs from 23 CGCMs of CMIP6 and 20 CGCMs of CMIP5. The 23 CMIP6 models were ACCESS-CM2, BCC-ESM1, CAMS-CSM1-0, CanESM5, CESM2, CESM2-WACCM, CNRM-CM6-1, CNRM-ESM2-1, EC-Earth3-CC, EC-Earth3-Veg, GFDL-CM4, GFDL-ESM4, GISS-E2-1-G, GISS-E2-1-H, HadGEM3-GC31-LL, IPSL-CM6A-LR, MIROC-ES2L, MIROC6, MPI-ESM-1-2-HAM, MPI-ESM1-2-LR, MRI-ESM2-0, NESM3, and SAM0-UNICON. The 20 CMIP5 models were ACCESS1-0, BCC-CSM1-1, CCSM4, CESM1-BGC, CESM1-CAM5, CNRM-CM5, CSIRO-Mk3-6-0, CanESM2, FGOALS-g2, GFDL-CM3, GFDL-ESM2G, GFDL-ESM2M, GISS-E2-R, IPSL-CM5A-LR, MIROC-ESM, MIROC5, MPI-ESM-LR, MPI-ESM-P, MRI-CGCM3, and NorESM1-M. These models were usually used to understand the changing behavior of El Niño62,63.

### Definition of El Niño years

We define El Niño events based on SST anomalies averaged in the Niño-3.4 region (5°N–5°S, 120°W–170°W), here called the Niño-3.4 index. We denote the El Niño first developing year, the following year, and the year prior to El Niño as year 0, year 1, and year −1, respectively. We smooth the Niño-3.4 index with a 3-month running-mean filter. El Niño events are defined as occurring when the detrended October–February (ONDJF)-averaged Niño-3.4 index is greater than or equal to 0.6 °C. A total of 38 El Niño events are identified using the merged SST dataset from 1871 to 2017.

### Cluster analysis

We used a nonlinear K-means cluster analysis focused on the spatiotemporal evolution of El Niño events from the onset to the mature phase (from Nov-1 to Oct 0), which is depicted by SST anomalies averaged between 5°S and 5°N. The data used in the cluster analysis were 3-month running mean SST anomalies based on the original SST data, and to focus on the onset phase to mature phase, only SSTAs greater than 0.3 °C were used for the K-means cluster analysis, which is different from the pioneering work of ref. 34.

In the K-means cluster analysis, we use the squared Euclidean distance to measure the similarity between each cluster member and the corresponding cluster pattern. The silhouette clustering evaluation criterion was used to evaluate the performance of the cluster analysis. A high silhouette value indicates that the member is well matched to its cluster and poorly matched to neighboring clusters. Through nonlinear K-means cluster analysis of the evolution of El Niño events, we obtained four types of El Niño events. Supplementary Fig. 1 shows the silhouette values for each El Niño event within each of the 4 clusters for the 1871–2017 period for the merged SST, HadISST1, ERSSTV5, and Kaplan SST datasets. Furthermore, we tested different SST datasets to identify the extreme and CP El Niño events, and they yielded similar results, indicating that the results are robust and not sensitive to the exact choice of dataset (see Supplementary Discussion). Our results are not sensitive to the classification method—for example, we find similar results if we used the traditional Niño method13,14,39 to categorize El Niño events into EP and CP events and then extracted the extreme El Niño events (see Supplementary Fig. 2, Supplementary Fig. 10-11 and Supplementary Discussion).

### Ocean mixed layer heat budget equation

To compare the dominant dynamic processes of extreme and CP types of El Niño between the two periods, we conduct an ocean mixed layer heat budget analysis. The heat budget is computed according to the following equation:

$$\frac{\partial {T}^{{\prime} }}{\partial t} =\left\langle {-u}^{{\prime} }\frac{\partial \bar{T}}{\partial x}\right\rangle+\left\langle -\bar{u}\frac{\partial {T}^{{\prime} }}{\partial x}\right\rangle+\left\langle {-u}^{{\prime} }\frac{\partial {T}^{{\prime} }}{\partial x}\right\rangle+\left\langle {-v}^{{\prime} }\frac{\partial \bar{T}}{\partial y}\right\rangle+\left\langle -\bar{v}\frac{\partial {T}^{{\prime} }}{\partial y}\right\rangle+\left\langle {-v}^{{\prime} }\frac{\partial {T}^{{\prime} }}{\partial y}\right\rangle \\ +\left\langle {-w}^{{\prime} }\frac{\partial \bar{T}}{\partial z}\right\rangle+\left\langle -\bar{w}\frac{\partial {T}^{{\prime} }}{\partial z}\right\rangle+\left\langle {-w}^{{\prime} }\frac{\partial {T}^{{\prime} }}{\partial z}\right\rangle+\frac{{Q}_{{net}}^{{\prime} }}{\rho {C}_{p}H}+R$$
(1)

where $$T$$ is the mixed layer temperature; $$u$$, $$v$$, and $$w$$ represent the zonal, meridional, and vertical ocean currents, respectively; $${Q}_{{net}}$$ is the net surface heat flux comprised of shortwave, longwave, latent, and sensible heat fluxes; $$\rho$$ (=103 kg m−3) is the ocean water density; $${C}_{p}\,$$(=4000 J kg−1 K−1) is the specific heat of water; $$H$$ is the climatological mixed layer depth as a constant 50 m; $$R$$ represents the residual term; and the overbar and prime denote the climatological mean and anomalies, respectively. The analysis result is not sensitive to the constant H we choose, such as H = 70 m. $${-u}^{{\prime} }\partial \bar{T}/\partial x$$, $$-\bar{w}\partial {T}^{{\prime} }/\partial z$$, and $${-w}^{{\prime} }\partial \bar{T}/\partial z$$ denote the zonal advective feedback, thermocline feedback and upwelling feedback, respectively.

### Contingent (2-way) table and Pearson’s chi-square test

The contingency table, also known as a cross-tabulation or a two-way table, is usually used to present categorical data in terms of frequency counts. We use Pearson’s chi‐square test to examine independence between the row and column variables in the contingency table64. Pearson’s chi-square test statistic follows an asymptotic chi-square distribution with (r–1)Х(c–1) degrees of freedom when the row and column variables are independent. It is calculated as

$${\chi }^{2}=\mathop{\sum }\limits_{i=1}^{r}\mathop{\sum }\limits_{j=1}^{c}\frac{{\left({{{{\rm O}}}}_{{ij}}-{{{{\rm E}}}}_{{ij}}\right)}^{2}}{{{{{\rm E}}}}_{{ij}}}$$
(2)

where $${{{{\rm O}}}}_{{ij}}$$ is the observed value shown in the contingency table and $${{{{\rm E}}}}_{{ij}}$$ is the expected value. $${{{{\rm E}}}}_{{ij}}$$ is calculated as $${{{{\rm E}}}}_{{ij}}=\frac{{N}_{i}{N}_{j}}{N}$$, $${N}_{i}={\sum }_{j=1}^{c}{{{{\rm O}}}}_{{ij}}$$, and $${N}_{j}={\sum }_{i=1}^{r}{{{{\rm O}}}}_{{ij}}$$. where r and c are the numbers of rows and columns in the contingency table, $${N}_{i}$$ is the row total and $${N}_{j}$$ is the column total.

### Estimates of the forced and internal components

Many previous studies used the global-mean SST (GMSST) from the multi-model ensemble mean (MMM) simulation to represent the externally forced signal. It is a preferred way to define the forced signal, which has the same temporal evolution over all grid boxes because it is determined by the external forcing series65. This method is based on the principle that the internal variations among the ensemble runs are usually uncorrelated. Thus, after averaging over a large number of ensemble simulations, the uncorrelated internal variations among the ensemble runs were largely smoothed out65. To calculate the MMM, we averaged all the ensemble runs for each single-forcing simulation from all the models. The MMM was similar if we first average over the available ensemble runs for each model to derive the ensemble mean for each model and then average the ensemble mean over all the models with the same weight for each model. Then we calculate the GMSST time series, $${T}_{{GHG}},\,{T}_{{NAT}},\,{T}_{{AA}}$$ from the MMM of GHG, NAT and AA forcing, respectively. A combined index $${{aT}}_{{GHG}}+b{T}_{{NAT}}+c{T}_{{AA}}$$ is constructed based on a multiple linear regression of the observed GMSST time series onto $${T}_{{GHG}},\,{T}_{{NAT}},\,{T}_{{AA}}$$, where $$a,{b},{c}$$ are the regressed coefficients. We use the combined index to estimate the externally forced component of the observed GMSST. We removed changes associated with the externally forced component through linear regression from observed SSTs at each grid point. The residual SST fields primarily contain unforced internal variations65.

### Linear regression method

Since there is a good relationship between the background zonal SSTA gradient and the frequency of different types of El Niño occurrence, for example, frequent occurrence of extreme and CP El Niño events is observed in the period of 1875–1905, 1930–1949 and 1981–2017 coincide with the enhanced zonal SST gradient, while in the other period (from 1905 to 1930 and from 1950 to 1980), EP El Niño events are dominant with a weakened zonal SST gradient. Here, we attempt to estimate the relative contribution of EX- and IV-induced zonal gradients to the El Niño type changes in the last 40 years. We use the background zonal SSTA gradient as predictor $$X$$ and the occurrence ratio of the sum of extreme and CP El Niño events to total El Niño events in one period as response variable $$Y$$

$$Y=\,{\beta }_{0}+{\beta }_{1}X$$
(3)

We used the merged SST dataset shown in Fig. 2a and the HadISST1, ERSSTv5, and Kaplan SST datasets shown in Supplementary Fig. 2, and for each dataset, we calculated the mean zonal SSTA gradient in five periods (1875–1905, 1906–1929, 1930–1949, 1950–1980, 1981–2017) and the occurrence ratio of the sum of extreme and CP El Niño events to total El Niño events in the five periods, so we obtained N = 20 and used them to calculate the intercept $${\beta }_{0}$$ and slope $${\beta }_{1}$$. Then, we calculate the IV-induced and IV + EX-induced mean zonal SSTA gradients from 1981 to 2017 and use them to estimate the IV-induced and IV + EX-induced occurrence ratio of the sum of extreme and CP El Niño events to total El Niño events. A total of 11 El Niño events occurred during the period 1981 to 2017, so we could further estimate the IV-induced and IV + EX-induced occurrence of the sum of extreme and CP El Niño events in this period. The difference between IV + EX-induced and IV-induced occurrence of total extreme and CP El Niño events in this period is used to estimate the EX-induced occurrence of total extreme and CP El Niño events. We also applied the above analysis to extreme, CP, and EP El Niño events.

### Statistical analyses

Statistical significance tests were based on Student’s t test with reduced degrees of freedom66. We calculate the effective degrees of freedom (EDF) using the following equation:

$$\begin{array}{c}\frac{1}{N*}\approx \frac{1}{N}+\frac{2}{N}\mathop{\sum}\limits_{j\,=1}^{N}\frac{N-j}{N}\rho {xx}\left(j\right)\rho {yy}\left(j\right)\end{array}$$
(4)

where $$N$$ is the sample size, and $$\rho {xx}\left(j\right)$$ and $$\rho {yy}\left(j\right)$$ are the autocorrelations of the two sampled time series x and y at time lag j. Here, we follow ref. 67 and use the above equation but without the weighting function (N - j)/N.

We use a bootstrap test to examine whether the number of extreme El Niño/CP El Niño events is statistically different in the positive phase of AMO and negative phase of AMO. We use Monte Carlo bootstrapping to estimate the probability density function (PDF) of the number of the 21-year sliding frequencies of extreme El Niño events and the number of the 21-year sliding frequencies of CP El Niño during positive and negative phases of the AMO. The resampling procedure was repeated 10 000 times. The 2.5 and 97.5% rankings from the probability distribution function indicate the 95% confidence level. The number of extreme El Niño increased in positive phase of AMO compared to negative phase of AMO is statistically significant above the 95% confidence level if distributions of number of extreme El Niño during positive phase of AMO are well-distinguished from that during the negative phase of AMO.