Abstract
Atlantic multidecadal variability is a coherent mode of natural climate variability occurring in the North Atlantic Ocean, with strong impacts on human societies and ecosystems worldwide. However, its periodicity and drivers are widely debated due to the short temporal extent of instrumental observations and competing effects of both internal and external climate factors acting on North Atlantic surface temperature variability. Here, we use a paleoclimate database and an advanced statistical framework to generate, evaluate, and compare 312 reconstructions of the Atlantic multidecadal variability over the past millennium, based on different indices and regression methods. From this process, the best reconstruction is obtained with the random forest method, and its robustness is checked using climate model outputs and independent oceanic paleoclimate data. This reconstruction shows that memory in variations of Atlantic multidecadal variability have strongly increased recently—a potential early warning signal for the approach of a North Atlantic tipping point.
Introduction
Since the early 20th century, the North Atlantic region has exhibited successive decades of anomalously warm and cold sea surface temperatures^{1} (SSTs) relative to the global average, effectively amplifying or damping the effects of global warming in the North Atlantic^{2,3}. The underlying mode of variability, the Atlantic Multidecadal Variability (AMV), formerly named the Atlantic Multidecadal Oscillation^{1}, has been linked to a variety of climate effects^{4,5}, including drought and precipitation in the Sahel^{5,6}, northeastern Brazil^{5}, and central Asia^{5,6}; hurricane frequency and intensity in the Atlantic^{6,7}; sea ice thickness and extent in the Arctic^{8}; and climate variability in the Pacific^{9}. Therefore, understanding AMV drivers is crucial to accurately predict its future changes and related global and regional impacts.
However, the mechanisms driving the AMV and its dominant timescales of variability remain highly debated. Disagreements stem primarily from the relatively short period of approximately 150 years over which the AMV is directly observed—a time span that includes large anthropogenic climate variations. Early studies considered AMV to be mainly an internal mode of variability, in part because SST variations in the North Atlantic were larger than at a global scale, with a strong variability on multidecadal timescales^{4,5,7,10}. However, it has been shown that the last observed period of cold AMV (∼1965–2000 C.E.) coincides with both strong anthropogenic aerosol emissions from Europe and North America^{11,12} and strong volcanic activity^{12}, which are both associated with cooling via their radiative impacts. Nevertheless, the assumption that North Atlantic SST (NASST) variations are purely externally driven is challenged by the observed characteristics of the subsurface ocean in the tropical Atlantic, which were not correctly represented in simulations, including regional aerosol forcing^{13}.
Internal ocean variability has been identified in coupled Atmosphere–Ocean General Circulation Models (AOGCMs) as a potential key driver of the AMV, mainly through variations in the Atlantic Meridional Overturning Circulation^{13,14,15,16,17} (AMOC), a largescale ocean circulation component transporting warm surface waters northward in the North Atlantic^{13}. The complexity and the large threedimensional spatial extent of the AMOC make it more difficult to measure than surface ocean properties. Direct observations of its strength have only been available since 2004 and at one specific latitude (i.e., 26.5°N)^{18,19}. Estimating the potential influence of the AMOC on surface ocean conditions would, therefore, help to understand the AMOC’s behavior over larger timescales. In this respect, the link between AMOC and AMV has been highlighted by a wide range of AOGCMs and experiments^{13,14,15,16,17}. This connection is often explained by the interhemispheric oceanic heat transport of the AMOC^{10}, which affects the temperature of the entire North Atlantic. Nevertheless, the dominant timescales of the variability of both AMV and AMOC vary greatly across these simulations^{20,21}. This raises legitimate questions about the precise process underlying this mode of variability, as well as how to reliably compare model simulations and observations of the AMV.
In a context where variations in greenhouse gas and aerosol concentrations were smaller than over the historical period, several long paleoclimate records^{22,23} and last millennium climate simulations^{17} also support the existence of multidecadal variability related to the AMOC during the preindustrial era. Nevertheless, even for these long periods, the role of internal and external factors in producing AMV variations is also debated^{21,24}.
Although modeling studies and paleoclimate data suggest that AMOC has an effect on the multidecadal NASST variations^{13,14,15,16,17,22,23}, it is unclear how much the observed NASST is also influenced by external factors. To address this issue, studies have proposed removing external radiative signals when defining the AMV to isolate the intrinsic component of NASST variations driven by AMOC heat transport^{2,7,25}. As a result, the current study will concentrate on the latter AMV definition, excluding the direct radiative effects of external forcings by design. To explore the sensitivity of commonly used approaches to remove the radiative effects of external forcings, we will consider three different AMV indices, which are designated as AMV_{T}^{2}, AMV_{TS}^{7}, and AMV_{F}^{25} (“Methods” and Fig. 1a), where the subscripts T, TS, and F refer to associated references^{2,7,25}. In contrast to simple detrending methods^{25}, which can leave spurious signals, the forced variability removed in these AMV indices is expected to account for the evolution of greenhouse gas and aerosol concentrations over the historical period, using different approaches (see “Methods”).
The debate on the driving factors of the AMV also leads to uncertainty concerning its spectral properties. The 150year record of instrumental SST observations does not allow for the identification of consistent frequency bands at multidecadal timescales. Those two limitations (blurred drivers and short timescale) can explain the discrepancies between the AMV periodicity evaluated from instrumental observations^{1} (50 to 70year timescale) and the predominant 10 to 30year periodicity found in most of the preindustrial AOGCM simulations from the Climate Model Intercomparison Project Phase 5 (CMIP5)^{20}. A recent study based on 16 simulations from eight CMIP5 AOGCMs found that transient last millennium simulations including volcanic forcings consistently produce a spectral peak of 50–70 years in global mean surface temperature (GMST), while such a modelwise spectral consistency was not observed for preindustrial control simulations^{21}. As a result, they proposed that volcanic eruptions alone could explain this frequency peak in the AMV in historical and paleoclimate data^{21}. However, the results from control simulations of the aforementioned study^{21} contradict findings from another AOGCM ensemble study, which used a much larger set of preindustrial control simulations from up to 39 CMIP6 models and found that most of them show a robust relationship between the North Atlantic and GMST variability at interdecadal to multidecadal timescales^{26}. Furthermore, an analysis of ice cores and marine archives also calls the hypothesis of a unique role of volcanic eruptions in shaping 50–70year AMV peaks^{19} into question, since it revealed that the 55–70year AMV timescale was essentially driven by the AMOC and atmospheric variability for the majority of the Holocene^{27}.
Aside from the fact that the AMV may be strongly linked to AMOC variations^{13,14,15,16,17,27}, the AMOC is also wellknown for its nonlinear dynamics and is considered as potentially unstable^{28}. However, the AMOC fate in future climate projections is highly uncertain^{29}. Regarding observations, while a recent study based on subsurface densities indicated that there was no discernible trend in AMOC fluctuations over the last 30 years^{30}, other studies have suggested that its current intensity is abnormally weak when looking over a longer timeframe.^{31,32,33,34}. Indeed, an estimate of the AMOC strength based on SST observations suggests that it may have reached an anomalously weak state these last decades^{31}, and some paleoreconstructions even show it to be a record low for at least a thousand years^{31,32,33,34}. Other recent studies^{35,36}, however, have revealed that the strength of ocean currents varies widely between regions, ocean depths, and time, casting doubt on the hypothesis of a recent major basinscale change in the AMOC derived from a relatively small number of oceanic cores^{31,33}. In terms of the nearterm future, evidence of significant Early Warning Signals (EWSs) for an upcoming tipping point has been recently shown for several indices of the AMOC based on observed surface salinity or temperature of the last ~150 years^{37}. However, it is difficult to determine the robustness of these EWSs, especially because of the short timeframe over which they are calculated^{38}. In this study, we investigate an AMOC EWS but on a much longer timescale using a reconstruction of one of its surface fingerprints, namely the AMV. In the formalism of dynamical systems theory, EWSs were developed to anticipate critical transitions of a system’s internal dynamics as a result of an external perturbation, typically in the form of strong changes in persistence, variance, or skewness^{39}. This definition supports our attempt to track a potential approaching tipping point of the AMOC using the AMV definitions depicted above, as it is expected to retrace variability in ocean dynamics rather than external radiative forcings acting on NASST.
Given the lack of SST measurements on sufficiently long timescales, one way to reconstruct the AMV is to rely on its observed climate impacts and fingerprints in specific paleoclimate proxy records. To document these fingerprints, we performed a regression analysis between (i) the mean of the three aforementioned AMV indices (without direct radiative effects) computed from HadISST^{40} (back to 1870 C.E.), and (ii) the gridded fields of surface air temperature and precipitation from CRU TS4^{41} (Fig. 1b–k). Overall, the AMV fingerprints obtained from this regression analysis are consistent with the findings from previous studies^{4,5,6,7}. It confirms that the temperature anomalies associated with the AMV are not limited to the North Atlantic region. Indeed, AMV is also associated with temperature changes in southeast Asia, Africa, the Middle East, and western North America (Fig. 1b–k). The AMV fingerprints on precipitation show that its positive phases are associated with more intense summer precipitation over West Africa and drought over western North America and northeastern South America. We also find significant precipitation anomalies over Eurasia (Fig. 1b–k), though with a more scattered and seasonally dependent pattern. These fingerprints are in agreement with modeling studies^{9,42} that highlight the widespread and seasonally varying teleconnection patterns of the AMV, including the remote impact over Asia (Fig. 1b–k).
We now focus on AMV variations during the last millennium, a period with a relatively good spatial distribution of highresolution precipitation and/or temperaturesensitive proxy records, making it a very suitable timeframe for producing paleoclimate reconstructions at annual resolution. Previous attempts to reconstruct the AMV over the last millennium have relied on models with paleoclimate data assimilation^{43}, paleoclimate data from a single or a few sites^{23,27}, or statistical regression methods with multiple proxy records as predictors^{22,44}. Our study will focus on the latter type of approach (i.e., regression) to extend back the AMV. In this respect, two previous studies^{22,44} reconstructed the AMV over the last millennium using a principal component regression method and continental proxy data located in North Atlantic bordering regions. However, these statistical models were calibrated using raw NASST anomalies^{22,44}. As a result, the obtained reconstructions inevitably included the direct radiative effect of external forcings^{44}, and thus does not correspond to the AMV as defined in the present study^{2,7,25}. In addition, former reconstruction studies^{22,44} do not include any statistical tests for the selection of the proxy records, which means that some of them may not be proper predictors of the AMV. The present study thus relies on statistically objective methods in the selection of proxy data^{45} from the PAGES 2k database^{46} (“Methods”).
In contrast with previous studies that focused on a single reconstruction, we here use an ensemble of 312 reconstructions to investigate the sensitivity to (i) the target observed AMV index, (ii) the regression method, and (iii) the calibration period over which the statistical models are constructed^{45}. Each of these reconstructions is evaluated with appropriate validation scores^{47} (“Methods”), which are then used to identify the bestperforming one that is selected as the final reconstruction. This bestperforming reconstruction is subsequently validated using climate model outputs^{48} and independent ocean records^{46}. We later present a spectral analysis of its variability and explore its potential modulation by radiative effects of natural external forcings (solar and volcanic) over the last millennium. Finally, we show a significant EWS in our AMV reconstruction, indicative of the approach to a climate tipping point, and we use millennialscale climate simulations to illustrate that the EWS metric applied to the AMV can be a relevant indicator for detecting an EWS in the AMOC.
Results
Optimal selection of the regression model and validation
The present study compares 312 reconstructions produced by combining 4 regression methods (principal component regression, random forest, partial least squares, and elasticnet, Supplementary Note 1), 3 definitions of the AMV index (AMV_{T}^{2}, AMV_{TS}^{7}, AMV_{F}^{25}, “Methods”), and 26 different time windows over which the regression model is constructed (Methods). The reconstructions are performed with the ClimIndRec^{45} toolbox, which evaluates their performance with Coefficient of Efficiency scores (S_{CE}) and optimizes the specific control parameters of each regression method by kfold crossvalidation^{45,49} (see “Methods”).
Each reconstruction covers the preindustrial period up to 850 C.E. and is made from a selection of proxy records significantly correlated (at the 95% confidence level) with the observed AMV index. This selection is realized separately for the three AMV definitions and timeframes investigated (“Methods”). The year 850 C.E. was chosen since it is the starting point for the last millennium climate simulations. Such simulations will be used later to further evaluate and validate our reconstruction in a pseudoproxy framework^{48}. The observed time series for the three AMV indices all reach back to 1870 C.E., the first year for which gridded SST observations are available^{40}. The selection of the proxy records is made from a large set of annually resolved proxies from the Northern Hemisphere, which includes the PAGES 2k database^{46} and 41 others^{45} (Supplementary Table 1). This database (hereafter P2k + ) consists of 457 records and was constructed using different quality criteria (Methods). An important preprocessing step is to remove from each P2k+ proxy record an estimate of the forced variability of NASST (“Methods”). This approach is one of the main differences with respect to previous NASST/AMV reconstructions^{22,44} which mixed forced and internal variability signals. Here, to account only for the internal AMV signature in the proxy records, the component of the radiative effects from external forcings is first estimated using a signaltonoise maximizing empirical orthogonal function technique^{2} (“Methods” and Supplementary Fig. 1) based on NASST time series from the average of historical simulations of 32 CMIP5 models (Supplementary Table 2). It should be noted that the forcing component estimated from CMIP5 historical simulations is preferred here over that estimated from CMIP6 models because a recent study found that CMIP6 historical simulations may overestimate the North Atlantic’s response to changes in aerosol concentrations^{50}. Nevertheless, the inclusion of small to moderate eruptions in CMIP6 forcings largely influences the GMST response compared to CMIP5 simulations and might have implications for simulated climate variability^{51}. As a result, we tested the robustness of our results to the uncertainty in the CMIP generation used by reproducing the various reconstructions and analyses of this study, but estimating the NASST forced component from the CMIP6 historical simulations instead (Supplementary Table 3 and Supplementary Figs. 2–5).
To evaluate and compare each of the 312 reconstructions produced, we compute S_{CE} scores^{48} over 30 pairs of randomly drawn training/testing periods, each of which is a partition of the learning period under consideration^{45} (i.e., the entire period over which the statistical model is evaluated and constructed, “Methods”). This S_{CE} metric quantifies the predictive ability of the reconstruction methods over the testing periods^{45,47}. When the S_{CE} is significantly positive for a given testing sample, it indicates that the statistical model provides better estimates than the empirical mean of the AMV observations from the testing sample^{45,47} (“Methods”). Among the 312 reconstructions, the random forest^{52} (RF) method provides the reconstruction with the highest mean S_{CE} score, although it is not necessarily the best method when considering the average of the scores by method for all the reconstructions produced (Supplementary Fig. 6). The best mean S_{CE} score calculated over testing periods is obtained when RF is applied to the AMV_{F} index, for the 850–1987 reconstruction period (i.e., with the 1870–1987 learning period). For this specific reconstruction, the average S_{CE} score over 30 training/testing splits of the learning period is positive at the 99% confidence level, validating its use for reconstruction purposes^{47} (med(CE) = 0.25, mean(CE) = 0.23, “Methods”).
A shortcoming of the four regression methods considered in this study is that they do not permit the use of paleoclimate records with a temporal gap over either the learning or reconstruction periods. To overcome this limitation, the reconstruction setup associated with the highest S_{CE} scores for the period up to 850 C.E. was subsequently applied in a nested reconstruction framework^{44}. This entails gradually reconstructing the AMV_{F} index over time (1 year at a time, starting with the oldest) by constructing a new RF model that includes newly available proxies at each timestep^{44} (“Methods”). This nested reconstruction procedure stops in 1869 C.E., the year before the observed AMV indices start.
To gain more confidence in the trustworthiness of this final reconstruction, we have used, in addition to the standard evaluation metrics within the reconstruction framework^{45} (“Methods”), three independent validation approaches:

1.
A benchmark nested reconstruction was produced using randomly generated red noise processes as predictors, with similar spectral characteristics than the actual proxy records^{53}. This reconstruction was made with the same number of predictors, target AMV index (AMV_{F}), regression method (RF), and reconstruction period (850–1987) as the final reconstruction (see Supplementary Fig. 7 and “Methods” for further details).

2.
A comparison with independent ocean proxy records from the Ocean2k database^{46} with multiannual to multidecadal resolution, in order to assess whether our reconstruction agrees with those slowly varying oceanic changes (see Supplementary Fig. 8, “Methods” and Supplementary Note 2 for further details).

3.
Two pseudoproxy experiments based on climate model outputs^{48} to assess (i) the robustness and reliability of the reconstruction method, and (ii) the potential effect of nonstationarities in the connection between the proxies and the AMV (see Supplementary Figs. 9 and 10, “Methods”, and Supplementary Note 3 for further details).
In addition, we independently performed a NASST reconstruction where no radiative forcing effect is removed from the predictand (NASST) and predictors (proxy records) with the same optimal model selection (Methods, Supplementary Fig. 11). The purpose of this additional reconstruction is to check whether our reconstruction procedure for the AMV consistently produces no direct response to external radiative forcings.
Selected proxy records for the best reconstruction
The nested RF reconstruction of the AMV uses a total of 55 proxy records from the Northern Hemisphere (Fig. 2a, b) selected based on their correlation with the AMV_{F} index. Their final weights, given by RF variables’ importance (“Methods”), and their temporal availability, are shown in Fig. 2c, d. We identify three main groups of records with similarly distributed weights: Central Asia, Europe, and Western North America. The relatively low number of European proxies contrasts with previous regressionbased reconstructions that preselected proxies solely on the basis of their geographic proximity to the North Atlantic^{22,44}. The fact that only five proxy records from Europe are used could be due to their reduced presence in the proxy record database we consider (<10% of the total of P2k + ), or could simply indicate that the climate signal and seasonality of most European proxies are not representative of the AMV signatures (Fig. 1). Among the five selected European proxies, one of them has a large weight (>3.5%) and interestingly covers the whole reconstruction period (Fig. 2c, d). It corresponds to a time series of tree ring growth measurements in the European Alps^{54}, which is strongly correlated with summer instrumental temperature over the historical period (r = 0.7, P < 0.01, Supplementary Table 4). The other four European proxy records are related to summer or annual temperature and precipitation, and are thus in agreement with the welldocumented fingerprints of the AMV on European summer temperatures^{4,5}, also highlighted here in Fig. 1. The proxy data from Asia and western North America are strongly represented in the PAGES 2k^{46} database, which may partly explain their relatively large presence in the reconstruction. The selected proxy records from western North America are most sensitive to summer and annual variations in temperature and precipitation (Fig. 2c, d and Supplementary Table 4), although the correlation remains moderate over the instrumental period in this region (Fig. 1). Interestingly, the reconstruction also selects several strongly weighted proxy records of annual and boreal summer (June–July–August) temperature over eastern Pakistan and the Tibetan Plateau. This link is supported in some recent studies that have highlighted the role of AMV variations for spring and summer temperatures in this region, through the impact of the AMV on largescale atmospheric pressure gradients in the Eurasian sector^{55,56}. The reconstruction further includes a large number of proxy records from East Asia/North China, for which the climate conditions have also been shown to be significantly affected by AMV variations through atmospheric Rossby waves propagation and heat advection changes in the western Pacific^{57}. Several studies showed that the tropical Pacific and the AMV are closely linked through the propagation of largescale equatorial waves and air–sea interactions, also impacting climate in eastern Asia^{9,57}. Thus, although previous AMV reconstructions did not consider proxy records from Asia^{22,44}, we argue that there are good physical and objective reasons for their selection here^{9,55,56,57}.
A detailed description of each proxy record used for our reconstruction and their correlations with the instrumental data and AMV is given in Supplementary Table 4. Their locations are shown on top of the maps from Fig. 1 in Supplementary Fig. 12.
Response to external forcing of the reconstructions
We now investigate the response of the AMV to relevant natural external forcings before the industrial era, namely aerosol emissions from volcanic eruptions and changes in solar irradiance. Because by construction, the AMV was separated from the direct radiative effect of the external forcings^{25}, the analysis focuses on responses that are either nonlinear or delayed. The analysis is complemented with a characterization of the fully forced responses of NASST, for which reconstruction has been performed in the same way (i.e., nested approach and methodological choices) as the AMV one but without removing any forced effect from the predictand (NASST) and predictors (proxy records). The best NASST reconstruction produced this way correlates significantly with the AMV reconstruction (r = 0.64; P < 0.01, Fig. 3a, “Methods” and Supplementary Fig. 11).
Using a recent reconstruction of volcanic activity^{58}, we perform a superposed epoch analysis^{45} (Methods) on the NASST and AMV reconstructions to characterize their radiative response to the 10 largest eruptions of the last millennium (Supplementary Table 5). While the NASST reconstruction shows a decadallong response after the eruption, similar to the result from a previous study^{44}, no significant response is found for the AMV reconstruction (Fig. 3b). The volcanic forcing might affect the AMV with a time lag, at multidecadal timescales, through internal variability excitation and dynamical oceanic response^{59}. However, such a lagged response is not detected in Fig. 3b. Individual responses of the AMV to each volcanic eruption are presented in Supplementary Fig. 13.
Concerning solar forcing, neither the 10year filtered time series nor the 30year filtered time series of the total solar irradiance (TSI) reconstruction used in the PMIP3 protocol^{60} is significantly correlated with our 10year filtered AMV reconstructions, even when solar forcing leads by a few years (r = 0.23, P > 0.2, lag = 12; r = 0.32, P > 0.2, lag = 13; respectively, where r stands for the maximal crosscorrelation across possible lags, expressed in years, cf. Supplementary Fig. 14). On the other hand, the 10year and 30year filtered time series of the TSI reconstruction do show a modest correlation with the NASST reconstruction when the TSI leads by 13–14 years (r = 0.5–0.52, 0.1 < P < 0.2).
Our results thus suggest that natural external forcing does not have a direct influence on the phasing of the AMV, while, as expected, it could play a direct role on variation of the NASST reconstruction. Therefore, the volcanic forcing is not found to act as a pacemaker for the AMV, at least for the ten largest volcanic eruptions of the last millennium.
Spectral analysis
The wavelet analysis in Fig. 4 shows that the AMV reconstruction exhibits significant multidecadal variations. It varies primarily in the 20 to 90year band, except for the 1400–1800 period, which is dominated by shorter cycles (20 to 40year band). Thus, the 50 to 70year preferential periodicity suggested by observations since 1870^{1} may not be systematic. The wide range of AMV preferential timescales can also be found in control simulations of climate models^{26,61} and was also obtained in an AMV reconstruction based on data assimilation^{43}.
To investigate the robustness of these results, the spectral analysis was also extended to the best 30 reconstructions in terms of S_{CE} scores (Supplementary Table 6). We notice that none of these 30 reconstructions’ spectral characteristics suggest a frequency range as narrow as the 50 to 70year timescales, as suggested by the short instrumental period^{1} (Supplementary Fig. 15). Instead, all reconstructions also show a dominant variability in the lower part of the multidecade range (e.g., 20–50 years).
An early warning signal for an approaching tipping point
It has been shown that detecting the approach of an AMOC regime shift in climate models may require knowledge of hundreds of years of AMOC variations^{38}, which is far from what direct observations at 26.5°N, currently covering less than 20 years^{18,19}, provide. In this regard, the evaluation of ocean surface indices as reliable estimates of AMOC variability is debated in the climate community. Indeed, while the heat transport associated with the AMOC can certainly contribute to changes in surface ocean properties^{10,13,14,15,16,17}, NASST also responds strongly to the direct radiative effect of external forcings^{11,12,21,24,44} (Fig. 3b). However, our reconstruction is, by construction, expected to be largely controlled by the dynamics of North Atlantic ocean circulation, and notably of the AMOC^{10,13,14,15,16,17,27}. Indeed, due to its deep ocean aspect, the AMOC may not respond to direct variations in radiative forcings, which is consistent with our AMV reconstruction (Fig. 3b and Supplementary Fig. 14) that we expect to be suitable to retrace AMOC variations. However, delayed and nonlinear dynamical response of the AMOC to other kinds of externally forced perturbations, such as freshwater input in the northern North Atlantic, may still be detected since it is not being removed in AMV indices.
To verify the linkages between the AMOC and AMV, their simulated indices are first investigated in 12 members of CESMLME^{62}. To date, the CESM climate model is the only one that provides such a large number of simulations for the last millennium, making it an invaluable framework for studying this period^{62}. Its use here is particularly relevant because historical runs are too short to study multidecadal relationships between time series. In addition to the simulated AMV_{F} indices for the last millennium extracted from CESMLME for the pseudoproxy experiment (“Methods” and Supplementary Note 3), we also calculated the corresponding AMOC indices. To quantify the AMOC strength in simulations, we used the maximum stream function value below 500 meters and between 20°N and 65°N (AMOC_{max}). Figure 5 shows, for each CESMLME member, the maximum lagged correlations when the AMOC_{max} index leads the AMV_{F} (using decadally smoothed time series). The associated crosscorrelation functions are presented in Supplementary Fig. 16. We find that 10 out of 12 CESMLME members show maximum and significant correlations at the 90% confidence level when the corresponding AMOC_{max} leads by 4–8 years. Among these members, the confidence level reaches 95% for 9 of the members and 99% for 4 of them. Similar percentages and lead times are obtained when other alternative AMOC indices are considered (Supplementary Figs. 16 and 17). The discrepancies observed among members in terms of significance level demonstrate the relevance of using an ensemble of simulations to properly study the relationships between climate phenomena, especially for such a long timescale and when using a single AOGCM^{62}. These discrepancies may reflect the different internal variabilities sampled among the ensemble members^{62}.
We test whether an AMV reconstruction can detect a potential AMOC tipping point, using the relationships between the AMV and the AMOC found in CESMLME (Fig. 5) and various other model simulations in previous studies^{10,13,14,15,16,17}. We employ a tipping point detection metric, the critical slowing down EWS—a type of EWS which has previously been used in studies based on simulated AMOC^{38,39} or observed AMOC indices based on surface observations^{37}. Dynamically, this approach assumes that a given system might be approaching a bifurcation if its memory grows with time, i.e., the state of the system at time t + 1 becomes increasingly dependent on the state of the system at time t. In other words, as the system’s memory grows, so does the time it takes to recover from variations. In time series analysis, an increasing memory of the system is reflected by an increasing firstorder autoregressive (i.e., AR(1)) coefficient^{39}. Kendall τ statistics are calculated for the AR(1) coefficients to estimate their evolution in time for different sliding window lengths^{38,39} (WL, WL = 200–400 years with a 50year increment) in order to evaluate such an EWS in the AMV reconstruction. Here, the Kendall τ statistic quantifies the AR(1) coefficients’ temporal evolution as the ranked correlation between the AMV’s sliding AR(1) coefficients and time (“Methods”). Although this EWS approach has been shown to be successful in an earth system model of intermediate complexity^{38,39}, there are still some sources of uncertainties about its application, such as the lack of information on the exact time when the tipping point is crossed, as well as the Kendall τ statistic’s dependence on the time span it is calculated on.
This EWS test applied to our “real” paleoclimate AMV reconstruction indicates a highly significant increase in the AMV’s memory over the recent period for all window lengths: \({\tau }{\in }\)[0.47,0.55] (P < 0.01 for all, Fig. 6a, “Methods”). According to the theory of tipping point detection in climate time series^{39}, this is a robust estimate, as it is based on sufficiently long observations^{38}, that the AMV may now be approaching a tipping point after which the Atlantic current system might undergo a critical transition. Figure 6a also shows that the most recent values of the AR(1) have seen a particularly sharp increase. Indeed, when only the last 200 years of the timevarying AR(1) coefficients are considered, they reach much higher levels of increase: τ\({\in }\)[0.86,0.94] (P < 0.01 for all WL, Fig. 6a, “Methods”). The top 30 reconstructions (Supplementary Table 6) also suggest a significant EWS, regardless of the length of the WL considered (τ\({\in }\)[0.39,0.77] for all 90 tests, P < 0.01 for 89 of the 90 tests, Supplementary Table 7 and Supplementary Fig. 18), showing that this could not be found by chance on the single reconstruction with the highest S_{CE} scores.
To determine whether the AMVbased EWS can be interpreted as an EWS for the AMOC, we also provide a similar analysis performed on those CESMLME simulations for which a significant lagged AMOC/AMV relationship was identified in Fig. 5. We computed EWS AR(1) statistics from the CESMLME simulations for the AMOC and the AMV_{F} indices, for three different window lengths^{39} (WL = 200, 300, and 400). The Kendall τ statistics obtained for the AMV_{F} index are, for the most part, consistent with those obtained for the AMOC_{max} index (Fig. 6b–d). According to the linear regressions between the AMOC_{max} and AMV_{F} statistics (Fig. 6b–d), AMV may slightly overestimate the AMOC autoregressive properties, but it accurately captures these changes on a millennialscale (at least P < 0.1 for WL = 200 to 400, Fig. 6b–d). Using these linear regression models computed in CESMLME between the AMV_{F} and AMOC_{max} indices (Fig. 6b–d) and the EWS statistics calculated from the AMV reconstruction (Fig. 6a), we estimate the Kendall τ of AR(1) coefficients from the last millennium AMOC to be 0.37, 0.34, and 0.39 for WL = 200,300,400, respectively (P < 0.01 for the three; see “Methods”).
Discussion
The AMV reconstruction proposed here is based on robust optimization and model selection techniques for statistical regression^{45,47,49}. This AMV reconstruction of the last millennium is based on a total of 55 proxy records^{45,46} (Fig. 2) and has been validated against a reconstruction based on surrogate time series with the same persistence as the proxy records^{53} (Supplementary Fig. 7), AOGCM simulation outputs of the last millennium^{62} (Supplementary Figs. 9 and 10), and independent ocean proxy records^{46} (Supplementary Fig. 8).
This AMV reconstruction provides an indirect estimate for past variability in the ocean circulation and notably the AMOC, as verified in last millennium model simulations^{17} (Figs. 5 and 6b–d). As previously stated, the AMV in this study is reconstructed in such a way that the direct radiative influence of external forcings is not included. However, delayed effects of the forcings on the AMV, in terms of intensity, persistence, or changes in spectral characteristics, may remain as a result of its interactions with other climate components responding to forcings. On top of the absence of the direct radiative effect of natural forcings, the AMV reconstruction varies in a wide frequency band (20–90 years, Fig. 4), which is larger than the variability estimated at 50–70 years from observations^{1}, but agrees with data assimilation^{43} and modeling^{26,61} studies.
We also found a significant EWS for an approaching tipping point based on this AMV reconstruction (Fig. 6a). Tipping occurs as a critical threshold, or tipping point, is crossed, leading the system into a different dynamical regime^{39}. In this context, while our AMV reconstruction does not show a consistent radiative response to natural forcings (Fig. 3 and Supplementary Figs. 13 and 14), it may primarily reflect changes in the AMOC internal variability caused by the forcings (Figs. 5 and 6b–d). It must be emphasized that the critical slowing down EWS we employ may generate false alarms for a variety of dynamical systems^{63,64}. Nevertheless, in contrast to the case of the AMV/AMOC with, e.g., underlying and diffuse anthropogenic greenhouse gas forcing causing strong and persistent freshwater anomalies in the northern North Atlantic, false alarms may especially occur for systems that are not prone to tipping^{63}. Therefore, based on the relationship found in models between the AMV and AMOC at decadal timescales^{13,14,15,16,17} (Figs. 5 and 6b–d), relating the EWS found in the AMV with potential large upcoming changes in the AMOC, a wellknown tipping element of the climate system^{28}, appears to be a reasonable and consistent assumption.
Future observations of the actual AMOC strength are still required to draw a reliable picture of its exact impact on the AMV, as this assumption is still primarily based on AOGCM simulations^{13,14,15,16,17} (Figs. 5 and 6b–d). As a result, more observationbased evidence of the precise AMOC impacts on ocean surface properties remains crucial in understanding the extent to which EWSs observed at the ocean surface^{37} (Fig. 6a) can provide robust insights into future AMOC variations. Consequently, the discovered AMV EWS could be a sign of an approaching bifurcation of a more localized feature of the AMOC, such as an abrupt convection weakening in the North Atlantic subpolar gyre region, as observed in some future projections from both the CMIP5^{65} and CMIP6^{66} AOGCM ensembles.
The debate over the fate of the AMOC has intensified recently^{30,36}, as evidenced by the fact that many paleoclimate records are inconsistent with the current strength of the AMOC being below normal^{35,36}, and subsurface density observations extending back 30 years show no consistent trend in the AMOC strength over this period^{30}. A study^{67} recently critiqued these various statements, arguing that some paleoclimate proxies that do not show an AMOC decline^{35} may not necessarily reflect its variations. Moreover, this study^{67} illustrated that the absence of a trend in the AMOC over the last 30 years^{30} does not contradict its longer estimates based on a paleoclimate reconstruction^{32} and observations^{31} of SST fields. The growing evidence from various sources that a strong AMOC state change might be ongoing^{31,32,33,34,37} must be taken seriously in the panel of plausible scenarios in order to better communicate the potential consequences this implies and thus better support decision making^{68}. Indeed, such a welldocumented AMOC state bifurcation could have significant climate and ecological consequences^{28,69,70}. Therefore, this finding emphasizes the importance of taking into account the potential consequences of such a large change in the Atlantic in climate adaptation plans^{68}.
Methods
Estimation of NASSTs forced component using signaltonoise maximizing EOF
The forced component of NASST (all North Atlantic grid points between latitudes 0° and 60°N) is estimated from historical simulations of 32 coupled climate models (Supplementary Table 2). NASST time series from the 32 simulations are extracted and merged as columns of the same matrix. Using a principal component analysis of the latter matrix, the first principal component is retained as the estimated forced component of NASST (Supplementary Fig. 1).
Instrumental AMV indices
Historical AMV indices have been calculated using annually resolved observations of SST data from the Hadley Center dataset (HadISST) for the period 1870–2019. Three instrumental AMV indices are defined as the spatially averaged SST over the North Atlantic region, differing in the way the externally forced signal is removed, either beforehand or afterward.
The first index (AMV_{TS}, Fig. 1a) uses the global averaged SST anomalies between 60°S and 60°N as a proxy for the externally forced signal that is subtracted from NASST.
A second index (AMV_{T}; Fig. 1a), is built using climate model historical simulations to isolate the forced component in NASST. It is calculated with a signaltonoise maximizing empirical orthogonal function (“Methods”) that is then removed by linear regression from an estimate of its 10year smoothed effect at each grid point of the North Atlantic. The AMV_{T} index is then obtained as the average of regression residuals over the North Atlantic region.
Finally, the AMV_{F} is obtained as the spatial average of the North Atlantic time series residuals obtained after regressing out the tenyear moving average of the global mean SST between 60°S and 60°N.
Only after being reconstructed, each AMV index has been smoothed with a 10year kernel filter, following the classical AMV definition.
Proxy records database
To select the proxy records of this study, a first large dataset is made by merging the PAGES 2k database (686 records) with other proxy records in neighboring continents of the North Atlantic published by different studies (Supplementary Table 1). Duplicates from the three sources are removed leading to a final database of 727 proxy records (P2kALL). The proxy records from P2kALL used in the reconstruction have been selected to fulfill the following conditions: (i) they are annually resolved, (ii) they are located in the Northern hemisphere (latitude >0°), and (iii) they are significantly correlated at the 95% confidence level with at least one historical time series of either annual or seasonal precipitation or surface temperatures from the nearest grid point within the CRU TS4 historical dataset.
First analyses singled out a proxy record in Asia named “Asia.MOR1JU” which had abnormally large RF weights (more than 5 times higher than the second) such that S_{CE} scores were almost doubled due to its inclusion. The reason for this behavior was not clear but highlighted a limitation of the nonlinear method. Whatsoever, since hundreds of proxy records are tested against the AMV indices, such a statistical anomaly does not appear unlikely. In order to prevent biasing the reconstruction toward this single proxy record, we decided to remove it from the database used in this study.
In the next sections, we describe how different reconstructions are compared and how a final nested reconstruction of the AMV is obtained. These reconstructions also use correlation tests to select proxies from P2k+ that are significantly correlated with a given AMV index, for a given learning and reconstruction period. This means that only a subset of the most relevant proxy records from the P2k+ database is finally used in each reconstruction.
Generation of the AMV and NASST reconstruction ensembles
The 312 reconstructions compared in this study are performed for 26time windows Γ: from 850–1975 to 850–2000, subsequently incrementing the superior boundary by 1 year. We use this approach to sample the sensitivity of the reconstructions to the calibration period, which goes from 1870 to the upper bound of Γ. For lower upper bounds, more proxies are available, but the regression models are built over shorter calibration periods and therefore have fewer degrees of freedom. These 26 temporal windows are used in combination with three AMV indices and four regression methods. A detailed description of these regression methods is given in Supplementary Note 1. We thus obtain 26 × 3 × 4 = 312 final reconstructions which are compared using the S_{CE} metric which allows us both to determine the best model and to say how reliable it is (next Methods section, Supplementary Fig. 6). All these setups are tested by only using proxies available and significantly correlated at the 95% confidence level with the respective AMV index. Noteworthy, it is, of course, possible that some proxies correlate significantly with AMV just by chance, which implies that some spurious predictors might be used in the reconstruction. However, the weights of these proxies are expected to be low, as they should be penalized in the crossvalidation process.
For the NASST reconstruction, 104 setups are compared by shuffling the same 26 temporal windows and the same 4 regression methods as for each of the AMV indices. As for the AMV, the bestperforming NASST reconstruction is retained as the one with the highest S_{CE} scores (“Computation of reconstruction and evaluation“ and Supplementary Fig. 11).
Computation of reconstruction and evaluation
We define the reconstruction period as Γ, defined by N annual time steps, and the common period of the proxy records and the AMV index as T, in this case defined by n < N annual time steps such that \({{{{{\mathbf{T}}}}}}\subset {{{{{\mathbf{\Gamma }}}}}}\). We then define the AMV index as \({{{{{\mathbf{Y}}}}}}\in {{\mathbb{R}}}^{n}\) and the matrix of the p available proxy records as \({{{{{\mathbf{X}}}}}}\in {{\mathbb{R}}}^{N\times p}\). We finally denote as \({{{{{\mathbf{x}}}}}}\in {{\mathbb{R}}}^{n\times p}\) the submatrix of X that contains the proxy records values over the timeframe T\(.\) The X matrix can then be denoted as \({{{{{\mathbf{X}}}}}}=[({{{{{{\mathbf{X}}}}}}}_{t}^{j})_{t \in {{{{{\boldsymbol{\Gamma }}}}}}}]_{1 \le j \le p}\) and \({{{{{\mathbf{x}}}}}}=[{{{{{{\mathbf{x}}}}}}}^{j}]_{1 \le j \le p}=[({{{{{{\mathbf{X}}}}}}}_{t}^{j})_{t \in {{{{{\mathbf{T}}}}}}}]_{1 \le j \le p}\).
We then randomly split T in R = 30 pairs of training/testing samples respectively denoted, \(\forall 1\le r\le R,\) as \(\{{{{{{{\mathbf{x}}}}}}}_{({{{{{\rm{train}}}}}})}^{(r)};{{{{{{\mathbf{Y}}}}}}}_{({{{{{\rm{train}}}}}})}^{(r)}\}\) and \(\{{{{{{{\mathbf{x}}}}}}}_{({{{{{\rm{test}}}}}})}^{(r)};{{{{{{\mathbf{Y}}}}}}}_{({{{{{\rm{test}}}}}})}^{(r)}\}\). Here, the training sample size is set to be 80% of the length of T so, by extension, the testing sample size is 20% of the length of T.
For statistical modeling, we use \({{{{{{\mathbf{Y}}}}}}}_{({{{{{\rm{train}}}}}})}^{(r)}\) as the predictand and \({{{{{{\mathbf{x}}}}}}}_{({{{{{\rm{train}}}}}})}^{(r)}\) as the predictors. For a given regression method denoted M, we apply KFCV (see subsection “kfold crossvalidation (KFCV)”) to each training set \(\left\{{{{{{{\mathbf{x}}}}}}}_{\left({{{{{\rm{train}}}}}}\right)}^{\left(r\right)};{{{{{{\mathbf{Y}}}}}}}_{\left({{{{{\rm{train}}}}}}\right)}^{\left(r\right)}\right\}\) as a metric to find the optimal set of parameters associated to the training sample and M.
M and the associated optimal set of control parameters are then applied to X^{(r)} in order to reconstruct Y^{(r)} on the testing period, which gives \({\hat{{{{{{\mathbf{Y}}}}}}}}_{({{{{{\rm{test}}}}}})}^{(r)}\), and the reconstruction period, which gives \({\hat{{{{{{\mathbf{Y}}}}}}}}_{({{{{{\rm{rec}}}}}})}^{(r)}\). This requires that \({\hat{{{{{{\mathbf{Y}}}}}}}}_{({{{{{\rm{test}}}}}})}^{(r)}=({\hat{{{{{{\mathbf{Y}}}}}}}}_{({{{{{\rm{rec}}}}}})}^{(r)})_{t \in {{{{{\mathbf{T}}}}}}}.\) The validation score associated to the r^{th} training sample is then calculated using the S_{CE}^{47} over the r^{th} testing sample:
Where m is the length of the testing sample.
This validation score gives an estimation of the accuracy of the statistical model when reconstructing the observed variability not included in the reconstruction period. When S_{CE} < 0, it means that a simple sample average over the testing period is better than the output given by the statistical model. Contrarily, S_{CE} > 0 means that the statistical model gives a more reliable reconstruction than the average over the testing sample^{45}, the associated reconstruction is thereby considered as reliable in this study.
The reconstruction for a given AMV index Y performed on a given timeframe Γ using a given statistical regression method M is obtained by applying it with KFCV over the whole learning sample. This reconstruction is associated to a global validation score, calculated as the mean of the individual validation scores obtained for the randomly drawn training and testing splits: \(s={{{{{\rm{avg}}}}}}(\{{s}^{(r)}\}_{1 \le r \le R})\).
kfold crossvalidation (KFCV)
Each method requires an optimization of its own set of control parameters θ. To estimate the optimal set of control parameters θ_{opt} on a given training set \(\{{{{{{{\mathbf{x}}}}}}}_{({{{{{\rm{train}}}}}})},{{{{{{\mathbf{Y}}}}}}}_{({{{{{\rm{train}}}}}})}\}\), we use a kfold crossvalidation (KFCV) approach.
The KFCV splits the observations into a partition of K groups of the same size (or with approximately the same size if the length of the training set is not divisible by K). We denote \(\left\{{{{{{{\mathbf{x}}}}}}}_{\left(k\right)},{{{{{{\mathbf{Y}}}}}}}_{\left(k\right)}\right\}\forall 1\le k\le K\), which contain only observations for the k^{th} drawn sample. We denote as \(\left\{{{{{{{\mathbf{x}}}}}}}_{\left(k\right)},{{{{{{\mathbf{Y}}}}}}}_{\left(k\right)}\right\}\) the K1 other sets. For all possible values of \({{{{{\boldsymbol{\theta }}}}}}\in \Theta\), we scan the K models based on the sets \(\{{{{{{{\mathbf{x}}}}}}}_{(k)},{{{{{{\mathbf{Y}}}}}}}_{(k)}\}\). The empirical optimal set of control parameters is obtained by minimizing the averaged Root Mean Squared Errors (RMSE) on the K splits by considering all possible values of θ (or as much as possible if θ is defined in a continuous set; see Supplementary Note 1). The optimal KFCV set of control parameters θ_{KF} is determined by:
Nested reconstruction
In this study, the best reconstruction found (defined as the one yielding the best S_{CE} scores) is the reconstruction of the AMV_{F} index with the random forest method over the period 850–1987 (Supplementary Fig. 6).
Using the same methodological choices (calibration period, AMV definition and reconstruction method), we have then performed a set of 1020 nested reconstructions for the periods 850–1987 to 1869–1987, subsequently using increments of one year for the inferior boundary, which allow us to use an increasing number of proxy records to reconstruct the most recent years. The nested reconstruction (i.e., the reconstruction presented in this study), is obtained by concatenating the first year in each of these 1020 reconstructions.
Best 30 reconstructions
As additional validation of the results found for the best reconstruction from this study, the 30 best reconstructions out of the 312 have been also produced and studied. Of note, nested reconstructions ranked from the 30th to the 2nd positions have all been produced with a 20year timestep instead of the oneyear one for the best reconstruction in order to largely limit the amount of computing time and energy used. A detailed description of each of these reconstructions and their average S_{CE} score are given in Supplementary Table 4.
Random forest weights
The weights of the proxy records used for the nested reconstruction are presented in Fig. 2c. Those weights have been calculated using the random forest variable importance. Different importance metrics exist, and for this study, we have selected the commonly used Mean Decrease in Impurity (MDI), also known as Gini importance. The MDI of a given proxy record is calculated as the sum of the number of splits where it is used across the K trees (Supplementary Note 1 for details on the regression methods), proportionally to the numbers of split samples in all trees (cf. Supplementary Note 1). For Fig. 2c, the MDI for each proxy is aggregated over the 1020 reconstructions using a weight of n/N, where n is the number of used proxies for a given timestep, and N = 55 the total number of proxies used at the end for the reconstruction. Finally, Fig. 2c is computed by calculating the weight of each proxy as a fraction of (i) its calculated MDIs over the timeframes of the nested reconstruction where it is used, and (ii) its total MDI over these timeframes. The same is done in Supplementary Figs. 9 and 10, but each MDI is also averaged over the pseudoproxy experiments performed on the 12 CESM1LME members used.
Validation using random red noise predictors
To verify the relevance of the S_{CE} metric and the quality of our statistical model, a further validation was performed based on randomly generated red noise predictors. The red noise processes are randomly drawn as firstorder autoregressive processes, based on the empirical firstorder autoregressive coefficient derived from the proxy records. The average reconstruction scores obtained from random red noise predictors are significantly negative (−0.08, P < 0.01) and significantly lower than the average scores obtained from real proxies over the different periods of the nested reconstruction (0.25, P < 0.01, Supplementary Fig. 7).
Composites of ocean proxy records time series
Since corals are often very short records and ocean sediment cores have too low temporal resolution (preventing them to meet the requirement of being annually resolved), there is a shortage of ocean records contributing to the reconstructions, which are almost exclusively based on terrestrial records. Interestingly, the lowfrequency part in the annually resolved reconstructions can be verified against the ocean records from the Ocean2K database that have not been used in the reconstruction. To avoid overfitting, correlation significance shown in Supplementary Fig. 8a is only calculated for the preindustrial period (before 1870) with an AR(1) correction for significance tests to avoid false detections due to the low resolution of some proxies (“Methods”). The composite average time series are performed by multiplying by −1 (i.e., calibrated) ocean proxies which have negative correlations with the AMV over the historical period only, such that we mimic a compositeplusscale reconstruction based on these records only. There is an exception for some ocean records (<10 from Supplementary Fig. 8a) that do not overlap with our reconstruction over the historical period. These time series are therefore multiplied by −1 if their correlations are negative over their overlapping period with the AMV reconstruction.
Detailed descriptions of the results are given in Supplementary Note 2.
Twoway multimember pseudoproxy experiments
For the pseudoproxy experiment, we use 12 members (number 2 to 13) of the National Center of Atmospheric Research (NCAR) Community Earth System Model Lat Millennium Experiment (CESMLME). Since the calculation of a trend for a given NASST time series is timedependent, we distinguish the calculation of the model AMV over the preindustrial (PI) period (pre1870, AMV_{PI}) and the historical one (AMV_{H}), notably because anthropogenic forcings were small during the PI period as compared to the recent one (historical). For AMV_{H}, we calculate the AMV by subtracting the ensemble mean of the CMIP5 models, similarly to the realworld reconstruction. The AMV_{F} is computed by estimating the NASST regression coefficients with the global SST over the PI period.
Each pseudoproxy mimics a specific realworld proxy by taking, in the model, the variable and season (or annual values) with the largest absolute correlations between the real proxy record and the closest grid points from the CRU TS4 dataset (Supplementary Table 4). Gaps and missing values in real proxy records are also reproduced in the pseudoproxy time series.
For both pseudoproxy experiments presented below, and for the sake of reducing computational costs, nested reconstructions have been made with a 20year timestep for the inferior boundary (from 850–1987 to 1850–1987) instead of the 1year one that is used for the real reconstructions.
The first pseudoproxy experiment consists of reconstructing the AMV within model simulations using the same proxy records and RF method as for the realworld reconstruction, and for each timestep of the nested reconstruction. Therefore, the reconstruction scores presented in Supplementary Fig. 9a for each CESMLME member are averaged over the different timeframes of the nested reconstruction. The correlations are those calculated between the RFbased reconstruction of the model AMV and AMV_{PI}. For each member, the proxy record weights are calculated in the same way as for Fig. 2c (“Methods”), and an ensemble average is presented in Supplementary Fig. 9b.
The second pseudoproxy experiment consists of training RF models directly within the CESMLME members, in which the pseudoproxies are selected if the confidence level of the correlation test with the model AMV is above 95%. These trained RF models tailored to the model simulations are then applied to the AMV_{F} index over the same portion of the historical period than in the real experiment. The obtained reconstructions are then compared to the reconstruction using realworld derived weights in Supplementary Fig. 10c. Since realworld proxy records have been measured with specific units (tree ring MXD, ice core δ^{18}O,…), the pseudoproxies are rescaled to the mean and the variance of the corresponding realworld proxy. For the same reason, the pseudoproxy is multiplied by −1 if its correlation with the model AMV has an opposite sign to that of the realworld proxy with the realworld AMV.
For the two pseudoproxy validations, detailed descriptions of the results are given in Supplementary Note 3.
Response to volcanic eruptions
To study the potential response of the reconstructed time series (AMV and NASST) we use a superposed epoch analysis. It consists in subtracting for each eruption the years from N–10 and N + 20, where N stands for the year where a given eruption effectively occurred. A composite is then computed as the average of the time series corresponding to the ten strongest eruptions. The 95% confidence level envelope is then calculated for each timestep from the distribution of the surrogate time series.
For the best 30 reconstructions, it is considered that the negative AMV response is significant if at least two years between N and N + 20 is lower than the 95% MonteCarlo envelope.
Early warning signal test
We base our approach on methods for the detection of incoming climate tipping points using the AR(1) critical slowing down metric. The AMV reconstruction is firstly smoothed using a Kernel Gaussian filtering with a bandwidth of 100 years. The annually resolved AMV is then regressed onto its 100year filtered version to remove potential longterm trends. The AR(1) coefficients of the residuals from this regression are calculated for different sliding window lengths WL = 200, 250, 300, 350, 400 years (Fig. 6a). The Kendall rank correlation, called Kendall τ, is calculated for each of the AR(1) coefficient series. Contrary to a former study focusing on early warning signal applied to an AMOC collapse in an earth system model of intermediate complexity^{38}, we cannot use a modelbased estimate of the significance of Kendall τ, which is rather calculated using a gaussian approximation as detailed in the “Statistical information” section.
Statistical information
Figure 1b, c: For each grid point, a twotailed Student test is applied to the regression coefficients between the corresponding climate variable and the AMV indices. The degrees of freedom are corrected using time series autocorrelations^{45}.
Figure 6a: The Kendall rank correlation coefficient, or Kendall τ coefficient, measures the ordinal association between two quantities, here AR(1) coefficients denoted \({({{{{{{\mathbf{x}}}}}}}_{i})}_{1\le i\le n}\) here, and time denoted \({({{{{{{\mathbf{y}}}}}}}_{i})}_{1\le i\le n}\). The statistic is given by:
Where, considering (x_{1}, y_{1}), (x_{2}, y_{2}), …, (x_{n}, y_{n}) the ensemble of joint pairs:
For large sample (n > 50), as in this study, the distribution is approximated with a Gaussian distribution of mean 0 and variance \(\frac{2(2n+5)}{9n(n1)}\), under the null hypothesis H_{0}: “τ = 0” which is tested against the alternative hypothesis H_{1}: “τ ≠ 0”. The P value (shown in Fig. 6a) of the test is deduced from the quantiles of this distribution.
Unless stated otherwise correlation are tested using a Student t test for correlation, with corrected degrees of freedom using time series autocorrelation, as shown for instance in Supplementary Fig. 8a.
For all boxplots of the study, the median is shown as a heavy darkline. Boxplots edges give first and third quartiles. Boxplot “whiskers” gives the full range without including outliers, which are not shown here for better graphical representations. A point from a boxplot is here considered as an outlier when it is outside 1.5 times the interquartile range above the upper quartile and below the lower quartile.
Data availability
The whole data generated and treated in this study have been deposited on a Zenodo repository: https://zenodo.org/record/4896670#.YLjdOS2w3dc. The source and treated data to compute figures in this study are provided in the Source Data file. The generated data in this study are provided in the Data file. Simulation outputs for the LME experiments can be accessed from the NCAR Climate Data gateway website: https://www.earthsystemgrid.org/. The CRU TS4 datasets can be accessed from the CRU data download webpage: https://crudata.uea.ac.uk/cru/data/hrg/#current. The HadISST data can be accessed from the Met Office Hadley Center website: https://www.metoffice.gov.uk/hadobs/hadisst/. For the historical experiments of CMIP5 and CMIP6 model outputs, data can be accessed from the ESGF website: https://esgfnode.llnl.gov/projects/esgfllnl/. Source data are provided with this paper.
Code availability
All codes needed to reproduce this study are publicly available on the following Zenodo link: https://zenodo.org/record/4896670#.YLjdOS2w3dc. Codes have been made using bash UNIX and R languages. R packages are listed in Supplementary Table 8.
References
Kushnir, Y. Interdecadal variations in North Atlantic sea surface temperatures and associated atmospheric conditions. J. Clim. 7, 1–20 (1994).
Ting, M., Kushnir, Y., Saeger, R. & Cuihua, L. Forced and internal twentiethcentury SST trends in the North Atlantic. J. Clim. 22, 1469–1481 (2009).
Mann, M. E., Steinman, B. A. & Miller, S. K. On forced temperature changes, internal variability, and the AMO. Geophys. Res. Lett. 41, 3211–3219 (2014).
Sutton, R. T. & Hodson, D. L. R. Atlantic Ocean forcing of North American and European summer climate. Science 309, 115–118 (2005).
Knight, J., Folland, C. & Scaife, A. Climate impacts of the Atlantic multidecadal oscillation. Geophys. Res. Lett. 33, L17706 (2006).
Zhang, R. & Delworth, T. L. Impact of Atlantic multidecadal oscillation on India/Sahel rainfall and Atlantic hurricanes. Geophys. Res. Lett. 33, L17712 (2006).
Trenberth, K. & Shea, D. Atlantic Hurricanes and natural variability in 2005. Geophys. Res. Lett. 33, L12704 (2006).
Miles, M. et al. A signal of persistent Atlantic multidecadal variability in Arctic sea ice. Geophys. Res. Lett. 41, 463–469 (2014).
RuprichRobert, Y. et al. Assessing the climate impacts of the observed Atlantic multidecadal variability using the GFDL CM2.1 and NCAR CESM1 global coupled models. J. Clim. 30, 2785–2810 (2017).
Knight, J. R., Allan, R. J., Folland, C. K., Vellinga, M. & Mann, M. E. A signature of persistent natural thermohaline circulation cycles in observed climate. Geophys. Res. Lett. 32, L20708 (2005).
Smith, D. M. et al. Role of volcanic and anthropogenic aerosols in the recent global surface warming slowdown. Nat. Clim. Change 6, 936–940 (2016).
Booth, B. B. B., Dunstone, N. J., Halloran, P. R., Andrews, T. & Bellouin, N. Aerosols implicated as a prime driver of twentiethcentury North Atlantic climate variability. Nature 484, 228–232 (2012).
Zhang, R. et al. A review of the role of Atlantic meridional overturning circulation in Atlantic multidecadal variability and associated climate impacts. Rev. Geophys. 57, 316–375 (2019).
Muir, L. C. & Fedorov, A. V. How the AMOC affects ocean temperatures on decadal to centennial timescales: the North Atlantic versus an interhemispheric seesaw. Clim. Dynam. 45, 151–160 (2015).
Yan, X., Zhang, R. & Knutson, T. Underestimated AMOC variability and implications for AMV and predictability in CMIP models. Geophys. Res. Lett. 45, 4319–4328 (2018).
Kim, W. M., Yaeger, S. & Danabasoglu, G. Atlantic multidecadal variability and associated impacts initiated by ocean thermohaline circulation. J. Clim. 33, 1317–1334 (2020).
Fang, S.W., Khodri, M., Timmreck, C., Zanchettin, D. & Jungclaus, J. Disentangling internal and external contributions to Atlantic multidecadal variability over the past millennium. Geophys. Res. Lett. 48, e2021GL095990 (2021).
Cunningham, S. A. et al. Temporal variability of the Atlantic meridional overturning circulation at 26.5°N. Science 317, 935–938 (2007).
McCarthy, G. D. et al. Measuring the Atlantic meridional overturning circulation at 26.5°. Prog. Oceanogr. 130, 91–111 (2015).
Lin, P. et al. Two regimes of Atlantic multidecadal oscillation: crossbasin dependent or Atlanticintrinsic. Sci. Bull. 64, 198–204 (2019).
Mann, M. M., Steinman, B. A., Brouillette, D. J. & Miller, S. K. Multidecadal climate oscillations during the past millennium driven by volcanic forcing. Science 371, 1014–1019 (2021).
Gray, S., Graumlich, L., Betancourt, J. & Pederson, G. D. A treering based reconstruction of the Atlantic multidecadal oscillation since 1567 A.D. Geophys. Res. Lett. 31, L12205 (2004).
Mjell, T. R., Ninnemann, U. S., Eldevik, T. & Kleiven, H. K. F. Holocene multidecadalto millennialscale variations in IcelandScotland overflow and their relationship to climate. Paleoc. Paleoclimatol. 30, 558–569 (2015).
Otterå, O. H., Bentsen, M., Drange, H. & Suo, L. External forcing as a metronome for Atlantic multidecadal variability. Nat. Geosci. 3, 688–694 (2010).
Frankignoul, C., Gastineau, G. & Kwon, Y.O. Estimation of the SST Response to anthropogenic and external forcing and its impact on the Atlantic multidecadal oscillation and the Pacific decadal oscillation. J. Clim. 30, 9871–9895 (2017).
Parsons, L. A., Brennan, M. K., Will, C. J. & Proistosescu, C. Magnitudes and spatial patterns of interdecadal temperature variability in CMIP6. Geophys. Res. Lett. 47, e2019GL086588 (2020).
Knudsen, M. F., Seidenkrantz, M.S., Jacobsen, B. H. & Kuijpers, A. Tracking the Atlantic multidecadal oscillation through the last 8,000 years. Nat. Commun. 2, 178 (2011).
Collins, M. et al. Extremes, abrupt changes and managing risk . In IPCC SROCC (eds Pörtner, H.O. et al.) Chapter 6 (2019).
Bellomo, K., Angeloni, M., Corti, S. & von Hardenberg, J. Future climate change shaped by intermodel differences in Atlantic meridional overturning circulation response. Nat. Commun. 12, 3659 (2021).
Worthington, E. L. et al. A 30year reconstruction of the Atlantic meridional overturning circulation shows no decline. Ocean Sci. 17, 285–299 (2021).
Caesar, L., Rahmstorf, S., Robinson, A., Feulner, G. & Saba, V. Observed fingerprint of a weakening Atlantic ocean overturning circulation. Nature 556, 191–195 (2018).
Rahmstorf, S. et al. Exceptional twentiethcentury slowdown in Atlantic ocean overturning circulation. Nat. Clim. Change 5, 475–480 (2015).
Thornalley, D. J. R. et al. Anomalously weak Labrador sea convection and Atlantic overturning during the past 150 years. Nature 558, 227–230 (2018).
Caesar, L., McCarthy, G. D., Thornalley, D. J. R., Cahill, N. & Rahmstorf, S. Current Atlantic meridional overturning circulation weakest in the last millennium. Nat. Geosci. 14, 118–120 (2021).
Kilbourne, K. A. et al. Atlantic circulation change still uncertain. Nat. Geosci. 15, 165–167 (2022).
MoffaSánchez, P. et al. Variability in the Northern North Atlantic and Artic Oceans across the last two millennia: a review. Paleoc. Paleoclimatol. 34, 1399–1436 (2019).
Boers, N. Observationbased earlywarning signals for a collapse of the Atlantic meridional overturning circulation. Nat. Clim. Change 11, 680–688 (2021).
Boulton, C. A., Allison, L. C. & Lenton, T. M. Early warning signals of Atlantic meridional overturning circulation collapse in a fully coupled climate model. Nat. Commun. 5, 5752 (2014).
Lenton, T. M. Early warning of climate tipping point. Nat. Clim. Change 1, 201–208 (2011).
Rayner, N. A. et al. Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth. J. Geophys. Res. 108, D14 (2003).
Harris, I., Osborn, T. J., Jones, P. & Lister, D. Version 4 of the CRU TS monthly highresolution gridded multivariate climate dataset. Sci. Data. 7, 109 (2020).
RuprichRobert, Y. et al. Impacts of Atlantic multidecadal variability on the tropical Pacific: a multimodel study. npj Clim. Atmos. Sci. 4, 33 (2021).
Singh, H. K. A., Hakim, G. J., Tardiff, R., EmileGeay, J. & Noone, D. C. Insights into Atlantic multidecadal variability using last millennium reanalysis framework. Journ. Geophys. Res. Atm. 14, 157–174 (2018).
Wang, J. et al. Internal and external forcing of multidecadal Atlantic variability over the past 1,200 years. Nat. Geosci. 10, 512–518 (2017).
Michel, S. et al. Reconstructing climatic modes of variability from proxy records using ClimIndRec version 1.0. Geosci. Mod. Dev. 13, 841–858 (2020).
PAGES 2k Consortium. A global multiproxy database for temperature reconstructions of the Common Era. Sci. Data. 4, 170088 (2017).
Nash, J. E. & Sutcliffe, J. V. River flow forecasting through conceptual models part I: a discussion of principles. J. Clim. 10, 282–290 (1970).
Neukom, R., Shurer, A. P., Steiger, N. J. & Hegerl, G. C. Possible causes of data model discrepancy in the temperature history of the last Millennium. Sci. Rep. 8, 7572 (2018).
Stone, M. Crossvalidatory choice and assessment of statistical predictions. Journ. Roy. Stat. Soc. 36, 111–147 (1974).
Menary, M. B. et al. Aerosolforced AMOC changes in CMIP6 historical simulations. Geophys. Res. Lett. 47, e2020GL088166 (2020).
Fyfe, J. C., Kharin, V. V., Santer, B. D., Cole, J. N. S. & Gillet, N. P. Significant impact of forcing uncertainty in a large ensemble of climate model simulations. PNAS 118, e2016549118 (2020).
Breiman, L. Random forests. Mach. Learn 45, 5–32 (2001).
Wahl, E. R. & Smerdson, J. E. Comparative performance of paleoclimate field and index reconstructions derived from climate proxies and noiseonly predictors. Geophys. Res. Lett. 39, L06703 (2012).
Büntgen, U., Frank, D. C., Nievergelt, D. & Esper, J. Summer temperature variations in the European Alps, A.D. 7552004. J. Clim. 19, 5606–5623 (2006).
Shi, C. et al. Summer temperature over the Tibetan Plateau modulated by Atlantic multidecadal variability. J. Clim. 32, 4055–4067 (2019).
Li, J., Li, F., He, S., Wang, H. & Orsolini, Y. J. The Atlantic multidecadal variability phase dependence of teleconnection between the North Atlantic oscillation in February and the Tibetan Plateau in March. J. Clim. 34, 4227–4242 (2021).
Monerie, P.A., Robson, J., Dong, B. & Hodson, D. Role of the Atlantic multidecadal variability in modulating East Asian Climate. Clim. Dynam. 56, 381–398 (2021).
Sigl, M. et al. Timing and climate forcing of volcanic eruptions for the past 2,500 years. Nature 523, 543–549 (2015).
Swingedouw, D. et al. Bidecadal North Atlantic ocean circulation variability controlled by timing of volcanic eruptions. Nat. Commun. 6, 6545 (2015).
Vieira, L. E. A., Solanki, S. K., Krikova, N. A. & Usoskin, I. Evolution of the solar irradiance during the Holocene. Astron. Astrophys. 531, A6 (2011).
Frankcombe, L. M., von der Heydt, A. & Dijkstra, H. North Atlantic multidecadal climate variability: an investigation of dominant time scales and processes. J. Clim. 23, 3626–3638 (2010).
OttoBliesner, B. L. et al. Climate variability and change since 850 CE: an ensemble approach with the community earth system model. Bull. Am. Meteor. Soc. 97, 5 (2016).
Kéfi, S., Dakos, V., Scheffer, M., van Nes, E. H. & Rietkerk, M. Early warning signals also precede noncatastrophic transitions. Oikos 122, 641–648 (2013).
Dakos, V., Carpenter, S. R., van Nes, E. H. & Scheffer, M. Resilience indicators: prospects and limitations for early warnings of regime shifts. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 370, 20130263 (2015).
Sgubin, G., Swingedouw, D., Drijfhout, S., Mary, Y. & Bennabi, A. Abrupt cooling over the North Atlantic in modern climate models. Nat. Commun. 8, 14375 (2017).
Swingedouw, D. et al. On the risk of abrupt changes in the North Atlantic subpolar gyre in CMIP6 models. Ann. N. Y. Acad. Sci. 1504, 187–201 (2021).
Caesar, L., McCarthy, G. D., Thornalley, D. J. R., Cahill, N. & Rahmstorf, S. Reply to: Atlantic circulation change still uncertain. Nat. Geosci. 15, 168–170 (2022).
Sutton, R. T. Climate science needs to take risk assessment much more seriously. Bull. Am. Meteor. Soc. 100, 1637–1642 (2019).
Swingedouw, D. et al. Early warning from space for a few tipping points in physical, biological and socialecological systems. Surv. Geophys. 41, 1237–1284 (2020).
Valesco, J. A. et al. Synergistic impacts of global warming and thermohaline circulation collapse on amphibians. Commun. Biol. 4, 141 (2021).
Acknowledgements
This study benefited from the IPSL ProdiguerCiclad and Camelot supercomputing facilities, supported by CNRS, UPMC Labex LIPSL. This work was also sponsored by NWO Exact and Natural Sciences for the use of SurfSARA supercomputing facilities (Amsterdam) under project 2020.022. Authors were also funded by EU‐H2020 TiPES (Grant no. 820970, S.M., contribution no. 164), Blue Action (Grant Agreement no. 727852, D.S., G.G, G.M., and J.M.), EUCP (Grant Agreement no 776613, D.S. and J.M.), ROADMAP (Grant Agreement no. 116020, J.M., G.M., and G.G.) ARCHANGE (ANR‐18‐MPGA‐0001, J.M. and G.G.) research programmes.
Author information
Authors and Affiliations
Contributions
S.M. has performed the different reconstructions, statistical analysis, figures, and the pseudoproxy experiment of this study. S.M. has mainly written the manuscript with the important participation of D.S., J.M., P.O., and G.G. D.S., J.M., P.O., G.G., M.K., and G.M. have contributed to the results assessments, the manuscript writing, and have made suggestions to set up the manuscript’s guiding thread.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks MaritSolveig Seidenkrantz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Michel, S.L.L., Swingedouw, D., Ortega, P. et al. Early warning signal for a tipping point suggested by a millennial Atlantic Multidecadal Variability reconstruction. Nat Commun 13, 5176 (2022). https://doi.org/10.1038/s41467022327043
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467022327043
This article is cited by

Warning of a forthcoming collapse of the Atlantic meridional overturning circulation
Nature Communications (2023)

Reply to: Evidence lacking for a pending collapse of the Atlantic Meridional Overturning Circulation
Nature Climate Change (2023)

Reliability of vegetation resilience estimates depends on biomass density
Nature Ecology & Evolution (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.