Introduction

Changes in the Earth’s orbit around the Sun, resulting in variations in the amount of received solar insolation on Earth, are responsible for the glacial–interglacial cycles observed throughout the Pleistocene1,2,3,4. Somewhere between 1.2 and 0.8 million years (Ma), the glacial–interglacial cyclicity changed from low-amplitude ~40 thousand years (ka) cycles to the current high-amplitude ~100 ka cycles5,6,7,8,9. What has triggered such change of periodicity during this time period, also referred to as the Mid-Pleistocene Transition (MPT), is one of the most intriguing and unsolved questions regarding Pleistocene climate.

While orbital forcing is unanimously recognized as the driver of the Pleistocene climatic cycles, no obvious variation of the main orbital parameters is observed over the MPT7,10,11. Hence, internal causes in the Earth’s climatic system have been proposed to explain this major transition from a ~40 ka to a ~100 ka cycle world. Based on the description of these physical mechanisms8,12,13, we propose to divide them into two main groups, one group including what we will refer to hereafter as abrupt mechanisms, and one group gathering more gradual mechanisms.

Hypotheses associated with an abrupt change refer to mechanisms involving a threshold from which a change is observed and the system could not turn back. It is assumed that such mechanisms do not take place during the whole Pleistocene but are specific to the time interval necessary to trigger the transition. Two hypotheses have been investigated so far:

  1. (i)

    The MPT would result from the junction of the previously separated Laurentide and Cordillera ice sheets. This junction would provoke a threshold response of the ice-sheet volume14,15. The merging of these two ice sheets during glaciations would have led to a rapid increase in global ice volume without any change in the climatic forcing14 while their separation during deglaciations would have accelerated their retreat15. Such a mechanism, dependent on the North American topography, was likely not triggered during the Early Pleistocene as the ice sheet never reached the required size14.

  2. (ii)

    The removal of the thick sediment layer called regolith beneath ice sheets is also mentioned as a possible cause of the MPT. Indeed, the change of substratum from regolith to a high-friction crystalline Precambrian shield bedrock would have reduced the basal velocity of the ice sheet making it more stable. This would have allowed the build-up of larger ice sheets associated with longer glacial periods after the MPT13,16,17. Most studies testing this hypothesis are based on a prescribed regolith without morpho-geological and erosional constraints and categorize this regolith change as abrupt. Estimates of the subglacial erosion rates made unlikely the hypothesis of a gradual removal of regolith over the entire Pleistocene18.

Hypotheses involving a gradual mechanism correspond to a progressive increase or decrease of a physical climatic parameter throughout the whole Pleistocene. One of the most frequently proposed parameters is the global atmospheric CO2 concentration, which would have decreased progressively over the Pleistocene16,17,19,20. This hypothesis remains challenging to test. Indeed, continuous atmospheric CO2 records inferred from Antarctic ice cores only go as far back as 0.8 Ma21. Prior to 0.8 Ma, atmospheric CO2 records have been derived from blue ice22,23, continental sediments24, or marine sediments25,26,27. However, they are discontinuous, attached to large uncertainty and/or poorly resolved. Still, in the current existing CO2 records, no decrease over the Pleistocene is found except when only considering the minimum values of atmospheric CO2 reached at the end of glacial periods23,25,26.

Others studies invoke the role of changes in the Artic sea ice extent as a mechanism explaining the ~100 ka cycles of the Late Pleistocene28,29. A large extent of sea ice would have caused a positive feedback retroaction and maintains the climate of the Earth under full glacial conditions. This mechanism is directly linked to the ocean surface temperature. Under a given temperature threshold temperature, Arctic sea ice would grow up28,29. A gradual and continuous drop in Arctic temperature would hence be mandatory to explain the appearance of 100 ka only during the Late Pleistocene21.

A different approach based on more explicit modelling is used to investigate the potential causes of the MPT30,31,32. This more physical approach allows us to test scenario that involves a change in the physics of the ice sheet itself before and after the MPT. This bifurcation of the terrestrial parameters of the ice sheets is suspected to play a role in the MPT31.

Several models reproduce successfully the MPT using these different hypotheses16,19,30,31,32,33,34. However, it remains difficult to evaluate and compare their relevance as they do not always test the same physical mechanism. Also, they cover a wide range of complexity, going from simple conceptual models to multi-dimensional ice models, including or not carbon cycle feedbacks20. Hence, the controlling factors of the MPT remain one of the biggest unknowns in paleoclimate sciences, and in particular, the role played by atmospheric CO2 changes. In this context and under the umbrella of the International Partnership on Ice Core Sciences (IPICS), the international ice-core community is currently conducting several ice core drilling projects aiming at recovering a continuous Antarctic ice core to provide the first direct atmospheric CO2 reconstructions across the MPT35. It is crucial to conduct in parallel new modelling studies investigating the physical processes at the origin of the MPT.

In our study, we use a conceptual model to reproduce the global ice volume variations over the past 2 Ma. Instead of focusing on the nature of the triggering mechanism, we propose to consider its temporal structure as the determining criterion, regardless of the underlying physics. Hence, we investigate whether the MPT was triggered by physical processes affecting the climate system at a specific time interval or alternatively, more gradually over the course of the Pleistocene. Conceptual models have already proven their efficiency in modelling the Pleistocene climate before and after the MPT19,36,37,38. The advantage of such simple models is that they enable clear identification of the influence of each new input when added to the initial version of the model. For instance, ref. 39 evaluate the influence of different insolation forcing on reproducing the global ice volume changes over the Pleistocene. A family of conceptual models only consider the orbital forcing to reproduce global ice volume changes over Pleistocene36,37,38. Two of these models apply a prescribed insolation as input36,37 while others are forced using a linear combination of the three orbital variables38. They successfully reproduce the triggering of deglaciation over the past 1 Ma. Especially, the model of ref. 38 pointed out the different roles of obliquity and precession during specific Terminations. To simulate global ice volume variations, these conceptual models are all fitted to a reference curve derived from paleo-reconstruction using a number of model parameters.

These model parameters allow to modulate the relative influence of the three orbital parameters and the ice volume itself on the resulting global ice volume computation. They also influence the glaciation and deglaciation thresholds which control in the model, the switch from a “glaciation” state to a “deglaciation” one, and reversely. Conceptual models forced only with orbital forcing reproduce the glacial–interglacial global ice volume variations over the past 1 Ma in term of amplitude and frequency19,37,38. However, the change in the amplitude and frequency of glacial–interglacial cycles during the MPT cannot be simulated unless the model parameters are changed19,39. Note that ref. 36 reproduce the change in frequency without changing the model parameters. However, they use a detrended benthic foraminifera δ18O stack40 as a reference curve for global ice volume changes. Hence, this prevents discussing the change in amplitude of the glacial–interglacial cycles during the MPT36.

Ref. 38 successfully reproduce the global ice volume for the last 1 Ma, only based on orbital forcing. The model switches between two states, glaciation and deglaciation, following a threshold mechanism related to the orbital forcing and the modelled global ice volume itself. However, when the model is run over the past 2 Ma, the MPT cannot be reproduced without changing model parameters. Such limitation could reveal the impossibility of reproducing the MPT with these model equations when considering only the orbital forcing. Or, alternatively, it could highlight the inefficiency of the method used in the study to find a combination of parameters allowing to reproduce of the MPT.

Here we use a zero-dimensional conceptual model that calculates changes in global ice volume over time as a function of orbital forcing. Terminations are achieved by introducing a strongly negative term into the mass balance once a certain ice volume threshold is crossed. This new conceptual model is derived from the conceptual model of ref. 38. Our model differs in the exploration of the parameter space as we developed a more efficient inverse method to find the best-fit parameters (see the “Methods” section). Also, our new model reproduces global ice volume variation over the past 2 Ma. We enlarge the simulated time interval compared to previous studies in order to integrate the entire period of the MPT (~1.2–0.8 Ma) as well as multiple pre-MPT 40 ka glacial–interglacial cycles. We thus test the hypothesis of two modes of climate before and after the MPT and our results should be considered under this hypothesis. Alternative simple models have developed another approach implying a change of climate physics through the MPT30,31.

Different forcing hypotheses for the MPT are tested using three simulations performed with our conceptual model in order to compare the relevance of each mechanism:

  1. (i)

    The ORB simulation uses only orbital forcing parameters as inputs.

  2. (ii)

    The ABR simulation is based on an internal abrupt forcing, in addition to the external forcing. The abrupt forcing is designed as such: the deglaciation threshold, which is the ice volume value that the model has to reach to initiate the deglaciation, differs before and after the timing of the MPT optimally determined by the inverse method. It conceptually reproduces an abrupt MPT as the deglaciation is facilitated before the MPT.

  3. (iii)

    The GRAD simulation is based on external forcing plus an internal gradual forcing. In this simulation, the gradual forcing is designed so the deglaciation threshold varies linearly over the last 2 Ma. This deglaciation threshold is thus different at each time step of the model. Contrary to the ABR simulation, no specific timing could be considered to represent the MPT.

In the following, we show that all three simulations reproduce the MPT as seen in the global ice volume reconstruction from ref. 41 with different degrees of success (see the “Methods” section). Based on our results, we investigate the relative role of internal versus external forcing in the triggering of MPT. Finally, we discuss our model results in the context of the different climate mechanisms proposed to explain the triggering of the MPT.

Results

The MPT in the ORB simulation

No modelling of the internal climate system is accounted for in the ORB simulation as reflected in the fact that the 13 model parameters being kept constant throughout the last 2 Ma (see the “Methods” section). Hence, the global ice volume variations modelled in the ORB simulation offer a basis to investigate the full role of the external forcing on the MPT (Fig. 1). To evaluate the degree of correlation of the modelled global ice volume variations to the reconstructions, we calculate the average of the absolute values of the residuals over the past 2 Ma (Fig. 2). Residuals correspond to the difference between the modelled value and the observed value at the same time (see the “Methods” section). A positive or negative residual implies an underestimation or an overestimation respectively, of the modelled ice volume compared to the reconstructed ice volume changes. Average residuals for 400 ka windows were computed for each of the three simulations (Supplementary Table 1). These values are thus used to investigate the presence of any systematic biases in a specific time window. The modelled global ice volume from the ORB simulation is relatively well correlated with the global ice volume reconstruction for the last 2 Ma with an average absolute value of residuals of 18.1 m and a coefficient of determination R2 equal to 0.44 (Fig. 1, Supplementary Fig. 1). The 11 terminations of the last 1 Ma are reproduced (Supplementary Fig. 2). All nine periods with residuals superior to 30 m correspond to a modelled global ice volume that is overestimated during interglacial periods (Fig. 2). Before 1 Ma, our model does not reproduce some existing Terminations, i.e. Termination (T) X, TXXII, whereas TIII is artificially doubled (Fig. 1; Supplementary Fig. 2).

Fig. 1: Simulated and reconstructed global ice volume variations (in metre sea level equivalent, msl) over the last 2 Ma.
figure 1

The ORB (a, black curve), ABR (b, blue), and GRAD (c, green curve) simulations are superimposed onto the reconstructed global ice volume (red dashed curves) from ref. 41. For each simulation, the grey curve represents the state of the model, deglaciation (1) or glaciation (0) and the evolution of a key model parameter is indicated as a dashed black line. In the ORB simulation, the v0 parameter is constant throughout the past 2 Ma (dashed black line). In the ABR simulation, the evolution of v0 can be represented as a step change (dashed black line) between two different values. The optimal timing of the abrupt change in v0 is at 1220 ka. In the GRAD simulation, the v0 parameter linearly increases through time over the past 2 Ma.

Fig. 2: Residuals between the modelled global ice volume and the reconstructed global ice volume changes.
figure 2

The reconstructed global ice volume is from ref. 41. ORB (a), ABR (b) and GRAD (c) simulations. Yellow and red areas are time intervals where the model-data deviation exceeds 30 m.

The modelled deglaciations are the largest after 1 Ma with an amplitude higher than 70 m, whereas no modelled deglaciation has an amplitude of more than 70 m before 1 Ma (Supplementary Fig. 2). The model reduces the contrast of the amplitude changes before and after the MPT. The periodicity of the modelled deglaciation onsets increases gradually over the past 2 Ma. Residuals are respectively negative and positive before and after 1 Ma (Supplementary Table 1). Still, this is the first time that a simulation from a conceptual model using only orbital forcing can partially reproduce the MPT in terms of both amplitude and frequency without any changes in the model parameters. Two sensitivity tests were performed to confirm the robustness of the MPT transition modelled in the ORB simulation. First, a “model parameter sensitivity” test could demonstrate that it is reasonable to consider that the ORB simulation is mainly sensitive to the parameters controlling the switch from the glaciation to the deglaciation mode (Eq. (6), Supplementary Note 1, Supplementary Fig. 3). Second, a simulation run with a synthetic orbital forcing was also performed. This synthetic forcing is an orbital forcing where all natural orbital frequencies except the dominant ones (41 ka for obliquity and 21 ka for precession and phase-shifted precession) are removed. The simulations show that ~100 ka cycles are not reproduced when using only the dominant forcing (Supplementary Note 1, Supplementary Fig. 4). This result highlights the important role played by the long-term orbital forcing in the MPT of the ORB simulation (SYN simulation, Supplementary Fig. 4).

The MPT in the ABR simulation

The ABR simulation accounts for both external and internal forcing. In the ABR simulation, the v0 parameter, which is the ice volume threshold value responsible for triggering deglaciation, varies abruptly at a specific time. This specific time is optimally inferred by our inverse method and corresponds to 1220 ka (see the “Methods” section for additional information). The modelled global ice volume is highly correlated with the reconstructed one from ref. 41 for the last 2 Ma. The average absolute value of residuals is 14.4 m and the coefficient of determination R2 is equal to 0.64 (Fig. 1; Supplementary Figs. 1 and 5). Terminations over the past 2 Ma are all reproduced by the model. Over 1.2–1.6 Ma, the model mainly underestimates the global ice volume (Supplementary Table 1). In particular, the slightly higher global ice volume maxima from the paleo-reconstruction are not simulated (Fig. 2).

Our ABR simulation reproduces the MPT in terms of amplitude and frequency. However, it produces a systematic bias of overestimation before the MPT and underestimation after the MPT of glacial maxima (Supplementary Table 1). It is likely that a longer transition in the v0 parameter is required to avoid a systematic temporal bias between the simulated and the reconstructed ice volume changes.

Two additional ABR-type simulations were performed based on abrupt changes of the kO and αg parameters in order to test the sensitivity of our results to the choice of the variable parameter (Supplementary Note 1, Supplementary Figs. 6, 7, Supplementary Table 2). These results have higher residual values (15.7 and 16.5 m) than the ABR simulation and they only reproduce partially the MPT (Supplementary Tables 5, 6). These results confirm the relevance to consider the v0 parameter as the main driver of the MPT.

The MPT with the GRAD simulation

The GRAD simulation also aims at simulating the MPT using both external and internal forcing. However, the internal forcing is modelled by a gradual-rather-than-abrupt change to modulate how difficult it is to trigger a deglaciation. Here, the v0 parameter influencing the switch from glaciation to deglaciation mode linearly varies through time (see the “Methods” section). The average absolute value of residuals is 13.9 m and the coefficient of determination R2 is equal to 0.67 (Fig. 1, Supplementary Fig. 1) which are, respectively, the lower and higher values of the three simulations. The GRAD simulation reproduces all but TXXII over the past 2 Ma (Fig. 3). No deglaciation has an amplitude superior to 60 m before 900 ka (Fig. 3). Compared to the ORB and ABR simulations, there is no systematic bias of overestimation or underestimation of global ice volume (Fig. 2). The residuals are centred around zero over the five sliding windows of 400 ka (Supplementary Table 1). To sum up, the GRAD simulation reproduces the MPT both in terms of amplitude and frequency while it has the smallest average residual of the three simulations (13.9 m). Finally, no systematic bias is identified over the past 2 Ma.

Fig. 3: Temporal structure of glacial–interglacial cycles simulated by the GRAD model.
figure 3

Each green stick corresponds to a deglaciation state. Height of the green stick represents the amplitude of the deglaciation in meter sea level equivalent. The width of each green stick and the associated green number indicate the duration of each of the deglaciation states. Grey roman numerals are termination names. Interval between two consecutive orange dots and the associated orange number corresponds to the time interval between two deglaciation onsets. Grey shaded area corresponds to a period of lower amplitude and higher frequency than the white area.

Such as for the ABR-type simulations, the sensitivity of the model results to other parameters was tested. Two additional GRAD-type simulations were performed based on the gradual change of the kO and αg parameters (Supplementary Note 1, Supplementary Figs. 6, 7, Supplementary Table 2). These results have higher residual values (15.4 and 16.7 m) than the GRAD simulation and they only reproduce partially the MPT (Supplementary Tables 5, 6). Hence, for this set of simulations too, the results support the fact that it is relevant to consider the v0 parameter, i.e. the deglaciation threshold, as the main driver of the MPT.

Spectral analyses of the three simulations

Next, we perform spectral analyses on the modelled global ice volume curves from the three simulations to study the relative amplitude of the 100 and 40 ka signals before and after the MPT using the Discrete Fourier Translation (DFT) method with a Fast Fourier Translation (FFT) algorithm42. The results of the spectral analysis of the three modelled global ice volume curves are compared to the results of the spectral analysis obtained on the global ice volume reconstruction (Fig. 4).

Fig. 4: Spectral analyses on the simulated and reconstructed global ice volume changes.
figure 4

Spectral analyses for the 1-0 Ma (a) and 2-1 Ma (b) time intervals for the ORB (dark), ABR (blue), and GRAD (green) simulations and the global ice volume reconstruction of ref. 41 (red).

Performed over the last 1 Ma, the spectral analyses of the three global ice volume simulations highlight the dominant periodicities at 100 and 40 ka, as well as a third dominant periodicity at 23 ka, also registered in the global ice volume reconstruction. A higher power for the 100 ka peak compared to the 40 ka peak is observed for the GRAD simulation, which prevails also in the power spectrum for the ice volume reconstructions. This is different from the spectral analyses of the ABR and ORB global ice volume simulations that exhibit 100 and 40 ka peaks of similar power over the last 1 Ma. When performed over the 2–1 Ma time interval, the spectral analysis of the global ice volume reconstruction highlights a 40 ka peak and no peak at 100 ka. Such a pattern is well reproduced in the spectral analyses of the GRAD and ABR simulated global ice volume variations. In contrast, the power spectrum for the ORB simulated global ice volume underlines the existence of a peak at 200 ka in addition to the larger 40 ka peak (Fig. 4).

In summary, the spectral analyses of our modelled global ice volume reconstructions confirm that the three simulations reproduce the MPT with different degrees of success, with the GRAD simulation performing the best.

Quantifying the relevance of the simulations

When calculated over the past 2 Ma, the average residuals of the ORB, ABR and GRAD simulations are respectively 18.1, 14.4 and 13.9 m using 13, 15 and 14 parameters for each model (Supplementary Table 3). The GRAD simulation fits the best with the global ice volume reconstruction compared to the ABR simulation, while it also has less degree of freedom as fewer model parameters are used. Hence, the additional parameter of the GRAD simulation compared to the ORB simulation appears more relevant than the two parameters added in the ABR simulation. To investigate the relative relevance of the three simulations in a quantitative way, we use an objective criterion named the Bayesian Information Criterion (BIC; see Methods). The BIC enables quantifying the relevance of a model against another one43,44,45. This criterion is based on the number of independent data, the number of parameters of the model, and the mean of residuals (see the “Methods” section). The ΔBICij is the difference between the BIC of model j and the BIC of a model i. If this value is positive, it is an argument in favour of model j. Several degrees of confidence are expressed in function of the value of the positive number (see Methods). If the ΔBIC is superior to 10, the confidence is considered as very strong46.

The ΔBIC calculated for the GRAD and ABR simulations against the ORB simulation are very strong, i.e. 61.7 and 49.6, respectively (Table 1). This result highlights the relevance of modelling an internal physical parameter of the Earth system in addition to the orbital forcing to fully reproduce the MPT. In addition, the ΔBIC of the GRAD simulation relative to the one of the ABR simulation is also strong i.e. 12.1 (Table 1), confirming that using the GRAD simulation is more relevant than the ABR simulation.

Table 1 BIC difference ∆BICij = BICj–BICi for the ORB, GRAD and ABR simulations.

Discussion

Previous studies exploring the cause of the MPT dismiss the hypothesis that this transition was solely driven by orbital forcing7,11,20,26,47. They consider that the pre-MPT climate responds linearly to orbital forcing whereas the climate after the MPT is decoupled from orbital forcing through an internal mechanism of the Earth’s climatic system48,49. This hypothesis is also supported by an abrupt transition identified in several records from natural climatic archives8,12,13. Also, the fact that up until now, no simple conceptual model could reproduce this transition using solely the orbital forcing as input was in favour of excluding orbital forcing as a potential driver of the MPT. Ref. 36 modelled the MPT only with external forcing inputs and without changing parameters. However, the model is not compared to the original climatic signal. Instead, it is compared to a detrended version of the LR04 benthic foraminifera δ18O composite curve50. In their study, the signal between 2 and 1 Ma is normalized and centred on the mean and maximum amplitude of the signal of the last 1 Ma. This approach enables to study the model performance regarding the change in the frequency of glacial–interglacial cycles across the MPT and highlights that it is well simulated in their model. However, such a strategy prevents investigating the change in the amplitude of the glacial–interglacial cycles.

Our conceptual model is the first to reproduce at least partially the change in both the frequency and the amplitude of glacial–interglacial cycles observed in paleodata across the MPT without the inclusion of forcing from the internal climate system nor a change in model parameters (ORB simulation). Our sensitivity analysis shows that the most sensitive mechanism behind this simulation is the shift from glaciation to deglaciation mode, i.e. the trigger of the deglaciation (Supplementary Note 1, Supplementary Fig. 3, Supplementary Table 4). This result confirms the crucial role of the trigger of deglaciations in driving the Pleistocene climate.

Ref. 51 already shows that an internal component of the Earth system could be driven by variations at million-year scale of orbital parameters. A non-linear response to these million-year scale variations in orbital forcing may have played a role in this transition, as suggested by our SYN simulation and this would deserve future dedicated investigations (Supplementary Fig. 4). While the dominant frequencies (21 and 41 ka) are able to reproduce the pre-MPT climate, ~100 ka cycles appear to require a long-term modulation of external forcing. This long-term (hundreds of thousands to a million years) cyclicity in the orbital records might explain the minored role of orbital forcing in the previous studies focusing on the specific time period of the MPT. Still, we show that the ORB simulation globally overestimates and underestimates the amplitude of the glacial–interglacial cycles before and after the MPT respectively (Fig. 1, Supplementary Fig. 2). Thus, an internal component of the climate system is necessary to be coupled with this external forcing to fully explain the MPT.

In the GRAD and ABR simulations, the modulation of the deglaciation threshold conceptually includes a forcing from the internal climate system. The variation in the value of the deglaciation threshold across the MPT, either abruptly or gradually, implies that it becomes more difficult to initiate a deglaciation during the Late Pleistocene rather than the early Pleistocene. Although the nature of the change in the internal climate that is modelled by the change in the deglaciation threshold is not directly constrained by our conceptual model, we can rank the relevance of the different proposed mechanisms of MPT based on their temporal structure.

Erosion of the regolith beneath the Laurentide and Fennoscandian ice sheets is regularly invoked as a likely cause of the MPT7,13,16. With such a hypothesis, the MPT would have begun at the time when all the regolith was removed. Indeed, ice sheets lied on hard crystalline bedrock, reducing basal velocity and allowing ice sheets to be thicker and more stable7. This hypothesis focuses on a specific time period for the transition to happen, which could be linked to the clear transition identified in some paleoclimatic records (ref. 8: 900 ka; ref. 13: between 950 and 860 ka). This mechanism also implies a direct response to the climate, where a binary change will provoke an abrupt transition. However, our results show that despite a major improvement in the model performance when adding a threshold occurring at a specific time, this solution is less relevant that adding a gradual trend to the initial orbital forcing.

The hypothesis of a gradual forcing in the internal climatic system to explain the MPT has been investigated before. For instance, changes in the atmospheric CO2 concentrations have been proposed as a possible driver of the MPT16,19,20,25. Indeed, a progressive decrease of the atmospheric CO2 concentrations would have for consequence to gradually cool the Earth20. Ref. 16 propose that the atmospheric CO2 is a possible driver of the MPT using a model coupling climate, ice sheets and carbon cycle. However, they show that an additional mechanism, i.e. the regolith removal, is necessary to fully describe the MPT. A characteristic of this study is the fact that the CO2 trend over the Pleistocene is not constrained by CO2 data but is only suggested by the linear decrease of an unspecified parameter. Ref. 26 investigates the role of CO2 in the MPT, but more as a consequence rather than as a possible trigger of the MPT. In fact, they proposed that a change in the ice sheet dynamics would cause an increase in dustiness. This would lead to a fertilization of the Southern Ocean and a consequent decrease in global CO2 concentrations.

The scarcity of atmospheric CO2 reconstructions from natural archives before 800 ka makes the assumption of an atmospheric CO2 concentration decrease throughout the Pleistocene challenging to test. Still, existing CO2 reconstructions from Boron isotopes of marine sediment cores reach values superior to 400 ppm during the late Pliocene (2.3–3.3 Ma). This corresponds to CO2 concentrations higher than CO2 measured in the ice core for the last 800 ka52. During the Pleistocene, available data suggest relatively constant CO2 levels characterizing interglacials prior, during and after the MPT while during glacial maxima, a gradually decreasing trend in CO2 levels is observed through time23,25,26. As our model is in favoured of a gradual change, we hypothesize that an atmospheric CO2 decrease during glacial times would be enough to provoke a gradually increasing stability of ice sheets during the glacial period, making them less sensitive to insolation variation after the MPT. The ease of the deglaciation trigger would be progressively reduced. Under these conditions, each obliquity cycle would not lead to deglaciation as in the pre-MPT world. Some obliquity cycles would be missed and deglaciations would be triggered through a specific obliquity and precession configuration, generating artificially ~100 ka cycles. The decreasing parameter in our gradual conceptual model could be linked to atmospheric CO2. This would create a stochastic and non-linear post-MPT climate, contrary to the obliquity-driven pre-MPT climate.

To conclude, our study tests conceptual hypotheses (e.g. external or internal forcing, abrupt or gradual mechanism) regarding the causes of the MPT and evaluates their efficiency to reproduce global ice volume reconstructions over the past 2 Ma. While our phenomenological model is clearly in favour of a gradual trigger of the MPT, the involvement of a change in atmospheric CO2 concentrations remains a hypothesis that requires direct testing. In particular, it would be very valuable and complementary that alternative physical-based approaches relying on more complex53 or more explicit31 modelling investigate further our findings so it is possible to provide a direct identification of the physical mechanism behind the equations. With the deep drilling of a new ice core in Antarctica, the on-going European Beyond EPICA-Oldest Ice project aims to provide new climate and environmental records back to 1.5 Ma, and in particular the first continuous millennial-scale atmospheric CO2 record throughout the MPT54,55. The climate and environmental records from this new ice core should provide key information to progress on our understanding of the mechanisms responsible for the MPT. Understanding the carbon cycle-climate interactions during this major climatic transition from the past will tighten the constraints on the response of the Earth system over long timescales to future greenhouse gas emissions.

Methods

Conceptual model formulation

The model used in our study is similar in the formulation to the previously published conceptual model of Quaternary climate of ref. 38. These conceptual models aim to reproduce the global ice volume \(v\), i.e. sea level variations with only orbital forcing parameters as input19,37,38,39. A linear combination of three orbital parameters Esi  e sin ω (precession, ω the precession angle taken from the vernal equinox), Eco  e cos ω (phase-shifted precession) and Ob ε (obliquity) could represent insolation at most latitudes and season3.

The model is composed of two states, the glaciation state “g” and the deglaciation state “d” which represent the climatic system. Two linear equations thus represent the climatic system:

$${{{{{\rm{g}}}}}}:\frac{{{{{{\rm{d}}}}}}v}{{{{{{\rm{d}}}}}}t}=-{{\alpha }}_{{{{{{\rm{Esi}}}}}}}{{{{{{\rm{Esi}}}}}}}_{{{{{{\rm{tr}}}}}}}\left({t}\right)-{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{Eco}}}}}}}{{{{{{\rm{Eco}}}}}}}_{{{{{{\rm{tr}}}}}}}\left({t}\right)-{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{O}}}}}}}{{{{{\rm{Ob}}}}}}({t})+{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{g}}}}}}}$$
(1)
$${{{{{\rm{d}}}}}}:\frac{{{{{{\rm{d}}}}}}v}{{{{{{\rm{d}}}}}}t}=-{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{Esi}}}}}}}{{{{{{\rm{Esi}}}}}}}_{{{{{{\rm{tr}}}}}}}\left({t}\right)-{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{Eco}}}}}}}{{{{{{\rm{Eco}}}}}}}_{{{{{{\rm{tr}}}}}}}\left({t}\right)-{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{O}}}}}}}{{{{{\rm{Ob}}}}}}({t})+{{{{{{\rm{\alpha }}}}}}}_{{{{{{\rm{d}}}}}}}-\frac{{v}}{{{{{{{\rm{\tau }}}}}}}_{{{{{{\rm{d}}}}}}}}$$
(2)

where \({{{{{{\rm{\tau }}}}}}}_{{{{{{\rm{d}}}}}}}\) is the relaxation time in ka, \({{\alpha }}_{{{{{{\rm{Esi}}}}}}}\), \({{\alpha }}_{{{{{{\rm{Eco}}}}}}}\) and \({{\alpha }}_{{{{{{\rm{O}}}}}}}\) are constant parameters in m ka−1 which allow to give relative weight to the three orbital forcings. \({{\alpha }}_{{{{{{\rm{d}}}}}}}\) and \({{\alpha }}_{{{{{{\rm{g}}}}}}}\) are the speed of deglaciation and glaciation, respectively, in m ka−1. “\({{{{{\rm{Ob}}}}}}\)” is obliquity normalized to zero mean and unit variance and Esitr and Ecotr are, respectively, calculated from Esi and Eco the precession parameters normalized to zero mean and unit variance using a truncation function:

$${{{{{\rm{if}}}}}}\; x\le 0:{f}\left({x}\right)={x}+\sqrt{4{{a}}^{2}+{{x}}^{2}}-2{a}$$
(3)
$${{{{{\rm{if}}}}}} \, x\, > \, 0:{f}\left({x}\right)={x}$$
(4)

where a is a constant with a = 1.06587 from ref. 37. This truncation is similar to the one used by ref. 19.

The model can switch from one state to another if a threshold ice volume is exceeded by a linear combination of the 3 orbital parameters (plus ice volume for g to d transition):

$${{{{{\rm{g}}}}}}\,{{{{{\rm{to}}}}}}\,{{{{{\rm{d}}}}}}:{{k}}_{{{{{{\rm{Esi}}}}}}}{{{{{\rm{Esi}}}}}}({t})+{{k}}_{{{{{{\rm{Eco}}}}}}}{{{{{\rm{Eco}}}}}}({t})+{{k}}_{{{{{{\rm{Ob}}}}}}}{{{{{\rm{Ob}}}}}}({t})+{v} \, > \, {{v}}_{0}\,\\ ({{{{{\rm{and}}}}}}\,{{k}}_{{{{{{\rm{Esi}}}}}}}{{{{{\rm{Esi}}}}}}({t})\ +{{k}}_{{{{{{\rm{Eco}}}}}}}{{{{{\rm{Eco}}}}}}({t})+{{k}}_{{{{{{\rm{Ob}}}}}}}{{{{{\rm{Ob}}}}}}({t})\ge {{v}}_{1})$$
(5)
$${{{{{\rm{d}}}}}}\,{{{{{\rm{to}}}}}}\,{{{{{\rm{g}}}}}}:{{k}}_{{{{{{\rm{Esi}}}}}}}{{{{{\rm{Esi}}}}}}({t})+{{k}}_{{{{{{\rm{Eco}}}}}}}{{{{{\rm{Eco}}}}}}({t})+{{k}}_{{{{{{\rm{Ob}}}}}}}{{{{{\rm{Ob}}}}}}({t}) \, < \, {{v}}_{1}\,\\ ({{{{{\rm{and}}}}}}\,{{k}}_{{{{{{\rm{Esi}}}}}}}{{{{{\rm{Esi}}}}}}({t})\ +{{k}}_{{{{{{\rm{Eco}}}}}}}{{{{{\rm{Eco}}}}}}({t})+{{k}}_{{{{{{\rm{Ob}}}}}}}{{{{{\rm{Ob}}}}}}({t})+{v}\le {{v}}_{0})$$
(6)

With \({{v}}_{0}\) the ice volume above which the model switches to the deglaciation mode and \({{v}}_{1}\) the ice volume below which the model switches to the glaciation mode. These two parameters represent a limiting value that defines the range in which the ice volume is allowed to vary during the simulation. The physical interpretation of these parameters is that they represent the maximum and minimum size of the ice sheet under full glacial and deglacial conditions.

This initial version of the conceptual model, is similar in the equations formulation to the one of ref. 37, is the so-called ORB simulation in the following study.

Variants of the orbital conceptual model

In order to test the internal forcing climatic hypothesis, we develop two variants of the ORB simulation: (i) the ABR simulation and (ii) the GRAD simulation. These models are similar to the ORB simulation, except for the formulation of the g to d transition.

(i) For the ABR simulation, we use a distinct value of \({v}_{0}\) depending if we are after or before a threshold age T

If t > T:

$${{v}}_{0}={{v}}_{0{{{{{\rm{b}}}}}}}$$
(7)

If t < T:

$${{v}}_{0}={{v}}_{0{{{{{\rm{a}}}}}}}$$
(8)

where t is the discretized time, \({{v}}_{0{{{{{\rm{a}}}}}}}\) and \({{v}}_{0{{{{{\rm{b}}}}}}}\) are two values of \({{v}}_{0}\) determined from our inverse method.

(ii) For the GRAD simulation, we add a parameter C to the \({{v}}_{0}\) parameter which drops continuously the ice volume threshold along the time period of the model.

$${{v}}_{0}\left({t}\right)={{v}}_{0}\left({t}=0\right)+{Ct}$$
(9)

where t is the discretized time.

Finally, orbital, gradual and abrupt models are, respectively, composed of 13, 14 and 15 parameters (Supplementary Table 3). We solve the evolution of v over the last 2 Ma using a Runge–Kutta 4th order method with a time step of 1000 yr for the 3 models.

Use of the global ice volume reconstruction from Berends et al. (2021)41

We fit the model to the global ice volume reconstruction of ref. 41 which is based on the LR04 marine benthic foraminifera δ18O stack50. The use of this global ice volume reconstruction as a reference could induce two main biases in the model-data comparison. Firstly, ref. 41 deconvolute the initial isotopic signal of the LR04 stack, which is influenced by both global ice volume and deep-ocean temperature. The global ice volume quantification is obtained through a coupled model of the Northern Hemisphere ice sheets and ocean temperatures41. Secondly, the chronology of the LR04 stack is partially built using an orbital tuning-based dating method50. However, the use of a global ice volume reconstruction instead of an isotopic record allows to provide dimensional and quantified model-data deviation in terms of sea level equivalent. Additionally, the isotopic signal from distinct oceanic basin has shown asynchronous variations at the onset of deglaciations56,57. Such asynchronicity is also observed during the MPT, due to the diversity of statistical tools used to analyse the data and to the criterion used to determine the onset of MPT7,20.

Note that we also fit the model to the global ice volume reconstruction from refs. 16,58. These ice volume reconstructions exhibit some differences from the more recent one from ref. 41, in particular, the amplitude of glacial–interglacial cycles appears larger. Using these two alternative records does not impact the conclusions of our study, i.e. the GRAD simulation is still the most appropriate to reproduce the MPT. We are aware of the limitations associated with a modelled reconstruction based on stacked benthic δ18O records but it has the advantage to provide information at a global scale, i.e. the global ice volume changes. Indeed, other paleorecords covering the MPT are mainly interpreted as representative of local climate and/or environmental changes8.

Model fit to the global ice volume reconstructions using a Monte Carlo method

As in ref. 37, we use a random walk based on the Metropolis algorithm59,60 to select the most probable experiments. In this study, we improve the selection of the best-fit parameters using a random walk with N walkers with the eemc python module61. The previously developed methodology37 is a simplified case of this method with one walker. Our method allows to optimize the selection of the best parameters as it is more efficient to explore the parameter space61. For each model, we perform 100,000 experiments using a random walk at 30 walkers and we extracted the best-fit vectors of parameters for each model (Supplementary Table 3).

Model comparison method

In order to evaluate the relative relevance of each of the three simulations, we compare them using the Bayesian Information Criterion (BIC). This criterion quantifies the evidence in favour of a model against another model43,44. It is expressed as following:

$${{{{{\rm{BIC}}}}}}=-2{{{{{\rm{ln}}}}}}{L}\left({{{{{\rm{\theta }}}}}}\right)+{K}{{{{{\rm{ln}}}}}}{N}$$
(10)

where N is the number of independent data points, K the number of model parameters and ln L(θ) is the maximum log-likelihood, defined here as the χ2. We estimate N as 200, which corresponds to one independent data point every 10 ka, i.e. approximately at each ice volume extremum.

To compare directly two models, we compute the ΔBICij, which is the BIC difference between two models:

$$\Delta {{{{{{\rm{BIC}}}}}}}_{{ij}}={{{{{{\rm{BIC}}}}}}}_{{j}}-{{{{{{\rm{BIC}}}}}}}_{{i}}$$
(11)

The ΔBIC could be directly interpreted as following: the evidence of the dominance of the relevance of the model j over model i is weak if 0 < ∆BICij < 2, positive if 2 < ∆BICij < 6, strong if 6 < ∆BICij < 10, and very strong if ∆BICij > 10 (42).

Calculating the average residuals between simulated and reconstructed ice volume change

Here we investigate the influence of the high degree of freedom of our model on the ability of the three simulations to fit the reconstructed global ice volume. To do so, we quantify the residuals for our three simulations and also perform an additional “test” simulation. The average residuals correspond to the average difference between simulated and reconstructed global ice volume. The average standard deviation of the ice volume reconstruction is 24.8 m. If our model was the simplest possible with no degree of freedom, i.e. a constant value of ice volume, the residual model data would thus be 24.8 m. In comparison, residuals for the ORB, ABR and GRAD simulations are, respectively, 18.1, 14.4, and 13.9 m. One could question if the ability of our model to represent the MPT is only due to the high degree of freedom of our model (from 13 to 15 parameters) and the sinusoidal nature of the input forcing. To test this, we evaluate if the ORB residual value of 18.1 m represents a significant improvement compared to the 24.8 m estimate. We run a “test” simulation over the past 2 Ma using the orbital forcing parameters corresponding to the 4–2 Ma interval. By using such “inappropriate” forcing we obtain a residual value of 20.4 m. This illustrates that the inverse method used in our model can reduce the model-data mismatch by about 4 m. However, the orbital forcing from the correct period (2–0 Ma) is required to further reduce the residual value to 18.1 m. This result illustrates that the significant reduction of the model-data mismatch in the ORB simulation is due to the use of the orbital forcing corresponding to the appropriate time period (2–0 Ma) period rather than the high degree of freedom of our model.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.