## Introduction

When the Earth warms, high latitudes often warm more than low latitudes, a phenomenon commonly known as “high latitude amplification”. The past, present, and future of high latitude amplification have been studied extensively using climate models and paleoclimate data, but the pattern and magnitude of amplification remain uncertain and are areas of great concern due to the impacts of future warming on sea ice decline, atmospheric circulation changes and extreme weather, and carbon release associated with permafrost melting1,2,3,4. In particular, debate continues with regard to the relative roles of albedo, water vapor, and lapse rate feedbacks in determining high latitude amplification, with the pattern of warming itself potentially affecting the sensitivity of the climate system to radiative forcing due to changes in greenhouse gas concentrations5,6. Exploring time intervals with and without significant polar ice may provide key insights into elucidating these different feedbacks4,7,8. Investigating high latitude amplification over the past 10 million years (Myr) of Earth’s history when the Northern Hemisphere evolved from a largely ice-free condition to one with major continental glaciers and sea ice, therefore, provides invaluable insights into the causes and expressions of high latitude amplification7,9. Of particular interest is how the pattern of warming behaves on different timescales and how that informs our understanding of future warming patterns with implications for global temperatures10.

Quantitative sea surface temperature (SST) reconstructions provide a basis for establishing temperature gradients and for determining high latitude amplification in the geological past8. To reconstruct SSTs, the algal biomarker-based alkenone unsaturation index $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ is a well-validated, accurate proxy11. It has been widely used to establish middle to high-latitude SST records, with existing records covering most of the late Miocene (since ~12 million years ago, Ma) to the present12,13. However, the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ index reaches its maximum value of 1 at about 29 °C14, limiting its application in warm regions, especially the Western Pacific Warm Pool (WPWP) defined by the 28.5 °C isotherms in the modern ocean. Being the warmest and largest surface water body, the WPWP dominates global oceanic heat transport15 and also drives major atmospheric circulations such as the latitudinal Hadley Cell and longitudinal Walker Cell16. Historical data and climate simulations show that the ascent area of the WPWP exerts the dominant control on the global radiative feedback17.

The temperatures of the WPWP are critical since they define the equator-to-pole gradients and high latitude amplification as this region represents the endmember with the warmest SSTs for determining energy budgets1. The deficiencies of the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ proxy have required reliance on other proxies, such as the biomarker-based TEX86 proxy and the foraminiferal Mg/Ca-based proxy, to determine the evolution of the WPWP, with conflicting results. For instance, when applied to Ocean Drilling Program (ODP) Sites 806 (0°19.1′N, 159°21.7′E) and 1143 (9°21.72′N, 113°17.11′E, Fig. 1), TEX86 revealed a long-term cooling18,19, in contrast to marine carbonate-based approaches (e.g., Mg/Ca and Δ47) that suggested the WPWP was thermally stable since the Pliocene (~5 Ma)20,21. This has become a contentious issue, with questions being raised about the veracity of the proxies used, the corrections applied and the sites chosen22,23. Thus, ambiguity with regard to the thermal evolution of the WPWP remains and currently limits our understanding of the evolution of Pacific temperature gradients and high latitude amplification in particular.

Here we present multi-proxy ($${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$, TEX86, Mg/Ca), multi-site (U1488, 806) SST records of the WPWP for the past 10 Myr, improving the spatial coverage within the WPWP and providing a comprehensive view of the evolution of ocean temperature in this important region. These data, together with published SSTs from the extratropics of the Pacific Ocean, are then used to evaluate the pattern of ocean temperature changes relative to the WPWP, the result of which is also compared with climate model outputs.

## Results and discussions

### SST records of the WPWP

The $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$- and TEX86-derived SSTs are based on sediments of Site U1488 recovered by the recent International Ocean Discovery Program (IODP) Expedition 363 which provides unique material to study the evolution of the central part of the WPWP since the Miocene. Samples from Site U1488 (02°02.59′N, 141°45.29′E, Fig. 1), located on the Eauripik Rise north of Papua New Guinea, were subjected to biomarker analyses and SST estimates, supplemented by planktonic foraminiferal Mg/Ca measurements from the late Miocene samples of ODP Site 806 (12.5–5.8 Ma) (Supplementary Table 1). The total organic carbon content of sediments from Site U1488 is overall low, averaged to ~0.2 wt%24. We, therefore, extracted lipids from large samples (~20–60 g), which helped to obtain sufficient alkenones and glycerol dialkyl alycerol tetraethers (GDGTs) for reliable $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ and TEX86 determinations (Supplementary Fig. 1). Site 806 is situated on the Ontong Java plateau (Fig. 1), with abundant planktic foraminifera Trilobatus sacculifer present in the late Miocene section which was used to extend the existing Mg/Ca record of the Plio-Pleistocene20.

The application of paleothermometers, particularly on older sediments, has to be accompanied by thorough evaluations of the limitations and caveats of each proxy. For example, the distribution of GDGTs in sediment samples could be influenced by non-thermal effects, which would invalidate the application of TEX86 for SST reconstructions. We, therefore, screened the GDGT data using a series of criteria including ring index, methane index, branched and isoprenoid tetraether index, GDGT-0/crenarchaeol, %GDGT-0, and %GDGT-2 (see the section “Methods”, Supplementary Fig. 2). For Site U1488, the majority of the data (89%) passed these vigorous tests and therefore was included in our SST estimate. Several recent studies have suggested that on a global scale, the TEX86 signal derives from the surface or within the top 200 m of the water column, validating its usage as a surface or shallow subsurface ocean temperature proxy25,26. This is especially likely to be true in the WPWP with upper thermocline depths of 100–200 m27. At Site U1488, GDGT-2/GDGT-3 ratio (used to detect the GDGTs with a deep-water origin28,29) is low overall (<6.1), supporting the utilization of TEX86 as an SST proxy. Finally, we used a Bayesian-based spatially varying regression (BAYSPAR) calibration30 to convert TEX86 to SSTs, including consideration of the small paleogeography changes of the sites.

On multi-million-year timescales, the Mg/Ca paleothermometer is complicated by the potential for carbonate diagenesis and long-term seawater Mg/Ca (Mg/Casw) variations. The diagenetic impact on trace metals is not fully understood, although nanoscale secondary ion mass spectrometry results suggest that foraminiferal Mg/Ca is relatively robust31. However, numerous lines of evidence, such as halite-hosted fluid inclusions32, calcium carbonate veins formed on mid-ocean ridge flanks33, sediment core pore-fluid profiles34, and biogenic carbonates35 have shown that for the past 10 Myr, Mg/Casw has increased substantially and therefore has to be considered when applying the Mg/Ca thermometry.

Here we applied a geochemical model-derived Mg/Casw scenario by Stanley and Hardie36 to adjust the influence of varying Mg/Casw on the Mg/Ca paleothermometry (Fig. 2b), given that this simulated Mg/Casw agree well with the proxy data and is available for the entire studied interval (Supplementary Fig. 3). This numerically modeled result was obtained by accounting for the steady-state mixing of riverine input and mid-ocean ridge hydrothermal brines. Other modeled or proxy-based Mg/Casw scenarios (Supplementary Fig. 3), such as back-calculating Mg/Casw using biomarker-derived SSTs and planktonic foraminiferal Mg/Ca37, and fitting a regression to all proxy data38 and to all but fossil coral data39, were also used to compute Mg/Ca-derived SSTs (Fig. 2b). These different Mg/Casw scenarios yield different SST history of Site 806 (Fig. 2b). Nonetheless, it is critical to point out that any Mg/Casw correction of Mg/Ca-derived SSTs results in the decrease of Site 806 temperatures from the late Miocene to the Pliocene and Pleistocene, in stark contrast with the uncorrected SSTs which do not show a clear trend over the last 10 Myr (Fig. 2b).

Data generated during this study and existing SST records from Sites U1488, 806, and 114318,19 in the central, eastern, and western parts of the WPWP provide the unique opportunity to broadly evaluate the temperature changes of the warm pool over the past 10 Myr. The excellent agreement between the independent proxies from the same site (i.e., Site 806, Fig. 2c and Supplementary Fig. 4) and sites that are thousands of kilometers away (Fig. 2c) demonstrates these approaches are individually robust and together, they unambiguously show that the WPWP was warmer during the late Miocene–Pliocene than the present, and document a secular cooling since 10 Ma (Fig. 2c), despite the SST data scatter from ~2 Ma towards the present associated with large glacial-interglacial temperature variations. $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST from Site U1488 show a range of values for the most recent 3 Myr, but they are mostly pinned to the maximum value between 10 and 3 Ma (Fig. 2a), similar to the previous results from Sites 806 and 114318,40. This qualitatively supports the notion of a warmer than present WPWP during the Mio-Pliocene. These results contradict the “permanent El Niño” theory20, which argues that the WPWP was “stable” since the early Pliocene. A “stable” WPWP along with a much warmer cold tongue in the Eastern Equatorial Pacific (EEP) (Supplementary Fig. 5) led to nearly absent zonal temperature gradients across the equatorial Pacific during the Pliocene, resembling modern “El Niño” events. However, the zonal SST gradients calculated using our WPWP SST record were never below 1 °C over the past 10 Myr (Supplementary Fig. 6), unsupportive of a permanent El Niño-like mean climate state, but allowing for time intervals with a smaller gradient than today’s. Also, perhaps with greater importance5, our results confirm that the WPWP does respond to global warming during the Pliocene and Miocene12,13,18, in agreement with the predictions of theory and climate models with important implications for the future climates41.

### Pacific SSTs, SST gradients, and high latitude amplification

Here we evaluate the meridional temperature gradients and high latitude amplification of the Pacific Ocean with our WPWP temperature estimates and published SST records from the EEP and extra-tropics (Fig. 1, Supplementary notes 1, 2, Supplementary Table 2 and Supplementary Figs. 5, 7 and 8). This was achieved through a novel approach that normalizes the magnitude from paleoclimate data to a metric that enables direct comparison with modern climate and climate modeling without having to consider the changes in the temporal domain. The procedure is as follows:

1. (1)

Stack SSTs from the WPWP, middle latitudes (30°−50°) and high latitudes (>50°) separately using a time-binning approach, to yield regional-averaged SST changes (Supplementary Fig. 9, see the section “Methods”). This treatment shows that similar to the WPWP, middle latitude and high latitude SSTs also exhibit an overall decline since 10 Ma (Supplementary Figs. 9, 10), but with transient deviations from this secular trend (Supplementary Figs. 8, 9).

2. (2)

Use these stacked data to calculate high latitude amplification over the past 10 Myr. To do this, we ordinated the SST data in all regions by high latitude SST (Fig. 3), rather than time as is customary. This allows us to focus on how temperatures elsewhere vary as a function of high latitude SSTs, with the understanding that high latitudes are responding to a variety of possible forcings such as greenhouse gas, icesheet, and volcanic forcing and thus those are implicitly represented in these formulations of the SST data from other regions ordinated by high latitude SST. Specifically, the WPWP and mid-latitude temperatures and temperature differences from the WPWP to high latitudes are compared directly against high latitude temperatures (Fig. 3), since the SST data were binned for every 200 thousand years (kyr) and then stacked for the WPWP, mid-latitude, and high-latitude regions (Supplementary Fig. 9).

High latitude amplification is clearly expressed by the decrease in meridional SST gradients (WPWP—high latitude) with increasing high latitude temperatures (Fig. 3). This is consistent with observations of the more distant Mesozoic and early Cenozoic greenhouse climates which are characterized by reduced meridional gradients42. The high latitude amplification factor can be further quantified by a weighted linear regression (York Regression43) between the WPWP and high latitude temperatures over the last 10 Myr (Fig. 3), which is equivalent to 1/slope of the linear regression. It should be noted that our definition of Pacific “high latitude amplification” is different from the more commonly used “polar” or “Arctic amplification”, which cannot be constrained here since polar SSTs with the timespan and resolution equivalent to our Pacific records are currently unavailable.

The Pacific high latitude amplification is calculated to be 2.42 ± 0.64 (1σ), determined by the WPWP and high latitude data from North Pacific sites residing between 50°N to 55°N (Fig. 4b). If the SST data from the EEP are used instead of the WPWP, this high latitude amplification factor is 1.81 ± 0.52 (Supplementary Fig. 11). Given that the upwelled waters in the EEP carried the thermal signature of extratropical waters44, our following discussions of high latitude amplification are all relative to the WPWP.

Temperature gradient and amplification changes between the middle latitude and WPWP provide a test to examine whether our high latitude amplification determinations are biased by the availability of sites and data. Middle latitude sites with SSTs covering the last 10 Myr are available from both the North and South Pacific. Our results show that the warming in the middle latitudes was also amplified relative to the WPWP, with the middle latitude amplification factor of 1.57 ± 0.19 for the North Pacific, and 1.38 ± 0.09 for the entire Pacific (Supplementary Fig. 12). Over the past 10 Myr, the middle latitude amplification factors are smaller than high latitude amplification, consistent with our understanding that the amplification of warming is most pronounced in the high latitude regions (Fig. 3, Supplementary Fig. 12). The determination of middle latitude amplification lends independent support for the robustness of our analyses of the Pacific temperature patterns.

Over the last 10 Myr, the Earth has experienced pronounced climate changes. The atmospheric CO2 levels have varied between ~450 and 180 ppm45, and the equator-to-pole temperature gradient changed by more than 7 °C12 (Supplementary Fig. 9). In addition, ocean gateways changed and ocean circulation shifted36, and the cryosphere evolved substantially, with sea ice first appearing in the North Pacific at ~3 Ma46, followed by the initiation of major continental glaciation around 2.7 Ma47. Regardless of these changes, the high latitude amplification in the Pacific has remained constant, which argues for a robust physical mechanism constraining this relationship in the face of all these boundary condition changes over a 10 Myr.

The ice-albedo feedback is often regarded as a principal mechanism for high latitude amplification48,49. A substantial increase in ice-rafted debris was reported at our high-latitude Sites 882 and 887 around 2.75–2.7 Ma46,50, indicating the onset of significant Northern Hemisphere glaciation at this time. These sites used in our high-latitude SST compilation have been impacted by sea ice and ice sheets since ~2.75 Ma, and therefore can be used to evaluate the influence of the appearance of the Northern Hemisphere cryosphere on the magnitude of the high latitude amplification. Direct comparison between high latitude amplification before and after Northern Hemisphere glaciation (~2.7 Ma) is hampered by the small number of available high-latitude SSTs between 2.8 Ma and the present. We, therefore, opted for comparing the high latitude amplification factor for the Northern Hemisphere ice-free period (10–2.8 Ma) with that for the entire studied interval (10–0 Ma, Fig. 4).

The amplification factor was estimated to be 3.39 ± 0.84 (1σ) in the NH ice-free world (10–2.8 Ma) and larger than that of the entire 10 Myr (2.42 ± 0.64, Fig. 4b). This precludes the ice-albedo feedback as the primary driver of the high latitude amplification, consistent with the model results showing amplified polar warming without changes in snow and sea ice cover7,9. Processes other than surface albedo feedback, such as lapse rate and Planck feedbacks51,52, therefore are more likely to govern this amplification. However, it should be noted that this conclusion cannot be directly extrapolated to “Arctic amplification” since again, our analyses are restricted to the 50°N to 55°N North Pacific due to the lack of a comparable dataset from the Arctic Ocean. Nonetheless, the robustness of this Pacific high latitude amplification in paleoclimate data suggests it is caused by an intrinsic property of the climate system, which leads us to hypothesize that this high latitude amplification response pattern should show up in climate models. This hypothesis is tested, and implications are drawn in the following sections.

### High latitude amplification: reconstructions vs. models

With the well-defined patterns of SST changes described above, we can now compare them against climate simulations. We focus here on the National Center for Atmospheric Research (NCAR) Community Earth System Model (CESM) since it is an Intergovernmental Panel on Climate Change (IPCC)-class model participating in the Coupled Model Intercomparison Project Phases 5 and 6, and it has been widely applied in paleoclimate studies. The recent versions of CESM have shown promise in reproducing the Pliocene53 and early Eocene54 climates with a lower meridional temperature gradient than today. CESM run for preindustrial and modern conditions reproduce modern SST distributions (Fig. 5). It similarly produces SSTs and SST gradients for the equilibrium mid-Pliocene paleoclimate simulations, and for long equilibrated future higher CO2 simulations, that lie well within the range defined by our paleoclimate data (Fig. 5, Supplementary Figs. 13, 14 and Supplementary note 3).

In terms of Pacific high latitude amplification, the Neogene and the 5300-year transient coupled CESM simulations55 lie along the same regression lines, and these slopes are equivalent to that obtained from the mid-Pliocene CESM1.2 simulation (Fig. 5, Table 1). The broad agreement between the Pacific amplification patterns from data on millennium to tens of millions of years timescale suggests that the millennium-scale climatic processes towards the past or future equilibrium climate work generate similar amplification in the North Pacific, such as a coupled atmospheric and oceanic heat transport56.

Besides CESM, we also compared our Neogene data with other millennial-length climate models55 and identified that majority of these models reproduce the middle and high latitude amplification more or less equally well except CCSM3, ECHAM5, and FAMOUS (Fig. 6, Supplementary Note 4 and Supplementary Fig. 15), suggesting they represent the physics of amplification correctly, but the more sensitive models exhibit a closer match to the range of high latitude SSTs over the past 10 Myr (Fig. 6 and Supplementary Table 4), given that the reconstructed late Miocene CO2 levels are ~400 ppm45 and lower than those (4 × CO2, 1120 ppm) in future climate simulations. Indeed, much of the recent progress in paleoclimate simulations in reproducing past warm climates appears to be due to increased climate sensitivity54. The warming pattern that emerged from this analysis can be used as a constraint on climate models and in this analysis, however preliminary, it is the models that are more sensitive and more accurately reproduce this pattern.

Importantly, regardless of the actual mechanism, the 10-million-year records (Fig. 5) presented here pave a possible path for our future high-latitude ocean regions, suggesting they will warm ~2.4 times as much as the WPWP SSTs. Arctic warming relative to WPWP is likely to be slightly higher than this value (Supplementary note 5, Supplementary Table 3, and Supplementary Fig. 16), consistent with an Arctic amplification factor of 2.2–2.4 relative to the global mean predicted in the Fifth Assessment Report of the IPCC1. These results clearly identify a specific amplification fingerprint for past warming which is not a strong function of time, boundary conditions, or greenhouse gas forcing trajectory45 thus enabling a ready comparison with climate model results. The technique of ordinating temperatures and temperature gradients by high-latitude temperatures should be broadly applicable across many regions and time intervals (Fig. 5).

Evaluating models based on how well they reproduce the amplification fingerprint can be a useful test of the model’s fidelity. It also might provide some idea of the magnitude of the “pattern effect” in our present climate; that is, the impact of the spatial pattern of surface warming on radiative feedbacks57,58, which is intimately linked with estimating climate sensitivity59. We can, however, derive a second constraint from the data shown in Fig. 5 by examining the range of SSTs themselves, which span high latitude SSTs of 5–12 °C, and comparing against climate models results across the range of forcings. In the case of various implementations of CESM, this comparison would be revealing. A CESM Miocene paleoclimate simulation with a CAM5 atmosphere (with an Equilibrium Climate Sensitivity or ECS of 4.1 °C/doubling) and at 400 ppm CO2 in the center of the range of CO2 proxy estimates, only warms half as much as data suggest at high latitudes (Fig. 5). A near-modern CESM/CAM5 simulation, with 800 ppm CO2 reproduces the full temperature range, which indicates that some substantial radiative forcing is still missing from the Miocene simulations or that the Miocene CESM sensitivity was too low. The just released CESM2/CAM6 model60, which has an ECS of 5.4 °C/doubling, is nearly as warm at high latitudes at only 569.4 ppm CO2 utilizing otherwise modern boundary conditions. These comparisons suggest that either the ECS value in CESM/CAM5 is too low for the Miocene, or other factors are important in driving the Miocene warmth. These other factors could include slow feedback from continental ice sheets or non-CO2 greenhouse gasses or aerosols or reorganizations of paleo-topography and continental configuration61. Regardless, discriminating between those options requires significant progress in understanding the CO2 levels during the late Miocene.

## Methods

### Age models

The age model of Site U1488 is based on integrated biostratigraphy and magnetostratigraphy24 established on Hole U1488A calibrated to the Geological Time Scale 2012 (GTS2012)62. Miocene age model of Site 806 was established from an existing biostratigraphy63 converted to the GTS2012 timescale by this study (Supplementary Table 1). The chronology of Site 1143 was also updated from the original publication18, using a recently published astronomically tuned age model (8.98–0 Ma)64, and the first occurrence of Neogloboquadrina acostaensis and Discoaster neohamatus observed at 465.8 and 488.79 m65, corresponding to 9.83 and 10.52 Ma, respectively. The chronology of Site 1388 was updated based on a revised biomagnetostratigraphic age model derived from Backman et al.66. The age model of other sites is based on their references which were already tuned to the GTS2012 timeframe12.

### Lipid biomarker analysis

304 of total lipids were extracted from 20 to 60 g of freeze-dried and crushed sediments of IODP Expedition 363 sites U1488, U1489, and U1490 with a mixture of dichloromethane and methanol (9:1, V/V) using an accelerated solvent extractor (ASE 350, DIONEX). Total lipid extracts were separated into aliphatic, aromatic, and polar fractions by silica-gel chromatography using hexane, dichloromethane, and methanol as respective eluents. The polar fraction containing tetraethers was dissolved in hexane:isopropanol (99:1, v/v), passed through 2.7 μm glass microfiber filter, and then analyzed by an Agilent 1260 series high-performance liquid chromatography (HPLC) coupled with Agilent 6120 series atmospheric pressure chemical ionization—mass spectrometry (APCI-MS) following the methodology of Becker et al.67 to measure the relative abundance of glycerol dialkyl glycerol tetraethers (GDGTs). An in-house laboratory standard was measured every five samples to check the reproducibility of the TEX86 values. TEX86 values were calculated according to Schouten et al.68, and the analytical precision is ±0.015 TEX86 unit based on long-term TEX86 measurements of the internal standard. The aromatic fractions containing ketones were dissolved in toluene and analyzed on an Agilent 7890B gas chromatography (GC) equipped with a 60 m DB-1 capillary column (0.25 mm ID, 0.25 mm film) and a flame ionization detector (FID) to measure the relative abundance of C37:3 and C37:2 alkenones. An in-house alkenone standard was measured every five samples to check the reproducibility of the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ values. $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ values were calculated according to Prahl and Wakeham69, and then converted to SSTs using the Bayesian B-spline regression (BAYSPLINE) calibration70. Multiple measurements of an internal standard indicate that analytical precision is ±0.01 $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ unit. All organic geochemistry was performed at Texas A&M University.

There must be abundant lipid biomarkers to ensure accurate GDGT and alkenone measurements and therefore TEX86 and $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ value determinations. Since Site U1488, U1489, and U1490 has low organic carbon content24, large amounts (~20–60 g) of sediments were, therefore, requested and analyzed. This practice resulted in a high abundance of GDGTs and alkenones in most samples. However, there were still a small number of samples that yielded low abundance. For example, GDGT-3 is the most analytically challenging compound due to its typical lowest relative abundance among all isoprenoid GDGTs. Therefore, only samples with the integrated peak area of GDGT-3 above 1000 on our HPLC–MS were used to calculate TEX86 values. For samples below the detection limit, we concentrated the samples to improve the intensity of GDGT-3. For instance, during the first round of measurements, one “challenging” sample (U1488 A-4H-3, 62–65 cm; 30.8 g sediments used) shows the peak area of GDGT-3 as 1320 (Supplementary Fig. 1a) when 20 out of 500 μL solution was injected, slightly above our HPLC–MS detection limit of 1000. When this sample was concentrated and analyzed again with a 20 out of 200 μL injection, the peak area of GDGT-3 was increased to 3503 (Supplementary Fig. 1b). Although the TEX86 values from these repeated measurements are almost undistinguishable (0.69 vs. 0.70) (Supplementary Fig. 1), this practice generated better chromatograms that potentially led to more reliable determinations of TEX86 values.

### Planktonic foraminiferal Mg/Ca measurements

About 20 specimens of planktonic foraminifer T. sacculifer (without final sac-like chamber) were picked from the 250 to 300 μm size fraction of samples from Site 806 at the University of Massachusetts-Amherst. Tests of T. sacculifer were crushed and cleaned based on the cleaning methodology of Barker et al.71 (with the reductive step omitted) before being leached and diluted with nitric acid. Trace elements were then analyzed on the dissolved samples by a Thermo Element inductively coupled plasma mass spectrometry (ICP-MS) at the University of Southampton following Henehan et al.72. Mg/Ca precision was ±4% at two standard deviations based on repeat measurements of consistency standards measured at the same time as the unknowns. Al/Ca, a measure of the effectiveness of clay removal, was <100 μmol/mol in all but one sample (mean ~16 μmol/mol), confirming clay removal was adequate. The sample with an elevated Al/Ca of 150 μmol/mol was not obviously elevated in Mg/Ca and so was retained for completeness.

### Compilation of Pacific SST data

We compiled TEX86, $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$, and Mg/Ca data of the Pacific sites for the past 10 Myr18,19,20,40. The $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ records from the eastern equatorial Pacific, middle latitudes (30°–50°), and high latitudes (>50°) over the past 10 Myr were compiled by Herbert et al.12 (Supplementary Table 2). For consistency, these SST estimates were recalculated using the TEX86 and $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ calibrations described below. The lack of continuous SST records from the southern high latitude over the last 10 Myr limits our ability to evaluate the high-latitude amplification in the Southern Hemisphere.

### Paleogeography of study sites

Paleo-latitude and -longitude of sites shown in Fig. 1 were calculated using the GPlates software (http://www.gplates.org). Rotations and coastlines were obtained from Matthews et al.73, and the global continent–ocean boundary and spreading ridge dataset were from Müller et al.74. All sites remained in their large area of geographical feature (e.g., WPWP, EEP, middle latitudes, and high latitudes) for the entire studied interval.

### GDGT distributions

Most of the samples from Sites U1489 and U1490 are either below the detection limit of GDGTs or did not pass the screening tests described below, thereby limiting our interpretations of GDGTs as reflecting SSTs to Site U1488 only. Among the 109 samples taken from Site U1488, 11 were collected as the “squeezed cake” on board (samples after pore-water extractions) and 98 were sampled onshore at the Gulf Coast Repository of the IODP with a resolution of one sample per 100 kyr. Two samples did not have a sufficient abundance of GDGTs for the TEX86 determination. For the remaining 107 samples, we applied a series of screening methods. These approaches include ∆Ring Index (RI), the difference between measured RI and predicted RI from the global core-top TEX86–RI relationship75, to detect the samples affected by non-thermal factors. Based on the absolute value of ∆RI < 0.6, 12 samples were excluded to estimate TEX86–SST. The remaining 95 samples passed all other tests described below. The GDGT-2/crenarchaeol ratios (≤0.4) and methane index (≤0.5) were below their threshold values, indicating no substantial contribution of methanotrophic archaea to the GDGT pool76,77. GDGT-0/crenarchaeol ratios are <1.2 and %GDGT-0 values are <51, which reveals no major influence of methanogenic archaea78. GDGT-2/GDGT-3 ratios are <6.1, ruling out the major influence of deep-water-produced GDGTs28. %GDGT-2 values are low (<38), also suggesting that GDGT distributions are suitable for TEX86–SST79. The only exception is the branched and isoprenoid tetraether (BIT) index, presumably representing the relative abundance of the soil-derived GDGTs over marine GDGTs80. Among the 95 samples, 81% of them have BIT values >0.3, 38% have BIT values >0.4, and 13% have BIT values >0.5. However, a number of studies have shown that high BIT might not faithfully reflect soil inputs81 because of the sedimentary in-situ production of branched GDGTs82 that does not necessarily affect TEX86. For example, the reported Eocene and Oligocene BIT values from open oceans average 0.27 ± 0.19, much larger than the modern core-tops (0.03 ± 0.03)83. When TEX86 and BIT index of Site U1488 were cross-plotted, no significant correlation was identifiable (Supplementary Fig. 2). This observation, combined with the low C/N ratio throughout most of the site (mean value around 10), strongly suggests the predominance of marine production of the organic matter24. Consequently, the samples with relatively high BIT values but passing the Ring Index and all other tests were still included in our analyses.

### TEX86 calibration

The TEX86–SSTs from Sites U1488, 1143, 806, and 850 were estimated using the BAYSPAR calibration30. BAYSPAR calibration accounts for the spatial variations of TEX86–SST relationships, and paleo-latitude and -longitude of Sites U1488, 1143, 806, and 850 with 0.5 Myr interval were used in BAYSPAR calibration to produce SST estimates and the associated uncertainties.

### $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST estimates

The $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SSTs from Sites U1488, 882, 883/884, 887, 1208, 1021, 594, 1125, U1338, 846, 850 were calculated using a recently published Bayesian B-spline approach (BAYSPLINE)70 which better captures the nonlinear behavior as $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ index approaches 1 and allows a slightly higher maximum, but the calibration is not anchored by data at those hotter temperatures. To reflect the fact that when SST reaches ~29 °C, within the analytical uncertainty, $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ reaches 114,84, we utilized a prior standard deviation of 5 °C when $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ > 0.9, 3 °C when $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ > 0.95 and 2 °C when $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ = 1. For Site U1488, 109 samples have measurable alkenones for $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ determinations. Almost all of the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ values reach their maximum value of 1 prior to 3 Ma, and thus $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST is mostly pinned to the maximum value (~29 °C) between 10 and 3 Ma (Fig. 2). In the high-latitude North Pacific, reported $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ values between 2.8 and 0 Ma were available from only one Site, Site 882, and from two independent studies: Martínez-Garcia et al.85 and Yamamoto and Kobayashi86. $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ values reported by Martínez-Garcia et al.85, where the relative abundance of alkenones was measured by gas chromatography chemical ionization mass spectrometry (GC–CIMS) aiming to detect compounds with much lower concentrations87, were inconsistent with those reported by Yamamoto and Kobayashi86, where alkenones were measured by the more traditional GC-FID. The dataset obtained from Yamamoto and Kobayashi86 suffers from a lower resolution over Pliocene and Pleistocene with only 15 data points after 2.8 Ma. Nevertheless, these data are sufficient for our binning within 0.2 Myr time window and not systematically biased toward either glacial or interglacial periods (Supplementary Fig. 7). In our compilation, we opted for the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ record reported by Yamamoto and Kobayashi86, because a higher alkenone abundance of their samples (averaged at ~0.08 μg/g) minimizes the potential biases86,88. For example, it has been observed that stronger absorption of C37:3 alkenones onto the surface of the capillary column when alkenone abundance was low89, which could lead to warm bias of $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST. In addition, Site 882 samples with low alkenone concentrations could be more subject to laterally transported alkenones from subtropical regions90, which would again cause a warm bias of $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST. Besides reducing the warm biases, the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST record from Yamamoto and Kobayashi is more consistent with the $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$-SST values of nearby Sites 883/884 (Supplementary Fig. 7), which were also determined by a GC-FID method12.

Besides BAYSPLINE, we also evaluated the influence of alternative linear calibrations such as Conte et al.14 on our results and found that linear calibrations will not change our estimates of high-latitude amplification factors (see Supplementary note 1).

### Mg/Ca-SST estimates

Mg/Ca values of Site 806 were converted to SSTs using the core-top calibration for tropical T. sacculifer derived from Dekens et al.91: Mg/Ca = 0.31 exp 0.084 [SST + 0.048(∆CO32−)] where ∆CO32− = [CO32−]in situ−[CO32−]saturation which corrects for carbonate dissolution. A constant, modern ∆CO32− value of 10.5 μmol/kg was applied for Site 806 10-Myr data91 since the water depth of Site 806 during the late Miocene and Pliocene (<2672 m)92 was always shallower than the horizon where significant dissolution occurred in the Pacific (>2800 m)93. This is recently confirmed by a B/Ca study of benthic foraminifera from Site 806, which suggests that carbonates at Site 806 were not subjected to severe dissolution during the Plio-Pleistocene23.

The residence time of Mg and Ca in the ocean is about 13 and 1 Myr, respectively, which requires the consideration of possible changes of Mg/Ca in the seawater (Mg/Casw) when applying Mg/Ca thermometry on timescales of >1 Myr. To adjust the influence of varying Mg/Casw on the Mg/Ca thermometry, the ratio between the past and modern Mg/Casw value is used following previous studies38,94, and the calibration equation is expressed as: Mg/Ca = (Mg/Casw/Mg/Caps) × 0.31 exp 0.084 [SST + 0.048(∆CO32−)] where Mg/Casw is the past seawater Mg/Ca and Mg/Caps is the present value (5.2 mol/mol). Since the simulated SH98 Mg/Casw agrees well with the proxy data and covers the entire 10 Myr, we applied SH98 Mg/Casw to calculate Site 806 Mg/Ca-SST presented in Fig. 2. Besides Dekens et al.91, alternative calibrations were also used to covert Mg/Ca to SST (see Supplementary note 1), and the estimated SSTs from different calibrations were shown in Supplementary Fig. 4.

Besides temperature, pH, and salinity also affect the incorporation of Mg into the calcite shell of planktonic foraminifera95. However, the pH effect is minimal for the Mg/Ca of T. sacculifer95 and therefore not included in our Mg/Ca–SST calculations. A salinity change of 1 practical salinity unit (PSU) would lead to a change in foraminiferal Mg/Ca by ~4% and consequently SST by ~0.5 °C95. The limited constraints on the Miocene–Pleistocene surface salinity of the WPWP restrict our ability to assess the bias of salinity variations on Mg/Ca–SST of Site 806, but this bias is likely to be within the SST calibration uncertainty given that the variation in the EEP surface salinity is <1.1 PSU for the past 10 Myr96 and the surface salinity change in the WPWP is less than that in the EEP during the last glacial maximum95.

### Stacking regional SST for the past 10 Myr and model outputs

Pacific SST records (Supplementary Figs. 5 and 8) were stacked to produce the composite records from four regions: WPWP, EEP, middle latitudes, and high latitudes (Supplementary Fig. 9). In each region, SST data were binned over 200 kyr at each site, with 50% overlap, from 10 to 0 Ma. 200 kyr was selected as our bin size since it is consistent with our typical data resolution and fits our goal of determining the long-term amplification of warming over the last 10 million years. This bin size limits our analyses to long timescales and precludes investigations of the high-latitude amplification on glacial–interglacial timescales. When higher-resolution records become available, different bin sizes should be used to test whether the amplification factors determined here (Fig. 3, Table 1) are subject to changes.

For each time bin, we calculated the mean at each site and the standard deviation of the SST around the mean and then averaged the mean SST at all sites in a region. Therefore, we obtained WPWP, EEP, and middle-latitude SST stack and high-latitude SST at the same time series (Supplementary Figs. 5 and 9). The standard error of averaged SST between sites was calculated and regarded as the standard error of the regional SST stack. The binning window we used is 200 kyr except for the high-latitude SST binning between 2.8 and 0 Ma, when there were only 15 data points available from Site 882. We, therefore, used a larger window (400 kyr) for binning the 2.8-million-year SST data at this site. High-latitude SST was obtained from a single site during some time periods (2.6–0.6, 4.8–4.5, 9.8–7.5 Ma) due to the limited number of sites with available $${{{{{{\rm{U}}}}}}}_{37}^{{{{{{\rm{K}}}}}}^{\prime} }$$ data. It was impossible to calculate the standard error of averaged SST between sites for these time periods, and thus the median of the calculated standard error of high-latitude SST stack for other time periods was used.

In terms of model outputs, several areal boxes bracketing our study sites were chosen to represent the WPWP (140°–160°E, 0°–4°N), EEP (118°W–90°W, 4°S–4°N), middle latitudes (156°E–160°E, 34°N–38°N; 126°W–130°W, 38°N–40°N; 174°E–178°W, 42°S–46°S) and high latitudes (168°E–148°W, 50°–56°N). We used the area-weighted yearly SST to represent the regional SST. For the 1200-yr mid-Pliocene CESM1.2 and CESM2 simulations, we binned the SST outputs over a 20-yr binning window to remove the influence of El Niño-Southern Oscillation and Pacific Decadal Oscillation, and then yielded one SST time series for each region, which were presented in Fig. 5. In terms of the 5300-yr transient coupled CESM104 simulation, we obtained a time-series of SSTs at each region over the first 160-year using the 20-year binning method. Since the model outputs are temperature anomalies, we started the 160-year data off to fit the observed recent 160-year SSTs in the EEP, WPWP, and middle latitudes at the lowest high-latitude SST. We then used these SST references at each region and generated a time series of SST over the whole time series using the 100-year binning approach. The same stacking method was applied to other millennial-length model simulations with abrupt 4 × CO2 forcing (Fig. 6).

### Regression analyses

We use two regression methods in our linear regression analyses, ordinary least-squares (OLS) regression, and York Regression. For model outputs, we applied the OLS regression since these temperatures and ages are associated with minimal uncertainties. For the 10-Myr proxy data, we employed York Regression to estimate the relative temperature change, given that temperatures in both the horizontal and vertical dimensions have analytical errors and uncertainties arising from temperature calibrations used to convert indices to SSTs. One exception for the proxy data is the SST trend from 0.5 to 9.8 Ma (Supplementary Note 2 and Supplementary Fig. 10), which is calculated using the OLS regression rather than York Regression due to difficulties in precisely determining the age uncertainties.

### High-latitude amplification factor determinations

High latitude amplification is defined as high latitude SST changes relative to tropical SST changes in this study. It can be estimated by the low latitude SST regressed on high latitude SST or vice versa. Instead of regressing the high latitude SST on the low latitude SST, here we chose to regress the low latitude SST on the high latitude SST since the high latitude SST covering the past 10 Myr has a larger range relative to the WPWP. To calculate the high latitude amplification factor over the past 10 Myr, we performed a York Regression of high-latitude SST against WPWP and EEP SST with their standard errors. Using a Monte Carlo approach, we conducted 10,000 simulations of binned SST at each site by randomly sampling the binned SST within their 2 standard deviations. Following the stacking method, 10,000 iterations of WPWP, EEP, and high-latitude SST were obtained with their standard errors. Utilizing these data, we generated 10,000 realizations of the York Regression of the high-latitude SST against WPWP and EEP SSTs, respectively. Therefore, 10,000 slopes of the high latitude versus WPWP SST and high-altitude versus EEP SST, respectively, were derived. The inverse of the slope represents the high latitude amplification factor. The distribution of 10,000 amplification factors (high latitude relative to the WPWP) was plotted as the probability density function of the high latitude amplification factor (Fig. 4b). The amplification factor is 2.42 ± 0.64 (1σ), indicating that the high-latitude SST changes are amplified relative to the WPWP by a factor of 2.42 ± 0.64. Similarly, the York Regression between high-latitude SST and EEP SST shows that the high-latitude SST change is amplified relative to the EEP merely by a factor of 1.81 ± 0.52 (Supplementary Fig. 11).

To calculate the amplification factor derived from model outputs of the mid-Pliocene and abrupt 4 × CO2 simulations, we performed the OLS regression of high-latitude SSTs against WPWP (Figs. 5 and 6).

### Meridional SST gradient vs. high latitude SST

The relationship between meridional SST gradient and high-latitude SST was derived from the linear relationship between the WPWP/EEP and high-latitude SST regardless of the reconstructed, observed, or simulated SSTs. Taking the paleo-SSTs as an example, we first yielded a linear relationship between the WPWP SST (y) and high-latitude SST (x) (Fig. 3), which gives:

$$y=0.43x+26.86$$
(1)

Then, we set meridional temperature gradient (WPWP minus high-latitude SST) as a variable Y. According to Eq. (1), we can derive the relationship between meridional SST gradient and high-latitude SST, which is expressed by

$$Y=y-x=\left(0.43x+26.86\right)-x=\left(0.43-1\right)x+26.86$$
(2)

Applying this indirect approach to other time scales or scenarios (Figs. 5, 6, Supplementary Figs. 11 and 12), we obtained the relationship (Eq. (2)) between meridional SST gradient and high-latitude SST. These relationships were only used in a qualitative sense to show that the SST differences from the equator to high latitudes decrease in a warmer world.

### Middle latitude SST changes relative to the WPWP

Mid-latitude sites are available from both the north and south Pacific, with the corresponding modern study area from the north Pacific (156°E–160°E, 34°N–38°N; 126°W–130°W, 38°N–40°N) and south Pacific (174°E–178°W, 42°S–46°S) region. However, the SST data from the South Pacific are quite limited; for example, Site 1125 is the only available record between 10 and 2.8 Ma. Our results show that the warming in the middle latitudes was also amplified relative to the WPWP, with the middle-latitude amplification factor of 1.57 ± 0.19 for the North Pacific (Supplementary Fig. 12). The inclusion of the South Pacific data does not make a statistically significant change, with the value computed to be 1.38 ± 0.09 for the entire Pacific (Supplementary Fig. 12).

### Model outputs

CESM outputs used to compare with observations and reconstructions are from paleoclimate, modern, and future simulations. Miocene global paleoclimate simulation (with an ECS of 4.1 °C/doubling) was carried out for >2000 years utilizing CESM/CAM5 incorporating the Community Land Model (CLM4)97 and updated Miocene boundary conditions98, including lower albedo and different ocean circulation, and 400 ppm CO2, which is in the center of the range of CO2 proxy estimates. The “equilibrium” temperatures of modern and future simulations are derived from publicly available CESM/CAM5 and the recently released CESM2/CAM6 models. The equilibrium mid-Pliocene (3.205 Ma) simulations were carried out for 1200 years using the CESM1.2 (with an ECS of 4 °C/doubling) and CESM2 (with an ECS of 5.3 °C/doubling)99, and their boundary conditions follow the Pliocene Model Intercomparison Project Phase 2 (PlioMIP2), which utilizes 400 ppm CO2. CESM/CAM5 model outputs from the preindustrial (1850) and present-day (800 ppm CO2 forcing) runs are utilized in our comparison since they reproduce the full temperature range of paleoclimate data, and the corresponding netcdf files we used are E.1850_C5-TS_avg.nc and E.Mod_800_C5-TS_avg.nc. CSEM2/CAM6 provides temperatures in response to present-day (1 × CO2) and instantaneously doubled CO2 (2 × CO2). The millennial-length future simulations included in our study were obtained from Rugenstein et al.55 and these model outputs generated with abrupt 4 × CO2 forcing (Table 1 and Supplementary Table 4) were employed for our comparison since a CO2 quadrupling is requested by Coupled Model Intercomparison Projects 5 and 6 and represent the future CO2 levels on a multi-century time scale.