Introduction

The deep-time geological record contains responses of the climate system to a wide range of internal and external forcings as well as documentation of intricate short- and long-term feedbacks. While modeling studies of more recent past climate intervals, like the Last Millennium (LM) and the Last Glacial Maximum (LGM), provide useful information about the future of our climate system and are being incorporated into the next Intergovernmental Panel on Climate Change (IPCC) assessment, palaeoclimate modeling of more ancient time periods have been unable to reproduce some of the key large-scale features associated with periods of extreme global warmth and it is important that we gain a better understanding of such discrepancies. Deep-time climates afford us an opportunity to explore climate sensitivity to a range of atmospheric CO2 levels, to understand how heat is transported during warm climate intervals, to gain insight into the controls on pole-to-equator thermal gradients and to evaluate the stability of ice sheets and the response of sea level during warmer climates. Anthropogenic changes could affect Earth's climate for hundreds to thousands of years, thus analysis and understanding of the most recent deep-time interval of warmth – similar in magnitude to that projected for the end of the 21st century – is relevant to our understanding of future climate and discussion of adaptation and mitigation policies.

The Pliocene Epoch (5.332 Ma to 2.588 Ma) spans an interval of global warmth with high-frequency, low-amplitude variability transitioning to high-frequency, high-amplitude variability associated with the initiation of Northern Hemisphere glaciation. The U.S. Geological Survey's Pliocene Research, Interpretation and Synoptic Mapping (PRISM) group selected for study the most recent interval within the Pliocene (3.264 to 3.025 Ma) known to have temperatures above preindustrial (PI) levels. This interval within the Pliocene, the middle part of the Piacenzian Age, is particularly attractive for palaeoclimate studies because many of the first-order boundary conditions were little different than today: continents and ocean basins were near their present geographic positions and the flora and fauna of the Late Pliocene was in large part identical to present day, allowing for modern analog-type reconstructions of the palaeoenvironment1. Chronology for Piacenzian sequences within deep-sea cores can usually be well-resolved from many magnetobiochronologic events, coupled with an ever-increasing number of orbitally-tuned chronologies2.

The “PRISM interval” has been the focus of two decades of intensive investigations into all aspects of the climate system, resulting in a series of palaeoenvironmental reconstructions: PRISM0, PRISM1, PRISM2 and the current PRISM33,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30. PRISM data have been used in a number of climate modeling studies to explore conditions during this last great interval of global warmth31,32,33,34,35,36,37,38,39. A natural evolution of these individual modeling studies was the formation of an organized model intercomparison project for the Pliocene, which includes many of the leading climate models that contribute IPCC simulations.

The Pliocene Model Intercomparison Project (PlioMIP) is a subcomponent of the Palaeoclimate Modelling Intercomparison Project (PMIP40,41;). The initial phase of PlioMIP includes two experiments, both of which focus on the mid-Piacenzian time period equivalent to the PRISM data sets: the first PlioMIP experiment employs atmosphere-only general circulation models and the second utilizes coupled ocean-atmosphere models (see42,43,44,45,46,47,48,49,50,51,52,53,54). Both experiments make use of the PRISM3 versions of the boundary condition data sets30.

In this paper we compare initial results of the second PlioMIP experiment using mean annual sea surface temperature (SST) fields derived from eight simulations ( Figure 1 ), with PRISM mean annual SST estimates at each of 100 locations in the PRISM3 marine reconstruction ( Figure 2 ). This comparison is used to document large-scale features of the mid-Piacenzian surface ocean, assess where models and data are in good general agreement and where they are not and also looks at the variability of the multi-model ensemble versus the variability in the palaeodata-based estimates.

Figure 1
figure 1

Model sea surface temperature anomaly (ΔSST), calculated by subtracting preindustrial from Pliocene sea surface temperature, as simulated by each of the eight PlioMIP models.

(a) CCSM4, (b) COSMOS, (c) GISS-E2-R, (d) HadCM3, (e) IPSL CM5A, (f) MIROC4m, (g) MRI-CGCM2.3, (h) NorESM. Maps created using Panoply v.3.1.3 written by Robert B. Schmunk.

Figure 2
figure 2

Map showing distribution of PRISM localities, sea surface temperature anomalies (ΔSST), calculated by subtracting modern from Pliocene sea surface temperature and the λ-confidence placed upon each locality estimate (relative size of circle, where larger circles represent greater confidence).

Map created in iMap v.3.5 using World Vector Shoreline (NOAA National Geophysical Data Center, Date Retrieved 4/17/2011, http://www.ngdc.noaa.gov/mgg/shorelines/shorelines.html).

Results

Comparison of individual model and data anomalies (ΔMODEL and ΔPRISM, respectively, see “Methods”) on a site-by-site basis identifies a greater range of warming in the reconstructed SST (~10°C spread in ΔPRISM). The models are the most inconsistent with each other and with the PRISM data in the North Atlantic region ( Figure 3 ). This area is, of course, complex given that a myriad of strong feedbacks can affect the region, including changes in sea ice, the Gulf Stream current, Atlantic meridional overturning and even the storm tracks. Similar inconsistency in this region was shown among the general circulation models (GCM's) used in the IPCC 4th assessment55.

Figure 3
figure 3

Scatter plot of multi-model-mean anomalies (squares) and PRISM3 data anomalies (large blue circles) by latitude.

Vertical bars on data anomalies represent the variability of warm climate phase within the time-slab at each locality. Small colored circles represent individual model anomalies and show the spread of model estimates about the multi-model-mean. While not directly comparable in terms of the development of the means nor the meaning of variability, this plot provides a first order comparison of the anomalies. Encircled areas are (a) PRISM low latitude sites outside of upwelling areas; (b) North Atlantic coastal sequences and Mediterranean sites; (c) large anomaly PRISM sites from the northern hemisphere. Numbers identify Ocean Drilling Program sites discussed in the text.

Individual models rarely show negative (i.e. Pliocene SST cooler than present day) anomalies. Relative to the data, the multi-model-mean anomaly (ΔMMM) always shows small positive values and exceeds +3°C at only two locations corresponding to core Site 722 on the Arabian Margin and Site 907 in the subarctic North Atlantic ( Figure 3 ).

Negative ΔPRISM values are confined to the tropics and the subtropical North Atlantic ( Figures 2 , 3 ). In the latter (area b in Figure 3 ), these anomalies are isolated to land sections (outcrops) associated with the western boundary current and others in the Mediterranean Sea region. The subtropical estimates are from a combination of planktonic foraminifer, ostracod and Mg/Ca palaeothermometers and represent medium- to high-confidence sites ( Figure 2 ). The tropical sites showing negative ΔPRISM values (area a in Figure 3 ) are located near the equator in all ocean basins and are presently associated with equatorial divergence. Small changes in position of equatorial currents and upwelling cells along coastal regions could explain apparent cooling at these locations.

Other upwelling sites from the Pacific (677, 847, 852, 1236, 1237, 1239), Atlantic (659, 661) and Indian Oceans (722) have some of the largest ΔPRISM values in the low-latitude data set based upon a combination of planktic foraminiferal and alkenone palaeothermometry and show basic agreement between proxies and between ΔPRISM and multi-model-mean anomalies (ΔMMM) ( Figure 3 30; Supplementary information ). Tropical sites in the PRISM3 data set collectively indicate a mean anomaly of +1.08°C. The aforementioned sites have a mean anomaly of +3.21°C. Removal of these sites greatly reduces the magnitude of the mean tropical ΔPRISM (to +0.4°C) and supports the Dowsett et al.28 finding that low latitude SSTs, away from upwelling centers, were indistinguishable from modern conditions.

Documenting the tropical sea surface temperature stability in the Pliocene, particularly given evidence of substantial warming at high latitudes, is of considerable importance for the appraisal of the primary climate drivers for this period. The multi-model ensemble from the IPCC 4th assessment report shows that greenhouse gas forcing of less than a double-CO2 equivalent still yields a tropical warming of more than 1°C by the middle of this century55 while recent Pliocene simulations using the Goddard Institute for Space Studies (GISS) ModelE2-R, a Coupled Model Intercomparison Project Phase 5 (CMIP5 ) model, found that Pliocene tropical warming exceeded 1°C with as little as 405 ppm CO253. Although the existence of a tropical thermostat mechanism cannot be entirely ruled out given uncertainties in modeled cloud responses, most of the IPCC GCM's would support the lower-end estimates for Pliocene atmospheric CO2. The only locations in the tropics where the PRISM3 data do indicate some warming tend to be in areas of present-day upwelling, so most simulations, which show consistent warming at all longitudes, are still too warm compared with proxies.

A second region of offset between the ΔMMM and ΔPRISM exists in the mid- to high northern latitudes. The majority of sites in Figure 3 (area c) are from the North Atlantic. The extensive warming at high latitudes in the North Atlantic has been repeatedly documented9,27,28,45 and is also seen in terrestrial archives of surface air temperature (e.g.56,57). Mid-latitude sites (410, 548, 552, 606, 607, 608 and 610) are the very highest confidence sites in the PRISM3 data set and the large inter-model spread in this region may be due to the highly variable position of the Gulf Stream—North Atlantic Drift and the resolution of the different models45. However, the magnitude of the warming in the mid- to high-latitude North Atlantic, in both ΔMMM and any ΔMODEL, is small compared to ΔPRISM. While the variability of ΔPRISM (a function of the within-time-slab variability at those localities) and the spread of the individual ΔMODEL results about the ΔMMM overlap in many cases ( Figure 3 ), there is no way to directly compare these uncertainty measures and there exists a consistent offset between data and models.

Four sites in Figure 3 (area c) are outside the North Atlantic. Sites 579 and 580 are located in the North Pacific Kuroshio Extension. Analysis of the diatom assemblages at both sites suggests moderate warming over present day conditions within the PRISM interval8,13. Some but not all PlioMIP simulations pick up this North Pacific warming (relative to PI conditions) ( Figure 1 ). New faunal data from ODP Site 1208, located beneath the Kuroshio Extension, show little change in mean annual temperature relative to present day but suggest greatly reduced seasonality (5°C), with winter temperatures warmer during the Pliocene ( Supplementary material ). This is essentially the same pattern exhibited by the Gulf Stream in the North Atlantic. This reduced seasonality is not demonstrated by PlioMIP simulations.

Sites 1014 and 1021, from the upwelling cell off the west coast of North America, show Pliocene warming (relative to today) documented by both faunal and alkenone palaeothermometry of 6°C–8°C, which is not simulated by any of the models (area c of Figure 3 ). Reed-Sterrett et al.58 suggest the cause may be due to a shoaling thermocline, but an analysis of seasonal vertical temperature profiles from the MIROC4m and GISS-E2-R simulations show no change to the thermocline between Pliocene and PI (preindustrial).

Sites 1236 and 1237 are both situated on the Nazca Ridge. Site 1236 is located just seaward of the main path of the cool, northward flowing Peru-Chile Current while Site 1237 is located near the eastern edge of the current and associated upwelling. Proxy SST data show warmer upwelling conditions during the Pliocene than exist today in the region, analogous to the upwelling region off the Pacific coast of North America ( Figure 2 ). The ΔMMM values do not capture this, but some PlioMIP simulations (GISS-E2-R and HadCM3) do show warmer regional anomalies ( Figure 1 ) and the MIROC4m simulation shows a deepening of the mixed layer during the Pliocene relative to the PI.

Large scale features

Several large-scale first-order features of the Pliocene climate can be found in both data reconstructions and model simulations. Models are generally in good agreement with estimates of Pliocene SST in most regions except the North Atlantic and upwelling regions in the tropics and subtropics ( Figures 1 , 2 ). Despite the complications of variability within the PRISM time slab and the spread shown by the different PlioMIP simulations, there is a fundamental divergence between the two data sets that increases with increasing latitude. This decoupling may stem from a number of sources, including the resolution of the models, differing model parameterizations of subgrid-scale features such as clouds, the highly variable nature of the mid-latitude North Atlantic, or the nature of the Gulf Stream – North Atlantic Drift Current45.

Polar amplification of SSTs is one of the principal signatures of the PRISM data set, with a maximum temperature increase observed in the North Atlantic. However, no such extreme is evident in the MMM data ( Figure 3 ). While individual models show various levels of polar amplification, none are equivalent in magnitude to the PRISM data. Conversely, low-latitude warming is common to all of the PlioMIP simulations, but is not observed in the PRISM data away from upwelling cells. Suggestions that increased oceanic and/or atmospheric heat transport during the Pliocene helped flatten the meridional temperature gradient are not unequivocally supported by the MMM data, because individual model simulations show a wide range of heat transport response59.

Although general ocean surface circulation, based upon global distribution of SSTs, appears to have been broadly similar between the Pliocene and today, patterns of warming seen in the PRISM data in both the North Atlantic and North Pacific could indicate that western boundary currents were more vigorous or that meridional overturning was enhanced. Unfortunately, climate models do not have a consistent Pliocene response with regards to either meridional overturning or ocean heat transport59.

Reduced seasonality, a phenomenon more characteristic of the tropics in the modern, can be documented farther north during the Pliocene based upon analysis of palaeontological assemblages found in coastal regions (e.g. Ref. 60,61,62).

In the Southern Ocean, poleward displacement of palaeo-fronts is documented in the PRISM data by changes in sedimentology, paleontology and productivity14,16 that demonstrate the Pliocene Antarctic Polar Front was as much as 6° latitude further south than today. The corresponding adjustments of isotherms are in close agreement with the PlioMIP simulations, especially when the variability of both is taken into account ( Figure 3 ).

Discussion

Features determined by the analysis of palaeontological and geochemical data as well as output of the PlioMIP simulations document the overall warming of surface waters of the Pliocene ocean.

Large-scale circulation (i.e., the existence of but not necessarily position of subtropical gyres) appears to have been similar in both data reconstructions and model simulations, but the differences in resolution between models and the variability introduced by the PRISM time slab averaging procedures tend to obscure important details. The PlioMIP simulations of mean annual temperature (MAT) do not pick up the magnitude of warming in upwelling regions documented by geochemical and palaeontological proxies in the PRISM data. An initial analysis of seasonal vertical temperature profiles from seven of the eight PlioMIP simulations in the Peru Upwelling region shows a simple temperature offset between PI and Pliocene, but no change to thermocline depth. Only one model, MIROC4m, shows a clearly deeper thermocline and concomitant mixed layer during the Pliocene. Tropical cyclones may play an important role in vertical mixing of the upper ocean63,64, but demonstrating that to be the case for the PlioMIP simulations will require analysis beyond the scope of the current study.

The overall ΔMMM field is in agreement with ΔPRISM values in most regions; however, models do not achieve the level of high-latitude warming seen in the North Atlantic and Arctic data. Reduced sea ice cover no doubt played a significant role in the high latitude warming of the Pliocene. Continued refinement of the model parameterizations and proxy data for Arctic sea ice may improve the agreement between PlioMIP simulations and warmth suggested by marine and terrestrial data.

The mid-latitude North Atlantic is highly variable today and this is reflected in the high variance associated with the ΔMMM in this region. Future experiments aimed at understanding additional forcings and incorporating seasonal rather than mean annual SST may help elucidate the spread in ΔMMM in the mid latitude North Atlantic region.

These PlioMIP simulations, performed with carefully controlled boundary conditions and protocols, point to the need for a more temporally refined data reconstruction with as broad geospatial coverage as possible. The next phase of PlioMIP simulations will be accomplished using a realistic near-modern orbital configuration relevant to future climate discussions65. If the oceanographic record responds strongly to certain orbital periods in certain regions, the time-slab methodology could produce a systematic bias in those regions. Similarly, the time-slab may integrate different seasonality in each measurement within the time-slab. In an attempt to reduce uncertainty, the PRISM4 palaeoenvironmental reconstruction will be correlated to marine isotope stage KM5c representing an order-of-magnitude increase in chronologic resolution66.

The PlioMIP model ensemble comparison to the PRISM3 palaeoenvironmental reconstruction of SST represents the first systematic data-model comparison for the mid-Piacenzian warm period. The differences we identify between proxy data and models, if real, presents a significant challenge in assessing climate sensitivity beyond the current period. Since the Pliocene is an optimal test-bed for models and in the face of current and future warming, it is imperative that the disagreement in amplitude of warming be explored in more detail with a focus on reducing uncertainty in climate proxies as well as uncertainties in the models and their forcings.

Methods

PRISM Reconstruction

The PRISM mean annual temperature verification data set has 100 localities ranging from 77.9° South latitude to 80.4° North latitude and situated in every major ocean basin ( Figure 2 ). Approximately 1/3 of the localities are confined to the Tropics and 1/3 are in the North Atlantic Basin. Twenty percent of the PRISM sites are located in the Southern Ocean. These data are derived primarily from quantitative faunal or floral analyses, augmented where possible with Mg/Ca and alkenone estimates45. The PRISM data have been examined in numerous articles including1,18,30,39,45.

PRISM mean annual SST estimates represent a warm-peak-average, which is defined as the warm phase of climate from the interval between 3.264 Ma and 3.025 Ma at each locality. The use of a time-slab equivalent to ~240 Ky duration introduces variability about the estimated warm phase of SST, expressed as the standard deviation of the warm peak estimates within the time slab at each locality (See Supplementary Information Table S1 ).

Data have been assessed using the λ-confidence scheme45. This provides a measure of confidence based upon chronology, sample density, sample quality and type as well as performance of quantitative method used. λ for included localities ranges from Very High to High to Medium and the percentage of sites corresponding to each level are 29%, 34% and 36% respectively. Figure 2 shows the spatial distribution and confidence of the 100 PRISM localities (see also Supplementary Table S1 ).

Models

The model simulations included in this paper are from CCSM4 (National Center for Atmospheric Research54), GISS-E2-R (NASA Goddard Institute for Space Studies53), HadCM3 (Hadley Centre for Climate Prediction and Research52), MIROC4m (Center for Climate System Research, National Institute for Environmental Studies, Frontier Research Center for Global Change44), COSMOS (Alfred Wegner Institute48), IPSL CM5A (Laboratoire des Sciences du Climat et de l'Environnement51), MRI-CGCM2.3 (Meteorological Research Institute and University of Tsukuba50) and NorESM (Bjerknes Centre for Climate Research49). Details of the models are found in Table 1 and in the corresponding publications. The GISS-E2-R, CCSM4 and IPSL CM5A models are the same versions used in IPCC 5th assessment future climate simulations. Coupled ocean-atmosphere model simulations for the PlioMIP experiment will be available from the PMIP3 project https://pmip3.lsce.ipsl.fr/.

Table 1 Details of coupled atmosphere-ocean climate models, boundary conditions, published climate sensitivity and ocean heat transport values59 from PlioMIP Experiment 2

In an attempt to standardize the PlioMIP simulations, all models were initialized and run using an identical (to the extent possible) experimental design and protocol ( Table 2 ; see also43). Of the eight models, three (MIROC4m, COSMOS and GISS-E2-R) used the preferred Pliocene boundary conditions, meaning each model's land/sea mask was altered to reflect a 25-meter increase in Pliocene sea level, the existence of ocean in place of the West Antarctic Ice Sheet and the removal of Hudson Bay (which is a result of Pleistocene ice sheet geography).

Table 2 Forcings and boundary conditions for PlioMIP Experiment 2 Pliocene and preindustrial simulations

For each model in the PlioMIP ensemble, the Pliocene simulations were run until the individual modeling groups determined that their models had achieved an equilibrium state; integration times varied from 500 to 1500 years ( Table 1 ). Twelve monthly SST fields were then averaged over the last 30 years of the run to develop a mean annual SST data set. Figure 1 shows the change in SST, Pliocene minus PI, simulated by each of the eight PlioMIP models. In this study, data points representing the PRISM localities are selected directly from the contributed datasets to avoid biases due to interpolation during re-gridding.

Comparison of anomalies

Analysis of the global and regional performance of the eight PlioMIP models is based upon comparison to the multiple proxy SST anomalies documented in45, which are referenced to the mid-20th century calibration of Reynolds and Smith67. We focus on comparing data and model SST anomalies rather than absolutes to avoid biases introduced by the effect of latitude on absolute SST (warmer at lower latitudes and cooler at higher latitudes). Testing the commonality between the anomalies produced by the model and data affords a more accurate test of the model response to Pliocene boundary conditions34,39.

Part of model–data disagreement is related to differences in the control (PI) simulation for each model. We corrected each model anomaly by subtracting a term equivalent to the difference between the mid 20th century calibration used by the palaeontological data and PI conditions39:

where PRISMPLIO is the PRISM3 mean annual SST estimate at each locality (45,68, this paper); RSMODERN is the modern observed temperature at each PRISM locality determined from61; MODELPLIO is the mean annual SST value sampled at each PRISM locality from the model SST field as described above; MODELPREINDUSTRIAL is the mean annual SST value sampled at each PRISM locality from the model control simulation; HadISSTPREINDUSTRIAL is the mean annual SST sampled at each PRISM locality from a PI data set which is a hybrid of observational and modeled data69.