Background & Summary

Since the pioneering work of D’Arrigo and Jacoby1,2,3, as well as Mann et al.4,5, temperature reconstructions of the Common Era have become a key component of climate assessments6,7,8,9. Such reconstructions depend strongly on the composition of the underlying network of climate proxies10, and it is therefore critical for the climate community to have access to a community-vetted, quality-controlled database of temperature-sensitive records stored in a self-describing format. The Past Global Changes (PAGES) 2k consortium, a self-organized, international group of experts, recently assembled such a database, and used it to reconstruct surface temperature over continental-scale regions11(hereafter, ‘PAGES2k-2013’).

This data descriptor presents version 2.0.0 of the PAGES2k proxy temperature database (Data Citation 1). It augments the PAGES2k-2013 collection of terrestrial records with marine records assembled by the Ocean2k working group at centennial12 and annual13 time scales. In addition to these previously published data compilations, this version includes substantially more records, extensive new metadata, and validation. Furthermore, the selection criteria for records included in this version are applied more uniformly and transparently across regions, resulting in a more cohesive data product.

This data descriptor describes the contents of the database, the criteria for inclusion, and quantifies the relation of each record with instrumental temperature. In addition, the paleotemperature time series are summarized as composites to highlight the most salient decadal- to centennial-scale behaviour of the dataset and check mutual consistency between paleoclimate archives. We provide extensive Matlab code to probe the database-processing, filtering and aggregating it in various ways to investigate temperature variability over the Common Era. The unique approach to data stewardship and code-sharing employed here is designed to enable an unprecedented scale of investigation of the temperature history of the Common Era, by the scientific community and citizen-scientists alike.

Methods

Collaborative model

The database is the product of a community-wide effort, coordinated by PAGES through a network of nine working groups (http://www.pastglobalchanges.org). Calls for participation were disseminated broadly and regional leaders solicited input from scientists with relevant datasets and/or expertise. A provisional database was compiled into a uniform framework, then redistributed to regional groups for quality control and further additions. For this purpose, quality control plots, including basic metadata for each record, were prepared to enable coauthors of this data descriptor to efficiently recognize and correct errors. Examples of these plots are given in the Quality Control section, and the full collection is archived as pdf files in Data Citation 1.

Data aggregation

The PAGES2k community aimed to identify records that are most relevant to understanding temperature evolution over the last 2000 years, while also assembling a uniform global database that can be culled to address a wide range of research questions. Specific criteria were developed to gather all published proxy records that meet five objective and reproducible criteria:

Thermal sensitivity

Proxy records were gathered from archive types for which previous understanding of the proxy system indicated that the records are temperature-sensitive. Records were only included when the original study described the relation between the proxy value and one or more climate variables, including temperature, or when the correlation with nearby instrumental temperature data was high enough to reject the null hypothesis of zero correlation at the 5% level, taking into account both temporal autocorrelation and test multiplicity. Indeed, temporal autocorrelation is well known to reduce the number of degrees of freedom available to reject a null hypothesis14. The test multiplicity problem (aka the ‘multiple comparisons problem’ or ‘look-elsewhere effect’) is the propensity for false positives to arise when multiple hypothesis tests are conducted simultaneously; in this case, testing the null hypothesis at 1,000 grid points with a 5% level would be expected to yield fifty spurious ‘discoveries’15 even in the absence of any link to temperature. Our analysis controls for both effects (see ‘Relationship to temperature’).

In addition, regional and proxy experts who are authors on this data descriptor certified that the records reflect temperature variability and that they meet all other stated criteria (Supplementary Table 1). Note that temperature sensitivity does not preclude the potential for many proxy systems to be secondarily or additionally sensitive to other environmental variables, such as moisture availability.

Record duration

A primary goal of the PAGES2k project is to understand climate dynamics over the entire Common Era. Records of this duration are most commonly accessible from sedimentary sequences that lack annual resolution; a minimum length of 500 years for these records serves as a coarse initial screen. For annually-banded terrestrial records (e.g., varves, glacier ice, tree rings), shorter-duration records that overlap with the instrumental period are important for calibration-validation exercises and for bridging between annually-resolved and lower-resolution records; as a result, annually resolved records from terrestrial archives over 300 years long were also included. Annually resolved records from marine archives (corals, molluscs) are rarely this long, but provide critical information where instrumental data are often sparse or absent, and were included in the database if they exceeded 50 years in duration.

Chronological accuracy

Most records in this database are layer-counted, with a dating uncertainty of a few percent or less, but generally extend back less than 500 years. Other proxy records may span many millennia but some have chronologies that are too uncertain for centennial and finer-scale paleotemperature reconstructions. Recognizing, however, that lake and marine sediments accumulate at approximately constant rates, and considering the goal of building a comprehensive database from which records can be culled as necessary, depending on the scientific question, the initial screen for chronological control was relatively coarse. Once suitable records have been identified, their age-model uncertainty can be quantified using existing statistical procedures16,17,18,19,20, providing a useful basis for including or weighting individual records in paleotemperature reconstructions. Namely, when annual layers cannot be counted, the timelines for records selected for this database were constrained by at least one chronological control point near the most recent end of the record and another near the oldest part of the record, or 1 CE, whichever is younger. Records that are longer than 1,000 years must include at least one additional age approximately midway between the other two. What constitutes ‘approximately’ was open to reasonable interpretation but was typically within two centuries of the mid-point.

Record resolution

PAGES2k scientific questions focus on centennial and finer time scales. Terrestrial and lacustrine records were included with average sample resolution of 50 years or finer. However, such records are rare from marine sediments, and thus a minimum average sample resolution of 200 years was accepted for this database. We also included 4 borehole records, although quantifying median resolution is less straightforward in boreholes than in other archives. The borehole records in the database are appropriate for examining decadal to multi-centennial scale variability, depending on the timeframe of interest21.

Public availability

Proxy records used in the PAGES2k synthesis products are publicly available through previous publications or online data archives, or because their owners made them available for inclusion in this open-access data product. The original data for 49 records are made available for the first time in this data product (specified in Supplementary Table 1). Open access is a critical component of this endeavor, and led us to reject some records that would have been suitable under the other criteria. The focus on annual- to centennial-scale temperature of the past 2000 years led to the exclusion of those paleoclimate records that did not meet the resolution or geochronological control criteria required for meaningful inferences of the temperature history of Common Era.

Relation to previous release

The selection criteria for this dataset are specific to the type of proxy archive; for some proxy types, the standards in this version were broadened compared to the criteria used previously by PAGES2k regional groups. In most regions, records have been added that have become available since the publication of PAGES2k-2013, or that were not used in the continental-scale reconstructions because they are not annually resolved and therefore did not conform to the reconstruction method used by a particular regional group. In Antarctica, for example, PAGES2k-2013 included only the longest annually resolved ice cores, whereas the present version includes shorter and decadal-scale-resolution records.

For other proxy types, more stringent criteria resulted in the exclusion of some records. The excluded records are tracked in Supplementary Table 2. In most regions, some records were excluded because they did not meet the stricter standards for the minimum length or temporal resolution (criteria detailed above), or because of ambiguities related to the temperature sensitivity of the proxy, or because they have been superseded by higher-quality records from the same site. Of the 641 records that together comprise the previously published PAGES2k datasets, 177 are now excluded, of which 124 are tree-ring-width series that are inversely related to temperature. To be included in the current database, tree-ring data were required to correlate positively (P<0.05) with local or regional temperature (averaged over the entire year or over the growing season). Trees whose growth increases with temperature (e.g., direct effect of temperature on physiological processes and photosynthetic rates) are more likely to produce a reliable expression of past temperature variability compared to trees that respond inversely to temperature, for which the proximal control on growth is moisture stress (e.g., evapotranspiration demand)22. Because many trees are more strongly influenced by moisture availability than by growing season temperatures23, including only the positive responders reduces the overall number of tree-ring records to a more selective subset (see Supplementary Information, section 1).

Metadata

The current database includes a large number of metadata fields to facilitate the intelligent reuse of the data. Table 1 (available online only) lists a subset of information in a single-page format. Supplementary Table 1 includes additional metadata fields with critical information to convey the appropriate use of each dataset, namely: the PAGES2k identifier assigned for this data product, the identifier used in previous PAGES2k products by the Ocean2k working group12,13, or by PAGES2k-2013, whether the record is superseded by another in this version, the archive type, the primary publication citation, its associated digital object identifier (DOI; if one exists), the secondary publication citation and DOI, the URL link to where the data were archived by the original author, the associated data citation, the geographic coordinates (latitude, longitude, elevation), the name of the site, the ISO 3166-1 standard name of the country/ocean basin where it is located, the earliest and latest years covered by the record, the resolution of the time series (median spacing between consecutive observations), the type of proxy observation, the name of the variable used as the temperature-sensitive time series and its units, the physical feature whose temperature is sensed by the proxy (e.g., surface air temperature, sea-surface temperature), the part of the seasonal cycle recorded by the proxy, the direction of the relationship between the proxy and temperature (positive or negative), quality control (QC) comments, initials of PAGES2k Consortium author who performed QC certification, and a permalink to the dataset’s page at the NCEI-Paleo/World Data Service for Paleoclimatology.

Table 1 Proxy series included in this synthesis

Annualization

Annualization is necessary to compare proxies of varying sampling resolution with instrumental observations or with each other. Records with a superannual resolution were interpolated to annual resolution via nearest-neighbor interpolation. Interpolating records may alter their spectral content24, but permits comparison of information on a common time grid and for shared spectral resolutions.

Seasonally-resolved proxies (e.g., most corals) were averaged to produce annual (ANN: Jan-Dec), DJF (December January February) and JJA (June July August) anomalies. Some records from glacier ice have a sampling resolution finer than 1 year. However, firn diffusion smooths subannual signals, such that the shortest recoverable periodicity is generally no shorter than 1 year25,26. Such records were therefore annualized to a Jan-Dec window.

The vast majority of records in the database are annually-resolved, and are not affected by this processing. We note, however, that many such records subsample part of the annual cycle (e.g., for tree rings, the growing season). For this purpose it is instructive to compare such records to annual (Jan-Dec), DJF and JJA averages of the HadCRUT4.2 temperature field (Technical Validation). The annualized data are archived alongside the original data, so either may be used in subsequent analyses.

Code availability

The Matlab code (https://www.mathworks.com/products/matlab.html) necessary to reproduce the figures of this descriptor is available at https://github.com/CommonClimate/ PAGES2k_phase2 under a free BSD license.

Data Records

Proxy dataset

The PAGES2k temperature database (Fig. 1, Supplementary Fig. 1) includes 692 records (Data Citation 2 to Data Citation 477) from 49 countries and 11 distinct types of archives: 415 from trees (ring width and density), 96 from corals (e.g., isotopes, elemental composition, calcification rate), 58 from marine sediments (e.g., geochemistry, floral and faunal assemblages), 49 from glacier ice, 42 from lake sediments (e.g., floral and faunal assemblages, sediment accumulation, geochemistry), 15 from documentary sources, 8 from sclerosponges, 4 from speleothems, 3 from boreholes, 1 from bivalves, and 1 hybrid (tree/borehole) record. Each of these archives bears the imprint of a proxy system responding to temperature changes, with the signal recorded in one or more of the archive’s chemical, physical, or biological properties27. The details behind the collection, analysis and interpretation of each of the records in the database are beyond the scope of this data descriptor, and we refer readers to the original publications for that information.

Figure 1: Spatiotemporal data availability in the PAGES2k database.
figure 1

(a) Geographical distribution, by archive type, coded by color and shape. (b) Temporal resolution in the PAGES2k database, defined here as the median of the spacing between consecutive observations. Shapes as in (a), colors encode the resolution in years (see colorbar). (c) Temporal availability, coded by color as in (a).

The records cover a wide range of time spans, from a minimum of 52 years to a maximum of 2000 years. The average length is 760 years, the median 547 years, not counting the duration of any record beyond 2000 years; temporal resolution ranges from biweekly to centennial, with a majority of annual records. As seen in Fig. 1, many proxy records spanning the last 2000 years are not annually resolved, and in some regions, most of the available records of any length lack annual resolution. The mean resolution of non-tree archives is 11 years, the median 1 year. For sedimentary archives the mean and median resolutions are 25 and 18 years, respectively. A list of sites comprising the database, along with basic metadata, is presented in Table 1 (available online only), an expanded version of which is in Supplementary Table 1. Note that some sites include more than one proxy temperature record.

The majority (59%) of the records are based on tree rings because they are annually resolved, precisely dated, and geographically widespread, especially in the mid-latitudes of the Northern Hemisphere (Supplementary Information, section 1). The PAGES2k collection is unique among previous efforts in the amount of paleoclimate evidence from sources other than tree rings, such as lake and marine sediments, corals, glacier ice and speleothems, thus expanding the geographic and temporal coverage of the database, as well as mitigating potential issues regarding the use of tree rings for temperature reconstructions28.

While the vast majority of the records gathered herein were layer-counted, there are 87 sediment (marine or lake) datasets whose chronologies are derived from radiometric methods. For 41 of those datasets (47%), Data Citation 1 includes the primary geochronological information needed for a formal treatment of time uncertainty using various age-modelling techniques19,29,30. Additionally, 30 records (overlapping, but not exclusively, with the 41 above) include chronology ensembles from the Arc2k 1.1.1 dataset31. These include both sedimentary records with age ensembles derived via BACON19, and ice and varved records with age ensembles derived via BAM32.

For comparison, Fig. 2 displays the spatiotemporal distribution of proxy archives in the databases of Mann et al.33,34 (hereafter M08) and PAGES2k-2013 (ref. 11). While the M08 database contains 75% more records than this collection, these records are overwhelmingly land-based, from the Northern Hemisphere, and relatively short. Indeed, the M08 database is disproportionately composed of tree rings from North America, many of which start after 1000 CE, so that fewer than 100 records reach beyond this date. In contrast, the present collection contains 176 records out to 1000 CE, most of which are not tree-based. While the PAGES2k-2013 effort had succeeded in diversifying the network prior to 1,000 CE, it focused on terrestrial sites, and was dominated by tree-based records after 1200 CE. The proportion of records from the Southern Hemisphere is comparable between all three databases (15% in M08, 12% in PAGES2k-2013, 16% in this study), but the number of records from Antarctica has steadily improved between databases (8 in M08, 9 in PAGES2k-2013, 26 here). The present dataset therefore constitutes a major leap in terms of the diversity and duration of records, as well as oceanic and polar coverage. The present compilation also marks an unprecedented effort at rigorously assessing their quality as temperature indicators (Technical Validation). While the overall quantity of records has declined with respect to M08, this is largely the result of more selective inclusion criteria (Methods).

Figure 2: Data availability in the M08 (refs 33,34) and PAGES2k-2013 (ref. 11) databases.
figure 2

Graphical conventions as in Fig. 1. Note that the y-axis scale varies between plots on account of the large differences in number of records, but the first millennium inset uses the same scale between all comparable panels in Fig. 1 and this one, to highlight progress made in bolstering coverage of this time period.

Indeed, a unique aspect of the PAGES2k effort is the richness of the metadata annotating each record. In Supplementary Table 1, all proxy records are accompanied by information about their paleotemperature interpretation, including where the proxy senses temperature (e.g., surface-air temperature, sea-surface temperature), the sign of this relationship (positive or negative), and the part of the annual cycle that is preferentially recorded (e.g., May June July). Some of the records from marine sediments were processed for additional quality control as described by ref. 12. The ‘QC Notes’ column of Supplementary Table 1 specifies data processing that was done, and explains modifications relative to the original data citation. In addition to the metadata in Supplementary Table 1, which are complete for every record, Data Citation 1 includes additional metadata for some records. The type of additional information depends on the proxy record and some of the information is missing for some records. For example, when available, the basis for the temperature interpretation is stated (e.g., calibration or first principles). Some records that were calibrated to temperature (e.g., ref. 35) include the native data from which the temperature series was derived, as well as a description of the calibration (equation, reference, uncertainty, units). This metadata structure follows the Linked Paleo Data (LiPD) structure, and the interested reader is referred to the associated publication36 for a full exposition of the format.

Accordingly, the database is primarily encoded as LiPD36 files: a structured, machine-readable format for paleoclimate data based on Javascript Object Notation (JSON) that accommodates the wide diversity of information comprising this database (PAGES2k_v2.0.0_LiPD.zip, Data Citation 1). Serializations are also available in the Python (PAGES2k_v2.0.0-ts.pklz, Data Citation 1), Matlab (PAGES2k_v2.0.0.mat, Data Citation 1) and R (PAGES2k_v2.0.0.Rdata, Data Citation 1) languages. Utilities for interacting with LiPD files in Matlab and Python are available at http://github.com/nickmckay/LiPD-utilities. Utilities in R are forthcoming.

Instrumental temperature dataset

The ability of the proxy network to capture temperature information is assessed with respect to the instrumental HadCRUT4.2 dataset37, covering CE 1850–2014. The dataset merges surface air temperature over land (CRUTEM4) and sea-surface temperature over ocean regions (HadSST3). We use the Cowtan & Way version38 of the dataset, which corrects for missing values and incomplete post-1979 Arctic coverage via the use of satellite observations. Even with the correction, the HadCRUT4.2 dataset is incomplete, with about 60% of the monthly values missing, so the remaining missing values were infilled via the GraphEM39 algorithm. The graph was chosen via the graphical lasso40 using a sparsity parameter of 0.7%, which was chosen by cross-validation as the minimizer of the expected prediction error (HADCRUT4_median_GraphEM.mat, Data Citation 1).

The global (area-weighted) mean from this dataset is charted in Fig. 3. We note that this dataset may result in temperature variations whose amplitude is biased downwards in regions of poor observational coverage, hence potentially distorting proxy-temperature correlations. Regionally-specific temperature datasets (e.g., ref. 41 for Antarctica) would therefore be more appropriate in regional applications.

Figure 3: Global mean temperature from the HadCRUT4.2 dataset before (gray) and after (red) imputing missing monthly values via GraphEM.
figure 3

Black circles mark the yearly averages (mean annual temperature, or MAT) of GraphEM-imputed temperature values (red line).

Technical Validation

A unique challenge for technical validation of paleoclimatological datasets is that the target, here, site-local temperature over the Common Era, is unknown. Addressing this issue is an important objective of the current study. Our approach to validation includes comparison with the instrumental data for annually-resolved records, subsampling the dataset to assess reproducibility among proxy types and other subsets based on different screening criteria, and coarse-graining the time series to different extents to address issues related to combining records of different resolution and age certainty. Evidence that the records in the database reflect past temperature variability can be found in the original publications associated with each record. In addition, each series incorporated in the dataset was examined by one or more regional experts, who certified that each proxy record included in the database was accurate and related to temperature (Supplementary Table 1). This level of expert elicitation is unique among existing paleoclimate syntheses covering the last two millennia, and is a key value proposition of the PAGES2k crowd-curation process.

Quality control

To facilitate quality control of individual records within the database, dashboards displaying raw data, their annualized version and the extent to which they may be informative of annual, JJA or DJF temperature were created. These figures are grouped by region or globally and included on the FigShare repository associated with this publication (Global_QCfig_bundle.pdf, Data Citation 1).

Fundamentally there are two ways to infer past temperatures from paleoclimate records. They can be calibrated using either:

  1. 1

    direct (in-time) calibration; or:

  2. 2

    indirect (space for time) calibration.

In the first approach, the record must overlap with the instrumental period (here: 1850–2014), and this period of overlap must contain enough points for a statistical calibration to an instrumental temperature product such as HadCRUT4 to be meaningful. In the second approach, one often uses transfer functions or laboratory-based culture experiments. Accordingly, summary plots for all records are divided into two categories, described below. The instrumental overlap threshold requirement is set at n=20 based on sensitivity tests (not shown). This parameter may be changed in the code associated with this dataset (see Code Availability).

Records that can be calibrated in time

Record Ocn_114 (ref. 42) (Fig. 4) is one such example. The top panel shows the (monthly) raw data as gray circles and an annualized curve whose color code matches that of Fig. 1a. The three bottom plots depict correlations with temperature grid points taken within a 2,000 km radius. The bottom left plot shows correlations with mean annual temperature (MAT), with insignificant correlations (as per an isospectral test43, 1,000 surrogates, 5% level) denoted by hatching. The local correlation is −0.59 and its bold font weight indicates statistical significance, also at the 5% level. Similar plots are shown for boreal summer (JJA, center) and boreal winter (DJF, right). Essential textual metadata are displayed on the right hand side. Similar plots follow identical conventions.

Figure 4: Quality-control plot for record Ocn_114 (Ocean_QCfig_bundle.pdf, Data Citation 1).
figure 4

See text for details as an example of record that can be calibrated in time.

Records that cannot be calibrated in time

If a record has too coarse a resolution, or ends too early, to contain 20 points over the 1850–2000 interval, it belongs to this category. One such record is Ocn_015, a foraminifera Mg/Ca record from the Caribbean35 (Fig. 5). This record was independently calibrated to temperature, as reflected in the ordinate of the time series plot (top left). As before, the right side of the page displays metadata, including the calibration method and its reported uncertainties. Since a comparison to an instrumental temperature series is neither possible nor meaningful for such a record, the bottom left panel displays its correlation to the 10 nearest high-resolution records (bottom left), coded by lines whose color represents the absolute correlation. Thick, solid lines represent significant correlations (again, as per ref. 43 at the 5% level), and thinner dashed lines represent correlations that did not pass the test. The bottom central panel stratifies these correlations by distance; the color corresponds to the proxy code (i.e., corals in orange, sclerosponges in red, c.f. Fig. 1a), with significant correlations circled in black. The size of the symbol scales with the number of years of overlap, as shown at the bottom right.

Figure 5: Quality-control plot for record Ocn_015 (Ocean_QCfig_bundle.pdf, Data Citation 1).
figure 5

See text for details as an example of record that cannot be calibrated in time.

Relationship to temperature

Here we examine the extent to which the database as a whole captures the observed temperature variability at local and regional scales. We do so via correlation analysis, which makes the common assumption that the relation between the proxy value and temperature over the twentieth century is representative of the entire record (stationarity). Unstable or multivariate associations between proxies and local temperature would represent a significant challenge to this assumption; however, this problem is not unique to paleoclimatology within the Earth sciences. The approach also assumes that the observational temperature time series itself is accurate and unbiased for each proxy site, which may not be true in areas of sparse coverage or complex topography.

The relationship between the current proxy database and the global temperature field is quantified via Pearson’s linear correlation coefficient (R) between proxy values and temperature averages (ANN, JJA, DJF). Statistical significance is established via a non-parametric, isospectral test43, which accounts for the loss of degrees of freedom due to large serial correlations common to proxy time series. Again, we restricted correlation analyses to records comprising a minimum of 20 samples over the instrumental era (CE 1850–2014), which limits the pool of proxies that may be evaluated in this way (n=597).

Regional screening

First, we search for significant correlations (P<0.05) within a search radius rs, ensuring that correlations are local to regional. Compared to a global search, this limits the extent to which spurious correlations may arise, for instance, due to strong trends. Since spatial correlations are non-uniform and highly anisotropic, using a distance-based criterion that is uniform over the globe represents an oversimplification. No single distance is likely to be globally optimal, so its choice reflects a compromise between various factors: autocorrelation in land versus ocean temperatures, annual versus longer resolution, or seasonal biases. With rs=2,000 km44, 411 records show significant correlations with annual temperature—their absolute values are shown in Fig. 6a and their locations are shown in Supplementary Fig. S3. Results change modestly depending on the value used for rs.

Figure 6: Relationships with temperature and other records.
figure 6

Median absolute correlations (a) of each record with mean annual temperature within a 2,000 km radius; (b) of each record with mean annual temperature at the nearest grid point; (c) of each low-resolution record with its 10 closest high-resolution proxy neighbors; (d) of temperature at each grid point and its proxy neighbors within a 2,000 km radius. Proxy-centric correlations (ac) are reported in color if significant; as small black symbols if insignificant or not applicable. Grid-centric correlations (d) are reported in color if significant; in grey if insignificant or not computable (i.e., no proxy neighbor within 2,000 km).

Regional screening adjusted for the false discovery rate

Searching for potentially hundreds of suitable correlations within such a search radius runs the risk of false discoveries45. The problem of multiple hypothesis tests has long been known to statisticians and several solutions exist46. We use a method based on the false-discovery rate (FDR)47, adapted to the climate context48,49. In all, 277 records passed this test with annual data (Supplementary Fig. 4).

Local screening

The search may be further restricted to the nearest HadCRUT4.2 grid point. The results of this evaluation are mapped in Fig. 6b. In some cases this may be problematic because sites may sit at the boundary between grid cells. For sites located in the vicinity of frontal zones with large spatial temperature gradients, choosing the most appropriate neighbor can be particularly difficult. Gridded temperature data may represent observations from a range of elevations or environments, and therefore may not be representative of the archive’s actual location. Furthermore, the nearest grid point can in some instances be located thousands of kilometers from a site, because of the incomplete coverage of HadCRUT4. This limitation is particularly acute for Antarctic records, because of poor instrumental coverage over the Southern Ocean and Antarctic continent. A total of 181 records passed this test with annual data, including 5 from Antarctica, versus 9 in the regional+FDR case (Supplementary Fig. 5).

Seasonal effects

The extent to which proxies are informative of annual temperature depends, sometimes very strongly, on the portion of the annual cycle which they preferentially sample7. Thus, before using seasonally-dependent proxies to reconstruct mean-annual temperature, one must ascertain the relationship between seasonal averages and the annual mean.

Supplementary Fig. S6 explores how much of the mean annual temperature (MAT) signal can be explained by boreal summer (JJA, top) versus winter (DJF, bottom) averages in the HadCRUT4.2 dataset. Correlations are generally very high (>0.8) in the tropics, where the MAT range is small, and low in the extra-tropics, particularly over northern hemisphere continental interiors for JJA, where the MAT range is large and dominated by winter synoptic variability. This means that proxies that preferentially record summer conditions may be adequate predictors of the annual mean if they are located in the tropics, but (all other things being equal) less so if they are located on Northern Hemisphere continents. Extratropical winter variability is known to dominate the annual average44, so correlations to the MAT primarily reflect winter conditions in those regions.

Table 2 summarizes the result of the aforementioned correlation-based screening for the three approaches (regional, regional with FDR control, and local), as well as the part of the year that goes into the annual average: ANN (calendar year), DJF, JJA, or April-March (AMA). The results make it clear that some proxy records are sensitive to JJA temperature, but not to DJF or annual temperature. The vast majority are tree-ring records from the Northern Hemisphere.

Table 2 Number of records retained in the correlation-based screening depending on the method used and the part of the annual cycle selected.

Relationship to other proxies

A total of 95 proxies could not be directly correlated to MAT, either because they ended prior to 1850 or because they featured too few samples after this date. As an alternate validation method, we searched for significant correlations among the 10 closest neighbors from within the proxy dataset that can be correlated with HadCRUT4 (colored dots on Fig. 6a), and reported these significant correlations, along with their magnitude, in Fig. 6c. To minimize issues related to correlating time series with very different resolutions, the time series of proxy neighbors were smoothed to match the resolution of the target proxy. In regions where proxy-record density is high, this is a reasonable approach to assess the mutual consistency between various series; in sparsely sampled regions, this approach implies that proxy series that cannot be directly correlated to instrumental temperature must look like those that can, even if they belong to different climate settings. Moreover, despite the precautions taken with the isospectral test, correlations between a low-resolution record and its high-resolution neighbors are often driven by trends, even if no geophysical connection is present. Correlations between high- and low-resolution records must therefore be interpreted with caution. With these caveats in mind, 54 of the 95 proxies showed a significant correlation to neighboring high-resolution proxies.

Grid-based spatial correlations

An alternate way to evaluate the extent to which variability in the global temperature field is captured by a heterogeneous network of paleoclimate proxies is to quantify the correlation between each grid cell’s instrumental temperature time series and that of all the proxy series within the radius rs50. Viewed in this manner, the statistical relationship between the regular grid and the irregular proxy network provides information about the extent to which different regions of the global temperature target field are represented by the paleoclimate data, and the strength of that relationship. The results of this evaluation are shown in Fig. 6d. It shows that surface temperature over 73% of the planet is significantly correlated to a proxy time series within a 2,000 km radius-about twice as great an area as covered by the previous PAGES2k compilation11.

Global trends

Having quantified the degree to which proxy records from this dataset respond to temperature, we now synthesize the largest-scale thermal signal embedded therein. We do so by use of composite time series, which efficiently summarize the global trends captured in this large and diverse collection. Composites allow us to readily compare signals contained in various subsets of the database; these comparisons, in turn, are an essential check on the mutual consistency of the temperature proxies across regions, geographic settings, and proxy archives.

Our focus here is purposefully general, centered on multidecadal to centennial time scales and ignoring the spatial features. This simple approach is intended as a preliminary estimate of global mean temperature fluctuations over the past 2000 years, and sets the stage for future community endeavors. Indeed, several PAGES2k working groups are currently working to generate spatially resolved reconstructions of annual or seasonal temperature fluctuations at regional to global scales, as well as cross-validated estimates of global mean surface temperature using a variety of statistical approaches. The composites allow this database to be placed in the context of past reconstruction efforts, and to serve as a benchmark for future ones.

Following recent compilations1113, we average all records (scaled to unit variance) into a composite. We do so at a coarse resolution by applying a simple binning procedure. Compositing makes two implicit assumptions:

  1. i

    all proxy records are linearly related to global, mean-annual temperature.

  2. ii

    all proxy records are equally representative of global mean-annual temperature at any given time, and are thus given equal weight.

Given suitable transformations, (i) may be satisfied for a broad class of proxy records, even very nonlinear ones51. Assumption (ii) is more problematic, for three reasons. First, as Fig. 1a shows, the network is dominated by tree rings from the northern midlatitudes, whose temperature sensitivity is primarily linked to the growing season (boreal summer), representing only a fraction of the annual variance. Second, the mix of proxies is also non-stationary (Fig. 1c). Coral records are relatively abundant over the instrumental era but practically absent prior to 1,600 CE. Tree rings dominate the network after around 1,400 CE but less so prior to that. Finally, the majority of records have annual (or better) resolution, but some records have median resolution on the order of 100 years (Fig. 1b). The information density per unit time of such records is thus quite different. Furthermore, dating uncertainties in low-resolution (non layer-counted) proxies are not quantified, but are mitigated to some extent by multidecadal binning.

Despite assumptions (i) and (ii) above, and their potential violation, we suggest that a simple treatment of the data constitutes an informative appraisal of the largest-scale thermal signal embedded in the dataset. We emphasize, however, that the above concerns are all legitimate, and that more rigorous treatment of these assumptions should and will be applied in formal temperature reconstructions. Compositing involved the following processing steps:

Sign adjustment

Records were multiplied by −1 if their values decrease with increasing temperature (i.e., if their interpDirection parameter is negative); by +1 otherwise. This step ensures that all proxy values point upward (downward) in response to warming (cooling).

Normalization

Records were mapped to a standard normal distribution via inverse transform sampling51, resulting in zero mean and unit variance.

Binning

Since the main focus of this composite is on low-frequency (decadal and longer) variability, all records were averaged in bins of 25, 50, and 100 years. Binning also mitigates the effect of age uncertainties, as it is known that even small age offsets between annual records could otherwise cause large spurious trends in composites made from them32.

Scaling

Standardized composites were scaled to temperature over identical bins.

Screening

For high-resolution records (HR: median resolution finer than 5 years), we applied either no screening (none), regional temperature screening (regional), or regional screening adjusted for the false discovery rate (regionalFDR). For low-resolution records (LR: median resolution coarser than or equal to 5 years), basicFilter denotes records that comprise at least 20 values over the Common Era (Supplementary Fig. 2), while hrNeighbors denotes records with at least one significantly correlated HR neighbor (see above for the caveats of this approach).

Bootstrap

Uncertainties in the composite are quantified via a bootstrap approach52. This assumes exchangeability, and primarily measures sampling uncertainty. We plot 95% confidence intervals derived from an ensemble of 1,000 bootstrap samples; in general, such intervals widen with proxy attrition, as expected.

Sensitivity analysis

Figure 7 presents the composites (HR in gray, LR in blue) and the HadCRUT4.2 target (red) scaled to temperature. Cases presented in the left column applied no screening, while the right column explores combinations of screening and binning interval. A striking feature is that in all cases, both HR and LR composites display a long-term cooling trend until the 19th century, after which an abrupt warming takes place, consistent with a very large body of literature5,8,11,12. We also note that temperature variability decreases with increasing bin size, as would be expected for data with random and independent errors.

Figure 7: Global composites for various binning intervals and screening criteria, as indicated in subplot titles.
figure 7

The composites are scaled to temperature for comparison, and the shading denotes 95% bootstrap confidence intervals with 500 replicates, to constrain uncertainties. The cutoff between high-resolution (HR) and low-resolution (LR) records is defined as a median resolution of 5 years. Screening options comprise: no screening (none), regional temperature screening (regional), or regional screening adjusted for the false discovery rate (regionalFDR). For low-resolution records, basicFilter denotes records that comprise at least 20 values over the Common Era (Supplementary Fig. 2), while hrNeighbors denotes records with at least one significantly correlated HR neighbor.

We find the main results robust to screening choices, with the exception of the case in Fig. 7f (regionalFDR, hrNeighbors). The latter shows the most discrepancy between HR and LR, mainly because the number of LR proxies is very low (n=22) and they have little overlap with the instrumental era, making their temperature calibration unstable. In all cases the HR composites display slightly shallower variations than LR composites. There are two non-exclusive explanations for this. Firstly, it is known that some HR records, particularly the tree-ring chronologies that form the majority of this subset, can be limited in their ability to capture low-frequency variability beyond the mean segment length53. Second, LR records are known to redden climate signals, often exaggerating low-frequency variability at the expense of high frequencies54. Our analysis cannot distinguish between these two possibilities.

It is important to consider whether any of the primary features of the composite series are strongly controlled by a particular subset of proxies, or if they are shared among archive types. There are many potential ways to analyze this dataset. We give but one example in Fig. 8, gathering composites from individual archive types that include 5 or more records among the proxy collection: corals, documentary archives, glacier ice, lake and marine sediments, as well as trees. For this case we apply regional HR screening and basic LR filtering, then average records from coral, documentary, glacier ice, lake sedimentary, marine sedimentary, and tree-ring archives.

Figure 8: 50-year binned composites stratified by archive type, for all types comprising 5 or more series.
figure 8

Composites with fewer than 10 available series are shown by a dotted curve, while solid lines indicate more than 10 series. Shading indicates 95% bootstrap confidence intervals with 500 replicates. Gray bars indicate the number of records per bin. The composites are expressed in standard deviation units, not scaled to temperature.

Most composites show a strong twentieth century warming trend that emerges above the variability of comparable centennial trends over the last two millennia. This is clearest in the tree- and coral-based composites, despite very large uncertainties in the latter during the seventeenth century, due to the paucity of records (Figs 1 and 8). An exception to this pattern is in the marine sediment composite (Fig. 8), which shows a cooling trend through most the Common Era. This may be explained by the low resolution of marine sediment records noted earlier, and the process of bioturbation of the sediment archive. These factors diffuse and damp changes occurring over years and decades (e.g., ref. 12), including the most recent warming. Local oceanographic factors may also play a role12,55.

Uncertainties in these composites include changes in sample size and available data network over time, the potential for non-climatic or non-temperature influences to bias these smaller subsets of the dataset, and the high spatial heterogeneity of subsample networks (Fig. 8). In general, uncertainty bands widen back in time (cf tree composite), with the notable exception of the marine sediment and documentary composites, which show widening bootstrap intervals in the last 2 bins, coincident with a drop in observational coverage in these archives. Note that multidecadal trends present in coral δ18O records from the eastern tropical Pacific may not be driven by temperature13,56,57 possibly biasing the trend of this coral composite. The network of lake records is regionally constrained (Fig. 1), and that composite may contain multiple environmental influences beyond temperature. As a result, from the lake subset only, we cannot exclude the possibility of above-modern levels of warmth in the third century CE, though uncertainty bands for early centuries are wide, and the recent rate of warming is clearly unprecedented over the Common Era.

The global composites derived from this dataset, despite their simplicity, supersede the composite-of-opportunity published in the last synthesis11, which was an average of regional indices obtained by very different means (hence not statistically homogeneous) and did not include the majority of the marine records gathered here. Nevertheless, the present composites share many similarities and some of the same caveats; namely, that a composite tends to give more weight to numerically abundant records (e.g., tree rings), and regions with more abundant observations (e.g., the Northern Hemisphere continents). An in-depth analysis of these composites, along with their climatic interpretation, will be the subject of a companion paper.

Usage Notes

Data Citation 1 gathers data records in multiple digital formats, as well as quality control dashboards for all PAGES 2k regions (Table 3). This collection is the cornerstone of current and future efforts by the PAGES 2k Consortium to better reconstruct surface temperature, attribute its variability to climate forcings, understand its relationship to other components of the climate system, and constrain model simulations. It is appropriate for many purposes, ranging from developing reconstructions of climate indices (e.g., global mean surface temperature, NINO3.4) and fields, to proxy-proxy and proxy-model comparisons, and it was designed to be functional and relatively inclusive so that appropriate records could be selected, depending on the intended purpose, of which some are presently unforeseen.

Table 3 Contents of the FigShare repository associated with this descriptor.

The 692 temperature-sensitive records described and validated in this manuscript were selected based on the criteria listed above. In addition to these records, Data Citation 1 contains 2,240 ancillary time-series data from the same sites. Most (87%) are associated with tree-ring records from North America, including raw measurements, sample density and expressed population statistics; some are the native observations used to derive the temperature reconstructions included in the restricted group of 692 (e.g., Mg/Ca of foraminifera for sea-surface temperature); others are not directly related to climate but represent environmental changes at the site that might be useful in interpreting the climatic significance of the record (e.g., sedimentary magnetic susceptibility); some are proxy climate records that are sensitive to climate variables other than temperature. These 2,240 records are all timeseries, in that the datasets are year/data pairs. These ancillary time series are provided ‘as is’; the authors make no claims or guarantee as to their scientific usage.

Within Data Citation 1, the 692 temperature-sensitive records that comprise v2.0.0 are each assigned a unique PAGES2k identifier, as listed in Table 1 (available online only) and Supplementary Table 1. The 2,240 ancillary records are not assigned PAGES2k identifiers. In addition, the 692 records are easily discoverable in Data Citation 1 (PAGES2k_v2.0.0_LiPD.zip, PAGES2k_v2.0.0.mat, PAGES2k_v2.0.0.RData, PAGES2k_v2.0.0-ts.pklz) by querying the metadata property ‘paleoData_useInGlobalTemperatureAnalysis’, which is set to ‘TRUE’ only for the 692 temperature-sensitive records described here.

Several factors stand in the way of the PAGES2k compilation being fully comprehensive: records are continuously being generated and published, while some existing records are not publicly archived. This synthesis represents a major community effort to compile data records and captures a substantial majority of relevant records; it is to be continuously expanded and curated by the PAGES2k community. In addition to Data Citation 1, the entire database will be made available as part of a more comprehensive, web-based data management platform (http://linked.earth). This cyberinfrastructure facilitates crowd curation, transparent discussions of proxy interpretations, tracking and versioning of paleo data, and is supported by the first paleoclimate ontology (http://linked.earth/projects/ontology). In the near future, the current PAGES2k temperature dataset will be integrated with other paleoclimate datasets in this platform—for example, one dedicated to water isotopes (Iso2k58)—to enable the data-intensive studies of the last 2,000 years envisaged by the PAGES2k community59. The LinkedEarth cyberinfrastructure will enable crowdsourced additions and edits to the database, allowing it to be a living entity, with careful versioning to ensure workflow reproducibility.

Our versioning scheme is as follows: the version number for a data compilation is of the form C1.C2.C3, where C1 is a counter associated with a publication of the dataset (e.g., ref. 1), C2 is a counter updated every time a record is added or removed, and C3 is a counter updated every time a modification is made to the data or metadata in an individual record. The dataset published here is thus v2.0.0 of the PAGES2k proxy temperature dataset. Future versions of the dataset, along with a change log that specifies the modifications associated with each new version, will be posted at http://wiki.linked.earth/PAGES2k. This versioning applies only to the temperature-sensitive records in Data Citation 1; changes to ancillary time series are not tracked.

In addition, an archival version of the dataset is available on the website of NCEI-Paleo/World Data Service for Paleoclimatology (https://www.ncdc.noaa.gov/paleo/study/21171), in both the LiPD format and the WDS ASCII template format developed in conjunction with the PAGES2k consortium, that will be updated and versioned as the dataset continues to evolve.

Additional Information

How to cite this article: PAGES2k Consortium. A global multiproxy database for temperature reconstructions of the Common Era. Sci. Data 4:170088 doi: 10.1038/sdata.2017.88 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.