High-resolution grids of daily air temperature for Peru - the new PISCOt v1.2 dataset

Huerta, Adrian; Aybar, Cesar; Imfeld, Noemi; Correa, Kris; Felipe-Obando, Oscar; Rau, Pedro; Drenkhan, Fabian; Lavado-Casimiro, Waldo

doi:10.1038/s41597-023-02777-w

Download PDF

Data Descriptor
Open access
Published: 01 December 2023

High-resolution grids of daily air temperature for Peru - the new PISCOt v1.2 dataset

Scientific Data volume 10, Article number: 847 (2023) Cite this article

3028 Accesses
1 Citations
9 Altmetric
Metrics details

Subjects

Abstract

Gridded high-resolution climate datasets are increasingly important for a wide range of modelling applications. Here we present PISCOt (v1.2), a novel high spatial resolution (0.01°) dataset of daily air temperature for entire Peru (1981–2020). The dataset development involves four main steps: (i) quality control; (ii) gap-filling; (iii) homogenisation of weather stations, and (iv) spatial interpolation using additional data, a revised calculation sequence and an enhanced version control. This improved methodological framework enables capturing complex spatial variability of maximum and minimum air temperature at a more accurate scale compared to other existing datasets (e.g. PISCOt v1.1, ERA5-Land, TerraClimate, CHIRTS). PISCOt performs well with mean absolute errors of 1.4 °C and 1.2 °C for maximum and minimum air temperature, respectively. For the first time, PISCOt v1.2 adequately captures complex climatology at high spatiotemporal resolution and therefore provides a substantial improvement for numerous applications at local-regional level. This is particularly useful in view of data scarcity and urgently needed model-based decision making for climate change, water balance and ecosystem assessment studies in Peru.

A 10-km CMIP6 downscaled dataset of temperature and precipitation for historical and future Vietnam climate

Article Open access 06 May 2023

ClimateEU, scale-free climate normals, historical time series, and future projections for Europe

Article Open access 04 December 2020

High-resolution climate projection dataset based on CMIP6 for Peru and Ecuador: BASD-CMIP6-PE

Article Open access 05 January 2024

Background & Summary

Air temperature is a fundamental parameter of the climate system, which is required for various applications such as ecology¹, hydrology², public health³, agriculture⁴, climate variability, and climate change^5,6. Typically, temperature values are obtained from meteorological stations and show high accuracy and temporal resolution but do not capture information for an entire unit or region of analysis. Therefore, gridded global- or continental-scale databases, derived from interpolated⁷, reanalyzed⁸ and/or combined⁹ in-situ and surface remote sensing data, are widely used. While each dataset offers several advantages for specific applications, limitations related to complex topography, spatial resolution, and the amount of assimilated data reduce their reliability^10,11. In recent years, gridded high-resolution climate datasets at national and sub-national scales have been produced to close this gap^{12,13,14,15,16,17,18}.

A broad range of methods exists for creating gridded air temperature data based on weather stations. Traditionally, they have been divided into geostatistical, non-geostatistical, and combined methods^19,20. Although these methods are widely used and provide high efficiency, more recent procedures based on artificial intelligence including deep learning^21,22 and machine learning^23,24 are gaining relevance due to their ability to work with large amounts of data and capture non-linear and multivariate relationships²⁵. However, the reduced capacity to estimate the value outside the range of the training data limits its use in large regions with low station density^26,27. Besides, since the relationship between air temperature and auxiliary spatial predictors varies on spatiotemporal scales, recent research has also highlighted the importance of non-stationarity in the spatiotemporal domain by building local models in contrast to global estimation models^{13,28,29,30,31,32}. The diversity of methods has advantages and disadvantages regarding data availability, computational efficiency, computational cost, and estimation accuracy. Therefore, the method selected must be suitable or at least adapted to the purpose and study area.

In South America, only few efforts have been undertaken to create gridded temperature datasets, mainly because of the low density of weather stations or the lack of long-term data series. However, there are significant advances in the construction of gridded datasets in countries such as Brazil^33,34, Chile³⁵, and Bolivia^36,37. For Peru, only two databases exist currently. The first is a gridded monthly-scale product for 1964–2014 at 5 km spatial resolution (henceforth “VS2018”) developed by Vicente-Serrano³⁸. The second is a gridded daily-scale product for 1981–2016 at 10 km developed by the National Service of Meteorology and Hydrology (SENAMHI). SENAMHI introduced this product as part of the Peruvian interpolated data of the Climatological and Hydrological Observations of SENAMHI (PISCO), denominated PISCOt v1.1³⁹. Since its release, PISCOt has been applied in numerous areas of research and operation^{3,40,41,42,43,44,45}. Due to the increasing availability of observed data and the need for higher spatial resolution, it is crucial to account for gridded air temperature datasets that allow modelling and process understanding at local scales, e.g., at the catchment level. Previously applied techniques^46,47,48,49, show that such a product can be optimised by enhancing the temporal homogeneity of the observed data and also by using topographic and climatic co-variables. Among the applied remote sensing data, Land Surface Temperature (LST) is the most frequently used parameter because it improves both the numerical accuracy and the spatiotemporal details of the interpolated air temperature.

Here, we present an updated version (v1.2) of PISCOt that consists of a gridded daily dataset for maximum (Tmax) and minimum (Tmin) air temperature at a spatial resolution of 0.01° (≈1 km) for the period 1981–2020. The updated version of PISCOt is essential for two main reasons: (i) it provides high-resolution estimates of daily Tmax and Tmin in a data scarce region taking into account steep climatic gradients that occur over complex terrain; and (ii) it provides the basis for further applications such as studies related to climate change analysis, hydrological modelling, and ecology, among others.

Methods

Workflow for generation of the data

Missing, inhomogeneous, and non-quality-controlled data are a typical concern in hydro-climatological studies. Particularly in regions with low financial resources and limited technical and institutional capacities, weather station networks are often sparse with poor coverage in rural and remote areas, many stations do not work appropriately, and quality control systems are inefficient^50,51. In Peru, quality issues with station data are especially challenging due to the complex topography leading to steep climatic gradients^52,53. The development of PISCOt requires therefore careful selection and pre-processing of the station observations before spatial interpolation can be applied.

The selection of the horizontal resolution is crucial in the spatial interpolation process. From a climatological perspective, deriving coarser products rather than topoclimatic-scale products¹³ (kilometer or sub-kilometer) based on sparse interpolated observations does not yield additional information⁵⁴. The underlying station distribution mostly defines the effective resolution, and it can be different from the target grid spacing^55,56,57. However, from the user’s perspective, higher-resolution data can be more desirable since they are urgently needed for practical applications⁵⁸. This is because these applications require a clear characterization of local gradients which in complex terrain might occur over shorter distances. An interpolation approach of air temperature based on high-resolution spatial predictors (0.01°≈1 km) is advantageous, especially in extremely complex mountain terrain such as the Andes, to properly account for the orographic gradients in a wide range of applications. Additionally, using high-resolution data makes it easier to interpret satellite observations or use them in hydrological models without further downscaling.

Therefore, the workflow for the development of the new PISCOt dataset includes four steps: (i) quality control, (ii) gap-filling, (iii) homogenisation of weather stations, and iv) spatial interpolation (Fig. 1). In step (i), statistical and visual techniques were applied to remove erroneous data in the times series of Tmax and Tmin. For (ii), all time series were gap-filled using data from neighbouring stations. The previously gap-filled data were then homogenised in step (iii) to reduce temporal inhomogeneities. Once a complete and homogenised database of Tmax and Tmin observations was established, we proceeded to step (iv). A climatologically based interpolation approach^59,60,61,62 was used, where the spatial interpolation was divided into the mean monthly normal and anomalies, and then aggregated to obtain the final product. Topographic and remote sensing data served as a basis to estimate air temperature at the country scale. The following sections provide the data sources and four development steps in more detail.

Weather station data

Data source

The database used in this study belongs to the Peruvian weather service (SENAMHI, https://www.senamhi.gob.pe) and includes 430 daily series of Tmax and Tmin (Fig. 2). To obtain a better spatial representation of the country boundaries (Fig. 2a), data from adjacent countries were used, such as the Ecuadorian Institute of Meteorology and Hydrology (INAMHI, https://www.inamhi.gob.ec), the Colombian Institute of Hydrology, Meteorology and Environmental Studies (IDEAM, http://www.ideam.gov.co), the Brazilian Institute of Meteorology (INMET, https://portal.inmet.gov.br) and the Climate Explorer portal of the Chilean Center for Climate and Resilience Research (CR2, https://explorador.cr2.cl). Consequently, we obtained a large set of climate data from Ecuador (18 time series), Colombia (3), Brazil (5), Bolivia (3), and Chile (3), representing a total of 462 potential time series (Fig. 2b). Due to restrictions on South American meteorological services, the raw data from the weather stations cannot be distributed with this publication. Readers that wish to obtain the primary data should apply to contact each agency or institution previously mentioned. It is important to note that while a substantial portion of the raw data is openly accessible, several data series remain restricted and can only be accessed upon request. Researchers are referred to revise the data provided by each institution via their official webpage and for further data requests contact each agency or institution individually.

The spatial distribution of the stations is highly uneven in the study area. While in the Amazon region, only a limited number of stations exists, station density in the Andes is higher and largest at the Pacific Coast (Fig. 2). Depending on the altitude, there was a lower (higher) density of stations between 1000 and 2000 masl (0–1000 masl and >3000 masl)⁶³. Thus, the spatial distance between stations varied considerably. The earliest observations started in the 1930s, with a significant increase up to date. Due to political instability and social conflicts (Supplementary Fig. 1), two episodes of under-reporting occurred before 1960 and during the 1980s. Due to the low reliability of data before the 1980s, the gridded product only covers the period 1981 to 2020. In addition, only stations with at least five years of data (365 days of the year repeated at least five times) were used. The 5-year threshold was chosen based on the finding that at least 5–7 years of observations are required before pairwise relationships between stations stabilise^13,64,65.

Quality control

The quality control (QC) of the air temperature series comprised the following steps:

1.
Obvious errors: conversion of numerical values (−999, −99.9, −88.8) to empty values, and removal of duplicate or incorrectly formatted dates.
2.
Extreme values: flagging of daily extreme (low and high) air temperature values based on physical and statistical values. The physical maximum and minimum limits for Tmax (Tmin) were 60 °C and −10 °C (40 °C and −30 °C), respectively⁶⁶. The statistical algorithm identified records that are above the 3rd quartile plus m times the interquartile range (IQR) and those that are below the 1st quartile minus m times the IQR. For Tmax and Tmin, m was set to 3.5. It should be mentioned that the statistical algorithm was applied each month in order to take into account the seasonal cycle effect on the thresholds.
3.
Internal consistency: inspection of daily records where Tmax is below Tmin. Furthermore, the values were flagged when Tmax and Tmin had the same magnitude (Tmax = Tmin).
4.
Temporal coherence: inspection of daily values repeated over a long period and very extreme (day-to-day) jumps. It was defined that a value can be the same up to a maximum of 8 days. Additionally, a daily jump may not have a variation over 20 °C⁶⁷.
5.
Spatial coherence: comparison of the rank of each data value with the average rank of the data recorded at adjacent stations⁶⁸. The original daily air temperature series were converted to percentiles. Each air temperature value was replaced by its corresponding percentile. For each time series, we selected the neighbouring stations which meet the requirements of being within 70 km and had an elevation difference of less than 500 m^69,70. To perform the test, at least four neighbouring stations had to be available. If this was not the case, the daily value of the target station was not compared. The records of the target station with differences greater than a percentile of 0.85 concerning the average of the neighbouring stations were identified. The percentile difference approach allows for identifying only the most extreme spatial variations^71,72,73.
6.
Visual inspection: a visual inspection of the daily time series was carried out to identify periods with inhomogeneities that cannot be corrected (rounding errors, asymmetric rounding patterns, measurement precision, time irregularities, and obvious inhomogeneities)^51,74. For this purpose, we used daily series and annual decimal frequency charts.

All QC-flagged values were set as a missing observation after the QC steps (Supplementary Figs. 1, 2). For the following procedures, only stations that retained the 5-year threshold after the QC were used. In addition, we manually verified the elevation information of weather stations using a digital elevation model (detailed in the Spatial predictors for air temperature sub-section) and modified it where necessary.

Gap-filling

Simple interpolation of incomplete data may produce artificial inhomogeneities in the gridded product due to the irregular spatiotemporal distribution of weather stations during the 1981–2020 period⁷⁵. This can affect the variance and lead to erroneous conclusions on changes and variability⁷⁶. To reduce such artifical inhomogeneities, data reconstruction of time series that do not cover the entire period and of gaps within time series was necessary.

A gap-filling procedure based on neighbouring stations⁷⁷ was implemented to create a complete database. Before applying the algorithm, the available information was standardised using a daily climatology of the available data to avoid differences in the mean and the variance⁷⁸. Subsequently, the model estimates were corrected to approximate the observed values as closely as possible. The correction was made by applying empirical quantile mapping^79,80. The Tmax and Tmin series were reconstructed independently.

A neighbouring station was considered for gap-filling if it met two conditions: (i) at least five years of data in common, and (ii) a correlation greater than or equal to 0.6 with the target station. An iterative process of the gap-filling algorithm was performed to take advantage of those stations that did not have a common period at the beginning⁸¹. This was carried out in up to three iterations, where the availability of neighbouring stations was limited according to the following characteristics: horizontal-vertical distances of (i) 70 km–500 m, (ii) 100 km–500 m, and (iii) 150 km (no vertical limit), respectively. A maximum of 8 neighbouring stations was considered during this procedure. The rationale for this configuration was based on a previous correlation-distance-elevation analysis (Supplementary Fig. 3).

Due to the low density of weather stations in some regions, virtual stations (time series at the closest grid point) from the ERA-5 Land reanalysis⁸² were additionally included to fill temporal gaps. These time series were not directly used, but an anomaly-based bias correction (detrended empirical quantile mapping⁸³) was applied to series with at least ten years of data. Only those virtual stations with a correlation greater than or equal to 0.6 with the target station (within Peru) were preserved and used for gap-filling.

Homogenisation

Many non-climatic influences can affect measurements (changes in station location, instrumentation, and observing practices, among others). To eliminate these inhomogeneities and to obtain more reliable observations, time series must be homogenised^84,85. A variety of statistical methods has been developed, each with different results^84,86. In sparse networks, homogenisation performance is drastically reduced, and there is a risk of erroneous corrections due to the low signal-to-noise ratio⁸⁷. Consequently, the chosen method must be applied carefully.

We tested the temporal homogeneity using the Standard Normal Homogeneity Test^88,89 in both its relative form, known as the Pairwise Homogeneity Algorithm (PHA)^90,91, and its absolute implementation. The process was fully automatic and straightforward. Therefore, the approach was consistent, unlike semi-automatic approaches that require several subjective decisions that can influence the whole process⁷⁴. In addition, PHA has been applied at global scale datasets^92,93, and is one of the approaches with the best performance^84,86.

The algorithm searched a maximum (minimum) of eight (four) neighbouring reference stations with a correlation greater than or equal to 0.6 with the target station within a horizontal (vertical) distance of 1000 km (1000 m) in order to perform a relative test. In absence of these conditions, the absolute test was applied. Absolute tests have a lower detection efficiency than relative tests⁸⁴. Therefore, the condition was designed as a backup test when a relative test was almost impossible to apply⁹⁴. In both cases, a p-value < 0.05 (with a 95% confidence interval) was used to define significant breakpoints which were then used to adjust past values compared to the present.

As the algorithm was applied on a monthly scale, a linear time interpolation of the monthly correction factors to a daily scale was performed⁹⁵. The homogeneity tests were applied after the gap-filling to (i) detect inhomogeneities introduced by the gap-filling process, and, (ii) because the process was more reliable if the time series had no gaps^50,71. Finally, as for the gap-filling procedure, homogenisation was performed in up to three repetitive cycles according to the boundary conditions previously defined.

Spatial predictors for air temperature

In the gridding process, Tmax and Tmin were adjusted to a series of auxiliary spatial predictors such as land surface temperature (LST), elevation (DEM), latitude (Y), longitude (X), and the topographic dissection index (TDI).

The LST observations were selected from MODIS⁹⁶. This satellite product provides average 8-day values starting in the year 2000 and at a 1 km spatial resolution. The Terra version (MOD11A2 V6)⁹⁷ was used for day (LST_day) and night (LST_night) observations. Because of missing data before 2000, the average monthly values for 2000–2020 for both day and night times were used as spatial predictors for Tmax and Tmin, respectively. Only LST values were used without cloud contamination, emissivity error >0.02, or LST errors >2 °C. If any grid cell in the final average were empty, they were reconstructed through nearest neighbour interpolation. The LST was downloaded from https://developers.google.com/earth-engine/datasets/catalog/MODIS_006_MOD11A2 (accessed 31 October 2022).

The DEM data were obtained from the Global Multi-resolution Terrain Elevation Data (GMTED) 2010⁹⁸ at a spatial resolution of 1 km. This dataset was selected because it has also been used in other temperature-gridded products at a national level³⁸. X, Y, and TDI were derived at the same spatial resolution as the DEM. The digital elevation model was downloaded from https://developers.google.com/earth-engine/datasets/catalog/USGS_GMTED2010 (accessed 31 October 2022).

The TDI was calculated through a multi-scale DEM calculation:

$$TD{I}_{({s}_{0})}=\mathop{\sum }\limits_{i=1}^{n}\frac{Z({s}_{0})-{Z}_{min}(i)}{{Z}_{max}(i)-{Z}_{min}(i)}$$

(1)

Where $TD{I}_{({s}_{0})}$ is the final multi-scale TDI value for the grid cell location s₀, $Z({s}_{0})$ is the elevation at the grid cell location s₀, ${Z}_{min}(i)$ is the minimum elevation at the grid cell location in the spatial window i, ${Z}_{max}(i)$ is the maximum elevation at the grid cell location in the spatial window i, and n is the number of spatial windows⁹⁹. The TDI value for a specific window size represented the height of a grid cell relative to the surrounding terrain. The multi-scale TDI was calculated for five spatial window sizes (at 3, 6, 9, 12, and 15 km). Valley bottoms and low areas relative to surrounding grids have values close to zero, while ridges and areas above surrounding areas have high values approaching 5. The selection of this topographic variable was based on the high correlation with daily Tmin anomalies which are influenced by cold air drainage^13,99.

The spatial predictors were downloaded from the Earth Engine Data Catalog¹⁰⁰ repository via rgee¹⁰¹. For efficient processing, the data were adapted to the extent of −81.405°, −67.185°, −18.595°, and 1.225° (min longitude, max longitude, min latitude, and max latitude); and re-gridded at 0.01° spatial resolution.

Air temperature interpolation

For the interpolation of Tmax and Tmin, a climatologically aided interpolation (CAI) approach^59,60,61,62 was used. With CAI, deviations from the average (anomalies) on a given day were interpolated and combined with an average field (climatology) to produce the final daily product. The CAI approach has been employed in several studies^13,18,62,73 and has proven to be effective to improve the accuracy of air temperature estimation in regions of complex terrain with limited observations^{102,103,104,105}. This approach drastically reduced computational costs compared to independent runs for each time step, and the co-variables did not necessarily need to be in the same temporal range as the observational data. The procedure was applied independently for Tmax and Tmin and comprised three steps:

1.
Interpolation at monthly (normal) average scale for the 1981–2010 period.
2.
Interpolation at the daily anomaly scale (based on the monthly normal) for 1981–2020 period.
3.
Combination of 1 and 2 to obtain the daily temperature value.

Monthly normal interpolation

For the interpolation of the monthly normal, the Regression-Kriging (RK) method^13,29,106 was used, which represents a spatial process expressed as the sum of a deterministic and a stochastic part:

$$\overline{T}({s}_{0},{m}_{0})={\overline{T}}_{u}({s}_{0},{m}_{0})+{\overline{T}}_{e}({s}_{0},{m}_{0})$$

(2)

Where $\overline{T}({s}_{0},{m}_{0})$ is the final interpolated normal temperature at the grid cell location s₀ and for the month m₀, ${\overline{T}}_{u}({s}_{0},{m}_{0})$ is the deterministic spatial trend in normal temperature modelled by the weather station locations and auxiliary predictors, and ${\overline{T}}_{e}({s}_{o},{m}_{o})$ is the spatially autocorrelated stochastic residual with zero mean¹⁰⁷. We use a linear model to fit ${\overline{T}}_{u}({s}_{0},{m}_{0})$, and ordinary kriging (OK) to interpolate the residual part ${\overline{T}}_{e}({s}_{o},{m}_{o})$:

$$\overline{T}({s}_{0},{m}_{0})={\beta }_{0}+{\beta }_{1}lst({m}_{0})+{\beta }_{2}z+{\beta }_{3}x+{\beta }_{4}\,y+\mathop{\sum }\limits_{i=1}^{n}{w}_{i}({s}_{0},{m}_{0}){\overline{T}}_{e}({s}_{i},{m}_{0})$$

(3)

β₀ is the intercept; β₁, β₂ β₃ and β₄ are the model coefficient estimates for monthly average LST, elevation, latitude, and longitude, respectively; $lst({m}_{0})$, z, x and y are the average LST at m₀, elevation, longitude, and latitude at grid level at the location s₀; ${w}_{i}({s}_{0},{m}_{0})$ are the weights defined by the residual spatial covariance; and ${\overline{T}}_{e}({s}_{i},{m}_{0})$ are the residuals of the regression for n stations.

Due to the large variability and extent of the study area, it was not appropriate to use a global model for the spatial prediction of normal temperature. A version of RK with a moving spatial window based on Geographically Weighted Regression-Kriging (GWRK)¹⁰⁸ was used to account for the spatial heterogeneity in the interpolation process. The GWR^109,110 calculated local trends for a subset of the study area with a weighting of weather stations using a distance-based function. To improve prediction accuracy, it added the OK from the residuals to the regression estimate. The weighting of the observations in GWR was calculated using the bi-square kernel nearest neighbourhood function:

$${w}_{i}({s}_{0})={\left[1-{\left(\frac{h{({s}_{0})}_{i}}{r}\right)}^{2}\right]}^{2}$$

(4)

Where w_i(s₀) is the distance-based weighting function of the station i at the interpolation location s₀, h(s₀) is the distance between the station i and the interpolation location s₀, r is the bandwidth for the size of the spatially adaptive kernel function. The bandwidth optimisation was necessary because a significant deviation in estimating the regression parameters would be generated if the bandwidth were too large or too small¹⁰⁹. The Corrected Akaike Information Criterion automatically determined the optimal bandwidth¹¹⁰.

The regression coefficients of the GWR model were estimated at a spatial resolution of 0.1°, assuming that the relationship between the normal temperature and the auxiliary predictors is independent of the spatial resolution scale^111,112. Then it was locally interpolated with a bilinear approach at a resolution of 0.01° to be applied to the auxiliary predictors. The OK of the residuals was set to 0.05° and then disaggregated to 0.01° to reduce the measurement precision inconsistencies^51,113,114 of the observed time series (Supplementary Fig. 4). Both sub-products at the final resolution were aggregated according to Eq. 3 to obtain the grids of the monthly normals of Tmax and Tmin.

We used the GWmodel¹¹⁰ and gstat^115,116 packages for the implementation of GWRK. For the estimation of the theoretical variogram (in OK), an automatic adjustment by iteratively repeated minimum squares was used, and the nugget value was forced to zero according to the automap package¹¹⁷.

Daily interpolation

A method similar to the monthly normal temperature was used in the daily temperature interpolation. In this sense, the daily anomalies of Tmax and Tmin were expressed as the sum of two components (deterministic and stochastic). Because of the large number of days (14244) per variable and the intention to produce PISCOt operationally, it was chosen to use RK due to computational limitations. The model here was similar to Eq. 3 but added the spatial predictor TDI.

Therefore, the daily temperature product was obtained according to:

$$T({s}_{0},{d}_{0})=\overline{T}({s}_{0},{m}_{0})+\delta T({s}_{0},{d}_{0})$$

(5)

Where $T({s}_{0},{d}_{0})$ is the temperature at the interpolation point s₀ for the day d₀ within the month m₀, $\overline{T}({s}_{0},{m}_{0})$ is the normal temperature in the month m₀ according to Eq. 3, and $\delta T({s}_{0},{d}_{0})$ is the daily temperature anomaly at the interpolation point s₀ for the day d₀.

Unlike traditional CAI applications, we employed spatial predictors in $\overline{T}({s}_{0},{m}_{0})$ and $\delta T({s}_{0},{d}_{0})$^13,39. Some research have found that topographic factors in a mountainous region are directly related to the spatial patterns of $\delta T({s}_{0},{d}_{0})$, particularly during stable atmospheric conditions that favour cold air inversion^13,99.

Data Records

The generated dataset consists of gridded, geo-localised files and a chart presenting information on the weather stations used. For quick access, the data are divided into different repositories (Table 1) and are stored in a figshare collection¹¹⁸ (https://doi.org/10.6084/m9.figshare.c.5959863).

Table 1 Accession and data files for each repository of the database.

Full size table

The files of normal (average) and daily Tmax and Tmin values are stored in Repository 1 and 2, respectively. These data represent the primary output of the research (a gridded 0.01° spatial resolution product, PISCOt v1.2) and are available in the Network Common Data Form (NetCDF) format. Normal values are stored in a single file whereas daily values are stored in different archives divided by year from 1981 to 2020.

The files of the spatial covariables are stored in Repository 3. These represent the predictors (X, Y, DEM, LST_day, and LST_night) used to build the spatial models of Tmax and Tmin and are available in NetCDF format.

The list of all weather stations used as input for PISCOt v1.2 is stored in Repository 4. The file contains the following information (headers): code (ID), name (NAM), longitude (LON), latitude (LAT), elevation (ALT), and source (SRC) of each weather station. In addition, it also provides information if a weather station has been selected as a virtual station (bias-correction of ERA5-Land) in the gap-filling procedure (filter_qc); and, if a weather station has been used for cross-validation in the gap-filling procedure and daily spatial model (filter_qc70). The file is available in Comma Separated Values (CSV) format.

The gridded product of PISCOt v1.2 was also produced at a coarser spatial resolution (at 0.05° and 0.10°) using the same methodology and input data. This dataset is available in Repository 5. The purpose to provide these different versions is to facilitate quick access to the data of Tmax and Tmin as the original version (0.01°) includes large file sizes. The normal and daily values of Tmax and Tmin at 0.05° and 0.10° spatial resolution are stored in single NetCDF files.

The data in each NetCDF file consists of three dimensions (time, latitude, and longitude). For monthly normal files, the time dimension corresponds to the month of the year beginning with January. Each repository in Table 1 provides in addition a README file with a brief explanation of the dataset. Finally, Repositories 1 and 2 will also be available as a secondary repository in the Google Earth Engine Data Catalog.

Technical Validation

The development process of PISCOt has been evaluated in three steps: (i) gap-filling validation; (ii) spatial model validation; and (iii) usefulness of the PISCOt product. In the spatial model validation, we focused on the assessment at monthly normal and daily scales. In the usefulness of the PISCOt product, we provided two applications, one associated with spatio-temporal variability of air temperature, and the other related to the coastal fog effect on air temperature.

The statistics used to evaluate the skill of each step were simple error (mean bias), mean absolute error (MAE), and the refined index of agreement (d_r)¹¹⁹. The d_r metric ranges from −1.0 to 1.0, with a value of >0.5 indicating a higher predictive capacity than the observed average. Because the primary mode of variability in air temperature is usually the seasonal cycle, the metrics were calculated independently for each month and then averaged. This baseline adjustment in d_r prevented from overestimating the skill of each reconstruction (i.e. gap-filling, etc.) by correcting for the seasonal cycle¹²⁰. Furthermore, the non-parametric Mann-Kendall test associated with Sen’s slope estimator was used for trend analysis in the evaluation.

Gap-filling validation

A gap-filling procedure was applied to extend shorter times series of weather stations (back to 1981) before constructing PISCOt. Two analyses were conducted to evaluate the efficiency of the gap-filling procedure. (i) Validation: comparing infilled and observed data for available dates with observed values, i.e., comparing available data that has been used to build the model. (ii) Cross-validation: comparing infilled and observed data for dates that were artificially set as missing data, i.e., comparing data that has not been used to build the model. In cross-validation, it is assumed a worst-case missing data scenario, we set only ten years of data in stations with more observed data (in time series with ≥75% of non-missing data in the period 1981–2020).

Table 2 summarises the statistical metrics, and Fig. 3 shows the distribution of d_r for both experiments. The experiments showed that the efficiency was slightly better for Tmax than Tmin. Both experiments had a bias <0.2 °C and MAE <1.5 °C. The most significant difference was in d_r; although moderate-to-high efficiency values were obtained in both experiments (d_r > 0.5), the best results were obtained in experiment (i). This can be explained due to the small amount of information available in the experiment (ii), as it was a worst-case scenario. By visualising the spatial distribution of d_r, it was noted that there were higher (lower) values in more (less) dense regions of weather stations for both experiments. The areas where d_r reached values from 0.8 to 0.9 were found in experiment (i). On the other hand, in experiment (ii), it reached values from 0.6 to 0.7.

Table 2 Gap filling error statistics for daily maximum (Tmax) and minimum (Tmin) temperature for bias, mean absolute error (MAE), and refined index of agreement (d_r) for 1981–2020 in two experiments: using all available data and when only a complete period of 10-years (with ≥75% data) is available.

Full size table

In general, the validation errors showed that the here-in used infill models worked reasonably well, considering the complicated topographic variability of the study area and the limited observational data. It must be pointed out that the errors of experiment (i) represented the residuals between the filled and observed values, as these were used to construct the infilled models that were finally used in PISCOt.

Spatial model validation

Monthly normal air temperature

K-fold cross-validation was performed to characterise the efficiency of the spatial model for the monthly normal temperature. In this study, K = 10 was defined. Therefore, 10 clusters were set up for each model and data series. We applied the statistical metrics (bias and MAE were only used as they are less affected by sample size) at the scale of two seasonal periods: “warm” (October to March) and “cold” (April to September).

Figure 4 showed a smaller positive bias in Tmax than in Tmin, with an average (warm and cold) value of 0.15 °C and 0.25 °C, respectively. However, this may be biased due to negative errors in the average. Considering the biases at the station scale, more points fall within the range of −1 °C to 1 °C in Tmin, implying that the estimation was better for Tmin. This pattern confirmed the findings for MAE, where Tmin (Tmax) averages 1.22 °C (1.42 °C) for both seasons. Spatially, the monthly normal interpolation performed worst in the mountainous regions between the boundaries of the climatic regions (Pacific Coast - Andes and Andes - Amazon), mainly in Tmax. Similarly, the largest errors in Tmax can be found in the southern Pacific Coast. At the seasonal level, there was no considerable difference in Tmax. However, for Tmin, estimates were slightly better in the warm period than in the cold period.

These results showed that the monthly normal interpolation for Tmin tends to be more efficient than for Tmax. In order to understand the impact of the spatial predictors (LST and DEM) on the air temperature estimation, the Lindemann, Merenda, and Gold method was applied^13,121,122. This method quantifies the relative influence of a spatial covariate by partitioning the total variance explained by the R² of the model (Fig. 5).

In Tmax (Fig. 5a), the DEM had the highest relative importance. The DEM contributed slightly more in summer than in winter months. About 50% (40%) of the observed variance can be explained by DEM in summer (winter). LST, on the other hand, adopts a major role from summer to autumn rather than during the period from winter to spring. One probable reason why DEM was such a good predictor for Tmax is that Tmax generally has a decreasing simple linear relationship with DEM, and DEM already has a solid predictive capacity without the addition of LST^13,28,47,123. In addition, LST_day is highly influenced by incoming solar radiation and biophysical properties (e.g. land cover, albedo, moisture, roughness) and, thus, has a high degree of microscale variability¹²⁴. As a result, LST_day is more spatially variable than Tmax, especially during higher solar radiation dates⁴⁷. The relationship between Tmax and LST is often more complex than that between Tmin and LST^13,47. From a seasonal perspective, we found that LST_day is more efficient in explaining the variance of Tmax from summer to autumn rather than winter to spring. We hypothesize that this behaviour can be related to solar radiation seasonality which is coupled with the cloud cover amount due to the rainfall season^42,75. From winter to spring (summer to autumn) there is more (less) incoming solar radiation due to the presence of less (more) cloud cover. Consequently, the spatial relation between LST_day and Tmax is weaker in the winter season compared to the summer period.

For Tmin, LST was a slightly more critical predictor than DEM in most months except for February (Fig. 5b). However, no covariate reached a relative importance of 50%. It is somewhat notable that LST reached its highest values from June to November and, inversely, in DEM. Due to the strong gradients and complex topography, micro-climatic influences on Tmin play an essential role. Cold air inversions are a common phenomenon, especially during periods of atmospheric stability and significant radiative cooling which is typical for mountainous regions^28,99. Therefore, Tmin does not have a simple linear relationship with DEM, which can limit its capacity as an individual predictor for the spatial patterns of Tmin¹²⁵. The addition of LST, however, contributed to the spatial estimation of Tmin. Unlike LST_day, without direct solar radiation LST_night spatial variability is more influenced by local and mesoscale atmospheric processes important for air temperature¹²⁴. Therefore, LST_night and Tmin maintain similar spatial variability throughout the annual seasonal cycle as contrary to LST_day and Tmax^13,47,126. This is also shown by the fact that higher values of R² were reached with Tmin (Fig. 5c) than with Tmax.

In summary, it was shown that the spatial model used had a greater predictive capacity and a lower average error in the estimation of Tmin than Tmax, mainly during the summer months. LST had a higher value-added in Tmin than in Tmax in the study region. Furthermore, DEM was more important for Tmax prediction.

Daily air temperature

The evaluation of the efficiency of daily air temperature data was similar to the one presented for the monthly normals, but only focused on the stations with long time series (with ≥75% of non-missing data) to reduce the influence of synthetic data. In addition, trends (Sen’s slope) were computed over the available period for each station and were compared with trends calculated based on the 10-fold cross-validation. This analysis allows to estimate how reliable temperature trends can be predicted at un-sampled locations by interpolation, giving insight into the accuracy of temperature trends from the gridded dataset^16,127.

Figure 6 shows the results for bias and MAE, while Fig. 7 shows the results for d_r. On average, a lower bias was observed compared to the normal scale. This was probably due to the greater amount of averaged data. Despite this, it can be observed that there was a similar pattern to the normal scale. For the bias (MAE), values of −0.01 °C and 0.05 °C (1.36 °C and 1.11 °C) were found on average for Tmax and Tmin, respectively. Furthermore, estimates were slightly better for Tmax (Tmin) in the cold (warm) period. For d_r, it reached moderate-to-high efficiency values (d_r > 0.5) at most of the weather stations. Efficiency values were lowest for the warm period of Tmax (d_r = 0.48). The area with the lowest d_r values was in the south, mainly along the Pacific Coast and the border regions of the Andes and the Amazon.

Figure 8 exhibits the cross-validated predictions of the 1981–2020 trends in the annual mean and the warm and cold seasons of air temperature with the trends observed in the homogenized weather stations. This shows that most signs of observed trends are well detected by the estimated time series, particularly for Tmax rather than Tmin. The disparity is also evidenced by the d_r metric, where estimated trends for Tmax are above 0.6 while for Tmin they are around 0.5–0.6. There is not much difference between the annual and seasonal means. The results indicate that there is moderate efficiency in reproducing the observed spatial variations of the temporal trends in Tmax, but for Tmin, there is a poorer capability. This is probably due to the limited station density in Peru and artificial temporal variability mixed with real local climate features, despite the homogeneity check and QC procedures¹⁶. For the case of Tmin, the low temporal variability estimation can also be attributed to the lower temporal correlation power at shorter distances compared to Tmax (Supplementary Fig. 3), leading to a less efficient temporal reconstruction (as shown in the Gap-filling validation sub-section), and hence a lower temporal variability estimation. Furthermore, as Tmin is more influenced by local conditions, there would be a possible role of land cover change that has not been taken into account as a predictor^128,129. Finally, it is worth mentioning that LST did not cover the entire period, which could also explain the bad performance of temporal trends.

In general, the results demonstrated a reasonably good capacity of the spatial model to estimate daily Tmax and Tmin. Similarly to the results from the normal monthly scale, Tmin outperformed Tmax in both the warm and cold periods. However, Tmax is slightly more efficient in estimating the observed spatial variations of the temporal trend.

Usefulness of the PISCOt v1.2 product

Spatio-temporal variability of air temperature

To present an application of PISCOt v1.2, a description of the spatio-temporal variability of air temperature indices characterising the trend (Mann-Kendall test and Sen’s slope) was conducted. This was applied in the southern Andes of Peru, a region characterised by agricultural and livestock subsistence and production⁴², and therefore highly dependent on climatic conditions. The indices selected were annual mean Tmax (MTmax), annual mean Tmin (MTmin), and the annual number of frost days (FD, number of days with Tmin <0 °C).

Additionally, to provide a full comparison with existing temperature products, both national datasets (PISCOt v1.1 and VS2018, described above) and global products (TerraClimate¹³⁰, CHIRTS⁹, and ERA5-Land⁸²) were included. TerraClimate provides Tmax and Tmin at monthly temporal resolution and a ≈4 km spatial resolution for 1958–2020. CHIRTS produces daily values of Tmax and Tmin at 5 km (0.05°) and is available from 1983 to 2016. ERA5-Land is a reanalysis product that contains a great diversity of surface variables at a spatial resolution of 9 km (≈0.1°) since 1981. For ERA5-Land, daily Tmax and Tmin were obtained from the maximum and minimum hourly values.

First, the spatial differences for the annual average air temperature indices were examined for the period 1981–2010. Figure 9a shows the annual climatologies of MTmax, MTmin, and FD in PISCOt v1.2, while Fig. 9b indicates the difference of PISCOt v1.2 with each gridded product. For MTmax, differences were small (below 1 °C), mainly in PISCOt v1.1 and VS2018. ERA5-Land presented the lowest MTmax values compared to PISCOt v1.2, reaching differences of up to more than 6 °C in large parts of the Andean and Amazonian regions. The largest areas of differences between the multiple gridded products occured at the boundaries of the climatic regions, i.e., at the Andes-Amazon and Pacific-Andean transitions and where no data were available. For MTmin, the spatial pattern of the differences was similar to MTmax for PISCOt v1.1 and VS2018. The largest differences were found in TerraClimate and CHIRTS, where the latter had the highest MTmin values, reaching differences of up to more than −6 °C in the Andean highlands. For FD, PISCOt v1.1 and ERA5-Land showed the best agreement with PISCO v1.2 (differences within 10%). Only for CHIRTS, differences of up to 60% were discovered. This was not surprising as CHIRTS represents the most diverging product regarding Tmin.

The spatio-temporal variability of air temperature indices was assessed through trend analysis at different temporal and spatial windows. Figure 10 shows the decadal rate of change for 10-year time windows from 1981 to 2020 for areas above 2000 masl in the Southern Andes of Peru. For MTmax, there was a good agreement between the trends of the different products. Periods with significant positive trend were coinciding well in all products in the 1990–1995, 2000–2005, and 2010–2015 years. Periods with slightly negative or zero trends coinciding well in all products in the 1995–2000 and 2005–2010 years. This was evident in PISCOt v1.2 compared to ERA5-Land, VS2018, and PISCOt v1.1. For MTmin, there was more variability in the trends, with no clear overall direction as in MTmax, except for the latest years (since 2010). From 1980 to 2000, PISCOt v1.2 showed similar variability (a slightly positive trend) to ERA5-Land, then moves closer (a slightly negative trend) to PISCOt v1.1 and VS2018 in the 2000–2007 period, and finally, since 2010, being in agreement with PISCOt v1.1 and VS2018 and ERA5-Land into a positive trend. It is worth noting that PISCOt v1.1 and VS2018 showed good agreement in Tmin throughout the analysis period, diverging to a greater extent from PISCOt v1.2 before 1990. Significant positive trends in common in MTmin were only found during 1990–1995 and 2010–2015. A similar pattern as for MTmin was also found for FD. ERA5-Land (PISCOt v1.1) tended to behave analogously to PISCOt v1.2 for much of the analysis period, only disagreement (agreement) from 1995 to 2007. There were only significant overlapping trends in FD during 1990–1995 (negative) and 2010–2015 (positive).

Regarding spatial variability, Fig. 11 shows the trend by different elevation intervals for the period 1983–2013 (common reporting period). In MTmax, the magnitude of trends increased for higher elevation intervals mainly in PISCOt v1.2, PISCOt v1.1, VS2018, and ERA5-Land. In contrast, in CHIRTS and TerraClimate no direct relationship between the elevation and trend magnitude was evident. There was a more substantial spatial disparity in the direction of the trends at lower than high elevations in the different products (Supplementary Fig. 5). For MTmin, the various products (except for CHIRTS) showed a better agreement of the relationship between the trend magnitude and elevation. However, this was less pronounced than for MTmax. Significant positive or negative trends in FD were only found between 3000 and 3500 masl, with a similar (inverse) agreement of PISCOt v1.2 with PISCOt v1.1 and ERA5-Land (CHIRTS). PISCOt v1.2 and ERA5-Land reached zero trends above 5000 masl, because for this elevation level for every year 100% FD was reached. Consequently, no temporal change can be found.

The results showed that PISCOt v1.2 performed well over the southern Andes of Peru. PISCOt v1.2 presented spatiotemporal trends and overall distribution similar to the other products. Some differences in the results can be pointed out. Firstly, there was a high degree of correspondance in the magnitude of the air temperature between PISCOt v1.2 and PISCOt v1.1 and VS2018. This was expected, since the three datasets used information from the same station’s network, albeit with a different number of stations and distinct pre-processing applied. Larger differences were obtained in ERA5-Land (MTmax) and CHIRTS (in MTmin and FD). ERA5-Land is a reanalysis-based dataset, thus, it is expected to represent the physics. However, it was subject to systematic differences caused by the misrepresentation of the topography, requiring a bias correction prior to its use at high elevations¹³¹. CHIRTS is a merged product of station-based and reanalysis data. In its construction, it prioritised the estimation of Tmax rather than Tmin⁹, possibly explaining the significant differences with the latter variable. Considering the trends, there was a clear warming signal^5,38, with larger magnitudes and spatially more homogeneously for Tmax than for Tmin⁴². CHIRTS and TerraClimate showed largest differences in temporal and spatial trends, leading to large unphysical trends due to unhomogenized or missing station data. This is an issue that should be fixed by using homogenisation algorithms.

Coastal fog effect on air temperature

In order to assess PISCOt v1.2 at a daily time step, we provide an analysis of the effect of coastal fog on modulating the daily mean air temperature. Coastal fog frequently occurs along the Peruvian coast and low Andean foothills. This phenomenon is especially persistent during austral winter (June-September), although it can occasionally appear throughout the year^132,133,134. The occurrence of fog is often produced by the particular thermal inversion layer situation with cool lower air masses due to the south-north flowing Humboldt Current. The frequency of coastal fog increases gradually to the south, causing a marked diurnal cooling in the influenced coastal-Andean areas¹³⁴.

We exemplified two situations with two variables: surface reflectance (Sref) from the MODIS terra satellite (MOD09GA version 6.1, band 1)¹³⁵, which relates to the amount of cloud cover and the mean air temperature (Tmean: mean of Tmax and Tmin). This was performed during a coastal fog-covered (2007/08/25) and cloud-free (2006/08/24) day in northern Peru, including the Pacific Coast and Andean slopes¹³⁴.

Figure 12 shows the spatial variability of Sref and Tmean during the two situations and its spatial difference; in addition, the vertical distribution of Tmean with elevation was included. For the Pacific Coast area, Sref values were higher (more reflectance) on the fog-covered day than on the cloud-free day, reaching a contrast of up to −1 (Fig. 12a1,b1,c1). The negative differences in Sref revealed very well the spatial configuration of the fog. When inspecting Tmean, lower values were found on the fog-covered day compared to the cloud-free day, leading to differences of up to 2–6 °C; outlining certainly the Pacific Coast area (Figure 12a2,b2,c3). In this sense, there was a clear spatial contrast in both variables in the two situations: the higher the Sref values (more cloud cover), the lower the Tmean.

From a vertical perspective (Figure 12a3,b3,c3), it was also confirmed that there was a contrast on both days in low-elevation areas where the fog was located. We found that the higher Tmean differences were mostly present for grid cells below 500 masl. There are, however, also positive differences above 500 masl, but for much fewer grid cells. This high contrast in the number of grid cells determined the presence of fog on low levels. Interestingly, the value of 500 masl is close to the height of the thermal inversion layer of 400 masl identified in a previous study¹³⁴. Furthermore, we found negative Tmean differences between both situations above 2500 masl (Figure 12c3), which can also be attributed to the presence of clouds at higher elevations on the cloud-free day rather than on the fog-covered day (Sref difference is positive in those areas).

These results suggest that PISCOt v1.2 is able to identify the effect of coastal fog on air temperature. Nevertheless, more in-depth analysis is required for a better understanding of this phenomenon.

Usage Notes

The PISCOt v1.2 database is a valuable dataset for different applications in Peru as it allows for high-resolution analyses linked to e.g. climate change, health, hydrology, ecosystem assessments, and other fields for research and practitioners. PISCOt v1.2 supports the generation of new findings urgently required for more robust local decision-making in the scientific and political communities, especially in a context of data scarcity and high uncertainties in the region.

The new PISCOt v1.2 product has improved compared to the earlier version 1.1 in several key aspects: more assimilated time series, better consistency of station data pre-processing (quality control, gap-filling, and homogenisation), use of updated freely available auxiliary predictors, higher spatial resolution, a tidier and revised calculation sequence, and improved version control. Therefore, the development of PISCOt v1.2 is more consistent, traceable, and reproducible compared to other previously established gridded products in Peru.

PISCOt v1.2 adequately characterises the spatiotemporal variability of air temperature in average and extreme values using indicators. However, within the scope of this study only three indices were used. Future assessments therefore need to focus on more indicators of climate extremes not assessed in this study.

As the region is topographically complex, including steep climatic gradients, and is characterized by a low density and uneven distribution of weather stations, inherent limitations in spatial interpolation are expected, mainly at high elevations (between 1000 and 2000 masl, and >3500 masl). It is therefore recommended to use PISCOt v1.2 along with other gridded multi-source products which would allow for a better characterisation of the associated uncertainties in air temperature. More importantly, when aiming the evaluation of temporal trends on Tmin. A poor trend validation was found for Tmin that could lead to local erroneous climatic evaluation, in some cases with the opposite sign.

Furthermore, it is essential to clarify that matching weather stations with PISCOt v1.2 (and other products) is not recommended for assessing air temperature accuracy¹³⁶. This is because such an analysis would favour products with interpolation algorithms that constrain the gridded data to precisely match weather station data. Likewise, if processes such as gap-filling, and homogeneity correction, among others, are applied to the observed data before spatial interpolation, the updated information would therefore no longer match the original data.

Finally, the gridded data of PISCOt v1.2 should only be used for continental areas. Due to the differences in LST values over water bodies compared to their surrounding terrestrial landscapes and the lack of observations over lakes, further validation is required to confirm the accuracy of spatial air temperature patterns over water¹³. Estimates over e.g. water bodies should therefore be masked out (i.e. be considered as empty grids).

Code availability

The construction of the gridded dataset PISCOt v1.2 was performed using the R (v3.6.3) and Python (v3.8.5) programming languages. The entire code used is freely available at figshare¹³⁷ and GitHub (https://github.com/adrHuerta/PISCOt_v1-2) under the GNU General Public License v3.0.

References

Kessler, M., Toivonen, J. M., Sylvester, S. P., Kluge, J. & Hertel, D. Elevational patterns of Polylepis tree height (Rosaceae) in the high Andes of Peru: role of human impact and climatic conditions. Frontiers in plant science 5, 194, https://doi.org/10.3389/fpls.2014.00194 (2014).
Article PubMed PubMed Central Google Scholar
Rau, P. et al. Assessing multidecadal runoff (1970–2010) using regional hydrological modelling under data and water scarcity conditions in Peruvian Pacific catchments. Hydrological Processes 33, 20–35, https://doi.org/10.1002/hyp.13318 (2019).
Article ADS Google Scholar
Delahoy, M. J. et al. Meteorological factors and childhood diarrhea in Peru, 2005–2015: a time series analysis of historic associations, with implications for climate change. Environmental Health 20, 1–10, https://doi.org/10.1186/s12940-021-00703-4 (2021).
Article Google Scholar
Sanabria, J., Calanca, P., Alarcón, C. & Canchari, G. Potential impacts of early twenty-first century changes in temperature and precipitation on rainfed annual crops in the Central Andes of Peru. Regional Environmental Change 14, 1533–1548, https://doi.org/10.1007/s10113-014-0595-y (2014).
Article Google Scholar
López-Moreno, J. I. et al. Recent temperature variability and change in the Altiplano of Bolivia and Peru. International Journal of Climatology 36, 1773–1796, https://doi.org/10.1002/joc.4459 (2016).
Article ADS Google Scholar
Sulca, J. et al. Climatology of extreme cold events in the central Peruvian Andes during austral summer: origin, types and teleconnections. Quarterly Journal of the Royal Meteorological Society https://doi.org/10.1002/qj.3398 (2018).
Harris, I., Osborn, T. J., Jones, P. & Lister, D. Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Scientific data 7, 1–18, https://doi.org/10.1038/s41597-020-0453-3 (2020).
Article Google Scholar
Hersbach, H. et al. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 1999–2049, https://doi.org/10.1002/qj.3803 (2020).
Article ADS Google Scholar
Verdin, A. et al. Development and validation of the CHIRTS-daily quasi-global high-resolution daily temperature data set. Scientific Data 7, 1–14, https://doi.org/10.1038/s41597-020-00643-7 (2020).
Article Google Scholar
Dee, D. P. et al. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quarterly Journal of the Royal Meteorological Society 137, 553–597, https://doi.org/10.1002/qj.828 (2011).
Article ADS Google Scholar
Rao, Y., Liang, S. & Yu, Y. Land Surface Air Temperature Data Are Considerably Different Among BEST-LAND, CRU-TEM4v, NASA-GISS, and NOAA-NCEI. Journal of Geophysical Research: Atmospheres 123, 5881–5900, https://doi.org/10.1029/2018JD028355 (2018).
Article ADS Google Scholar
Krähenmann, S. & Ahrens, B. Spatial gridding of daily maximum and minimum 2 m temperatures supported by satellite observations. Meteorology and Atmospheric Physics 120, 87–105, https://doi.org/10.1007/s00703-013-0237-9 (2013).
Article ADS Google Scholar
Oyler, J. W., Ballantyne, A., Jencso, K., Sweet, M. & Running, S. W. Creating a topoclimatic daily air temperature dataset for the conterminous United States using homogenized station data and remotely sensed land skin temperature. International Journal of Climatology 35, 2258–2279, https://doi.org/10.1002/joc.4127 (2015).
Article ADS Google Scholar
Hiebl, J. & Frei, C. Daily temperature grids for Austria since 1961–concept, creation and applicability. Theoretical and Applied Climatology 124, 161–178, https://doi.org/10.1007/s00704-015-1411-4 (2016).
Article ADS Google Scholar
Berezowski, T. et al. CPLFD-GDPT5: High-resolution gridded daily precipitation and temperature data set for two largest Polish river basins. Earth System Science Data 8, 127–139, https://doi.org/10.5194/essd-8-127-2016 (2016).
Article ADS Google Scholar
Antolini, G. et al. A daily high-resolution gridded climatic data set for Emilia-Romagna, Italy, during 1961-2010. International Journal of Climatology 36, 1970–1986, https://doi.org/10.1002/joc.4473 (2016).
Article ADS Google Scholar
Way, R. G., Lewkowicz, A. G. & Bonnaventure, P. P. Development of moderate-resolution gridded monthly air temperature and degree-day maps for the Labrador-Ungava region of northern Canada. International Journal of Climatology 37, 493–508, https://doi.org/10.1002/joc.4721 (2017).
Article ADS Google Scholar
Fonseca, A. R. & Santos, J. A. High-resolution temperature datasets in Portugal from a geostatistical approach: Variability and extremes. Journal of Applied Meteorology and Climatology 57, 627–644, https://doi.org/10.1175/JAMC-D-17-0215.1 (2018).
Article ADS Google Scholar
Li, J. & Heap, A. D. A review of comparative studies of spatial interpolation methods in environmental sciences: Performance and impact factors. Ecological Informatics 6, 228–241, https://doi.org/10.1016/j.ecoinf.2010.12.003 (2011).
Article ADS Google Scholar
Li, J. & Heap, A. D. Spatial interpolation methods applied in the environmental sciences: A review. Environmental Modelling and Software 53, 173–189, https://doi.org/10.1016/j.envsoft.2013.12.008 (2014).
Article Google Scholar
Shen, H. et al. Deep learning-based air temperature mapping by fusing remote sensing, station, simulation and socioeconomic data. Remote Sensing of Environment 240, 111692, https://doi.org/10.1016/j.rse.2020.111692 (2020).
Article Google Scholar
Zhang, X. et al. Deep learning-based 500 m spatio-temporally continuous air temperature generation by fusing multi-source data. Remote Sensing 14, 3536, https://doi.org/10.3390/rs14153536 (2022).
Article ADS Google Scholar
Sekulić, A., Kilibarda, M., Protić, D. & Bajat, B. A high-resolution daily gridded meteorological dataset for Serbia made by Random Forest Spatial Interpolation. Scientific Data 8, 1–12, https://doi.org/10.1038/s41597-021-00901-2 (2021).
Article Google Scholar
He, Q., Wang, M., Liu, K., Li, K. & Jiang, Z. GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning. Earth System Science Data 14, 3273–3292, https://doi.org/10.5194/essd-14-3273-2022 (2022).
Article ADS Google Scholar
Lary, D. J., Alavi, A. H., Gandomi, A. H. & Walker, A. L. Machine learning in geosciences and remote sensing. Geoscience Frontiers 7, 3–10, https://doi.org/10.1016/j.gsf.2015.07.003 (2016).
Article Google Scholar
Hengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B. & Gräler, B. Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ 6, e5518, https://doi.org/10.7717/peerj.5518 (2018).
Article PubMed PubMed Central Google Scholar
Hernanz, A., Garca-Valero, J. A., Domnguez, M. & Rodrguez-Camino, E. A critical view on the suitability of machine learning techniques to downscale climate change projections: Illustration for temperature with a toy experiment. Atmospheric Science Letters e1087, https://doi.org/10.1002/asl.1087 (2022).
Daly, C. et al. Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. International Journal of Climatology 28, 2031–2064, https://doi.org/10.1002/joc.1688 (2008).
Article ADS Google Scholar
Hengl, T., Heuvelink, G. B., Tadić, M. P. & Pebesma, E. J. Spatio-temporal prediction of daily temperatures using time-series of MODIS LST images. Theoretical and Applied Climatology 107, 265–277, https://doi.org/10.1007/s00704-011-0464-2 (2012).
Article ADS Google Scholar
Lin, G. et al. Spatio-temporal variation of PM2.5 concentrations and their relationship with geographic and socioeconomic factors in China. International Journal of Environmental Research and Public Health 11, 173–186, https://doi.org/10.3390/ijerph110100173 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kilibarda, M. et al. Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution. Journal of Geophysical Research 119, 2294–2313, https://doi.org/10.1002/2013JD020803 (2014).
Article Google Scholar
Wang, M. et al. Comparison of spatial interpolation and regression analysis models for an estimation of monthly near surface air temperature in China. Remote Sensing 9, https://doi.org/10.3390/rs9121278 (2017).
Xavier, A. C., King, C. W. & Scanlon, B. R. Daily gridded meteorological variables in Brazil (1980–2013). International Journal of Climatology 36, 2644–2659, https://doi.org/10.1002/joc.4518 (2016).
Article ADS Google Scholar
Xavier, A. C., Scanlon, B. R., King, C. W. & Alves, A. I. New Improved Brazilian Daily Weather Gridded Data (1961-2020). International Journal of Climatology https://doi.org/10.1002/joc.7731 (2022).
Bianchi, E., Villalba, R., Viale, M., Couvreux, F. & Marticorena, R. New precipitation and temperature grids for northern Patagonia: Advances in relation to global climate grids. Journal of Meteorological Research 30, 38–52, https://doi.org/10.1007/s13351-015-5058-y (2016).
Article ADS Google Scholar
Vicente-Serrano, S. M. et al. Average monthly and annual climate maps for Bolivia. Journal of Maps 12, 295–310, https://doi.org/10.1080/17445647.2015.1014940 (2016).
Article Google Scholar
Andrade, M. F. et al. Atlas-clima y eventos extremos del altiplano central perú-boliviano. Geographica Bernensia https://doi.org/10.4480/GB2018.N01 (2018).
Vicente-Serrano, S. M. et al. Recent changes in monthly surface air temperature over Peru, 1964–2014. International Journal of Climatology 38, 283–306, https://doi.org/10.1002/joc.5176 (2018).
Article ADS Google Scholar
Huerta, A., Aybar, C. & Lavado-Casimiro, W. PISCO temperatura versión 1.1 (PISCOt v1. 1). Lima, Peru: National Meteorology and Hydrology Service of Peru (SENAMHI) https://iridl.ldeo.columbia.edu/SOURCES/.SENAMHI/.HSR/.PISCO/.Temp/ (2018).
Drenkhan, F., Huggel, C., Guardamino, L. & Haeberli, W. Managing risks and future options from new lakes in the deglaciating Andes of Peru: The example of the Vilcanota-Urubamba basin. Science of the Total Environment 665, 465–483, https://doi.org/10.1016/j.scitotenv.2019.02.070 (2019).
Article ADS CAS PubMed Google Scholar
Muñoz, R., Huggel, C., Drenkhan, F., Vis, M. & Viviroli, D. Comparing model complexity for glacio-hydrological simulation in the data-scarce Peruvian Andes. Journal of Hydrology: Regional Studies 37, 100932, https://doi.org/10.1016/j.ejrh.2021.100932 (2021).
Article Google Scholar
Imfeld, N. et al. A combined view on precipitation and temperature climatology and trends in the southern Andes of Peru. International Journal of Climatology 41, 679–698, https://doi.org/10.1002/joc.6645 (2021).
Article Google Scholar
Llauca, H., Lavado-Casimiro, W., Montesinos, C., Santini, W. & Rau, P. PISCO_HyM_GR2M: A Model of Monthly Water Balance in Peru (1981–2020). Water 13, 1048, https://doi.org/10.3390/w13081048 (2021).
Article Google Scholar
Monge-Salazar, M. J. et al. Ecohydrology and ecosystem services of a natural and an artificial bofedal wetland in the central Andes. Science of The Total Environment 155968, https://doi.org/10.1016/j.scitotenv.2022.155968 (2022).
Motschmann, A. et al. Current and future water balance for coupled human-natural systems–Insights from a glacierized catchment in Peru. Journal of Hydrology: Regional Studies 41, 101063, https://doi.org/10.1016/j.ejrh.2022.101063 (2022).
Article Google Scholar
Chen, F., Liu, Y., Liu, Q. & Qin, F. A statistical method based on remote sensing for the estimation of air temperature in China. International Journal of Climatology 35, 2131–2143, https://doi.org/10.1002/joc.4113 (2015).
Article ADS Google Scholar
Oyler, J. W., Dobrowski, S. Z., Holden, Z. A. & Running, S. W. Remotely sensed land skin temperature as a spatial predictor of air temperature across the conterminous United States. Journal of Applied Meteorology and Climatology 55, 1441–1457, https://doi.org/10.1175/JAMC-D-15-0276.1 (2016).
Article ADS Google Scholar
Kloog, I. et al. Modelling spatio-temporally resolved air temperature across the complex geo-climate area of France using satellite-derived land surface temperature data. International Journal of Climatology 37, 296–304, https://doi.org/10.1002/joc.4705 (2017).
Article ADS Google Scholar
Li, X., Zhou, Y., Asrar, G. R. & Zhu, Z. Developing a 1 km resolution daily air temperature dataset for urban and surrounding areas in the conterminous United States. Remote Sensing of Environment 215, 74–84, https://doi.org/10.1016/j.rse.2018.05.034 (2018).
Article ADS Google Scholar
Woldesenbet, T. A., Elagib, N. A., Ribbe, L. & Heinrich, J. Gap filling and homogenization of climatological datasets in the headwater region of the Upper Blue Nile Basin, Ethiopia. International Journal of Climatology 37, 2122–2140, https://doi.org/10.1002/joc.4839 (2017).
Article ADS Google Scholar
Hunziker, S. et al. Identifying, attributing, and overcoming common data quality issues of manned station observations. International Journal of Climatology 37, 4131–4145, https://doi.org/10.1002/joc.5037 (2017).
Article ADS Google Scholar
Huerta, A. & Lavado-Casimiro, W. Atlas de Zonas Áridas del Perú: una evaluación presente y futura. Servicio Nacional de Meteorología e Hidrología del Perú https://hdl.handle.net/20.500.12542/1206 (2021).
Zevallos, J. & Lavado-Casimiro, W. Climate Change Impact on Peruvian Biomes. Forests 13, 238, https://doi.org/10.3390/f13020238 (2022).
Article Google Scholar
Haylock, M. et al. A European daily high-resolution gridded data set of surface temperature and precipitation for 1950–2006. Journal of Geophysical Research: Atmospheres 113, https://doi.org/10.1029/2008JD010201 (2008).
Grasso, L. D. The differentiation between grid spacing and resolution and their application to numerical modeling. Bulletin of the American Meteorological Society 81, 579–580, 10.1175/1520-0477(2001)082<0699:FCOTDB>2.3.CO;2 (2000).
Lussana, C., Tveito, O. E., Dobler, A. & Tunheim, K. seNorge_2018, daily precipitation, and temperature datasets over Norway. Earth System Science Data 11, 1531–1551, https://doi.org/10.5194/essd-11-1531-2019 (2019).
Article ADS Google Scholar
Crespi, A., Matiu, M., Bertoldi, G., Petitta, M. & Zebisch, M. A high-resolution gridded dataset of daily temperature and precipitation records (1980–2018) for Trentino-South Tyrol (north-eastern Italian Alps). Earth System Science Data 13, 2801–2818, https://doi.org/10.5194/essd-13-2801-2021 (2021).
Article ADS Google Scholar
Beven, K., Cloke, H., Pappenberger, F., Lamb, R. & Hunter, N. Hyperresolution information and hyperresolution ignorance in modelling the hydrology of the land surface. Science China Earth Sciences 58, 25–35, https://doi.org/10.1007/s11430-014-5003-4 (2015).
Article ADS Google Scholar
Dawdy, D. & Langbein, W. Mapping mean areal precipitation. Hydrological Sciences Journal 5, 16–23, https://doi.org/10.1080/02626666009493176 (1960).
Article Google Scholar
Willmott, C. J. & Robeson, S. M. Climatologically aided interpolation (CAI) of terrestrial air temperature. International Journal of Climatology 15, 221–229, https://doi.org/10.1002/joc.3370150207 (1995).
Article ADS Google Scholar
New, M., Hulme, M. & Jones, P. Representing twentieth-century space–time climate variability. Part II: Development of 1901–96 monthly grids of terrestrial surface climate. Journal of climate 13, 2217–2238, 10.1175/1520-0442(2000)013<2217:RTCSTC>2.0.CO;2 (2000).
Hunter, R. D. & Meentemeyer, R. K. Climatologically aided mapping of daily precipitation and temperature. Journal of Applied Meteorology 44, 1501–1510, https://doi.org/10.1175/JAM2295.1 (2005).
Article ADS Google Scholar
Condom, T. et al. Climatological and hydrological observations for the South American Andes: in situ stations, satellite, and reanalysis data sets. Frontiers in Earth Science 8, 92, https://doi.org/10.3389/feart.2020.00092 (2020).
Article ADS Google Scholar
Hubbard, K. Spatial variability of daily weather variables in the high plains of the USA. Agricultural and Forest Meteorology 68, 29–41, https://doi.org/10.1016/0168-1923(94)90067-1 (1994).
Article ADS Google Scholar
Camargo, M. B. & Hubbard, K. G. Spatial and temporal variability of daily weather variables in sub-humid and semi-arid areas of the United States high plains. Agricultural and forest meteorology 93, 141–148, https://doi.org/10.1016/S0168-1923(98)00122-1 (1999).
Article ADS Google Scholar
Vera, L., Villegas, E., Oria, C. & Arboleda, F. Control de calidad de datos de estaciones meteorológicas e hidrológicas automáticas en el centro de procesamiento de datos del SENAMHI. Tech. Rep., Servicio Nacional de Meteorología e Hidrología del Perú (SENAMHI), https://www.senamhi.gob.pe/load/file/00711SENA-54.pdf (2021).
Espinoza, J. C. et al. Revisiting wintertime cold air intrusions at the east of the Andes: propagating features from subtropical Argentina to Peruvian Amazon and relationship with large-scale circulation patterns. Climate dynamics 41, 1983–2002, https://doi.org/10.1007/s00382-012-1639-y (2013).
Article ADS Google Scholar
Vicente-Serrano, S. M., Beguería, S., López-Moreno, J. I., García-Vera, M. A. & Stepanek, P. A complete daily precipitation database for northeast Spain: reconstruction, quality control, and homogeneity. International Journal of Climatology 30, 1146–1163, https://doi.org/10.1002/joc.1850 (2010).
Article ADS Google Scholar
Lanzante, J. R. Resistant, robust and non-parametric techniques for the analysis of climate data: Theory and examples, including applications to historical radiosonde station data. International Journal of Climatology: A Journal of the Royal Meteorological Society 16, 1197–1226, 10.1002/(SICI)1097-0088(199611)16:11<1197::AID-JOC89>3.0.CO;2-L (1996).
Wood, W. H., Marshall, S. J., Whitehead, T. L. & Fargey, S. E. Daily temperature records from a mesonet in the foothills of the Canadian Rocky Mountains, 2005–2010. Earth System Science Data 10, 595–607, https://doi.org/10.5194/essd-10-595-2018 (2018).
Article ADS Google Scholar
Tomas-Burguera, M., Vicente-Serrano, S. M., Beguera, S., Reig, F. & Latorre, B. Reference crop evapotranspiration database in Spain (1961–2014). Earth System Science Data 11, 1917–1930, https://doi.org/10.5194/essd-11-1917-2019 (2019).
Article ADS Google Scholar
Huerta, A. & Lavado-Casimiro, W. Trends and variability of precipitation extremes in the Peruvian Altiplano (1971–2013). International Journal of Climatology 41, 513–528, https://doi.org/10.1002/joc.6635 (2021).
Article ADS Google Scholar
Huerta, A. et al. PISCOeo_pm, a reference evapotranspiration gridded database based on FAO Penman-Monteith in Peru. Scientific data 9, 1–18, https://doi.org/10.1038/s41597-022-01373-8 (2022).
Article Google Scholar
Hunziker, S. et al. Effects of undetected data quality issues on climatological analyses. Climate of the Past 14, 1–20, https://doi.org/10.5194/cp-14-1-2018 (2018).
Article ADS Google Scholar
Aybar, C. et al. Construction of a high-resolution gridded rainfall dataset for Peru from 1981 to the present day. Hydrological Sciences Journal 65, 770–785, https://doi.org/10.1080/02626667.2019.1649411 (2020).
Article Google Scholar
Beguera, S., Vicente-Serrano, S. M., Tomás-Burguera, M. & Maneta, M. Bias in the variance of gridded data sets leads to misleading conclusions about changes in climate variability. International Journal of Climatology 36, 3413–3422, https://doi.org/10.1002/joc.4561 (2016).
Article ADS Google Scholar
Thevakaran, A. & Sonnadara, D. U. Estimating missing daily temperature extremes in Jaffna, Sri Lanka. Theoretical and Applied Climatology 132, 145–152, https://doi.org/10.1007/s00704-017-2082-0 (2018).
Article ADS Google Scholar
Beguera, S. et al. Gap filling of monthly temperature data and its effect on climatic variability and trends. Journal of Climate 32, 7797–7821, https://doi.org/10.1175/JCLI-D-19-0244.1 (2019).
Article ADS Google Scholar
Gudmundsson, L., Bremnes, J. B., Haugen, J. E. & Engen-Skaugen, T. Downscaling RCM precipitation to the station scale using statistical transformations–a comparison of methods. Hydrology and Earth System Sciences 16, 3383–3390, https://doi.org/10.5194/hess-16-3383-2012 (2012).
Article ADS Google Scholar
Stanley, T., Kirschbaum, D. B., Huffman, G. J. & Adler, R. F. Approximating long-term statistics early in the global precipitation measurement era. Earth Interactions 21, 1–10, https://doi.org/10.1175/EI-D-16-0025.1 (2017).
Article Google Scholar
Gonzalez-Hidalgo, J. C., Peña-Angulo, D., Brunetti, M. & Cortesi, N. MOTEDAS: a new monthly temperature database for mainland Spain and the trend in temperature (1951–2010. International Journal of Climatology 35, 4444–4463, https://doi.org/10.1002/joc.4298 (2015).
Article ADS Google Scholar
Muñoz-Sabater, J. et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth System Science Data Discussions 1–50, https://doi.org/10.5194/essd-13-4349-2021 (2021).
Cannon, A. J., Sobie, S. R. & Murdock, T. Q. Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes? Journal of Climate 28, 6938–6959, https://doi.org/10.1175/JCLI-D-14-00754.1 (2015).
Article ADS Google Scholar
Venema, V. K. C. et al. Benchmarking homogenization algorithms for monthly data. Climate of the Past 8, 89–115, https://doi.org/10.5194/cp-8-89-2012 (2012).
Article ADS Google Scholar
Brönnimann, S. Climatic changes since 1700. In Climatic Changes Since 1700, 167–321, https://doi.org/10.1007/978-3-319-19042-6 (Springer, 2015).
Domonkos, P., Guijarro, J. A., Venema, V., Brunet, M. & Sigró, J. Efficiency of Time Series Homogenization: Method Comparison with 12 Monthly Temperature Test Datasets. Journal of Climate 34, 2877–2891, https://doi.org/10.1175/JCLI-D-20-0611.1 (2021).
Article ADS Google Scholar
Gubler, S. et al. The influence of station density on climate data homogenization. International Journal of Climatology 37, 4670–4683, https://doi.org/10.1002/joc.5114 (2017).
Article ADS Google Scholar
Alexandersson, H. A homogeneity test applied to precipitation data. Journal of Climatology 6, 661–675, https://doi.org/10.1002/joc.3370060607 (1986).
Article Google Scholar
Haimberger, L. Homogenization of Radiosonde Temperature Time Series Using Innovation Statistics. Journal of Climate 20, 1377–1403, https://doi.org/10.1175/JCLI4050.1 (01 Apr. 2007).
Menne, M. J. & Williams, C. N. Homogenization of temperature series via pairwise comparisons. Journal of Climate 22, 1700–1717, https://doi.org/10.1175/2008JCLI2263.1 (2009).
Article ADS Google Scholar
Browning, J. & Schneider, C. snht: Standard Normal Homogeneity Test, https://CRAN.R-project.org/package=snht. R package version 1.0.5 (2017).
Dunn, R. J. H., Willett, K. M., Morice, C. P. & Parker, D. E. Pairwise homogeneity assessment of HadISD. Climate of the Past 10, 1501–1522, https://doi.org/10.5194/cp-10-1501-2014 (2014).
Article ADS Google Scholar
Thorne, P. W. et al. Toward an integrated set of surface meteorological observations for climate science and applications. Bulletin of the American Meteorological Society 98, 2689–2702, https://doi.org/10.1175/BAMS-D-16-0165.1 (2017).
Article ADS Google Scholar
Brugnara, Y., Good, E., Squintu, A. A., van der Schrier, G. & Brönnimann, S. The EUSTACE global land station daily air temperature dataset. Geoscience Data Journal 6, 189–204, https://doi.org/10.1002/gdj3.81 (2019).
Article ADS Google Scholar
Vincent, L. A., Zhang, X., Bonsal, B. R. & Hogg, W. D. Homogenization of Daily Temperatures over Canada. Journal of Climate 15, 1322–1334, 10.1175/1520-0442(2002)015<1322:HODTOC>2.0.CO;2 (2002).
Jin, M. & Dickinson, R. E. Land surface skin temperature climatology: Benefitting from the strengths of satellite observations. Environmental Research Letters 5, 044004, https://doi.org/10.1088/1748-9326/5/4/044004 (2010).
Article ADS Google Scholar
Wan, Z., Hook, S. & Hulley, G. MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1 km SIN Grid V006 https://doi.org/10.5067/MODIS/MOD11A2.006 (2015).
Article Google Scholar
Danielson, J. J. & Gesch, D. B. Global multi-resolution terrain elevation data 2010 (GMTED2010) https://doi.org/10.5066/F7J38R2N (2011).
Article Google Scholar
Holden, Z. A., Abatzoglou, J. T., Luce, C. H. & Baggett, L. S. Empirical downscaling of daily minimum air temperature at very fine resolutions in complex terrain. Agricultural and Forest Meteorology 151, 1066–1073, https://doi.org/10.1016/j.agrformet.2011.03.011 (2011).
Article ADS Google Scholar
Gorelick, N. et al. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment 202, 18–27, https://doi.org/10.1016/j.rse.2017.06.031 (2017).
Article ADS Google Scholar
Aybar, C., Wu, Q., Bautista, L., Yali, R. & Barja, A. rgee: An R package for interacting with Google Earth Engine. Journal of Open Source Software 5, 2272, https://doi.org/10.21105/joss.02272 (2020).
Article ADS Google Scholar
Parmentier, B. et al. Using multi-timescale methods and satellite-derived land surface temperature for the interpolation of daily maximum air temperature in Oregon. International Journal of Climatology 35, 3862–3878, https://doi.org/10.1002/joc.4251 (2015).
Article ADS Google Scholar
Longman, R. J. et al. High-resolution gridded daily rainfall and temperature for the Hawaiian Islands (1990–2014. Journal of Hydrometeorology 20, 489–508, https://doi.org/10.1175/JHM-D-18-0112.1 (2019).
Article ADS Google Scholar
Newman, A. J. et al. Use of daily station observations to produce high-resolution gridded probabilistic precipitation and temperature time series for the Hawaiian Islands. Journal of Hydrometeorology 20, 509–529, https://doi.org/10.1175/JHM-D-18-0113.1 (2019).
Article ADS Google Scholar
Newman, A. J., Clark, M. P., Wood, A. W. & Arnold, J. R. Probabilistic spatial meteorological estimates for alaska and the yukon. Journal of Geophysical Research: Atmospheres 125, e2020JD032696, https://doi.org/10.1029/2020JD032696 (2020).
Article ADS Google Scholar
Hengl, T., Heuvelink, G. & Rossiter, D. About regression-kriging: from theory to interpretation of results. Computers & Geosciences 33, 1301–1315, https://doi.org/10.1016/j.cageo.2007.05.001 (2007).
Article ADS Google Scholar
Webster, R. & Oliver, M. A. Geostatistics for environmental scientists. John Wiley & Sons https://doi.org/10.1002/9780470517277 (2007).
Harris, P., Fotheringham, A., Crespo, R. & Charlton, M. The use of geographically weighted regression for spatial prediction: an evaluation of models using simulated data sets. Mathematical Geosciences 42, 657–680, https://doi.org/10.1007/s11004-010-9284-7 (2010).
Article MathSciNet CAS MATH Google Scholar
Fotheringham, A. S., Brunsdon, C. & Charlton, M. Geographically weighted regression: the analysis of spatially varying relationships. John Wiley & Sons (2003).
Gollini, I., Lu, B., Charlton, M., Brunsdon, C. & Harris, P. GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models. Journal of Statistical Software, Articles 63, 1–50, https://doi.org/10.18637/jss.v063.i17 (2015).
Article Google Scholar
Zhan, W. et al. Disaggregation of remotely sensed land surface temperature: Literature survey, taxonomy, issues, and caveats. Remote Sensing of Environment 131, 119–139, https://doi.org/10.1016/j.rse.2012.12.014 (2013).
Article ADS Google Scholar
Wang, S., Luo, X. & Peng, Y. Spatial Downscaling of MODIS Land Surface Temperature Based on Geographically Weighted Autoregressive Model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13, 2532–2546, https://doi.org/10.1109/JSTARS.2020.2968809 (2020).
Article ADS Google Scholar
Zhang, X., Zwiers, F. W. & Hegerl, G. The influences of data precision on the calculation of temperature percentile indices. International Journal of Climatology 29, 321–327, https://doi.org/10.1002/joc.1738 (2009).
Article ADS Google Scholar
Rhines, A., Tingley, M. P., McKinnon, K. A. & Huybers, P. Decoding the precision of historical temperature observations. Quarterly Journal of the Royal Meteorological Society 141, 2923–2933, https://doi.org/10.1002/qj.2612 (2015).
Article ADS Google Scholar
Pebesma, E. J. Multivariable geostatistics in S: the gstat package. Computers & geosciences 30, 683–691, https://doi.org/10.1016/j.cageo.2004.03.012 (2004).
Article ADS Google Scholar
Gräler, B., Pebesma, E. & Heuvelink, G. Spatio-Temporal Interpolation using gstat. The R Journal 8, 204–218, https://doi.org/10.32614/RJ-2016-014 (2016).
Article Google Scholar
Hiemstra, P., Pebesma, E., Twenhöfel, C. & Heuvelink, G. Real-time automatic interpolation of ambient gamma dose rates from the Dutch Radioactivity Monitoring Network. Computers and Geosciences https://doi.org/10.1016/j.cageo.2008.10.011 (2008).
Huerta, A. et al. High-resolution grids of daily air temperature for Peru - the PISCOt v1.2 dataset. figshare. https://doi.org/10.6084/m9.figshare.c.5959863.v3 (2023).
Willmott, C. J., Robeson, S. M. & Matsuura, K. A refined index of model performance. International Journal of Climatology 32, 2088–2094, https://doi.org/10.1002/joc.2419 (2012).
Article ADS Google Scholar
Legates, D. R. & McCabe, G. J. A refined index of model performance: a rejoinder. International Journal of Climatology 33, 1053–1056, https://doi.org/10.1002/joc.3487 (2013).
Article ADS Google Scholar
Lindeman, R. H. Introduction to bivariate and multivariate analysis. Scott Foresman & Co (1980).
Grömping, U. Relative importance for linear regression in R: the package relaimpo. Journal of statistical software 17, 1–27, https://doi.org/10.18637/jss.v017.i01 (2007).
Article Google Scholar
Dobrowski, S. Z., Abatzoglou, J. T., Greenberg, J. A. & Schladow, S. How much influence does landscape-scale physiography have on air temperature in a mountain environment. Agricultural and Forest Meteorology 149, 1751–1758, https://doi.org/10.1016/j.agrformet.2009.06.006 (2009).
Article ADS Google Scholar
Nichol, J. Remote sensing of urban heat islands by day and night. Photogrammetric engineering and remote sensing 71, 613–621, https://doi.org/10.14358/PERS.71.5.613 (2005).
Article Google Scholar
Moraes, A. G. D. L. et al. Terrain sensitive climate mapping for the Arequipa Department in Peru. International Journal of Climatology https://doi.org/10.1002/joc.7730 (2022).
Zhang, M. et al. Creating new near-surface air temperature datasets to understand elevation-dependent warming in the Tibetan Plateau. Remote Sensing 12, 1722, https://doi.org/10.3390/rs12111722 (2020).
Article ADS CAS Google Scholar
Frei, C. Interpolation of temperature in a mountainous region using nonlinear profiles and non-euclidean distances. International Journal of Climatology 34, 1585–1605, https://doi.org/10.1002/joc.3786 (2014).
Article ADS Google Scholar
Luyssaert, S. et al. Land management and land-cover change have impacts of similar magnitude on surface temperature. Nature Climate Change 4, 389–393, https://doi.org/10.1038/nclimate2196 (2014).
Article ADS Google Scholar
Pongratz, J. et al. Land use effects on climate: current state, recent progress, and emerging topics. Current Climate Change Reports 1–22, https://doi.org/10.1007/s40641-021-00178-y (2021).
Abatzoglou, J. T., Dobrowski, S. Z., Parks, S. A. & Hegewisch, K. C. TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015. Scientific Data 5, 1–12, https://doi.org/10.1038/sdata.2017.191 (2018).
Article Google Scholar
Bonshoms, M. et al. Validation of ERA5-Land temperature and relative humidity on four Peruvian glaciers using on-glacier observations. Journal of Mountain Science 19, 1849–1873, https://doi.org/10.1007/s11629-022-7388-4 (2022).
Article Google Scholar
Pinche Laurre, C. Estudio de las condiciones climáticas y de la niebla en la costa norte de Lima. Tech. Rep., Universidad Nacional Agraria La Molina, Lima (Peru). Facultad de Ciencias (1986).
Schemenauer, R. S. & Cereceda, P. Meteorological conditions at a coastal fog collection site in Peru. Atmosfera 6, 175–188, https://www.redalyc.org/articulo.oa?id=56506304 (1993).
Google Scholar
Navarro-Serrano, F. et al. Maximum and minimum air temperature lapse rates in the Andean region of Ecuador and Peru. International Journal of Climatology 40, 6150–6168, https://doi.org/10.1002/joc.6574 (2020).
Article ADS Google Scholar
Vermote, E. & Wolfe, R. MODIS/Terra surface reflectance daily L2G Global 1 km and 500 m SIN Grid V061. NASA EOSDIS Land Processes DAAC https://doi.org/10.5067/MODIS/MOD09GA.061 (2021).
Walton, D. & Hall, A. An assessment of high-resolution gridded temperature datasets over California. Journal of Climate 31, 3789–3810, https://doi.org/10.1175/JCLI-D-17-0410.1 (2018).
Article ADS Google Scholar
Huerta, A. Code of PISCOt v1.2. figshare. https://doi.org/10.6084/m9.figshare.24602373.v1 (2023).

Download references

Acknowledgements

The new version of PISCOt was developed with support by the Newton-Paulet fund within the project ‘Water security and climate change adaptation in Peruvian glacier-fed river basins’ (RAHU) under the contract N°005-2019-FONDECYT. A.H. acknowledges additional financial support of the project “Natural Infrastructure for Water Security” (NIWS), an initiative promoted and financed by the United States Agency for International Development (USAID) and the Canadian Government. A.H. also acknowledges financial support of the project ‘Enhancing Adaptive Capacity of Andean Communities through Climate Services’ (ENANDES) executed by National Meteorological and Hydrological Services of Colombia (IDEAM), Chile (DMC) and Peru (SENAMHI), and the WMO Regional Climate Centre for Western South America (CIIFEN). P.R. acknowledges support from the fund KF400238 British Academy: El Niño and flash floods in Peru: Bringing knowledge on “Furia de los rios” and “Western science” to understand lag time. We are grateful for the freely available global products: ERA5-Land climate reanalysis data from the Copernicus Climate Change Service (C3S) Climate Data Store at https://cds.climate.copernicus.eu/, the TerraClimate data from the Climatology Lab portal at https://www.climatologylab.org/terraclimate.html; and, the CHIRTS data from the Climatic Hazard Center at https://www.chc.ucsb.edu/data. In addition, PISCOt v1.1 was obtained from the IRI/LDEO Climate Data Library at http://iridl.ldeo.columbia.edu/SOURCES/.SENAMHI/.HSR/.PISCO/; and VS2018 from http://hdl.handle.net/10261/139347.

Author information

Adrian Huerta
Present address: Institute of Geography and Oeschger Centre for Climate Change Research, University of Bern, Bern, Switzerland

Authors and Affiliations

Servicio Nacional de Meteorología e Hidrología (SENAMHI), Lima, Perú
Adrian Huerta, Kris Correa, Oscar Felipe-Obando & Waldo Lavado-Casimiro
Departamento de Física y Meteorología, Universidad Nacional Agraria La Molina (UNALM), Lima, Perú
Adrian Huerta
Image Processing Laboratory, University of Valencia, 46980, Valencia, Spain
Cesar Aybar
High Mountain Ecosystem Research Group, National University of San Marcos, 15081, Lima, Peru
Cesar Aybar
Institute of Geography, University of Bern, Bern, Switzerland
Noemi Imfeld
Oeschger Centre for Climate Change Research, University of Bern, Bern, Switzerland
Noemi Imfeld
Centro de Investigación y Tecnología del Agua (CITA), Departamento de Ingeniería Ambiental, Universidad de Ingeniería y Tecnología (UTEC), Lima, Perú
Pedro Rau
Geography and the Environment, Department of Humanities, Pontificia Universidad Católica del Perú, Lima, Peru
Fabian Drenkhan

Authors

Adrian Huerta
View author publications
You can also search for this author in PubMed Google Scholar
Cesar Aybar
View author publications
You can also search for this author in PubMed Google Scholar
Noemi Imfeld
View author publications
You can also search for this author in PubMed Google Scholar
Kris Correa
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Felipe-Obando
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Rau
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Drenkhan
View author publications
You can also search for this author in PubMed Google Scholar
Waldo Lavado-Casimiro
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.H. led the publication, wrote the first draft of the manuscript, and developed the methodology in consultation with W.L.C. A.H., C.A. and K.C. collected the station, reanalysis, and satellite data. A.H. pre-processed the station data. A.H. and C.A. produced the gridding of station data. A.H. and N.I. validated the data. O.F.B., P.R., F.D. and W.L.C. supervised the dataset construction and provided professional advice. All authors were involved in discussions with regard to data development, and all reviewed the manuscript.

Corresponding author

Correspondence to Adrian Huerta.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

SUPPLEMENTARY INFORMATION

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Huerta, A., Aybar, C., Imfeld, N. et al. High-resolution grids of daily air temperature for Peru - the new PISCOt v1.2 dataset. Sci Data 10, 847 (2023). https://doi.org/10.1038/s41597-023-02777-w

Download citation

Received: 30 December 2022
Accepted: 23 November 2023
Published: 01 December 2023
DOI: https://doi.org/10.1038/s41597-023-02777-w