Global daily 1 km land surface precipitation based on cloud cover-informed downscaling

Karger, Dirk Nikolaus; Wilson, Adam M.; Mahony, Colin; Zimmermann, Niklaus E.; Jetz, Walter

doi:10.1038/s41597-021-01084-6

Download PDF

Data Descriptor
Open access
Published: 26 November 2021

Global daily 1 km land surface precipitation based on cloud cover-informed downscaling

Scientific Data volume 8, Article number: 307 (2021) Cite this article

14k Accesses
42 Citations
26 Altmetric
Metrics details

Subjects

Abstract

High-resolution climatic data are essential to many questions and applications in environmental research and ecology. Here we develop and implement a new semi-mechanistic downscaling approach for daily precipitation estimate that incorporates high resolution (30 arcsec, ≈1 km) satellite-derived cloud frequency. The downscaling algorithm incorporates orographic predictors such as wind fields, valley exposition, and boundary layer height, with a subsequent bias correction. We apply the method to the ERA5 precipitation archive and MODIS monthly cloud cover frequency to develop a daily gridded precipitation time series in 1 km resolution for the years 2003 onward. Comparison of the predictions with existing gridded products and station data from the Global Historical Climate Network indicates an improvement in the spatio-temporal performance of the downscaled data in predicting precipitation. Regional scrutiny of the cloud cover correction from the continental United States further indicates that CHELSA-EarthEnv performs well in comparison to other precipitation products. The CHELSA-EarthEnv daily precipitation product improves the temporal accuracy compared with a large improvement in the spatial accuracy especially in complex terrain.

Measurement(s)	hydrological precipitation process
Technology Type(s)	cloud-cover informed downscaling
Factor Type(s)	temporal interval • geographic location
Sample Characteristic - Environment	climate system • cloud
Sample Characteristic - Location	global

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.16910344

Recent reductions in aerosol emissions have increased Earth’s energy imbalance

Article Open access 03 April 2024

Øivind Hodnebrog, Gunnar Myhre, … Michael Schulz

Global prediction of extreme floods in ungauged watersheds

Article Open access 20 March 2024

Grey Nearing, Deborah Cohen, … Yossi Matias

Disappearing cities on US coasts

Article Open access 06 March 2024

Leonard O. Ohenhen, Manoochehr Shirzaei, … Robert J. Nicholls

Background & Summary

High resolution information on precipitation is essential in many scientific fields, ranging from ecology, agriculture, forestry, to global change impact studies^1,2,3. Spatiotemporal precipitation data is usually derived from a range of different sources, including satellites, reanalysis, global circulation models, or precipitation gauges^4,5. However, each of these sources on their own have limitations in coverage, accuracy, or detail, impeding many downstream uses, especially those addressing large spatial and temporal extents^6,7.

Reanalysis data products such as ERA5^8,9, MERRA 2^10,11 or MSWEP¹² overcome these constraints by combining data from a variety of sources. To date, however, they remain limited to rather coarse spatial resolutions such as 0.5° ~0.25°, i.e. ca. 55–27 km near the equator. This is much coarser than the scale of many environmental and ecological processes and the associated data requirements for ecosystem management and conservation. This resolution is furthermore too coarse to capture orographic precipitation in complex terrain^13,14,15. Global circulation and weather models such as WRF-ARF¹⁶ and ICON^17,18 are able to run at high spatial resolutions of 1 km, but are still heavily constrained by computational limits⁷. Currently, global kilometer scale models are only able to archive a simulation throughput of 0.043 SYPD (Simulated years per day)¹⁹, which amounts to an 100x shortfall compared to computationally efficient simulations defined as 1 SYPD^7,20. Even with the largest supercomputers and with state of the art climate models, as well as large financial investments, this shortfall can only be reduced to approximately 20x²¹.

Although achieving 1 km resolution in numerical climate models is important to quantify effects such as deep convection or surface drag²¹, studies focusing on the impact of climate on different systems often rely on a limited set of climatic variables. In ecological studies for example, precipitation together with minimum-, mean-, and maximum temperatures, are often used to delineate occurrences of species²². It is common to characterize the range of a species by its climatic envelope in e.g. species distribution models (SDMs) using a rather simple set of climatic predictors and derivations thereof^23,24. This means that for applications like these, only a subset of available variables needs to be downscaled to finer spatial resolution.

Environmental scientists or climate impact modelers in need of high-resolution precipitation data, therefore often resort to data from computationally less intensive methods. One such method is the spatial interpolation of data from climatic stations. Here, precipitation gauges form the input for interpolation²⁵ or regression models to achieve a high spatial resolution either with or without additional, often terrain derived, predictors^26,27,28. Such interpolations however, usually suffer from a spatially uneven station density^29,30,31,32 and severely underestimate snowfall^6,33,34,35. While gauge undercatch can be corrected using statistical methods in combination with steam flow observations³⁶, the spatially uneven distribution of gauges can lead to false parametrizations of precipitation lapse rates in regression based interpolation methods³⁷. One method to overcome the limitation imposed by uneven station density is to directly downscale the output of reanalysis data calculated at coarser cell size^37,38,39. However, there are still interpolations and parametrizations involved which account for processes not resolved at the original model resolution.

This uneven distribution of gauges can be overcome by the use of satellite data^{2,40,41,42,43}, which offers spatially more complete information of precipitation patterns. Yet, satellites also detect snowfall poorly^44,45, meaning that the satellite-derived amounts of precipitation have to be corrected. This is usually done by bias-correction using station observations^{46,47,48,49,50}. Although satellite precipitation products generally have a higher horizontal resolution than reanalysis data, they are globally still not available at resolutions of 1 km needed for local impact studies. However, available at this very high resolution is cloud cover from satellites, which can potentially lead to improved spatial representation of precipitation. The established relationship between cloud occurrences and precipitation^51,52 scales precipitation with cloud cover frequency such that if no clouds occur there is no precipitation, and increasing cloud frequency translates to increasing precipitation⁵³.

Here we merge data from a downscaled reanalysis (ERA5) using the CHELSA algorithm^37,39 with cloud cover information derived from MODIS from the EarthEnv layer suite (https://www.earthenv.org)⁵⁴ to achieve a better representation of the fine-scale variation of global precipitation patterns. The presented CHELSA-EarthEnv daily precipitation data at ~1 km horizontal resolution offers a more reliable characterization of precipitation in topographically heterogeneous regions and supports a range of applications that require high resolution precipitation data.

Methods

Bias correction of ERA5 precipitation data

ERA5 shows an increased performance over its predecessor ERA-Interim in several attributes⁵⁵ and especially in precipitation⁵⁶. Nevertheless, for application in impact studies, there is often still a significant bias observed in several parameters that need to be accounted for⁵⁷. For accumulated parameters such as total daily precipitation, we used the monthly sum of the hourly precipitation from ERA5 p_era to assess this bias. ERA5 generated estimates of the surface precipitation, similar to its predecessor ERA-Interim are extracted from short range forecasts, which vary considerably with forecast length⁵⁸. This bias in the short-range forecasts can be a problem for monthly and climatological means as it accumulates over time⁵⁸. Several methods exist to account for such biases but most of them require gapless gridded observational data which comes with an inherent interpolation error itself⁵⁷. To correct the bias in the ERA5 precipitation estimates and account for the interpolation errors we therefore performed a bias correction which consists of three steps.

1.
One very common approach to account for reanalysis bias is to calculate the difference between baseline precipitation from the reanalysis and the observed precipitation from station data and apply this ‘change factor’ to the reanalysis data. We apply a monthly bias correction on the accumulated ERA5 precipitation for each month p_sim. We used the monthly accumulated precipitation p_obs of the gridded GPCC 2018 dataset⁵⁹. The bias correction in earlier versions of CHELSA did not adequately interpolate across the dateline, which caused artefacts in this region. To correct for this we reprojected both p_era and p_obs to a North Pole Azimuthal Equidistant projection (EPSG: 102016) to allow interpolation across the dateline. We then calculate the monthly bias R_m caused by the ERA5 parametrization, and the excessive or insufficient precipitation of the forecast algorithm for each month using:
$${R}_{m}^{obs}=\frac{{p}_{obs}+c}{{p}_{sim}+c}$$
with c being a constant of 0.0001 kg*m⁻²*s⁻¹ to avoid division by zero. We only used grid cells with meteorological stations present for the calculation of the observed bias ${R}_{m}^{obs}$. The forecast algorithm used to produce the precipitation amounts for ERA5 exhibits a considerable bias (too much or too less precipitation), that has a coherent spatial structure, with a larger bias over high elevation terrain, or specific landforms such as tropical rainforests. Based on this observation, we assumed that grid cells without stations share a similar bias as their neighbouring stations.
2.
To achieve a gap-free bias correction grid surface, we interpolated the gaps in the R_m grid using a multilevel B-spline interpolation⁶⁰ with 14 error levels optimized using B-spline refinement to a 0.25° resolution. The multilevel B-spline approximation⁶⁰ applies a B-spline approximation to R_m starting with the coarsest lattice ϕ₀ from a set of control lattices ${\phi }_{0},{\phi }_{1},\ldots ,{\phi }_{n}$ with n = 14 that have been generated using optimized B-spline refinement⁶¹. The resulting B-spline function ${f}_{0}({R}_{m}^{obs})$ gives the first approximation of R_m. ${f}_{0}({R}_{m})$ leaves a deviation between ${\Delta }^{1}{R}_{m\,c}^{obs}={R}_{m}^{obs}-{f}_{0}({x}_{c},{y}_{c})$ at each location $\left({x}_{c},{y}_{c},{R}_{m\quad c}^{obs}\right)$⁶¹. Then the next control lattice ${\phi }_{1}$ is used to approximate ${f}_{1}({\Delta }^{{\rm{1}}}{R}_{m\,c}^{obs})$⁶¹. Approximation is then repeated on the sum of ${f}_{0}+{f}_{1}$ = ${R}_{m}^{obs}-{f}_{0}\left({x}_{c},{y}_{c}\right)-{f}_{1}\left({x}_{c},{y}_{c}\right)$ at each point $\left({x}_{c},{y}_{c},{R}_{m\quad c}^{obs}\right)$ n times resulting in the gap free interpolated bias surface ${R}_{m}^{int}$⁶¹.
3.
The bias correction surface ${R}_{m}^{int}$ is then multiplied with the ERA5 precipitation p_sim to get the bias corrected monthly precipitation sums ${p}_{m}^{cor}$ at 0.25° resolution:

$${p}_{m}^{cor}={p}_{sim}\ast {R}_{m}^{int}$$

Orographic wind effects

Orographic effects are among the most reported drivers of precipitation^{62,63,64,65,66}. Orographic effects have been taken into account using a variant of the CHELSA V1.2 algorithm which uses a parametrization of orographic rainfall based on wind fields^67,68,69,70. We used daily u-wind and v-wind components at the 10-m level of ERA5 as underlying wind components. As the calculation of a windward leeward index H (hereafter: wind effect) requires a projected coordinate system, both wind components (u-wind, v-wind) were projected to a world Mercator projection and then interpolated to a 3 km grid resolution using a multilevel B-spline interpolation similar to the one used for the bias correction surface. The resolution of 3 km was chosen as resolutions of around 1 km would over-represent orographic terrain effects²⁶. The wind effect H was then calculated by multiplying the windward H_w and leeward H_L components calculated using:

$${H}_{W}=\frac{{\sum }_{i=1}^{n}\frac{1}{{d}_{WHi}}{tan}^{-1}\left(\frac{{d}_{WZi}}{{d}_{WHi}^{0.5}}\right)}{{\sum }_{i=1}^{n}\frac{1}{{d}_{LHi}}}+\frac{{\sum }_{i=1}^{n}\frac{1}{{d}_{LHi}}ta{n}^{-1}\left(\frac{{d}_{LZi}}{{d}_{LHi}^{0.5}}\right)}{{\sum }_{i=1}^{n}\frac{1}{{d}_{LHi}}}$$

$${H}_{L}\frac{{\sum }_{i=1}^{n}\frac{1}{{\rm{ln}}\left({d}_{WHi}\right)}ta{n}^{-1}\left(\frac{{d}_{LZi}}{{d}_{WHi}^{0.5}}\right)}{{\sum }_{i=1}^{n}\frac{1}{{\rm{ln}}\left({d}_{LHi}\right)}}$$

where ${d}_{WHi}$ and ${d}_{LHi}$ refer to the horizontal distances between the focal 3 km grid cell in windward and leeward direction and ${d}_{WZi}$ and ${d}_{LZi}$ are the corresponding vertical distances compared with the focal 3 km cell following the wind trajectory. Distances are summed over a search distance of 75 kilometers as orographic airflows are limited to horizontal extents between 50–100 km^71,72. The second summand in the equation for ${H}_{W,L}$ where ${d}_{LHi} < 0$ accounts for the leeward impact of previously traversed mountain chains. The horizontal distances in the equation for ${H}_{W,L}$ where ${d}_{LHi}\ge 0$ lead to a longer-distance impact of leeward rain shadow. The final wind-effect parameter, which is assumed to be related to the interaction of the large-scale wind field and the local-scale precipitation characteristics, is calculated as:

$$H={H}_{W,L}\to {d}_{LHi} < 0\ast {H}_{W,L}\to {d}_{LHi}\ge 0$$

and generally, takes values between 0.7 for leeward and 1.3 for windward positions. Both equations were applied to each grid cell at the 3 km resolution in a World Mercator projection.

We used the boundary layer height PBL from ERA5 as an indicator of the pressure level that has the highest contribution to the wind effect. PBL and H have been interpolated to a 30 arc second using a B-spline interpolation. To create a boundary layer height corrected wind effect H_B, the wind effect grid H containing was then proportionally distributed to all grid cells falling within a respective 0.25° grid cell using:

$${H}_{B}=\frac{H}{1-\left(\frac{| z-PB{L}_{z}| \,-\,{z}_{max}}{h}\right)}$$

with z_max being the maximum distance between the boundary layer height B_z at elevation z and all grid cells at a 30 arc sec resolution falling within a respective 0.25° grid cell, h being a constant of 9000 m, and z being the respective elevation from the Global Multi-resolution Terrain Elevation Data (GMTED2010)⁷² with:

$$PB{L}_{z}=PBL+{z}_{ERA}+f$$

where B is the height of the daily means of the boundary layer from ERA5, and z_ERA is the elevation of the ERA5 grid cell. The boundary layer height provided by ECMWF is based on the Richardson number⁷³ which is usually at the lower end of the elevational spectrum compared to other methods⁷⁴. We therefore tuned our model by adding a constant of 500 m similar to the approach in the original CHELSA algorithm³⁷.

Although the wind effect algorithm can distinguish between the windward and leeward sites of an orographic barrier, it cannot distinguish extremely isolated valleys in high mountain areas⁷⁵. Such dry valleys are situated in areas where the wet air masses flow over an orographic barrier and are prevented from flowing into deep valleys⁷⁵. These effects are however mainly confined to large mountain ranges, and are not as prominent in intermediate mountain ranges⁷². To account for these effects, we used a variant of the windward-leeward equations with a linear search distance of 300 km in steps of 5° from 0° to 355° circular for each grid cell. The calculated leeward index was then scaled towards higher elevations using:

$$E={\left(\frac{{\sum }_{i=1}^{n}\frac{1}{{\rm{ln}}\left({d}_{WHi}\right)}ta{n}^{-1}\left(\frac{{d}_{LZi}}{{d}_{WHi}^{0.5}}\right)}{{\sum }_{i=1}^{n}\frac{1}{{\rm{ln}}\left({d}_{LHi}\right)}}\right)}^{\frac{z}{h}}$$

which rescales the strength of the exposition index relative to elevation z from GMTED2010, and gives valleys at high elevations larger wind isolations E than valleys located at low elevations. The correction constant h was set to 9000 m to include all possible elevations of the DEM and because values of z > h could otherwise lead to a reverse relationship between z and E.

$${p}_{{I}_{c}}=E\ast {H}_{B}$$

will give the first approximation of precipitation intensity ${p}_{{I}_{c}}$ at each grid location (${x}_{c},{y}_{c}$).

Precipitation including orographic effects

To achieve the distribution of daily precipitation p_o given the approximated precipitation intensity ${p}_{Ic}$ at each grid location (${x}_{c},{y}_{c}$), we used a linear relationship between ${p}_{m}^{cor}$ and ${p}_{Ic}$ using:

$${p}_{o}=\frac{{p}_{Ic}}{\frac{1}{n}{\sum }_{i=1}^{n}\,{p}_{Ici}}\ast {p}_{m}^{cor}$$

where n equals the number of 0.0083334°. grid cells that fall within a 0.25 grid cell. This equation ensures that the precipitation at 0.25° resolution exactly matches the mean precipitation of all 0.0083334° cells that overlap with a 0.25° cell.

The GPCC dataset used for the bias correction does not include a correction for gauge undercatch. We therefore additionally correct for gauge undercatch using a downscaled version of the bias correction layers from Beck et al. 2020³⁶. We downscaled the bias correction surfaces to 0.0083334° by using a moving window regression with a search radius of three cells and elevation from GMTED2010 as predictor. We then multiplied this downscaled bias correction layer with the p_o.

Monthly cloud frequencies

To derive monthly cloud frequencies, we used the internal cloud mask in the PGE11 program that relies on two reflective and one thermal test MODIS MOD09 atmospherically corrected surface reflectance product^76,77. The reflective tests include the shortwave and middle infrared data combined in the “middle infrared anomaly” index (MIRA = ρ20,21 − 0.82ρ7 + 0.32ρ6, where ρ indicates MODIS band number). The second test uses reflectance at 1.38 microns (1.38 mic = ρ26). The MIRA and the 1.38 mic reflectance are designed to be complementary, with MIRA efficiently detecting low or high reflective clouds⁷⁷, while 1.38 mic effectively detects high (and potentially not very reflective) clouds. Additionally, a thermal test is used to identify pixels with high infrared reflectance anomalies (e.g., fires, sun-glint, and high albedo surfaces) with respect to near-surface (2 m) air temperature computed by the NCEP reanalysis model⁷⁸. The MOD09 cloud algorithm was designed to minimize confusion over snow and ice by taking the surface air temperature into account. Like many cloud masks, the MOD09 detection algorithm has a binary response (cloudy/not cloudy) and does not retain an estimate of confidence in cloud state (i.e., probability that the pixel was actually cloudy given the tests). We extracted the daily cloud flags from bit 10 of the daily daytime surface reflectance product “state 1 km” Scientific Data Set (SDS) from both the Terra (MOD09GA, collected at approximately 10:30 AM local time) and Aqua (MYD09GA, approximately 1:30 PM) satellites. The time series of monthly cloud frequencies (proportion of days with a positive cloud flag) was calculated separately for the daily MOD09GA and MYD09GA data using the Google Earth Engine application programming interface (http://earthengine.google.org/).

Cloud frequency correction of daily precipitation estimates

We include monthly cloud frequencies $c{f}_{m}$ into the daily precipitation estimates assuming that the frequency of cloud occurrences is related to precipitation events and their geographic distribution carries a spatial signal of precipitation^51,52. Strictly we assume that where no clouds occur, no precipitation occurs, and where clouds occur more frequently, more precipitation occurs⁵³. To achieve the distribution of daily precipitation p given the approximated orographic corrected precipitation ${p}_{o}$ at each grid location (${x}_{c},{y}_{c}$), we first approximate the cloud cover corrected precipitation intensity using:

$${p}_{cfc}={p}_{o}\ast c{f}_{m}$$

This however distorts the precipitation amount of each grid cell. We therefore repeat the step used to create orographic precipitation in a similar manner by estimating daily precipitation p at each grid location (${x}_{c},{y}_{c}$) using:

$$p=\frac{{p}_{c{f}_{c}}}{\frac{1}{n}{\sum }_{i=1}^{n}\,{p}_{c{f}_{ci}}}\ast {p}_{m}^{cor}$$

where n equals the number of 30 arc sec. grid cells that fall within a 0.25 grid cell.

Data Records

The dataset⁷⁹ is available at EarthEnv (https://doi.org/10079/MOL/6f52b80d-0a41-40f7-84ec-873458ca6ee6). All files are provided as georeferenced tiff files (GeoTIFF). GeoTIFF is a public domain metadata standard which allows georeferencing information to be embedded within a TIFF file. Additional information included in the file are: map projection, coordinate systems, ellipsoids, datums, and fill values.

GeoTIFF can be viewed using standard GIS software such as:

SAGA GIS – (free) http://www.saga-gis.org/

ArcGIS - https://www.arcgis.com/

QGIS - (free) www.qgis.org

DIVA – GIS - (free) http://www.diva-gis.org/

GRASS – GIS - (free) https://grass.osgeo.org/

All files contain variables that define the dimensions of longitude and latitude (Table 1). The time variable is usually encoded in the filename.

Table 1 Grid extent and resolution of the GeoTIFF files.

Full size table

All files are in a geographic coordinate system referenced to the WGS 84 horizontal datum, with the horizontal coordinates expressed in decimal degrees. The extent (minimum and maximum latitude and longitude) are a result of the coordinate system inherited from the 1-arc-second GMTED2010 data which itself inherited the grid extent from the 1-arc-second SRTM data.

The filename includes the respective model used, the variable short name, the respective time variables, and the version of the data:

[Model]_[short_name]_[day]_[month]_[year]_[Version].tif

There are two different models available. CHELSA which includes the results from the bias correction and orographic correction, and CHELSA_EarthEnv which includes the cloud cover correction as well.

The unit of the precipitation is CHELSA_EarthEnv is: (kg*m⁻²*day⁻¹)/100.

Technical Validation

To validate the performance of CHELSA_EarthEnv we are focusing on (a) the downscaling performance by calculating different performance metrics between coarse and high resolution and comparing observations from meteorological stations and (b) a comparison with similar high-resolution precipitation datasets (Table 2) within the continental United States where meteorological station density is high and of good quality.

Table 2 Overview of the precipitation datasets used for comparison and their respective properties and methodologies.

Full size table

Validating the downscaling performance

To validate if the downscaling to 0.0083334° resolution leads to a better performance over the coarser 0.25° gridded data that was used as forcing, we compare both resolutions with precipitation measured at Global Historical Climate Network – daily weather stations (GHCN-D)⁸⁰. The 0.25° resolution has been chosen as benchmark as it is the resolution of the forcing ERA5 data that is used as an input for CHELSA_EarthEnv. To set the performance changes in to four comparable products (Table 2): PRISM AN81d, MSWEP 2.1, CHIRPS 2.0, and WorldClim 2.1 and repeated the analysis with these datasets over the continental United States except Alaska.

Accessing the global downscaling performance across several metrics

To validate the performance of CHELSA_EarthEnv globally we compare it to observations at metrological stations from the GHCN-D⁸⁰ network for the time 2003–2016. We use only stations without any quality flags and compare them to the precipitation data at the coarse 0.25°, and the high 0.0083334° spatial resolution.

Downscaling can affect different aspects of model performance such as bias, variability, or correlation coefficients. To test in a first step which metric is affected by the applied downscaling we calculated for each grid cell separately the Kling-Gupta efficiency (KGE) scores from daily time series from 2003 to 2016. KGE is a performance metric combining correlation, bias, and variability^81,82 and is defined as follows:

$$KGE=1-\sqrt{{\left(r-1\right)}^{2}+{\left(\beta -1\right)}^{2}+{\left(\gamma -1\right)}^{2}}$$

where the correlation component r is represented by the Pearson’s correlation coefficient, the bias component 𝛽 by the ratio of estimated and observed means, and the variability component $\gamma $ by the ratio of the estimated and observed coefficients of variation:

$$\beta =\frac{{\mu }_{s}}{{\mu }_{s}}\,and\,\gamma =\frac{\frac{{\sigma }_{s}}{{\mu }_{s}}}{\frac{{\sigma }_{o}}{{\mu }_{o}}}$$

where μ is the mean and σ the standard deviation, and the subscripts s and o indicate simulated and observed, respectively. KGE, r, β, and γ values all have their optimum at 1. KGE values between −0.41 and 1 indicate that the model estimates precipitation better than just taking the mean of the recorded precipitation at the gauges⁸³.

We also calculated the percent bias (pbias) that reflects the average tendency of the modelled precipitation values ${p}_{sim}$ to be larger or smaller than their observed values ${p}_{obs}$ at the stations. The optimal value of pbias is 0, with low values indicating accurate model simulation. Positive values indicate an overestimation, whereas negative values indicate an underestimation. pbias is defined as follows:

$$pbias=100\ast \left(\frac{{\sum }_{i=0}^{n}\left({p}_{si{m}_{i}}-{p}_{ob{s}_{i}}\right)}{{\sum }_{i=0}^{n}{p}_{ob{s}_{i}}}\right)$$

Additionally, we also report the mean absolute error (mae) which is defined as:

$$mae=\frac{1}{n}\left(\mathop{\sum }\limits_{i=0}^{n}\left|{p}_{si{m}_{i}}-{p}_{ob{s}_{i}}\right|\right)$$

and the root mean squared (rmse) error which is defined as:

$$rmse=\sqrt{\frac{1}{n}\left(\mathop{\sum }\limits_{i=0}^{n}{\left({p}_{si{m}_{i}}-{p}_{ob{s}_{i}}\right)}^{2}\right)}$$

Accessing the regional performance

To compare the results to similar precipitation datasets, we use GHCN-D and four other gridded datasets (Table 2) that provide data over the same time period: PRISM (AN81d)²⁶, MSWEP 2.1¹², and CHIRPS 2.0⁴³, and WorldClim 2.1⁸⁴. PRISM is a high-resolution precipitation dataset for the United States that, similar to CHELSA_EarthEnv, takes orographic effects into account and additionally profits from a dense quality-controlled network of weather stations. While PRISM uses a regression approach to predict long term precipitation climatologies, daily precipitation is derived from climatologically aided interpolation (CAI)⁸⁵. MSWEP 2.1 is a merged product from various sources (weather stations, reanalysis data, satellite observations) and consistently has high performance scores in comparison to other precipitation products⁶. CHIRPS is a high-resolution precipitation dataset, that integrates remote sensed precipitation with observations from weather stations. Additionally, we also include the WorldClim 2.1 data in our comparison. Although WorldClim 2.1 does not offer daily data, it provides monthly timeseries that has been created using climatologically aided interpolation of the CRU-TS 4.03 data⁸⁶. All these datasets have been aggregated over the period 2003–2016 to annual means to gain a comparable temporal extent as CHELSA_EarthEnv. We then compare these data from the different datasets with observations from GHCN-D⁸⁰ for the continental United states except Alaska. Within this spatial extent all five products overlap and the quality of the stations can be considered as high. All products have additionally been aggregated to a 0.25° grid resolution by taking the mean of all grid cells overlapping with a 0.25° grid cell in WGS84 geographic projection. We then used all stations with data available between 2003 and 2016 and without any quality flag (58,071 stations) and extracted precipitation from both the highest available spatial resolution of the different datasets (Table 2) and the coarse 0.25° resolution using a nearest neighbour approach. We then calculated the differences in absolute bias between coarse and high resolution, and compared these among products using an ANOVA with post-hoc Tukey HSD test.

Comparison with PRISM

The validation of the temporal accuracy, done using the GHCN-D station data gives information how well a product reproduces precipitation directly at the locations of these stations. All products we compare to CHELSA here are however, at least partly, parameterized on a subset of the GHCN-D stations as well. This often leads to a high fit with station data in all products that use exactly these climate stations at the locations of the stations. However, predicted precipitation patterns between stations, where the data is actually interpolated or predicted cannot be validated in this way. The performance of a model to predict the spatial patterns of precipitation correctly could for example be accessed by a cross validation approach, but this is not possible without the station data or the source code of the respective model being available. As the exact station data each dataset uses are generally not available, we use the spatially explicit PRISM model as a benchmark for comparison. PRISM has a very high accuracy and captures small scale precipitation gradients well. It uses the highest amount of meteorological stations of all models compared here. It is however, also a model and therefore has its own inherent biases. To compare models, we aggregated the daily values (monthly for WorldClim) over 2003–2016 to mean annual precipitation, and calculated the bias and correlation between products.

Comparing precipitation lapse rates

In a case study, we compare CHELSA-EarthEnv’s annual precipitation climatology in coastal British Columbia with that of PRISM, simulation data from the Weather Research and Forecasting (WRF) convection-permitting dynamical simulation for North America⁸⁷; and WorldClim2.1. We calculated horizontal precipitation gradients for each grid cell by multiplying precipitation lapse rate by the terrain slope. The precipitation lapse rate is calculated from a moving window regression of precipitation against elevation in the 8 cells surrounding the focal cell.

Accessing the improvement from the cloud layers

We validate the inclusion of the cloud frequencies from MODIS in two steps. First, we compare the global performance of the precipitation dataset with, and without cloud refinement globally using GHCN-D. The refinement however, is done at the 0.0083334° resolution, and the mesoscale patterns of the data with, or without refinement are nearly identical. To compare the to datasets with, and without refinement at the scale where an effect of the cloud layer is actually expected, we use the island of Hawai’i as an example. Here both the station density and the quality of the stations are high, and the island has step precipitation gradients ranging from nearly 0 to >20 kg m⁻²s⁻¹. We use 105 stations that recorded at least 25 days per month between 2003 and 2016 from GHCN-D dataset and compare the annual mean precipitation it to the one derived at the original 0.25° resolution, the data without cloud refinement, and the data with cloud refinement at 0.0083334° resolution.

Global downscaling performance across several metrics

Kling-Gupta Efficiency, as well as Pearson’s r values were highest in Europe, Central Asia, and North America (Supplementary Fig. 1). The lowest values are found within the tropics, but also in areas with very high precipitation, such as Venezuela, Colombia, or the Congo basin, or very low precipitation, such as the Sahara, or the Arabian Peninsula. There are several possible explanations for the relatively lower performance in the tropics. We are using the GHCN-D dataset for validation, as it is one of the few available datasets for large-scale, global validation of precipitation. Gauge data such as GHCN-D is however very heterogeneous in quality^30,83,85,86 and, even after cleaning using the provided quality flags, errors likely remain. The lower validation performance in these regions may therefore be partially an artefact of poor station data quality.

Differences in KGE values between coarse and high resolution are higher in areas with large spatial heterogeneity such as mountains (Fig. 1). This shows that the downscaling has a positive effect on the estimation of precipitation at high spatial resolutions (Table 3). The increase in KGE values is however, not confined to areas with heterogeneous terrain, but also the lowlands in the United States or Europe. The high-resolution data shows improvements in KGE and all of its components compared to the coarse 0.25° data. Performance gains are given for the root mean squared error (rmse), mean absolute error (mae), and percent bias (pbias) (Fig. 1). The global performance gain is $\Delta KGE=$ 0.045, but shows a strong geographical pattern (Fig. 1) especially in mountainous regions such as the Andes, or the Rocky Mountains, but also large parts of Asia. Performance losses are most prominent in Western Indonesia, with the rest of Indonesia however, showing a gain in KGE.

Table 3 Global test metrics for a comparison between the downscaled CHELSA_EarthEnv data and the original ERA5 data based on 122,236,056 observations at 58,071 stations between 2003 and 2016.

Full size table

While globally an increase in the γ component of KGE is larger than the increase in the β or r component, in most of the regions with the highest gain in KGE, both increases in r and β prevail. A possible explanation for this is that the inclusion of topography in the downscaling has the largest effect on the bias (Fig. 1).

The more evenly distributed differences in the γ component, which reflects the variability in precipitation is most likely due to the inclusion of the MODIS cloud cover, that adds additional information on the spatio-temporal variance in precipitation to the downscaling. Although we only included monthly cloud frequency distributions into the downscaling, this shows the potential high resolution cloud cover frequencies have in improving high resolution precipitation estimates globally.

Regional performance

The comparison of all five precipitation products for the continental United States, shows a relatively high performances of all datasets (Fig. 2) ranging from a correlation of r ~ 0.85 (PRISM), to r ~ 0.5 (CHIRPS). CHELSA_EarthEnv performs slightly worse than MSWEP in estimating daily precipitation rates, but better than CHIRPS. PRISM performs best with the highest correlations compared to GHCN-D. The performance increases for all products when monthly climatological means, instead of daily precipitation values are used, with CHELSA, CHIRPS, and MSWEP performing almost identically. PRISM still outperforms all models slightly. WorldClim shows a comparably poor performance compared to all other products during the period 2003–2016 with low correlations (r ~ 0.5) and a much higher standard deviation than all other products.

All precipitation products use part of the GHCN-D stations to parametrize their algorithms. PRISM uses the daily station data directly and uses the anomalies from long term climatologies at the stations and interpolates them to achieve a gap free anomaly surface for the CAI. The achieved performance might therefore be due to the high station density in PRISM itself. CHELSA_EarthEnv uses GPCC gridded station data at 0.25° for a bias correction, therefore the algorithm cannot force the interpolation through each station location directly, which might explain the difference between PRISM and CHELSA_EarthEnv. CHIRPS uses a smaller set of stations compared to PRISM, so the difference in performance might partly be due to the less dense station network. MSWEP uses a wide variety of input sources from remote sensed data, to reanalysis data, to station data. MSWEP therefore averages out most of the errors of a single source, which leads to a relatively high performance in the resulting precipitation estimates¹². Interestingly, WorldClim does not perform well compared the other products, despite being parameterized on a large number of stations. This might be due to errors in the parametrization of the predictors used for the long term climatologies, or uncertainties from the CAI applied on the CRU-TS data.

Downscaling performance in relation to comparable products

The bias compared to observations at stations is heterogenous in all different precipitation datasets. PRISM shows the lowest bias compared to GHCN-D data, while CHELSA_EarthEnv, MSWEP, and CHIRPS show similar biases (Fig. 3). WorldClim has the largest overall bias of all five comparable products.

A similar pattern emerges when the different products are compared at the 0.25° and the highest resolution. Comparing the absolute bias of the coarse resolution aggregations with the highest available resolutions shows that all different precipitation datasets have a lower absolute bias at the highest spatial resolution (Fig. 3). The amount of bias correction however varies to a large degree, with PRISM and CHELSA_EarthEnv showing the largest bias reduction, while CHIRPS and MSWEP show a slightly lower bias reduction, and WorldClim the lowest reduction. The relative smaller reduction of CHIRPS and MSWEP to CHELSA_EarthEnv and PRISM might could be due to the lower native spatial resolution (Table 2) compared to CHELSA_EarthEnv and PRISM (Table 2). However, the monthly WorldClim timeseries has the same native spatial resolution as PRISM, and still has a very low difference in absolute bias between the high and the coarse resolution, indicating poor downscaling performance.

Downscaling performance also varies geographically (Fig. 4). Generally, the bias reduction is higher in mountainous regions of the western United States, and lower in the more homogenous terrain in the east. Comparing at which stations the bias is reduced (Fig. 5), shows that PRISM, CHELSA_EarthEnv, MSWEP and CHIRPS are able to reduce the absolute bias in mountainous terrain, but also in the convective regimes of the Midwest and Southwest of the United States. WorldClim only reduces the bias in the mountainous regions, but does not reduce the precipitation bias in convective regimes.

Comparison with PRISM

PRISM shows consistently the highest performance metrics and is therefore a suitable benchmark for a spatially explicit comparison. Overall, all precipitation datasets show similar mesoscale patterns of precipitations (Fig. 5). Marked differences are mainly apparent in the southwestern United states, where all models are comparably dryer than PRISM. Differences are also apparent in the eastern Rocky Mountains, where CHIRPS, MSWEP, and WorldClim have a considerable dry bias, but CHELSA_EarthEnv shows more similar precipitation rates as PRISM. Overall CHELSA_EarthEnv shows the lowest differences and highest correlations to PRISM (Fig. 6) (r = 0.97, mae=0.20), followed by MSWEP (r = 0.97, mae = 0.23) and CHIRPS (r = 0.96, mae = 0.23). WorldClim shows the highest differences with PRISM and the lowest correlation among all products (r = 0.95, mae = 0.28).

Precipitation lapse rates

The general similarity between CHELSA-EarthEnv and PRISM (at 800 m resolution) in precipitation amount and in precipitation gradients (Fig. 7d,f) is notable, given that elevation-precipitation relationships in CHELSA-EarthEnv are produced by the orographic wind effect algorithm, rather than by elevational relationships to station observations as in PRISM. The WRF simulation is independent of station observations and provides further evidence that precipitation increases with elevation in this region (Fig. 7h). Weaker gradients in WRF are due to the coarser (4 km) grid scale, which imposes more subdued gradients of both terrain and precipitation. The strong negative gradients in WorldClim2 (Fig. 7j) are due to derivation of a precipitation-elevation relationship from stations spanning the windward (low elevation stations with high precipitation) and leeward (higher elevation stations with low precipitation) sides of the mountain range. These erroneous negative gradients produce a strong underestimation of regional precipitation (Fig. 7i) as they are used to extrapolate station precipitation into higher elevations (Fig. 7k) that have very low station density. This case study illustrates the utility of CHELSA-EarthEnv for mountainous regions with sparse station observations: the dynamical ERA5 reanalysis provides a physically plausible regional distribution of precipitation while the orographic wind effects algorithm provides credible local elevational gradients, even in the absence of station observations.

Improvement from the cloud layers

The global comparison between the predicted precipitation with and without cloud cover refinement yielded in very small differences in all test metrics indicating no significant differences in global test metrics (with cloud refinement r = 0.609, mae = 2.404, without refinement: r = 0.610, mae = 2.402). The cloud cover refinement, however happens on a spatial scale, that is not necessarily captured well by a global comparison. The local comparison for the island of Hawai’i (Fig. 8) indicates that the cloud cover refinement largely acts on the local scale, where it reduces the wet bias of the interpolation without cloud cover refinement. Without the refinement the CHELSA algorithm distributes precipitation based on wind fields and boundary layer height alone. It does not distinguish areas that are usually above the clouds very well, leading to an overestimation in precipitation in these areas. Here the cloud cover refinement shows an effect, by increasing the correlation between predicted precipitation and observed precipitation, as well as decreasing the error in the estimates (Fig. 8).

Validation results—Conclusions

The comparison of the coarse grid resolution with the high resolution of CHELSA_EarthEnv shows that the applied downscaling is able to increase the accuracy of the precipitation predictions in several aspects and generates realistic precipitation patterns in complex terrain. The downscaling algorithm together with remotely sensed cloud cover performs equally well as other high-resolution products in predicting precipitation. The CHELSA_EarthEnv algorithm produces similar high resolution precipitation patterns as datasets that need to be informed by a high quality, dense weather station network without directly relying on stations itself. With respect to the realistic simulation of precipitation gradients in complex terrain, it also outperforms comparable high resolution global products.

Usage Notes

Note that because of the pixel center referencing of the input GMTED2010 data the full extent of each grid as defined by the outside edges of the pixels differs from an integer value of latitude or longitude by 0.000138888888 degree (or 1/2 arc-second). Users of products based on the legacy GTOPO30 product should note that the coordinate referencing of each grid (and GMTED2010) and GTOPO30 are not the same. In GTOPO30, the integer lines of latitude and longitude fall directly on the edges of a 30-arc-second pixel. Thus, when overlaying grids with products based on GTOPO30 a slight shift of 1/2 arc-second will be observed between the edges of corresponding 30-arc-second pixels.

CHELSA_EarthEnv differs in several aspects with the already available climatological data (CHELSA V1-V2)³⁷ and long term downscaled CMIP5 modelled data (CHELSAcmip5ts)³⁹. The main difference is the increase in temporal resolution to a daily one, compared to the other two datasets. It is similar to CHELSA V1.x in the respect that both are ‘observational’ datasets, while CHELSAcmip5ts is a downscaled “modelled” dataset. A value of a climate variable given a specific day or month in CHELSA_EarthEnv, or CHELSA V1.x can therefore be seen as an event which actually has been recorded, while one in the CHELSAcmip5ts dataset is only a modelled and does not represent a real observation similar to those in the forcing CMIP5 models.

Code availability

The code calculating the bias correction on the CHELSA V2.0 precipitation data is written in Python 2.7 and C++ (via the SAGA-GIS api). The code for the cloud cover refinement is available here: https://gitlabext.wsl.ch/karger/chelsa_earthenv. The code for the validation is available here: https://gitlabext.wsl.ch/karger/chelsa_earthenv_validation.

References

Kucera, P. A. et al. Precipitation from Space: Advancing Earth System Science. Bull. Am. Meteorol. Soc. 94, 365–375 (2012).
Article ADS Google Scholar
Tapiador, F. J. et al. Global precipitation measurement: Methods, datasets and applications. Atmospheric Res. 104–105, 70–97 (2012).
Article ADS Google Scholar
Kirschbaum, D. B. et al. NASA’s Remotely Sensed Precipitation: A Reservoir for Applications Users. Bull. Am. Meteorol. Soc. 98, 1169–1184 (2016).
Article ADS Google Scholar
Sun, Q. et al. A Review of Global Precipitation Data Sets: Data Sources, Estimation, and Intercomparisons. Rev. Geophys. 56, 79–107 (2018).
Article ADS Google Scholar
Beck, H. E. et al. MSWEP V2 Global 3-Hourly 0.1° Precipitation: Methodology and Quantitative Assessment. Bull. Am. Meteorol. Soc. 100, 473–500 (2019).
Article ADS Google Scholar
Beck, H. E. et al. Daily evaluation of 26 precipitation datasets using Stage-IV gauge-radar data for the CONUS. Hydrol. Earth Syst. Sci. 23, 207–224 (2019).
Article ADS Google Scholar
Schär, C. et al. Kilometer-scale climate models: Prospects and challenges. Bull. Am. Meteorol. Soc. 101 (2019).
Service (C3S), C. C. C. ERA5: Fifth generation of ECMWF atmospheric reanalyses of the global climate, Copernicus Climate Change Service Climate Data Store (CDS). (2017).
Hersbach, H. et al. Operational global reanalysis: progress, future directions and synergies with NWP. (2018).
Gelaro, R. et al. The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). J. Clim. 30, 5419–5454 (2017).
Article ADS PubMed Google Scholar
Reichle, R. H. et al. Land surface precipitation in MERRA-2. J. Clim. 30, 1643–1664 (2017).
Article ADS Google Scholar
Beck, H. E. et al. MSWEP: 3-hourly 0.25◦ global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data. Hydrol Earth Syst Sci Discuss 2016, 1–38 (2016).
Google Scholar
Skamarock, W. C. Evaluating mesoscale NWP models using kinetic energy spectra. Mon. Weather Rev. 132, 3019–3032 (2004).
Article ADS Google Scholar
Ménégoz, M., Gallée, H. & Jacobi, H. W. Precipitation and snow cover in the Himalaya: from reanalysis to regional climate simulations. Hydrol. Earth Syst. Sci. 17 (2013).
Liu, Z. et al. Evaluation of spatial and temporal performances of ERA-Interim precipitation and temperature in mainland China. J. Clim. 31, 4347–4365 (2018).
Article ADS Google Scholar
Skamarock, C. et al. A Description of the Advanced Research WRF Model Version 4. OpenSky https://doi.org/10.5065/1dfh-6p97 (2019).
Dipankar, A. et al. Large eddy simulation using the general circulation model ICON. J. Adv. Model. Earth Syst. 7, 963–986 (2015).
Article ADS Google Scholar
Heinze, R. et al. Large-eddy simulations over Germany using ICON: a comprehensive evaluation. Q. J. R. Meteorol. Soc. 143, 69–100 (2017).
Article ADS Google Scholar
Fuhrer, O. et al. Near-global climate simulation at 1km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0. Geosci. Model Dev. 11, 1665–1681 (2018).
Article ADS Google Scholar
Schulthess, T. C. et al. Reflecting on the goal and baseline for exascale computing: a roadmap based on weather and climate simulations. Comput. Sci. Eng. 21, 30–41 (2018).
Article Google Scholar
Neumann, P. et al. Assessing the scales in numerical weather and climate predictions: will exascale be the rescue? Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 377, 20180148 (2019).
ADS Google Scholar
Woodward, F. I., Fogg, G. E., Heber, U., Laws, R. M. & Franks, F. The impact of low temperatures in controlling the geographical distribution of plants. Philos. Trans. R. Soc. Lond. B Biol. Sci. 326, 585–593 (1990).
Article ADS Google Scholar
Guisan, A. & Zimmermann, N. E. Predictive habitat distribution models in ecology. Ecol. Model. 135, 147–186 (2000).
Article Google Scholar
Guisan, A. & Thuiller, W. Predicting species distribution: offering more than simple habitat models. Ecol. Lett. 8, 993–1009 (2005).
Article PubMed Google Scholar
Tabios, G. Q. & Salas, J. D. A Comparative Analysis of Techniques for Spatial Interpolation of Precipitation1. JAWRA J. Am. Water Resour. Assoc. 21, 365–380 (1985).
Article ADS Google Scholar
Daly, C., Taylor, G. H. & Gibson, W. P. The PRISM approach to mapping precipitation and temperature. Proc 10th AMS Conf Appl. Climatol. 20–23 (1997).
Thornton, P. E., Running, S. W. & White, M. A. Generating surfaces of daily meteorological variables over large regions of complex terrain. J. Hydrol. 190, 214–251 (1997).
Article ADS Google Scholar
Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G. & Jarvis, A. Very high resolution interpolated climate surfaces for global land areas. Int. J. Climatol. 25, 1965–1978 (2005).
Article Google Scholar
Briggs, P. R. & Cogley, J. G. Topographic bias in mesoscale precipitation networks. J. Clim. 9, 205–218 (1996).
Article ADS Google Scholar
Schneider, U. et al. GPCC’s new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle. Theor. Appl. Climatol. 115, 15–40 (2013).
Article ADS Google Scholar
Kidd, C. et al. So, how much of the Earth’s surface is covered by rain gauges? Bull. Am. Meteorol. Soc. 98, 69–78 (2017).
Article ADS PubMed PubMed Central Google Scholar
Berndt, C. & Haberlandt, U. Spatial interpolation of climate variables in Northern Germany—Influence of temporal resolution and network density. J. Hydrol. Reg. Stud. 15, 184–202 (2018).
Article Google Scholar
Groisman, P. Y. & Legates, D. R. The accuracy of United States precipitation data. Bull. Am. Meteorol. Soc. 75, 215–228 (1994).
Article ADS Google Scholar
Sevruk, B. Regional Dependency of Precipitation-Altitude Relationship in the Swiss Alps. in Climatic Change at High Elevation Sites (eds. Diaz, H. F., Beniston, M. & Bradley, R. S.) 123–137, https://doi.org/10.1007/978-94-015-8905-5_7 (Springer Netherlands, 1997).
Rasmussen, R. et al. How well are we measuring snow: The NOAA/FAA/NCAR winter precipitation test bed. Bull. Am. Meteorol. Soc. 93, 811–829 (2012).
Article ADS Google Scholar
Beck, H. E. et al. Bias Correction of Global High-Resolution Precipitation Climatologies Using Streamflow Observations from 9372 Catchments. J. Clim. 33, 1299–1315 (2020).
Article ADS Google Scholar
Karger, D. N. et al. Climatologies at high resolution for the earth’s land surface areas. Sci. Data 4, 170122 (2017).
Article PubMed PubMed Central Google Scholar
Muñoz-Sabater, J. et al. ERA5-Land: an improved version of the ERA5 reanalysis land component. in Joint ISWG and LSA-SAF Workshop IPMA, Lisbon 26–28 (2018).
Karger, D. N., Schmatz, D. R., Dettling, G. & Zimmermann, N. E. High resolution monthly precipitation and temperature timeseries for the period 2006–2100. Sci. Data (2020).
Huffman, G. J. et al. The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-Global, Multiyear, Combined-Sensor Precipitation Estimates at Fine Scales. J. Hydrometeorol. 8, 38–55 (2007).
Article ADS Google Scholar
Biasutti, M., Yuter, S. E., Burleyson, C. D. & Sobel, A. H. Very high resolution rainfall patterns measured by TRMM precipitation radar: seasonal and diurnal cycles. Clim. Dyn. 39, 239–258 (2011).
Article Google Scholar
Goddard Space Flight Center Distributed Active Archive Center (GSFC DAAC). TRMM/TMPA 3B43 TRMM and Other Sources Monthly Rainfall Product V7. (2011).
Funk, C. et al. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Sci. Data 2, 150066 (2015).
Article PubMed PubMed Central Google Scholar
Levizzani, V., Laviola, S. & Cattani, E. Detection and measurement of snowfall from space. Remote Sens. 3, 145–166 (2011).
Article ADS Google Scholar
Skofronick-Jackson, G. et al. Global precipitation measurement cold season precipitation experiment (GCPEX): for measurement’s sake, let it snow. Bull. Am. Meteorol. Soc. 96, 1719–1741 (2015).
Article ADS Google Scholar
Vila, D. A., D Goncalves, L. G. G., Toll, D. L. & Rozante, J. R. Statistical evaluation of combined daily gauge observations and rainfall satellite estimates over continental South America. J. Hydrometeorol. 10, 533–543 (2009).
Article ADS Google Scholar
Xie, P., Yoo, S.-H., Joyce, R. & Yarosh, Y. Bias-corrected CMORPH: A 13-year analysis of high-resolution global precipitation. In Geophysical Research Abstracts 13, EGU2011–1809 (2011).
Google Scholar
Xie, P. & Xiong, A.-Y. A conceptual model for constructing high‐resolution gauge‐satellite merged precipitation analyses. J. Geophys. Res. Atmospheres 116 (2011).
Vernimmen, R. R. E., Hooijer, A., Mamenun, N. K., Aldrian, E. & Van Dijk, A. Evaluation and bias correction of satellite rainfall data for drought monitoring in Indonesia. (2012).
Cannon, A. J., Sobie, S. R. & Murdock, T. Q. Bias Correction of GCM Precipitation by Quantile Mapping: How Well Do Methods Preserve Changes in Quantiles and Extremes? J. Clim. 28, 6938–6959 (2015).
Article ADS Google Scholar
Richards, F. & Arkin, P. On the relationship between satellite-observed cloud cover and precipitation. Mon. Weather Rev. 109, 1081–1093 (1981).
Article ADS Google Scholar
Arkin, P. A. & Meisner, B. N. The relationship between large-scale convective rainfall and cold cloud over the western hemisphere during 1982-84. Mon. Weather Rev. 115, 51–74 (1987).
Article ADS Google Scholar
Betts, A. K., Tawfik, A. B. & Desjardins, R. L. Revisiting Hydrometeorology Using Cloud and Climate Observations. J. Hydrometeorol. 18, 939–955 (2017).
Article ADS Google Scholar
Wilson, A. M. & Jetz, W. Remotely Sensed High-Resolution Global Cloud Dynamics for Predicting Ecosystem and Biodiversity Distributions. PLOS Biol 14, e1002415 (2016).
Article PubMed PubMed Central Google Scholar
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).
Article ADS Google Scholar
Hersbach, H. et al. Global reanalysis: goodbye ERA-Interim, hello ERA5. 17–24, https://doi.org/10.21957/vf291hehd7 (2019).
Cucchi, M. et al. WFDE5: bias-adjusted ERA5 reanalysis data for impact studies. Earth Syst. Sci. Data 12, 2097–2120 (2020).
Article ADS Google Scholar
Kållberg, P. Forecast drift in ERA-Interim. ERA Rep. Ser. 10, 9 (2011).
Ziese, M. et al. GPCC Full Data Daily Version.2018 at 1.0°: Daily Land-Surface Precipitation from Rain-Gauges built on GTS-based and Historic DataZiese, Markus; Rauthe-Schöch, Armin; Becker, Andreas; Finger, Peter; Meyer-Christoffer, Anja; Schneider, Udo. DWD 10.5676/DWD_GPCC/FD_D_V2018_100.
Lee, S., Wolberg, G. & Shin, S. Y. Scattered data interpolation with multilevel B-splines. IEEE Trans. Vis. Comput. Graph. 3, 228–244 (1997).
Article Google Scholar
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. Numerical recipes. vol. 3 (Cambridge University Press Cambridge, 1989).
Basist, A., Bell, G. D. & Meentemeyer, V. Statistical Relationships between Topography and Precipitation Patterns. J. Clim. 7, 1305–1315 (1994).
Article ADS Google Scholar
Weisse, A. K. & Bois, P. Topographic Effects on Statistical Characteristics of Heavy Rainfall and Mapping in the French Alps. J. Appl. Meteorol. 40, 720–740 (2001).
Article ADS Google Scholar
Marquı́nez, J., Lastra, J. & Garcı́a, P. Estimation models for precipitation in mountainous regions: the use of GIS and multivariate analysis. J. Hydrol. 270, 1–11 (2003).
Article ADS Google Scholar
Smith, R. B. & Barstad, I. A Linear Theory of Orographic Precipitation. J. Atmospheric Sci. 61, 1377–1391 (2004).
Article ADS Google Scholar
Jiang, Q. Precipitation over multiscale terrain. Tellus Dyn. Meteorol. Oceanogr. 59, 321–335 (2007).
Article Google Scholar
Böhner, J. Advancements and new approaches in climate spatial prediction and environmental modelling. Arbeitsberichte Geogr. Inst. HU Zu Berl. 109, 49–90 (2005).
Böhner, J. General climatic controls and topoclimatic variations in Central and High Asia. Boreas 35, 279–295 (2006).
Article Google Scholar
Böhner, J., Antonic, O., Böhner, J. & Antonic, O. Land-Surface Parameters Specific to Topo-Climatology. In T. Hengl, & H. I. Reuter (Eds.), GEOMORPHOMETRY: CONCEPTS, SOFTWARE, APPLICATIONS (pp. 195–226). Elsevier Science. in in T. Hengl, & H. I. Reuter (eds.) Geomorphometry: Concepts, Software, Applications 195–226 (Elsevier Science, 2009).
Gerlitz, L., Conrad, O. & Böhner, J. Large-scale atmospheric forcing and topographic modification of precipitation rates over High Asia – a neural-network-based approach. Earth Syst Dynam 6, 61–81 (2015).
Article ADS Google Scholar
Austin, G. L. & Dirks, K. N. Topographic Effects on Precipitation. in Encyclopedia of Hydrological Sciences https://doi.org/10.1002/0470848944.hsa033 (American Cancer Society, 2006).
Liu, M., Bárdossy, A. & Zehe, E. Interaction of valleys and circulation patterns (CPs) on small-scale spatial precipitation distribution in the complex terrain of southern Germany. Hydrol. Earth Syst. Sci. Discuss. 9 (2012).
Vogelezang, D. H. P. & Holtslag, A. A. M. Evaluation and model impacts of alternative boundary-layer height formulations. Bound.-Layer Meteorol. 81, 245–269 (1996).
Article ADS Google Scholar
von Engeln, A. & Teixeira, J. A Planetary Boundary Layer Height Climatology Derived from ECMWF Reanalysis Data. J. Clim. 26, 6575–6590 (2013).
Article ADS Google Scholar
Frei, C. & Schär, C. A precipitation climatology of the Alps from high-resolution rain-gauge observations. Int. J. Climatol. 18, 873–900 (1998).
Article Google Scholar
Roger, J. C. & Vermote, E. F. A method to retrieve the reflectivity signature at 3.75 μm from AVHRR data. Remote Sens. Environ. 64, 103–114 (1998).
Article ADS Google Scholar
Petitcolin, F. & Vermote, E. Land surface reflectance, emissivity and temperature from MODIS middle and thermal infrared data. Remote Sens. Environ. 83, 112–134 (2002).
Article ADS Google Scholar
Kalnay, E. et al. The NCEP/NCAR 40-Year Reanalysis Project. Bull. Am. Meteorol. Soc. 77, 437–471 (1996).
Article ADS Google Scholar
Karger, D. N., Wilson, A. M., Mahony, C., Zimmermann, N. E. & Jetz, W. Global daily 1km land surface precipitation based on cloud cover-informed downscaling. EarthEnv, https://doi.org/10079/MOL/6f52b80d-0a41-40f7-84ec-873458ca6ee6 (2021).
Menne, M. J. et al. Global Historical Climatology Network - Daily (GHCN-Daily), Version 3. NOAA National Climatic Data Center. 10.7289/V5D21VHZ [access 3.11.2018]. (2018).
Gupta, H. V., Kling, H., Yilmaz, K. K. & Martinez, G. F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol. 377, 80–91 (2009).
Article ADS Google Scholar
Kling, H., Fuchs, M. & Paulin, M. Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. J. Hydrol. 424–425, 264–277 (2012).
Article Google Scholar
Knoben, W. J. M., Freer, J. E. & Woods, R. A. Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores. Hydrol. Earth Syst. Sci. 23, 4323–4331 (2019).
Article ADS Google Scholar
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Article Google Scholar
Willmott, C. J. & Robeson, S. M. Climatologically aided interpolation (CAI) of terrestrial air temperature. Int. J. Climatol. 15, 221–229 (1995).
Article Google Scholar
Harris, I., Osborn, T. J., Jones, P. & Lister, D. Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Sci. Data 7, 1–18 (2020).
Article Google Scholar
Liu, C. et al. Continental-scale convection-permitting modeling of the current and future climate of North America. Clim. Dyn. 49, 71–95 (2017).
Article Google Scholar
Sorooshian, S., Duan, Q. & Gupta, V. K. Calibration of rainfall-runoff models: Application of global optimization to the Sacramento Soil Moisture Accounting Model. Water Resour. Res. 29, 1185–1194 (1993).
Article ADS Google Scholar

Download references

Acknowledgements

D.N.K. & N.E.Z. acknowledge funding from: The WSL internal grant exCHELSA, the 2019–2020 BiodivERsA joint call for research proposals, under the BiodivClim ERA-Net COFUND program, with the funding organisations Swiss National Science Foundation SNF (project: FeedBaCks, 193907), Agence nationale de la recherche (ANR-20-EBI5-0001-05), the Swedish Research Council for Sustainable Development (Formas 2020–02360), the German Research Foundation (DFG BR 1698/21–1, DFG HI 1538/16–1), and the Technology Agency of the Czech Republic (SS70010002), as well as the Swiss Data Science Projects: SPEEDMIND, and COMECO. D.N.K. acknowledges funding to the ERA-Net BiodivERsA - Belmont Forum, with the national funder Swiss National Foundation (20BD21_184131), part of the 2018 Joint call BiodivERsA-Belmont Forum call (project ‘FutureWeb’), the WSL internal grant ClimEx. We thank EarthEnv project collaborators Rob Guralnick and Brian McGill for discussions preceding and intellectually benefitting the research presented here. W.J. acknowledges funding from NASA grants 80NSSC17K0282, 80NSSC20K0202, and 80NSSC18K0435.

Author information

Authors and Affiliations

Swiss Federal Research Institute for Forest, Snow, and Landscape Research (WSL), Zürcherstrasse 111, 8903, Birmensdorf, Switzerland
Dirk Nikolaus Karger & Niklaus E. Zimmermann
Department of Ecology and Evolutionary Biology, Yale University, 165 Prospect Street, New Haven, CT, 06520-8106, USA
Dirk Nikolaus Karger & Walter Jetz
Center for Biodiversity and Global Change, Yale University, 165 Prospect Street, New Haven, CT, 06520-8106, USA
Dirk Nikolaus Karger, Colin Mahony & Walter Jetz
Department of Geography, University at Buffalo, 120 Wilkeson Quad, Buffalo, NY, 14261, USA
Adam M. Wilson

Authors

Dirk Nikolaus Karger
View author publications
You can also search for this author in PubMed Google Scholar
Adam M. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Colin Mahony
View author publications
You can also search for this author in PubMed Google Scholar
Niklaus E. Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar
Walter Jetz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.N.K., A.W. and W.J. developed the idea. A.W. produced the monthly MODIS cloud frequency layers, D.N.K. and N.E.Z. developed and implemented the precipitation downscaling and bias correction algorithm, C.M. and D.N.K. conducted the validation, D.N.K. wrote the first version of the manuscript and all authors contributed significantly to the revision.

Corresponding authors

Correspondence to Dirk Nikolaus Karger or Walter Jetz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figure 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Karger, D.N., Wilson, A.M., Mahony, C. et al. Global daily 1 km land surface precipitation based on cloud cover-informed downscaling. Sci Data 8, 307 (2021). https://doi.org/10.1038/s41597-021-01084-6

Download citation

Received: 25 February 2021
Accepted: 21 October 2021
Published: 26 November 2021
DOI: https://doi.org/10.1038/s41597-021-01084-6

This article is cited by

Rapid groundwater decline and some cases of recovery in aquifers globally
- Scott Jasechko
- Hansjörg Seybold
- James W. Kirchner
Nature (2024)
Assessing the applicability of binary land-cover variables to species distribution models across multiple grains
- Lukáš Gábor
- Jeremy Cohen
- Walter Jetz
Landscape Ecology (2024)
Understanding fatal landslides at global scales: a summary of topographic, climatic, and anthropogenic perspectives
- Seçkin Fidan
- Hakan Tanyaş
- Tolga Görüm
Natural Hazards (2024)
Animal-borne sensors as a biologically informed lens on a changing climate
- Diego Ellis-Soto
- Martin Wikelski
- Walter Jetz
Nature Climate Change (2023)
Anthropogenic influence on extreme temperature and precipitation in Central Asia
- Bijan Fallah
- Emmanuele Russo
- Fred F. Hattermann
Scientific Reports (2023)

Subjects

Abstract

Similar content being viewed by others

Background & Summary

Methods

Bias correction of ERA5 precipitation data

Orographic wind effects

Precipitation including orographic effects

Monthly cloud frequencies

Cloud frequency correction of daily precipitation estimates

Data Records

Technical Validation

Validating the downscaling performance

Accessing the global downscaling performance across several metrics

Accessing the regional performance

Comparison with PRISM

Comparing precipitation lapse rates

Accessing the improvement from the cloud layers

Global downscaling performance across several metrics

Regional performance

Downscaling performance in relation to comparable products

Comparison with PRISM

Precipitation lapse rates

Improvement from the cloud layers

Validation results—Conclusions

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links