Data Descriptor | Open

# A new, long-term daily satellite-based rainfall dataset for operational monitoring in Africa

• Scientific Data 4, Article number: 170063 (2017)
• doi:10.1038/sdata.2017.63
Accepted:
Published online:

## Abstract

Rainfall information is essential for many applications in developing countries, and yet, continually updated information at fine temporal and spatial scales is lacking. In Africa, rainfall monitoring is particularly important given the close relationship between climate and livelihoods. To address this information gap, this paper describes two versions (v2.0 and v3.0) of the TAMSAT daily rainfall dataset based on high-resolution thermal-infrared observations, available from 1983 to the present. The datasets are based on the disaggregation of 10-day (v2.0) and 5-day (v3.0) total TAMSAT rainfall estimates to a daily time-step using daily cold cloud duration. This approach provides temporally consistent historic and near-real time daily rainfall information for all of Africa. The estimates have been evaluated using ground-based observations from five countries with contrasting rainfall climates (Mozambique, Niger, Nigeria, Uganda, and Zambia) and compared to other satellite-based rainfall estimates. The results indicate that both versions of the TAMSAT daily estimates reliably detects rainy days, but have less skill in capturing rainfall amount—results that are comparable to the other datasets.

Design Type(s) observation design • data integration objective hydrological precipitation process meterological observation Mozambique • Niger • Nigeria • Uganda • Zambia

## Background and Summary

High spatial and temporal rainfall variability is a major challenge when it comes to managing agricultural activities across Africa, as above or below average rainfall can lead to crop losses and failure1. A notable recent example was the occurrence of widespread drought conditions across the Horn of Africa during 2010–2011 which affected over 10 million people2,3. To help mitigate these climate-related risks, access to reliable rainfall information, both historic and near-real time, is a necessity. Historic data allows climate risks (e.g., the probability of drought) and long-term changes in the rainfall climate to be assessed, while near-real time data is important to evaluate the present day weather in a historical context. The latter is especially important in monitoring the evolution of hydrological hazards, allowing timely responses from governments and organizations before major crises occurs. Although temporally coarse data (for example, dekadal or monthly) can be useful for evaluating climatic trends and monitoring above or below average rainfall4, information at fine time scales (e.g., daily) provides information valuable in a range of other applications such as crop modelling, water management and weather index-based insurance4,5.

Conventionally, rain gauge records provide the most accurate means to obtain information about the rainfall climate. However, the spatially sparse network and often temporally incomplete records at many stations across Africa leaves large parts of the continent unobserved6. This problem is exacerbated by the high spatial variability associated with convective rainfall at the daily time-step that makes a rain gauge measurement only representative of rainfall over several square kilometres surrounding the gauge7. Except in the vicinity of a continually reporting weather station, gauge observations alone are impractical for the routine assessment of rainfall. Africa-wide, near-real time gauge records are only available via the Global Telecommunications System (GTS) network, usually through automatic weather stations. Although over 700 stations are registered on the GTS network, only a small proportion of these report daily6,8. Moreover, access to country-level records that often contain more data than is publicly available, is often only possible via direct contact with African meteorological and hydrological agencies.

The limitations associated with gauge measurements have elevated the importance of satellite-based rainfall estimates in many applications across Africa, especially in agriculture and drought monitoring8. Satellite-based algorithms have the advantage of providing full spatial coverage and have been demonstrated to be skilful in many locations over Africa9,​10,​11,​12,​13,​14,​15,​16,​17. While there is an ever growing collection of satellite-based datasets capable of providing near-real time estimates (a selection of which are listed in Table 1 in Maidment et al.18), only a handful of publicly available high resolution satellite-based datasets providing historic data (at least 30 years) at the daily time-step and which are continually updated in real time or near-real time, exist for Africa. These are the National Oceanic Atmospheric Administration (NOAA) African Rainfall Climatology version 2.0 (ARC19) and the Climate Hazards Group InfraRed Precipitation with Station data version 2.0 (CHIRPS20) and are described in the Technical Validation section. Given the dearth of Africa-wide long-term (30 years or more) daily rainfall information and large uncertainties in existing observational records over Africa21,​22,​23, the addition of daily satellite-based rainfall datasets with contrasting estimation approaches are extremely valuable for rainfall monitoring and climate research. Moreover, Africa’s population is expanding rapidly and it is expected that this trend will continue throughout this century24. The pressures such growth is putting on agricultural and water resources, combined with changes in the rainfall climate21, are encouraging the use of climate-based services such as Enhancing National Climate Services (ENACTS)25 and Rainwatch26 in many African countries. These services provide easily accessible historic and near-real time information on the local climate that is useful to a wide range of stakeholders. Such platforms, however, require skilful, long-term and regularly updated rainfall information.

Here, we describe and evaluate two versions (2.0 and 3.0) of the long-term daily TAMSAT (Tropical Applications of Meteorology using SATellite and ground based observations) rainfall dataset (Data Citation 1: University of Reading http://dx.doi.org/10.17864/1947.108 and Data Citation 2: University of Reading https://doi.pangaea.de/10.1594/PANGAEA.871465; hereinafter TAMSAT-2 and TAMSAT-3 respectively), based on high resolution Meteosat thermal-infrared (TIR) observations for all of Africa, available from 1983 to the present and updated in near-real time. TAMSAT-2 and TAMSAT-3 are based on the disaggregation of the TAMSAT version 2.0 dekadal18 and TAMSAT version 3.0 pentadal rainfall estimates respectively, to a daily time-step using daily calibrated cold cloud duration (CCD) observations (see Methods section for algorithm details).

In January 2017, the TAMSAT Group released TAMSAT version 3.0—which is produced operationally alongside version 2.0 (ref. 18). Given that the daily rainfall estimates derived from TAMSAT v2.0 have been in the public domain for several years and are used by many users, this paper formally evaluates both TAMSAT-2 and TAMSAT-3. The rainfall estimates have been validated using daily rain gauge measurements from five Africa countries (Mozambique, Niger, Nigeria, Uganda, and Zambia) and compared with estimates from six other satellite-based rainfall datasets, some of which are used widely in rainfall monitoring applications across Africa.

## Methods

### TAMSAT algorithm

The daily estimates (both TAMSAT-2 and TAMSAT-3) are derived from the TAMSAT rainfall estimation algorithm. The TAMSAT Group have, since the 1980s, produced estimates at the 10-day (dekad) scale. The algorithm, described in Milford et al.27, Dugdale et al.28, Grimes et al.10 and Maidment et al.18, works on the premise that the use of TIR imagery to monitor the cold cloud tops of rain-bearing convective cumulonimbus systems acts as a useful indicator for rainfall in the Tropics. Despite the simplicity of the TAMSAT operational approach, the dekadal estimates have been shown to perform well where rainfall is predominantly convective in origin9,12,14,29,​30,​31,​32. The TAMSAT-2 estimates, described in this paper, have also been evaluated over the complex terrain of Ethiopia and demonstrated good skill33. Such skill, both at the daily and dekadal time-step, underlines the effectiveness of using TIR imagery in rainfall estimation where and when rainfall is convective in origin. The TAMSAT approach to rainfall estimation, however, does have limitations. Where rainfall from warm rain processes is dominant, such as along the coastal parts of West Africa34 and over mountainous regions31, the ability to identify rainy cloud is reduced. In addition, since the TAMSAT estimation approach is geared towards drought monitoring where accurately representing low rainfall totals is important, the algorithm in TAMSAT v2.0 (and all previous versions) was calibrated to better capture the more frequent, low rainfall amounts18. In doing so, the total rainfall is underestimated, resulting in an inherent dry bias that is more pronounced when the data (both daily and dekadal estimates) are aggregated (in space and/or time).

The aforementioned dry bias in TAMSAT v2.0 dekadal data, along with unrealistic spatial artefacts that originated from the use of rectangular calibration zones, prompted the TAMSAT Group to modify the calibration design, while ensuring the data is still applicable to drought monitoring. Although the principle features of the TAMSAT rainfall estimation approach have remained the same, the calibration used in version 3.0 differs markedly to version 2.0 and is designed to better capture local variations in the rainfall climate while reducing problems associated with version 2.0. Additionally, the time-step for the primary rainfall estimate is now 5-day (pentad), compared to 10-day in version 2.0. Here, we provide an outline of the common features behind the methodology used to create both TAMSAT-2 and TAMSAT-3. Comprehensive details on the version 2.0 pan-African calibration can be found in Maidment et al.18 and Tarnavsky et al.8, while details on version 3.0 can be found on the TAMSAT website (http://www.tamsat.org.uk).

The TAMSAT algorithm is based on two primary data inputs: Meteosat TIR imagery provided by The European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) and rain gauge observations for calibration (see Fig. 1 for the estimation process). The rainfall estimation approach is based on TIR imagery obtained every 15 min from July 2006 and every 30 min prior to this. The TAMSAT algorithm is an example of a cloud-indexing method: the duration of cloud tops exceeding a predetermined temperature threshold, known as cold cloud duration (CCD), acts as a proxy for rainfall.

The calibration process is divided into two stages. The first stage distinguishes rainy regions from non-rainy regions, while the second stage attempts to assign a rainfall amount for the rainy regions. In the first stage, daily CCD totals are derived at a range of thresholds between −30 °C and −60 °C. These are then summed to the dekadal (in v2.0) or pentadal (in v3.0) time-step and a set of contingency tables are prepared for every threshold, comparing greater than zero CCD at the pixel scale with rainfall occurrence from the collocated rain gauge records. The temperature threshold with the greatest skill for determining rainfall events (greater than 0 mm) is selected based primarily on the rainfall event frequency bias (see Maidment et al.18 for details). In version 2.0, these were determined for large climatologically-similar rectangular zones, whereas in version 3.0, these are derived over 1.0° grid boxes (hence capturing local detail more accurately) where sufficient gauges exist and then interpolated Africa-wide. In the second stage, calibration parameters are obtained by linearly regressing CCD totals for the selected temperature threshold with historical rain gauge accumulations. In version 3.0, a spatially and temporally varying bias adjustment is then made to the calibration parameters. Using the calibration coefficients, rainfall is estimated as a function of CCD, according to equation (1): $(1)raintimestep={a0+a1CCDtimestepCCD>00CCD=0$ Where timestep is either pentad or dekad, depending on the TAMSAT version, and a0 and a1 are the linear calibration coefficients. If CCD is equal to zero, rainfall is also assumed to be zero. The TAMSAT method implements a local calibration, hence the linear calibration coefficients vary spatially and monthly to reflect the geographical and temporal variations in the average rainfall climate across Africa8.

The TAMSAT-2 data are derived from TAMSAT dekadal v2.0 estimates that constitute the TAMSAT African Rainfall Climatology And Time-series (TARCAT) dataset18 which is still routinely updated to the present day. Since the calibrations used in these datasets do not change from year-to-year, the interannual variations in rainfall are dependent only on the satellite observations. The TAMSAT method thus contrasts with other long-term datasets such as CHIRPS and ARC, which merge gauge data in near-real time19,20. The inclusion of contemporaneous gauge data arguably makes maximal use of all available data sources, increasing skill where high quality gauge data are available. The African gauge network is, however, not consistent in either time or space, and the inclusion of gauge data may thus introduce artefacts, especially when assessing long term change21. The TAMSAT datasets hence can be seen as a complement to the other available products.

### Downscaling to the daily time scale

The currently available TAMSAT dekadal (v2.0) and pentadal (v3.0) rainfall estimates are disaggregated to daily values in proportion to the amount of CCD observed for each day (each daily CCD map is created by considering all TIR images from 06:00 to 06:00 the following day, to coincide with the timing gauge observations are usually taken). This has the advantage that the estimates are constrained to match the dekadal or pentadal rainfall totals which are expected to be reliable. The daily rainfall estimates are thus calculated according to equation (2): $(2)raindaily=raintimestepCCDtimestep×CCDdaily$ where raindaily is the daily rainfall estimate, raintimestep is the dekadal (v2.0) or pentadal (v3.0) rainfall estimate, CCDtimestep is the CCD summed over the ten or five days and CCDdaily is the daily CCD. The complete process used to create the TAMSAT daily rainfall estimates is illustrated in Fig. 1.

## Data Records

### Data archive

A time-series of daily totals has been generated from 1983 to the present for all of Africa. A day is considered missing if there is a gap of more than six continuous hours in the TIR imagery. For version 2.0, a dekad is considered missing if there are more than two missing days (see Maidment et al.18 for details), whereas for version 3.0, the pentad is considered missing if more than one day is missing. Despite many incomplete or missing TIR images during the 1980s and early 1990s, both version 2.0 and 3.0 have near-complete archives. For version 2.0 for example, based on available data from the EUMETSAT archive, the dataset is approximately 97% complete (as of December 31st 2016). Specifically, of the 12,419 days between January 1st 1983 and December 31st 2016, there are 398 missing days. Of these, in 271 cases, the whole dekad was missing, resulting in no data to disaggregate, and in 127 cases, individual days within the dekad were missing. Of the missing days, 271 were between 1983 and 1989, 114 were between 1990 and 1999, and 13 were after 2000. There have been no missing days since 2007. As expected, the proportion of missing days is similar for TAMSAT-3. The daily estimates are available from January 11th 1983 to the present and are available within two days after the end of each dekad (i.e., 11th, 21st, and 1st of the following month) for version 2.0 and each pentad (i.e., 6th, 11th, 16th, 21st, 26th and 1st of the following month) for version 3.0.

### Data access and format

The daily rainfall estimates (in mm per day) are freely available as netCDF files for each day from the TAMSAT website (http://www.tamsat.org.uk) and the University of Reading Research Data Archive (version 2.0, Data Citation 1: University of Reading http://dx.doi.org/10.17864/1947.108; version 3.0: Data Citation 2: University of Reading https://doi.pangaea.de/10.1594/PANGAEA.871465). TAMSAT-2 is also available on the International Research Institute for Climate and Society (IRI) Data Library (https://iridl.ldeo.columbia.edu/SOURCES/.Reading/.Meteorology/.TAMSAT/.TARCAT/.v2p0/.daily/), with TAMSAT-3 expected to be available during 2017. The spatial resolution is 0.0375° latitude by 0.0375° longitude with estimates provided for all land points in Africa, including Madagascar. In addition, the TAMSAT website contains quicklook images for each day and a time series extraction tool can be used to extract area-average data for countries, administrative districts and user defined rectangular regions or user defined pixels in csv format. The IRI Data Library includes additional subsetting and data analysis tools.

## Technical Validation

### Study regions and validation data

The daily satellite rainfall estimates have been evaluated using rain gauge records covering four countries (Mozambique, Nigeria, Uganda and Zambia) and one region over south-west Niger consisting of a dense network (see Fig. 2 and Table 1). These regions of Africa are characterised by contrasting rainfall climates and thus, the validation provides a useful indicator of the expected skill of the TAMSAT daily estimates (and the other satellite estimates used in this study) across Africa. The section below summarises the general climate of each region considered.

The rainfall over Niger is typical of that experienced over most of the Sahel, characterized by a single rainy season occurring during boreal summer. The main features of the rainy season are the West African Monsoon, which advects moisture-laden air onto the continent and African Easterly Waves that are associated with the passage of westward propagating mesoscale convective systems that are responsible for the majority of rainfall over this part of Africa35,36. TIR-based estimation algorithms, including TAMSAT, have demonstrated high skill over the Sahel9,12,32. Much of Nigeria’s rainfall climate is similar in nature, although rainfall in the coastal regions and areas surrounding the Cameroon Highlands to the east are often modulated by oceanic and orographic effects respectively, complicating the relationship between cloud top temperature and rainfall31.

Most of Uganda experiences two rainy seasons associated with the seasonal northward and southward migration of the Inter-Tropical Convergence Zone14. Whilst rainfall is convective in origin, the presence of mountain chains to the east and southwest of the country and large bodies of water, such as Lake Victoria and Lake Albert, influence the local climate considerably. While this presents a challenge for TIR-based algorithms due to the increased occurrence of rainfall from warm clouds, particularly where local changes to the rainfall processes are pronounced, 10-day total satellite-derived estimates have shown to be skilful over this region14.

Zambia has one rainy season occurring between October and April. As the country is relatively flat and landlocked and rainfall is primarily a result of convective systems, cold cloud tops of these convective systems and rainfall are usually well correlated, as found across Niger.

Finally, the climate of Mozambique contrasts with the other regions considered in this study. The close proximity to the Indian Ocean and the passage of tropical depressions and cyclones create a varied and complex climate. Such variable weather regimes presents a challenge for TIR-based algorithms, especially when other data (e.g., gauge data) are not incorporated contemporaneously17.

The daily rain gauge records from Nigeria, Uganda and Zambia were obtained directly from their respective meteorological agencies. Each of these datasets were subject to rigorous quality control measures. These procedures involved checking for erroneous entries, duplicates, and outliers. If outliers were flagged, temporal and spatial checks were then conducted. The high density Niger dataset was created during the Hydrology-Atmosphere Pilot Experiment in the Sahel (Hapex-Sahel) experiment in the early 1990s (refs 37,​38,​39,​40) and has been used in many subsequent studies10. Finally, the Mozambique data was sourced from The Mozambique National Institute of Meteorology and quality controlled for The World Bank41. Only those records during each region’s rainy season were used (for Uganda, records covering the ‘long rains’ were used). Whilst not all stations used have complete records, each regional dataset had at least 15,000 gauge records available for validation (see Table 1).

The variability in the TAMSAT daily rainfall estimates is derived entirely from the satellite imagery—with the calibration carried out on 10-daily (v2.0) or 5-daily (v3.0) accumulated rainfall/CCD over regions encompassing hundreds of gauge-CCD pairs8,18. The evaluation of the TAMSAT rainfall estimates described here can thus arguably be considered to be against independent data, even though some of the gauge records may have been included in the historical calibration. This is not the case for some of the comparison satellite datasets used in this study, which incorporate contemporaneous gauge records. The Niger gauge dataset however, is not included either in the TAMSAT version 2.0 dekadal calibration, or—to our knowledge—in the comparison satellite datasets.

To ensure a consistent comparison between the satellite estimates and ground-based data, all rain gauge records were interpolated onto a regular 0.25° by 0.25° grid using block kriging. Kriging was chosen as it has been shown to be superior compared to other forms of spatial interpolation42,​43,​44. Since the uncertainty in the interpolated rainfall amount increases significantly away from a rain gauge, only those grid squares containing at least one gauge were used. For simplicity, it was assumed that all 0.25° grid squares containing only dry gauges were set to zero rainfall. In the event of a grid square containing dry and wet gauges, the kriged rainfall amount was used.

It should be noted that given the high density of the Niger gauge network, the interpolated area-average values will, in general, be much more accurate than the equivalent interpolated grid values over the four other regions whose gauge networks are considerably less dense. Moreover, since the availability of the satellite estimates and gauge data for each region do not cover the same time periods, it is not possible to directly compare the results from one region to another. This is particularly the case for Niger whose gauge data is only available for one year. However, the results presented provide a useful indicator of the expected skill of the TAMSAT daily rainfall estimates, in comparison to the other satellite datasets.

The TAMSAT-2 and TAMSAT-3 rainfall estimates were evaluated alongside six other satellite precipitation datasets providing daily estimates. These datasets are CHIRPS, CHIRP (CHIRPS without stations), ARC, NOAA’s African Rainfall Estimates version 2 (hereinafter RFE), the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA)-3B42 and NOAA’s Climate Prediction Center (CPC) morphing technique (CMORPH) (see Table 2). The latter three datasets include passive microwave (PMW) imagery, and hence are expected to be capable of providing more realistic information on rainfall intensity. A brief description of these datasets is as follows:

CHIRPS provides 30+ years of high resolution (0.05° lat-lon grid) quasi-global (50°S-50°N and 180°W–180°E) rainfall estimates at daily, pentadal, and monthly time-steps. CHIRPS depends on several data sources to produce estimates of rainfall. First, TIR imagery are used to produce maps of pentadal CCD. Unlike TAMSAT, which implements a temporally and spatially varying threshold to compute the CCD, a constant rain/no-rain threshold of 235 K is used. Calibration regression coefficients are then derived by comparing TMPA-3B42 rainfall estimates (2000–2013) and CCD. These calibration parameters are then applied to the complete CCD record to produce a time-series of rainfall estimates. Next, these pentadal rainfall estimates are expressed as a fraction of their long-term mean (1981–2013) and then multiplied by the Climate Hazards group Precipitation climatology (CHPclim). This step produces what is known as CHIRP, i.e., the satellite-based estimates with no merging of rain gauge records, and is also evaluated in this study. CHPclim is an attempt to create accurate pentadal and monthly climatologies based on rain gauge records and multiple satellite-based products45. Finally, station rain gauge records are merged with CHIRP using a modified form of the inverse distance weighting algorithm to create the CHIRPS product. A preliminary version, CHIRPS-prelim, is created with a 2-day latency based on GTS data, while the final version (evaluated in this paper) makes use of public monthly gauge summaries and additional data from meteorological agencies. Daily estimates of precipitation are created by disaggregating the pentadal estimates using daily CCD observations (analogous to the method described in this paper).

Both ARC and RFE produce daily rainfall estimates solely for Africa and were created to aid drought monitoring across sub-Saharan Africa. RFE uses satellite imagery from two streams, namely (1) TIR imagery to create rainfall estimates based on the GOES Precipitation Index (GPI) algorithm46 and (2) PMW imagery from the AMSU and SSM/I satellite instruments are used to create rainfall estimates using the method described by Ferraro and Marks47. The TIR and PMW rainfall estimates are then merged, before being adjusted to available GTS station data. ARC is a long term (30+ years) dataset and employs a similar method to RFE in that satellite estimates are merged with GTS gauge data, however PMW data are not considered.

The primary objective of the TRMM satellite and the derived products was aimed at improving observations of tropical precipitation48,49. The TRMM satellite, equipped with a precipitation radar, as well as microwave imager and a visible-infrared scanner, was used to better estimate precipitation features such as intensity, distribution, and type. 3-hourly TMPA-3B42 (evaluated in this study) estimates are derived from merged-TIR imagery from geostationary and polar-orbiting platforms, adjusted by information derived from the TRMM instruments. The final step used the monthly Global Precipitation Climatology Centre (GPCC) gauge analysis to scale the monthly TMPA estimates to the gauge values. Sub-monthly products, including the 3-hourly TMPA-3B42 estimates, take account of this gauge scaling. TMPA data were issued to provide near-global coverage at a spatial resolution of 0.25°.

CMORPH50 produces global rainfall estimates from various PMW sensors. Motion vectors are calculated using half-hourly geostationary TIR imagery, which are used to propagate the PMW precipitation fields forward and back in time where no direct PMW data are available. A time-weighted interpolation is applied to the available PMW estimates to provide an estimate of the rainfall distribution and intensity for the intervening missing half-hour periods. This process is referred to as ‘morphing’ of the observations. For this study, the 3-hourly estimates at a spatial resolution of 0.25° were used.

In the case of CHIRPS, the operational product CHIRPS-Prelim was not considered because at the time this study was conducted, the data were not available prior to 2015. All of the other datasets can be considered fully operational, except TMPA-3B42 which was replaced by the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (GPM) (IMERG) estimates in 2014. For consistency, all satellite datasets (except TMPA-3B42 and CMORPH) were bilinearly interpolated to a regular grid spacing of 0.25° by 0.25°—same as the kriged gauge grid, and grid squares with coincident gauge measurements were then extracted. When summing the CMORPH and TMPA-3B42 3-hourly estimates to daily totals, the 3-hourly slots corresponding to the TAMSAT day (i.e., 06:00–06:00 the following day) were chosen. Evaluations were then carried out for the period of the gauge data, which differs from region to region (see Table 1).

### Statistical comparison of TAMSAT daily rainfall estimates with rain gauge data and other satellite-based rainfall datasets

The TAMSAT version 2.0 dekadal and monthly estimates and their representation of the Africa-wide climatology and seasonal cycle have been evaluated elsewhere18 and hence these features are not assessed here. Similar analyses for TAMSAT version 3.0 are documented on the TAMSAT website. Instead, the paper focuses on the ability of TAMSAT to capture daily rainfall characteristics, i.e., occurrence and amount.

#### Rainfall occurrence

Rainfall occurrence was evaluated using a suite of binary skill scores that encapsulate information on rainy/dry days in a contingency table (see Table 3).

A contingency table has been constructed for each region using all available data and is used to compute the following statistics:

• Accuracy; defined as the fraction of rainfall estimates that were estimated correctly: (A+D)/(A+B+C+D)

• Frequency bias (bias); defined as the rainfall estimate frequency of rainy days compared to the gauge observed frequency of rainy days: (A+B)/(A+C)

• Probability of detection (POD); defined as the fraction of rainy days correctly estimated: A/(A+C)

• False alarm ratio (FAR): defined as the proportion of estimated rainy days that did not actually occur: B/(A+B)

• Probability of false detection (POFD); defined as the fraction of gauge observed dry days incorrectly estimated as a rainy day: B/(B+D)

• Equitable threat score (ETS); defined as the fraction of gauge observed rainy days that were correctly estimated allowing for hits due to chance: (A-Arandom)/(A+B+C-Arandom) where Arandom=(A+C)(A+B)/(A+B+C+D)

• Peirce’s Skill Score (PSS, also known as Hanssen and Kuipers discriminant); defined as the ability of the satellite estimate to differentiate between a rainy day and a dry day (as given by the gauge observation): (A/(A+C))-(B/(B+D))

Figure 3 displays barplots for each binary skill score over each of the study regions and over all regions for TAMSAT-2, TAMSAT-3 and the other six satellite datasets (values are also given in Table 4). In general, the TAMSAT skill scores (both versions) are similar to most of the other satellite products on all skill measures. Across all regions (leftmost column in Fig. 3), the accuracy skill measure indicates around 70% of the estimates were correct (i.e., in estimating dry and rainy days) and that around 70–80% of the observed rainy days were captured (POD). However, around 35–45% of estimated rainy days were falsely estimated (FAR) resulting in all products overestimating the occurrence of rainy days (bias), with the errors most severe in CHIRP, RFE and CMORPH. Similarly, around 20–40% of the gauge observed dry days, were estimated as rainy days (POFD). Of the eight datasets, the TIR-based products show more commonality than the PMW-based products. The similarity between skill scores of the former suggests that this is a result of the use of TIR imagery being used to define those regions which are rainy. The exception to these findings is CHIRP, which, across all countries, grossly overestimates the frequency of rainfall events, leading to a high frequency bias (1.98) and POFD (0.67). However, CHIRPS demonstrates marked improvement on all statistical measures compared to CHIRP.

Regionally however, there are some differences. Across most of the satellite products, scores are generally better for Niger, particularly for TAMSAT-2, which has the best scores for accuracy, FAR and POFD, and TAMSAT-3, which has the best scores for accuracy (same value as TAMSAT-2), POD, ETS and PSS. Scores are generally worst for Mozambique and Uganda. This is consistent with the expectation that satellite rainfall estimation algorithms, even those that incorporate PMW imagery, generally perform worse when the rainfall climate is strongly modulated by large water bodies, and for regions in close proximity to the ocean and complex topography17. Conversely, such algorithms perform well in the Sahel and over Zambia, where rainfall is primarily convective and the rainfall climate is less variable spatially. The high skill across both Niger and Zambia reflects this.

The skill scores were also assessed as a function of rainfall threshold (i.e., changing the satellite rainfall estimate threshold at which the contingency table is constructed). However, for all datasets the skill scores exhibited no improvement in skill as the threshold was increased from 0 mm up to 40 mm (not shown).

#### Rainfall amount

Figure 4 shows a density scatterplot of TAMSAT-2, TAMSAT-3 and the other satellite rainfall estimates against kriged rain gauge amounts for all regions included in this study. Quantitative assessment of rainfall amount was based on the calculation of bias, coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE). The kriging process also generates an estimate of the uncertainty of the interpolated gauge grid value. Using this, the fraction of satellite estimates within one and two standard errors of the gauge value was also computed. A summary of the aforementioned statistics is given in Fig. 5 for each dataset and for each region (values are also given in Table 5).

There is some correlation between rainfall estimates and gauge measured rainfall amount for all of the satellite-rainfall estimation datasets, but there are also significant discrepancies (see Fig. 4). For example, TAMSAT-2 systematically underestimates rainfall amount, and does not distinguish between moderate and high rainfall. Figure 5 confirms that there is a negative TAMSAT-2 bias for all countries, with the largest bias being for Mozambique and Nigeria. The correlations (i.e., R2) between gauge and TAMSAT-2 rainfall amounts range from 0.05 (Mozambique) to 0.61 (Niger). Niger also has the lowest errors (RMSE and MAE) out of the five regions whereas Mozambique has the largest RMSE. TAMSAT-3 however demonstrates improvement on some on the statistics considered when compared to TAMSAT-2, most notably, a reduction in the dry bias. There is also slightly better distinction between moderate and high rainfall (c.f. Figs 4 and 6).

When the TAMSAT estimates are contrasted with the other rainfall datasets, it can be seen that over all countries, TAMSAT is in general, comparable in all skill measures, except for bias. CHIRPS has the smallest bias, which can be attributed to the bias removal procedure implemented in the rainfall estimation approach. TAMSAT-2 and TAMSAT-3 estimates typically have smaller errors, as given by lower RMSE and MAE values. The smallest errors are for Niger. Low R2 values (with the exception of TAMSAT over Niger) indicate limited skill in representing variability for all datasets. Given the high density gauge network over Niger and the contiguous 0.25° grid squares used here, measures of variability (i.e., R2) are associated with both spatial and temporal variability. All datasets typically perform worse over Mozambique as evident by the large spread of data points in Fig. 4. Despite including PMW data in their estimation approaches, neither RFE, TMPA-3B42 and CMOPRH demonstrate substantial improvements in skill over the TIR-based methods, particularly for rainfall amount variability. This indicates that at such fine scales (daily and 0.25°), no dataset considered here can provide robust estimates of daily rainfall amount. This is in agreement with other studies at such scales31,33,51.

For all of the regions other than Niger, it is likely that at least some of the validation gauge records have been ingested into the rainfall estimation process for ARC, RFE and CHIRPS. While the high gauge density may be a factor, it is notable that TAMSAT has significantly more skill than the other datasets for Niger, in particular, the relatively high R2 values for both TAMSAT-2 and TAMSAT-3. As TAMSAT is the only dataset considered here that is locally calibrated for both rainfall occurrence and rainfall amount, the skill of the TAMSAT data is noteworthy given it does not include contemporaneous information from gauges or PMW imagery. This illustrates the importance and the utility of a local and historical calibration approach.

Figure 6 gives an example of rainfall estimates for January 1st 2010. It can be seen that while the rainfall fields have similar spatial structures, there are fewer intensely rainy pixels in TAMSAT-2 (compared to the other datasets), although this is ameliorated somewhat in TAMSAT-3. While the rainy areas are similar for all of the datasets, the intensities vary considerably. This is consistent with the quantitative analysis described in this study, which showed that for all of the datasets, occurrence is more reliably estimated than amount across the five countries considered.

## Usage Notes

The TAMSAT system was originally designed for seasonal early warning of drought. Until the initial release of daily TAMSAT-2 in 2012, ARC in 2013, and CHIRPS in 2014, long-term satellite-based rainfall data for drought early warning have typically been released at the dekadal time scale. This paper has presented the daily version of the TAMSAT data (versions 2.0 and 3.0). TAMSAT has previously been demonstrated to have good skill for 10-day cumulative rainfall estimates14 and we have shown here that the daily data reliably represents the occurrence of rainfall, capturing, on average, around 70 % of observed rainy days (POD) and falsely estimating less than 40 % of rainy days (FAR) across the case study countries. Regionally, TAMSAT captured rainy and non-rainy days better across Niger and Zambia—regions whose rainfall climates are not significantly modulated by large water bodies and complex topography. Variability in rainfall amount is, however, not well captured. Whilst the ability to differentiate between low and high rainfall amounts is important, it can be argued that across Africa, long dry spells (which, to be detected, require satellite estimation algorithms to skilfully differentiate a rainy day from a non-rainy day) is more damaging to crops than extremes of rainfall1. Many aspects of the skill of the TAMSAT daily data are however similar or better (depending on the skill measure) than other, widely used African operational daily datasets. Since CHIRPS, ARC, and RFE make use of contemporaneous gauges which are likely included in the validation datasets, this complicates the interpretation of the results.

An obvious application for the daily data is the production of rainfall estimates for periods other than 5-day or 10-day accumulations starting on fixed days of the calendar month. The availability of a daily version of the TAMSAT dataset gives a choice of products based on the optimal length and starting point of cumulative rainfall estimates required. This facilitates comparison with other datasets, which are issued at weekly resolutions for example, and allows for greater flexibility for agricultural and hydrological applications.

Many crop and hydrological models require daily input52,​53,​54,​55. In the case of crop modelling, yield generally depends on cumulative rainfall for key parts of the growing cycle. Daily data are therefore useful because the data can pick out key development phases of crops and is an example of the value of being able to cumulate rainfall over bespoke periods. Although TAMSAT data may be too coarse for analysis of small catchments, hydrological models for medium and large catchments may be able to utilise data at 4-km resolution56. The TAMSAT data have most skill when spatially aggregated4,14, and this is especially the case for rainfall that is not aggregated in time. In this context, the suitability of TAMSAT daily rainfall estimates depends on the hydrological features of the catchment and the purpose of the monitoring or modelling. TAMSAT’s poor skill for rainfall amount means that it is most suitable for monitoring large catchments where river discharge is determined by gradual accumulation of rainfall over a period of days. It can be argued that the TAMSAT data is not suitable for providing information on pluvial flood risk.

Unlike the other daily rainfall datasets considered, TAMSAT does not incorporate gauge data in real time. Recent studies have shown that inconsistencies in the gauge record can lead to spurious trends in rainfall, especially in the tropics, where the station network is patchy21,57. The TAMSAT cumulative rainfall datasets and the derived daily estimates can therefore be considered temporally consistent, which is important in both assessing climatic risks and for seasonal rainfall monitoring. As such, TAMSAT daily data are well suited to the study of long term changes in daily metrics, relating primarily to occurrence, such as the length of dry spells and the length of the growing season58. Since it cannot capture the intensity of high rainfall events well, TAMSAT daily data is less suited for studies of long term changes in rainfall amount.

In conclusion, we present the TAMSAT high-resolution daily rainfall dataset for Africa. The data are back calculated to January 1983 and updated in near-real time (v2.0 is updated every ten days and v3.0 is updated every five days). The recent development of TAMSAT version 3.0 pentadal estimates and derived daily estimates removes spatial artefacts and greatly reduces the dry bias associated with the previous version. A formal statistical assessment indicates that both TAMSAT daily datasets have comparable skill to other remotely sensed rainfall datasets, and can therefore be used for similar applications. Furthermore, TAMSAT’s historical calibration suits it well for risk assessment and the investigation of long-term changes in the rainfall climate.

Competing interests: The authors declare no competing financial interests.

How to cite this article: Maidment, R. I. et al. A new, long-term daily satellite-based rainfall dataset for operational monitoring in Africa. Sci. Data 4:170063 doi: 10.1038/sdata.2017.63 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

, & Influence of extreme weather disasters on global crop production. Nature 529, 84–87 (2016).

2. 2.

United Nations Office for the Coordination of Humanitarian Affairs. Horn of Africa Drought Crisis Situation Report No. 5 (21 July 2011) (2011).

3. 3.

, & Building resilience to face recurring environmental crisis in African Sahel. Nat. Clim. Chang 3, 631–637 (2013).

4. 4.

et al. The Use of Remotely Sensed Rainfall for Managing Drought Risk: a Case Study of Weather Index Insurance in Zambia. Remote Sens 8, 342 (2016).

5. 5.

, & Advances in the Stochastic Modeling of Satellite-Derived Rainfall Estimates Using a Sparse Calibration Dataset. J. Hydrometeorol. 15, 1810–1831 (2014).

6. 6.

et al. African Climate Change: Taking the Shorter Route. Bull. Am. Meteorol. Soc 87, 1355–1366 (2006).

7. 7.

, & Relating Point to Area Average Rainfall in Semiarid West Africa and the Implications for Rainfall Estimates Derived from Satellite Data. J. Appl. Meteorol. 28, 252–266 (1989).

8. 8.

et al. Extension of the TAMSAT Satellite-Based Rainfall Monitoring over Africa and from 1983 to Present. J. Appl. Meteorol. Climatol. 53, 2805–2822 (2014).

9. 9.

, & Validation of satellite and ground-based estimates of precipitation over the Sahel. Atmos. Res. 47, 651–670 (1998).

10. 10.

, & Optimal areal rainfall estimation using raingauges and satellite data. J. Hydrol. 222, 93–108 (1999).

11. 11.

, , , & Validation of high‐resolution satellite rainfall products over complex terrain. Int. J. Remote Sens. 29, 4097–4110 (2008).

12. 12.

, , & An intercomparison of 10-day satellite precipitation products during West African monsoon. Int. J. Remote Sens. 32, 2353–2376 (2011).

13. 13.

, , , & Validation of Satellite-Based Precipitation Products over Sparsely Gauged African River Basins. J. Hydrometeorol. 13, 1760–1783 (2012).

14. 14.

et al. Evaluation of satellite-based and model re-analysis rainfall estimates for Uganda. Meteorol. Appl. 20, 308–317 (2013).

15. 15.

et al. Comparing Satellite and Surface Rainfall Products over West Africa at Meteorologically Relevant Scales during the AMMA Campaign Using Error Estimates. J. Appl. Meteorol. Climatol. 49, 715–731 (2010).

16. 16.

, , & Evaluating satellite-based diurnal cycles of precipitation in the African tropics. J. Appl. Meteorol. Climatol. 55, 23–39 (2015).

17. 17.

et al. Evaluation of Satellite Rainfall Estimates for Drought and Flood Monitoring in Mozambique. Remote Sens 7, 1758–1776 (2015).

18. 18.

et al. The 30 year TAMSAT African Rainfall Climatology And Time series (TARCAT) data set. J. Geophys. Res. Atmos. 119, 2014JD021927 (2014).

19. 19.

& African Rainfall Climatology Version 2 for Famine Early Warning Systems. J. Appl. Meteorol. Climatol. 52, 588–606 (2013).

20. 20.

et al. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Sci. Data 2, 150066 (2015).

21. 21.

, & Recent observed and simulated changes in precipitation over Africa. Geophys. Res. Lett. 42, 2015GL065765 (2015).

22. 22.

et al. Uncertainties in remotely sensed precipitation data over Africa. Int. J. Climatol. 36, 303–323 (2015).

23. 23.

, , & Uncertainties in daily rainfall over Africa: assessment of gridded observation products and evaluation of a regional climate model simulation. Int. J. Climatol. 33, 1805–1817 (2013).

24. 24.

et al. World population stabilization unlikely this century. Science 346, 234–237 (2014).

25. 25.

, , & The ENACTS Approach: Transforming climate services in Africa one country at a time. J. World's Policy 1–24 (2016).

26. 26.

, & Rainwatch. Bull. Am. Meteorol. Soc 90, 1607–1614 (2009).

27. 27.

, & in Colloques et Seminaires. Validation problems of rainfall estimation methods by satellite in intertropical Africa. Proceedings of the Niamey Workshop (1994) (ORSTROM, 1996).

28. 28.

, & Rainfall estimates in the Sahel from cold cloud statistics: accuracy and limitations of operational systems (Proceedings of the Niamey Workshop, February 1991). Soil Water Balanc. Sudano-Sahelian Zo. IAHS 199, 65–74 (1991).

29. 29.

, , & Comparison of TAMSAT and CPC rainfall estimates with raingauges, for southern Africa. Int. J. Remote Sens. 22, 1951–1974 (2001).

30. 30.

& A comparison of Meteosat rainfall estimation techniques in Kenya. Meteorol. Appl. 8, 107–117 (2001).

31. 31.

et al. Validation of satellite rainfall products over East Africa’s complex topography. Int. J. Remote Sens. 28, 1503–1526 (2007).

32. 32.

, , , & The TAMORA algorithm: satellite rainfall estimates over West Africa using multi-spectral SEVIRI data. Adv. Geosci 25, 3–9 (2010).

33. 33.

, , , & Investigation of discrepancies in satellite rainfall estimates over Ethiopia. J. Hydrometeorol. 15, 2347–2369 (2014).

34. 34.

& Stratiform precipitation production over sub-Saharan Africa and the tropical East Atlantic as observed by TRMM. Q. J. R. Meteorol. Soc 132, 2235–2255 (2006).

35. 35.

, & Contribution of Mesoscale Convective Complexes to Rainfall in Sahelian Africa: estimates from Geostationary Infrared and Passive Microwave Data. J. Appl. Meteorol. 38, 957–964 (1999).

36. 36.

, & Seasonal cycle and interannual variability of the Sahelian rainfall at hydrological scales. J. Geophys. Res. 108, 8389 (2003).

37. 37.

& Rainfall climatology of the HAPEX-Sahel region during the years 1950–1990. J. Hydrol. 189, 43–73 (1997).

38. 38.

& Rainfall monitoring during HAPEX-Sahel. 2. Point and areal estimation at the event and seasonal scales. J. Hydrol. 189, 97–122 (1997).

39. 39.

, & Rainfall monitoring during HAPEX-Sahel. 1. General rainfall conditions and climatology. J. Hydrol. 189, 74–96 (1997).

40. 40.

et al. Rainfall estimation in the Sahel: the EPSAT-NIGER experiment. Hydrol. Sci. J 37, 201–215 (1992).

41. 41.

Gridded Analysis of Meteorological Variables in Mozambique (The World Bank, Washington, D.C., USA: 2011).

42. 42.

& Objective analyses and mapping techniques for rainfall fields: An objective comparison. Water Resour. Res. 18, 413–431 (1982).

43. 43.

& A comparative analysis of techniques for spatial interpolation of precipitation. JAWRA J. Am. Water Resour. Assoc 21, 365–380 (1985).

44. 44.

Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. J. Hydrol. 228, 113–129 (2000).

45. 45.

et al. A global satellite-assisted precipitation climatology. Earth Syst. Sci. Data 7, 275–287 (2015).

46. 46.

& The Relationship between Large-Scale Convective Rainfall and Cold Cloud over the Western Hemisphere during 1982–84. Mon. Weather Rev. 115, 51–74 (1987).

47. 47.

& The Development of SSM/I Rain-Rate Retrieval Algorithms Using Ground-Based Radar Measurements. J. Atmos. Ocean. Technol. 12, 755–770 (1995).

48. 48.

et al. The Status of the Tropical Rainfall Measuring Mission (TRMM) after Two Years in Orbit. J. Appl. Meteorol. 39, 1965–1982 (2000).

49. 49.

et al. The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-Global, Multiyear, Combined-Sensor Precipitation Estimates at Fine Scales. J. Hydrometeorol. 8, 38–55 (2007).

50. 50.

, , & CMORPH: a method that produces global precipitation estimates from passive microwave and infrared data at high spatial and temporal resolution. J. Hydrometeorol. 5, 487–503 (2004).

51. 51.

, , & Validation of Satellite Rainfall Products for Western Uganda. J. Hydrometeorol. 15, 2030–2038 (2014).

52. 52.

, , , & Model Development The Joint UK Land Environment Simulator (JULES), model description—Part 1 : energy and water fluxes. Geosci. Model Dev 4, 677–699 (2011).

53. 53.

et al. Model Development The Joint UK Land Environment Simulator (JULES), model description—Part 2 : carbon fluxes and vegetation dynamics. Geosci. Model Dev 4, 701–722 (2011).

54. 54.

et al. JULES-crop : a parametrisation of crops in the Joint UK Land Environment Simulator. Geosci. Model Dev 8, 1139–1155 (2015).

55. 55.

, , , & Design and optimisation of a large-area process-based model for annual crops. Agric. For. Meteorol 124, 99–120 (2004).

56. 56.

, , & Satellite-driven downscaling of global reanalysis precipitation products for hydrological applications. Hydrol. Earth Syst. Sci. Discuss 11, 9067–9112 (2014).

57. 57.

, , & Fingerprints of changes in annual and seasonal precipitation from CMIP5 models over land and ocean. Geophys. Res. Lett. 39, L21706 (2012).

58. 58.

, , & Trends in the start of the wet season over Africa. Int. J. Climatol. 1225, 1216–1225 (2009).

## Data Citations

1. 1.

Maidment, R., Black, E., & Tarnavsky, E. University of Reading http://dx.doi.org/10.17864/1947.108 (2017)

2. 2.

Maidment, R., Black, E., & Young, M. University of Reading https://doi.pangaea.de/10.1594/PANGAEA.871465 (2017)

## Acknowledgements

This paper is dedicated to our friend and mentor, George Dugdale, who sadly passed away on 1 March 2016. George was a founding member of the TAMSAT Group who pioneered the use of satellite imagery for rainfall estimation over Africa. His outstanding contribution to the field has left a lasting legacy that is strongly reflected in the research and other activities carried out by the TAMSAT Group today. He is sorely missed by all that who knew him.

This work was implemented as part of the CGIAR Research Program on Climate Change, Agriculture and Food Security (CCAFS), which is a strategic partnership of CGIAR and Future Earth. It was carried out with funding by CGIAR Fund Donors, the Danish International Development Agency (DANIDA), Australian Government (ACIAR), Irish Aid, Environment Canada, Ministry of Foreign Affairs for the Netherlands, Swiss Agency for Development and Cooperation (SDC), Instituto de Investigação Científica Tropical (IICT), UK Aid, Government of Russia, the European Union (EU), New Zealand Ministry of Foreign Affairs and Trade, with technical support from the International Fund for Agricultural Development (IFAD). The views expressed in this document cannot be taken to reflect the official opinions of CGIAR or Future Earth. The authors would like to thank Keith Shine for his support and input into the initial CCAFS study that led to this paper. The development of the TAMSAT version 3.0 dataset was supported by the European Commission MARSOP4 programme. R.I.M. is supported by the HyCRISTAL project (NE/M020371/1). E.B. is supported by the NCAS-Climate core programme, and the following NERC grants: ERADACS NE/P015352/1, BRAVE2 NE/M008983/1 and HyCRISTAL NE/M020371/1. M.Y. is supported by the NCAS-Climate core programme and the European Commission MARSOP4 programme.

## Author notes

• David Grimes

Deceased.

## Affiliations

1. ### Department of Meteorology, University of Reading, Reading RG6 6BB, UK

• Ross I Maidment
• , David Grimes
• , Emily Black
• , Elena Tarnavsky
• , Matthew Young
• , Richard P Allan
•  & Thorwald Stein
2. ### International Research Institute for Climate and Society (IRI), Columbia University, New York, NY 10964-1000, USA

• Helen Greatrex
3. ### National Centre for Earth Observation (NCEO), Reading RG6 6BB, UK

• Richard P Allan
4. ### Zambian Meteorological Department, P.O. Box 30200, Lusaka, Zambia

• Edson Nkonde
5. ### Uganda National Meteorological Authority, P.O. Box 7025, Kampala, Uganda

• Samuel Senkunda

• Edgar Misael Uribe Alcántara

## Authors

### Contributions

R.I.M. did the bulk of the work both creating the TAMSAT datasets, carrying out the evaluations and writing the paper. D.G. had the original insight that skilful daily estimates could be provided by weighting dekadal rainfall estimates by daily CCD. E.B. worked with R.I.M. to design the evaluations and create TAMSAT version 3.0, and drafted some of the discussion sections of the paper. E.T. worked with R.I.M. to create TAMSAT version 2.0, and provided intellectual input to the evaluation of the daily data, as well as operational leadership of the TAMSAT pan-African calibration and validation system, on which the disaggregated daily products are based. M.Y. worked with R.I.M. to create TAMSAT version 3.0. H.G., R.P.A. and T.S. had intellectual input into the evaluations and analysis. E.N., S.S. and E.M.U.A. processed and provided rain gauge records from Zambia, Uganda and Mozambique respectively.

### Competing interests

The authors declare no competing financial interests.

## Corresponding author

Correspondence to Ross I Maidment.