The evaluation of IMERG and ERA5-Land daily precipitation over China with considering the influence of gauge data bias

Evaluating the accuracy of the satellite and reanalysis precipitation products is very important for understanding their uncertainties and potential applications. However, because of underestimation existing in commonly used evaluation benchmark, gauge precipitation data, it is necessary to investigate the influence of systematic errors in gauge data on the performance evaluation of satellite and reanalysis precipitation datasets. Daily satellite-based IMERG and model-based ERA5-Land, together with gauge precipitation data, were collected with the period from 2005 to 2016 over China in this study. Daily corrections for precipitation biases from wind-induced undercatch, wetting loss, and trace error were made for gauge measurements. A set of metrics, including relative bias, Kling-Gupta efficiency, frequency bias, and critical success index, were used to evaluate and intercompare the performances of IMERG and ERA5-Land against original and bias-corrected gauge data in different locations, years, seasons, climatic zones, classes of precipitation events, and precipitation phases. The results have shown that: After removing the bias in gauge data, the relative biases of IMERG and ERA5-Land both significantly decline. The noticeable changes of their accuracy occur and vary with different locations, years, seasons, climatic zones, and precipitation phases. Furthermore, the frequency biases of IMERG and ERA5-Land rise in no precipitation events and decline in light, moderate, heavy, and extreme precipitation events. The detection capability of IMERG and ERA5-Land in no and light precipitation events is also obviously affected. Therefore, this study has demonstrated the significant influence of systematic gauge precipitation errors on the assessment of IMERG and ERA5-Land and reinforces the necessity to remove negative bias in gauge data before using it as the benchmark.

. (a) Topography and the location of the selected weather stations over China. (b) Climatic zones using multi-year average precipitation from bias-corrected gauge data (see Table 1). (c) Enlarged detail plot of two nearest weather stations along with ERA5_Land cell and IMERG cell for the purple area in (a). All of the maps in this figure and Figs. 2, 3, 4 and 5 were created by ArcGIS 10.2 software (https:// resou rces. arcgis. com/ en/ help/ main/ 10.2/).

IMERG.
As one of the most popular algorithms of GPM, IMERG intercalibrates, merges, and interpolates microwave precipitation estimates from GPM constellation, microwave-calibrated infrared (IR) satellite estimates, Global Precipitation Climatology Centre (GPCC) monthly gauge analyses, and data from other sensors for the TRMM and GPM eras 39 . The latest Version 06B of IMERG extends the time range of data back to June 2000 by integrating the precipitation estimates collected during the TRMM era. The system runs three times for each observation time, providing Early Run (near real-time), Late Run (near real-time), and Final Run (researchlevel) products. The IMERG Final Run product includes two precipitation variables: the microwave-infrared estimates without gauge calibration (IMERG_uncal) and the calibrated product based on GPCC monthly gauge analysis (IMERG_cal). Because the gauge observations from weather stations used in this study are incorporated in GPCC monthly gauge analysis but not assimilated into ERA5-Land, the IMERG_uncal (hereafter IMERG) rather than IMERG_cal is selected for an independent and fair evaluation. The half-hourly 0.1° IMERG Final Run precipitation estimates were gathered from Goddard Earth Sciences Data and Information Services Center (GES DISC) (https:// disc. gsfc. nasa. gov/). ERA5-Land. ERA5-Land, produced by replaying the land component of the ECMWF ERA5 climate reanalysis, offers finer-scale resolution (9 km) and more accurate land parameters than 31 km ERA5-HRES. Another critical difference between the two products is that ERA5-Land is only forced by the ERA5 low atmosphere meteorological field in conjunction with an additional lapse-rate correction and is not coupled to the atmospheric model, which is designed to make running ERA5-Land computationally affordable 47 . The precipitation data used in this paper is the sum of large-scale and convective precipitation from the beginning to the end of the forecast step. The regular hourly ERA5-Land precipitation estimates with 0.1° × 0.1° latitude/longitude grid were downloaded from the Climate Data Store (CDS) (https:// cds. clima te. coper nicus. eu/).

Methods.
Pre-processing of the datasets. It should be noted that CMA records daily precipitation from 8:00 p.m. Beijing time of the previous day to 8:00 p.m. Beijing time of the current day, which is equivalent to 12:00 Coordinated Universal Time (UTC) to 12:00 UTC. To get the consistent definition of daily precipitation with CMA, we aggregated all half-hourly IMERG and hourly ERA5-Land estimates falling between 12:00 and 12:00 UTC to get daily IMERG and ERA5-Land precipitation.
Due to point measurement characteristic, gauge data do not match the areal estimates of IMERG and ERA5-Land 48 . To deal with this issue, multiple rain gauge observations within a pixel can be averaged to approximate areal precipitation in the presence of dense gauge networks. However, the gauge network used in this study is sparse. The fact is that the nearest great circle distance between stations is 22.25 km, which is larger than one grid distance of IMERG and ERA5-Land (Fig. 1c). As alternative solutions, different interpolation methods can be used to downscale areal estimates of IMERG and ERA5-Land to point scale or upscale point measurement of gauge to areal precipitation. However, these methods vary with different hydroclimate and terrain and the uncertainty introduced by interpolation itself is hard to quantify. Therefore, to preserve the accuracy of the rain gauge data, IMERG and ERA5-Land, we adopted a pixel-topoint strategy adopted by many other authors in data-scarce regions 21,[49][50][51][52][53] to evaluate IMERG and ERA5-Land. This strategy neglects the scale gap between daily precipitation estimates from gridded IMERG and ERA5-Land and their counterpart derived from rain gauge point measurement and assumes the grid-scale precipitation is equal to point-scale precipitation regardless of the gauge position in the pixel. Precipitation from the specific pixels of IMERG and ERA5-Land were extracted using the geographic location of weather stations and were compared with the corresponding gauge observations by the following metrics.
Evaluation metrics. To assess the accuracy of IMERG and ERA5-Land, two continuous metrics, relative bias (RB) and Kling-Gupta efficiency (KGEʹ), were used. RB describes the deviation of IMERG and ERA5-Land from gauge data. RB range is from − ∞ to ∞ and perfect value is 0. The sign of value (positive or negative) of RB represents the direction of deviation (overestimation or underestimation) of IMERG and ERA5-Land compared with gauge precipitation. The magnitude of the absolute value of RB means magnitude of deviation (magnitude of overestimation or underestimation). KGEʹ 54,55 describes overall accuracy, which is a multi-component metric integrating the correlation (CC), bias ratio ( β ), and variability ratio ( γ ), proposed by Gupta et al. (2009) and Table 1. The division of climatic zones using annual mean precipitation based on bias-corrected gauge data. where N represents the number of samples; X i and Y i represent the ith gauge record and IMERG/ERA5-Land estimate, respectively; µ X and µ Y are the corresponding mean values; σ X and σ Y are the corresponding Referencing China's national standard on the grade of precipitation (GB/T 28592-2012 which can be consulted at http:// opens td. samr. gov. cn/), daily precipitation can be divided into 5 classes (Table 2): no precipitation, light precipitation, moderate precipitation, heavy precipitation, and extreme precipitation. Two categorical metrics (frequency bias (FB) and critical success index (CSI)) are used to further quantify the performance detection ability of IMERG and ERA5-Land for the occurrence of precipitation events falling in different precipitation classes. FB shows the extent to which IMERG and ERA5-Land overestimate or underestimate the amount of precipitation events. The range and perfect value of FB are [− 1, ∞) and 0, respectively. Similar to RB, the sign of value and the magnitude of the absolute value of FB provide information on the direction of deviation and magnitude of deviation of IMERG and ERA5-Land in the amount of precipitation events. CSI combines the features of the probability of detection (POD) and the false alarm ratio (FAR) to demonstrate the overall detection ability of target datasets. The range and perfect value of CSI are [0,1] and 1, respectively. The closer of CSI to 1, the better detectability.
where HIT is the frequency of events that are detected and observed; MISS is the frequency of events that are observed but not detected; FALSE is the frequency of events that are detected but not observed; POD = HIT/(HIT + MISS) ; FAR = FALSE/(HIT + FALSE).
Precipitation phase classification method. The bias of gauge measurements is related to the precipitation phase. In general, the bias for rainfall is relatively small and snowfall is greatly affected by wind speed 56 . Therefore, before correcting the bias of gauge measurements, the precipitation phase should be determined first. There are two types of precipitation phase classification schemes 57 . The first type only uses air temperature as the basis to classify precipitation phases 32,36,37,58 . This type of schemes requires relatively few parameters and is suitable for data-scarce areas. However, the temperature thresholds used to classify precipitation phases can vary significantly in different areas. Unlike the first type, the second type also considers other parameters such as wet-bulb temperature and relative humidity in addition to air temperature [59][60][61] . Sims and Liu 61 found that the wet-bulb temperature, rather than ambient air temperature, should be used to separate solid and liquid precipitation. This is because the wet-bulb temperature is closer to the actual temperature of precipitation particles, and it has a smaller range for which there is uncertainty as to whether the precipitation is solid or liquid. Based on the dependence of the precipitation phase on different geophysical parameters, they developed a parameterization scheme over a global scale that accepts 2-m temperature, relative humidity and surface pressure (to calculate wet-bulb temperature), low-level vertical temperature lapse rate, surface skin temperature, and surface type as inputs and returns the conditional probability of solid precipitation. However, the required parameters for this scheme are too many and cannot be obtained from weather stations alone. For example, the calculation of vertical temperature lapse rate needs observations from radiosondes and balloons. Ding et al. 60 investigated the relationship of precipitation phases with surface elevation and meteorological variables. Based on the results, a parameterization scheme was developed to discriminate the precipitation phase, with input of daily mean wetbulb temperature, relative humidity, and surface elevation. They also compared the performance of this scheme with those of 11 other schemes for China territory and the evaluations showed that the new scheme gives better accuracy with respect to other schemes. Moreover, the number of required parameters for this scheme is moderate and all parameters can be provided by weather stations. Therefore, the precipitation phase classification in this paper was based on this scheme. This scheme uses the wet-bulb temperature instead of air temperature to classify the precipitation phase. Two threshold wet-bulb temperatures are calculated with inputs of relative humidity and surface elevation. Under different values of relative humidity and surface elevation, the thresholds also vary. In other words, this parameterization scheme is a dynamic threshold method. The details are as follows.
Two threshold wet-bulb temperatures ( T min and T max ) are defined such that the precipitation phase is decided by: where T d is daily mean air temperature (℃); RH is relative humidity and it ranges from 0 to 1;p s is air pressure (hPa); e sat (T d ) is the saturated vapor pressure (hPa) at T d and is given by Tetens's empirical formula 62 as: The calculation of T min and T max is shown in the following formulas: where T 0 , T , and S are given by: where Z denotes station elevation (km).
Bias-correction method. Due to the negative bias inherent in gauge data, the reliability of the performance evaluation of IMERG and ERA5-Land directly using gauge data as the benchmark will also be weakened. Fortunately, there have been a lot of research works for the bias-correction methods of gauge data 32,34,[36][37][38] . According to Sevruk and Hamon 63 , bias correction for gauge precipitation mainly includes trace precipitation, wetting losses, evaporation losses, and wind undercatch errors. The general formula for precipitation bias correction is written as follows 28,36,63 : where P c is the corrected gauge precipitation; P g is the measured gauge precipitation; P w , P e , and P t are wetting losses, evaporation losses, and trace precipitation, respectively; and K is the correction coefficient for wind-induced errors, which is defined as 1/CR, and CR is the catch ratio due to wind-induced undercatch.
Evaporation losses are strongly dependent on weather conditions and methods of observations 64 . Because daily variation and seasonal change in evaporation losses are great and can be very site-dependent, it is difficult to give an incredible estimate of the daily evaporation losses at regional station networks by using the averaging evaporation amount obtained from other experiment sites. In addition, the annual evaporation loss in China is expected to be small due to the use of a funnel and a container in the high-evaporation period 36 . Therefore, we neglected this error in this study.
A precipitation event with < 0.10 mm is beyond the resolution of CSPG and is recorded as trace amount of precipitation. Trace events are counted as precipitation days but treated as zero amounts. Sometimes, two trace precipitation events are reported in a single precipitation day. Considering the wetting loss, the gauge cannot detect the precipitation events that are less than the wetting loss. To be conservative, we didn't correct the wetting loss and wind undercatch for trace precipitation days. Following Ye et al. 36 and Zhang et al. 37 , a value of 0.10 mm was assigned for any given trace day, regardless of the number of trace observations reported. In the gauge records, trace precipitation is flagged as 32,700.
Wetting losses depend on gauge types, precipitation phases, the number of observations, and the gaugeemptied times. The wetting loss experiments in the Urumqi river basin show the average wetting loss of CSPG per observation was 0.23 mm, 0.29 mm, 0.30 mm for rainfall, mixed precipitation, and snowfall, respectively 38,65 .
To be conservative, we corrected wetting losses once a precipitation day regardless of the number of observations in a day. These correction amounts for wetting losses were also applied in the works of Ye et al. 36 and Zhang et al. 37 . The correction for wetting losses was trigged for measurable precipitation days (gauge-measured precipitation ≥ 0.10 mm).
P c = K P g + P w + P e + P t , www.nature.com/scientificreports/ The wind-induced errors are caused by the wind field deformation over the gauge orifice 63 . It has been well documented that the wind undercatch for snowfall is much higher than for rainfall under the same wind speed 36,66 . It is essential to classify the phase of precipitation in order to apply an appropriate wind-loss correction for different precipitation phases. In this work, information on the precipitation phase was not available in the station metadata. Alternatively, we employed the parameterization scheme developed by Ding et al. 60 to discriminate the precipitation phase. The details of this scheme have been presented in "Precipitation phase classification method" section. Catch ratio was defined as the ratio of the amount of precipitation caught by a gauge (including the record amount and wetting loss) to the true precipitation. The catch ratio equation we applied is based on Yang et al. 38 who conducted gauge intercomparison experiments in the Urumqi river basin. The octagonal vertical double fence (DFIR) was used as reference measurements 38,57 . According to the intercomparison study, the catch ratios (percent) for different precipitation phases as the function of daily mean wind speed at 10 m height were established.
where CR snow and CR rain are catch ratios for snowfall, rainfall, and mixed precipitation, respectively; U 10 is the daily mean wind speed at 10 m height.
For mixed precipitation, the catch ratio was estimated by a linear relationship of the catch ratios for snowfall and rainfall determined by daily mean temperature ( T d ) 36 : The correction for wind-induced errors was also trigged for measurable precipitation days. Finally, the corrected precipitation can be obtained through the following formula: The procedure of bias-correction method can be summarized as follows:

Data collection Twelve years (2005-2016) of the daily gauge precipitation data from CSPG installed in 814
weather stations operated by CMA were obtained. In addition, the daily mean air temperature, relative humidity, air pressure, station elevation, and daily mean wind speed at 10 m height from these CMA stations were also gathered.

Precipitation phase classification
The daily mean air temperature, relative humidity, air pressure, and station elevation were used to classify precipitation phase according to Eqs. (5)-(12) 3. Calculating each bias correction component Evaporation loss was neglected. For each measurable precipitation day, wetting losses and wind-induced errors were corrected. The correction amounts of wetting losses were 0.23 mm, 0.29 mm, and 0.30 mm for rainfall, mixed precipitation, and snowfall, respectively. The catch ratios for snowfall, rainfall, and mixed precipitation were calculated using daily mean speed at 10 m height and daily mean temperature according to Eqs. (14)-(16), respectively. For trace precipitation days, only correction for trace precipitation was made and a value of 0.1 mm was assigned. 4. Removing bias in gauge data After calculating each bias correction component, bias-corrected gauge data can be produced according to Eq. (17).
Consent to publish. All the authors have approved the submission and consented for the publication.

Results
Spatial distribution of annual mean precipitation. The spatial distribution of annual mean precipitation from IMERG, ERA5-Land, Gauge_O, and Gauge_C over China is shown in Fig. 2. All four precipitation datasets (IMERG, ERA5-Land, Gauge_O, and Gauge_C) display a southeast-to-northwest decreasing trend over China. This result demonstrates that both IMERG and ERA5-Land can capture spatial trend of precipitation over China and negative bias in gauge data doesn't influence spatial trend of gauge precipitation. However, comparing Gauge_O with Gauge_C, we found that the precipitation belts with lower annual mean precipitation slightly shift in the direction of the precipitation belts with higher annual mean precipitation after bias correction of gauge data. For example, the precipitation belt with precipitation ranging from 250 to 500 mm/year is replaced by precipitation belt with precipitation ranging from 500 to 1000 mm/year in the border of these two precipitation belts. This slightly shifting of precipitation belts results from the increasing precipitation from gauge data after removing negative bias. After bias correction, annual mean precipitation from gauge increases everywhere. Under the same standard for precipitation belt classification, the precipitation belt of one place might be replaced by the precipitation belt with higher annual mean precipitation.
The correction for bias in gauge measurements. We calculated the correction amount (CA) and correction factor (CF) in different locations, climatic zones, seasons, and precipitation phases. The results are displayed in Fig. 3. The correction amount was calculated as Gauge_C − Gauge_O . The correction factor was calculated as (Gauge_C − Gauge_O)/Gauge_O. www.nature.com/scientificreports/ The spatial trend of correction amount (Fig. 3a) matches that of Gauge_O (Fig. 2c), with large correction amount in southeastern China where annual precipitation is high and small correction amount in northwestern China where annual precipitation is low. The correction amount and Gauge_O both gradually decrease from severe-humid to severe-arid climatic zone (Fig. 3c). On seasonal scale, the correction amount and largest Gauge_O both have largest values in summer and smallest values in winter (Fig. 3e). For different precipitation phases, the values of the correction amount and Gauge_O in descending order both occur in rainfall, mixed precipitation, and snowfall (Fig. 3g). These results demonstrate that there is some kind of positive correlation between the correction amount and gauge-observed precipitation. The reason is probably that the corrected gauge precipitation is proportional to the measured gauge precipitation (Eq. (13)).
As for the correction factor, the situation is different. The spatial trend of the correction factor ( Fig. 3b) is opposite to that of the correction amount (Fig. 3a), with large correction factor in northwestern China and small correction factor in southeastern China. From severe-humid to severe-arid climatic zone, correction factor increases from 11.3 to 25.6% (Fig. 3d). On seasonal scale, the correction factor has largest value in winter and smallest value in summer (Fig. 3f). Among three precipitation phases, the value of the correction factor from largest to smallest occurs in snowfall, mixed precipitation, and rainfall. The snowfall proportion in precipitation events and the lower catch efficiency of snowfall than rainfall make some contribution to these results. The snowfall proportion decreases from northwestern China (arid climates) to southeastern China (humid climates). Rainfall is dominant in summer and most snowfall occurs in winter. The catch ratio of snowfall is lower than rainfall under the same wind speed. Moreover, the correction factor is 12.7% on average in China, which is no more than those in the works of Li et al. 32  Spatial variation of continuous evaluation metrics. We calculated continuous metrics before bias correction and after bias correction by using Gauge_O and Gauge_C as the benchmark, respectively. RBs before bias correction and after bias correction are denoted by RB_O and RB_C. KGEʹ_O or KGEʹ_C is for KGEʹ likewise. We also calculated the difference between continuous metrics after bias correction and those before bias correction. For RB, the difference was labeled as ∆RB (RB_C minus RB_O). For KGEʹ, the difference was labeled as ∆KGEʹ (KGEʹ_C minus KGEʹ_O).
RB_O, RB_C, and ∆RB for IMERG and ERA5-Land were calculated at each weather station and their distributions were presented in Fig. 4. The spatial distributions of RB_O and RB_C for IMERG (Fig. 4a,b) both indicate that IMERG underestimates precipitation in southern China and Qinghai-Tibet Plateau while overestimates precipitation in northern China. Moreover, the spatial distributions of RB_O and RB_C for ERA5-Land (Fig. 4d,e) both show that the precipitation of ERA5-Land is overestimated in Tibet Plateau and Yunnan-Guizhou Plateau. www.nature.com/scientificreports/ However, comparing the spatial distribution of ∆RB for IMERG (Fig. 4c) and ERA5-Land (Fig. 4f), it is shown the noticeable decline of RB for IMERG and ERA5-Land occurs everywhere and the magnitude of decline varies www.nature.com/scientificreports/ spatially after bias correction. The magnitude of decline of RB for IMERG is greater in northern China than in southern China while that for ERA5-Land is greater in western China than in eastern China. The reason for the obvious decline of RB is that the benchmark dataset, the precipitation from gauge data, increases after bias correction. This decline of RB has led to the sign of RB reverses and the absolute value of RB changes, that is, the direction of deviation and magnitude of deviation change. For example, in comparison with Fig. 4a,b, many medium red (20% < RB < 50%) and heavy red points (RB > 50%) are replaced by light red (0 < RB < 20%) and medium red points, respectively, in northeastern China after bias correction. This indicates the magnitude of overestimation of IMERG precipitation reduces in northeastern China. In the comparison with Fig. 4d,e, many light red points are replaced by light blue points (− 20% < RB < 0) in eastern China after bias correction. The result demonstrates that the direction of deviation of ERA5-Land precipitation changes from overestimation to underestimation in eastern China. Figure 5 shows the spatial distributions of KGEʹ_O, KGEʹ_C, and ∆KGEʹ for IMERG and ERA5-Land, respectively. As can be seen from Fig. 5c-f, it can be found: After bias correction, that the accuracy of IMERG deteriorates in southern China and Tibetan Plateau with ∆KGEʹ < 0 but improves in the remaining regions of China with ∆KGEʹ > 0. ERA5-Land performs better in most of China except the eastern coast of China and isolated sites inland. However, the spatial trends of KGEʹ for IMERG and ERA5-Land are almost unaffected by systematic bias in gauge data. Comparing the spatial distribution of KGEʹ_O and KGEʹ_C for IMERG (Fig. 5a,b) and for ERA5-Land (Fig. 5d,e), respectively, we found the accuracy of IMERG shows the southeast-to-northwestdecreasing trend and that of ERA5-Land shows the northeast-to-southwest-decreasing trend whether before or after bias correction. There are two reasons for the above two distinct spatial trends of accuracy. First, satellites (reanalysis) exhibit superior performance at low (high) latitudes dominated by intense, localized convective (persistent, large-scale stratiform) precipitation system [67][68][69][70][71] . This explains the south-to-north-decreasing www.nature.com/scientificreports/ (north-to-south-decreasing) trend of accuracy for IMERG (ERA5-Land). Another reason is the increasingly sophisticated underlying surfaces from the eastern coast to the western inland which explains the east-to-westdecreasing trend of accuracy for two precipitation estimates.   www.nature.com/scientificreports/ and ∆KGEʹ) after bias correction and those before bias correction for IMERG and ERA5-Land, respectively. We defined March-May, June-August, September-November, and December-February as spring, summer, autumn, and winter. As can be seen from Figs. 6a and 7a, the RBs for IMERG and ERA5-Land both decline noticeably in all years and all seasons after bias correction with the upper staple of each box in Figs. 6b and 7b below ∆RB = 0. The medians of ∆RB for IMERG and ERA5-Land range from − 13.7 to − 11.6% and from − 15.  www.nature.com/scientificreports/ From Fig. 6c,d, the overall accuracies for IMERG and ERA5-Land both slightly increase in all years and the magnitude of improvement for ERA5-Land is a little larger than IMERG. The medians of ∆KGEʹ for IMERG and ERA5-Land range from 0 to 0.05 and from 0.03 to 0.06, respectively, and the median line of ∆KGEʹ for ERA5-Land is higher than that for IMERG for each year between 2005 and 2016. From Fig. 7c,d, IMERG performs worse in spring and winter with the median ∆KGEʹ = − 0.04 and − 0.08, respectively, but performs better in summer and autumn with the median ∆KGEʹ = 0.02 and 0.12, respectively. For ERA5-Land, the overall accuracy increases in all four seasons with ∆KGEʹ ranging from 0.01 to 0.28. In addition, comparing KGEʹ_O with KGEʹ_C for IMERG and ERA5-Land in Fig. 7c, it can be concluded that ERA5-Land has better overall accuracy than IMERG in autumn and winter while the contrary is the case in summer whether before bias correction or after bias correction. In spring, the performance superiority between IMERG and ERA5-Land reverses after bias correction. Before bias correction, the median KGEʹ of IMERG (0.58) is greater than that of ERA5-Land (0.53). After bias correction, the median KGEʹ of IMERG (0.56) becomes less than that of ERA5-Land (0.61).

Accuracy of IMERG and ERA5-Land over different climatic zones.
To explore the regional influence of negative bias in gauge data on the accuracy evaluation of IMERG and ERA5-Land, we computed overall RB_O, RB_C, ∆RB, KGEʹ_O, KGEʹ_C, and ∆KGEʹ for IMERG and ERA5-Land over the whole China (Table 3) and different climatic zones (Table 4).  www.nature.com/scientificreports/ As can be seen from Table 3, over the whole of China, the overall RBs of IMERG and ERA5-Land both decline after bias correction. The direction of deviation of IMERG changes from overestimation (RB_O = 6.8%) to underestimation (RB_C = − 5.2%), and the magnitude of deviation of ERA5-Land remarkably reduces, changing from significant overestimation to slight overestimation. However, the accuracies of IMERG and ERA5-Land generally do not change much.
As shown in Table 4, after bias correction, the overall RBs of IMERG and ERA5-Land again decline over different climatic zones and the magnitude of RB decline gradually rise from severe-humid climatic zone to severearid climatic zone along with ∆RB for IMERG varying from − 9.7 to − 38.5% and ∆RB for ERA5-Land varying from − 10.2 to − 30.7%. In other words, the RB declines more in arid climates than in humid climates. This result probably attributes to more trace precipitation events, lower catch ratio for wind-induced undercatch, and more correction percentage for wetting loss which are contributed by scarce precipitation, higher wind speed, and a greater proportion of snowfall events, respectively, in arid climates with respect to humid climates over China 37 . Moreover, the changes of direction of deviation and magnitude of deviation for IMERG and ERA5-Land also occur. For example, the magnitude of deviation of IMERG changes from slight underestimation (RB_O = − 5.1%) to significantly underestimation (RB_C = − 14.8%) over severe-humid climate zone and the direction of deviation of IMERG changes from overestimation to underestimation over the humid climatic zone.
Over the severe-humid zone, the overall accuracies of IMERG and ERA5-Land both slightly deteriorate after bias correction with ∆KGEʹ = − 0.04 and ∆KGEʹ = − 0.02, respectively. Over the remaining five climatic zones, the overall accuracies of IMERG and ERA5-Land all improve, especially in the severe-arid climatic zone for IMERG with ∆KGEʹ = 0.32 and in arid and severe-arid climatic zone for ERA5-Land with ∆KGEʹ = 0.16 and 0.22. It should be noted the overall accuracies of IMERG in severe-arid climatic zone is poor, probably due to sub-cloud evaporation 72 . Over arid regions, the atmosphere beneath the clouds is mostly dry. As a result, precipitation detected by satellite will evaporate before reaching the surface resulting in huge overestimation of surface precipitation. This phenomenon is severe in the severe-arid climatic zone with RB_O = 88.9% and RB_C = 50.4%.

Detection ability of IMERG and ERA5-Land in different classes of precipitation events.
To investigate the influence of systematic error in gauge data on the frequency distribution of gauge-observed precipitation events and categorical metrics for IMERG and ERA5-Land, we listed the frequency distributions of precipitation events in different precipitation classes from Gauge_O and Gauge_C in Table 5 and calculated FB_O, FB_C, ∆FB, CSI_O, CSI_C, and ∆CSI for IMERG and ERA5-Land in different precipitation classes which are shown in Fig. 8. Similar to RB_O, RB_C, ∆RB, KGEʹ_O, KGEʹ_C, and ∆KGEʹ, FB_O and CSI_O denote categorical metrics calculated before bias correction, FB_C and CSI_C denote categorical metrics calculated after bias correction, and ∆FB and ∆CSI denote the change of categorical metrics after bias correction. Analogously, IMERG_O, ERA5-Land_O, IMERG_C, ERA5-Land_C, ∆IMERG, and ∆ERA5-Land in Fig. 8 have similar meanings to those in Figs. 6 and 7 except that they stand for categorical metrics.  Table 5. Frequency distributions of precipitation events in different precipitation classes from Gauge_O and Gauge_C. ∆Gauge denotes the frequency change for gauge-observed precipitation events after bias correction, which is calculated by frequency distribution from Gauge_C minus that from Gauge_O. www.nature.com/scientificreports/ As can be seen from Table 5, after bias correction, the apparent change of the frequency distribution of gaugeobserved precipitation events in different precipitation classes occurs. In no precipitation class, the amount of precipitation events remarkably decreases. In the remaining five precipitation classes, the amounts of precipitation events both increase, especially in the light precipitation class. The obvious changes of precipitation events in no and tiny precipitation classes might be mainly caused by the change of the precipitation class of trace precipitation events. Before bias correction, trace precipitation events considered as zero precipitation are classified into no precipitation class. After bias correction, the precipitation intensity of trace precipitation events is corrected to 0.1 mm and thus trace precipitation events are classified into light precipitation events.
The change in the frequency distribution of gauge-observed precipitation events inevitably affects categorical metrics for IMERG and ERA5-Land. From Fig. 8a,b, the FBs for IMERG and ERA5-Land increase in no precipitation class but decrease in the other four precipitation classes after bias correction. This change is in accordance with the change in the frequency distribution of gauge-observed precipitation events. As a result, the magnitude of deviation and direction of deviation in the amount of the different classes of precipitation events also alter. For example, the magnitudes of overestimation both reduce for IMERG and ERA5-Land in the amount of light precipitation events, with FB_O = 0.36 and FB_C = 0.02 for IMERG and FB_O = 0.90 and FB_C = 0.43 for ERA5-Land. In the amount of extreme precipitation events, the direction of deviation for IMERG changes from overestimation to underestimation, with FB_O = 0.13 and FB_C = − 0.06.
From Fig. 8c,d, the CSIs for IMERG and ERA5-Land nearly have no change in moderate, heavy, and extreme precipitation classes with |∆CSI|≤ 0.008 which indicates negative bias in gauge data almost has little influence on the evaluation of detection ability for IMERG and ERA5-Land in these precipitation classes. In other words, IMERG has better capability in detecting moderate precipitation events, ERA5-Land has better performance in detecting extreme precipitation events, and both two products have the comparable ability in detecting heavy precipitation events. It should be noted that the performance of IMERG in detecting extreme events is better than ERA5-Land and their performances in detecting heavy events are comparable. This demonstrates that satellite precipitation products maybe more suitable than reanalysis precipitation products for flood forecast and warning due to the detection of heavy and extreme precipitation vital for flood-related applications 73 . www.nature.com/scientificreports/ Whereas, the noticeable changes of CSI in no and light precipitation events have taken place after bias correction. The CSI of IMERG decreases but that of ERA5-Land increases in no precipitation events which weakens the superiority of detection ability of IMERG over that of ERA5-Land in the class of no precipitation events. Before bias correction, IMERG has an advantage over ERA5-Land in detecting no precipitation events, with CSI_O = 0.70 for IMERG and CSI_O = 0.65 for ERA5-Land. After bias correction, the detection ability of IMERG in no precipitation events is comparable to that of ERA5-Land, with CSI_O = 0.68 both for IMERG ERA5-Land. In light precipitation events, the CSIs of two products both increase and the increasing magnitude of ERA5-Land is more significant than that of IMERG with ∆CSI = 0.04 for IMERG and ∆CSI = 0.10 for ERA5-Land.

Accuracy of IMERG and ERA5-Land for different phases of precipitation.
Last, we also assess the performance accuracy of IMERG and ERA5-Land for different precipitation phases by calculating the corresponding continuous metrics. The results are listed in Table 6. After bias correction, the RBs for rainfall, mixed precipitation, and snowfall all decrease. The magnitude of RB decline for snowfall is much greater than that for rainfall. This might result from the different wind-induced undercatch for different precipitation phases. First, under the same wind speed, gauge wind-induced undercatch of snowfall is much higher than rainfall 35,36,74 , which is reflected by the catch ratio calculating equations (Eqs. (14)- (16)). Second, in China, the high wind speed area 75 coincides with major snowfall area 20 (Xinjiang, Tibetan Plateau, and northeastern China). Thus, snowfall is usually accomplished by higher wind speed and higher wind-induced undercatch compared with rainfall. Again, the RB decline has resulted in the change of the direction and magnitude of deviation of IMERG and ERA5-Land for different precipitation phases. For instance, the direction of deviation of IMERG changes from overestimation to underestimation and the magnitude of deviation of ERA5-Land changes from apparent overestimation to slight underestimation for rainfall.
Moreover, little has changed on the accuracies of IMERG and ERA5-Land after bias correction for rainfall. However, for mixed precipitation and snowfall, the noticeable changes happen on the accuracy of two products. The KGEʹs of IMERG decrease for both two precipitation phases while those of ERA5-Land increase. It's worth noting that IMERG has better performance accuracy than ERA5-Land before bias correction but the contrary is the case after bias correction for snowfall. Besides, it can be seen from Table 6 that IMERG has poor performance in detecting snowfall and mixed precipitation compared with rainfall which might be ascribed to several factors 76 . First, snow accumulation on the ground often presents a similar passive microwave signature as the falling snow, making it difficult to separate them. Second, the non-spherical shape of ice particles and snowflakes results in much more complex radiative properties than the approximately spherically shaped rain drops. Third, the supercooled liquid water can increase the brightness temperature at high frequencies and obscure the brightness temperature depression signature caused by the scattering effect. Fourth, snowfall has a weaker scattering signature relative to rainfall which is more easily obscured by other signals (e.g., surface contamination and super cooled liquid water).

Discussion
The relationship between the change of the absolute value of RB and the change of KGEʹ after bias correction. After removing negative bias in gauge data, the RBs of IMERG and ERA5-Land both decline and the magnitude of RB decline is seemingly positive related to the magnitude of correction factor. Meanwhile, the KGEʹ of the two products also have changed. It seems like that the change of KGEʹ is related to the change of the absolute value of RB. We defined the absolute value of RB as |RB| and the change of |RB| as ∆|RB|. ∆|RB| is calculated by |RB| after bias correction (|RB_C|) minus |RB| before bias correction (|RB_O|). To test the preceding guess, we calculated ∆|RB| and ∆KGEʹ for IMERG and ERA5-Land at each weather station, year, season, climatic zone, and precipitation phase. The results are shown in Fig. 9.
It is evident from Fig. 9 that most of the points fall into the second quadrant or the fourth quadrant. They are along a diagonal line with the slope less than 0 in Fig. 9a-e. The sign of ∆|RB| is opposite to that of ∆KGEʹ. The magnitude of ∆|RB| is positively correlated with that of ∆KGEʹ for both IMERG and ERA5-Land. In other words, the improvement and deterioration of accuracy (KGEʹ) is consistent with the decreasing and increasing of magnitude of deviation (|RB|). The magnitude of change of accuracy is positively correlated with the magnitude of change of magnitude of deviation. This may be caused by the magnitude of deviation, which also can be represented by the term (β − 1) 2 in Eq. (2), has been integrated into the calculating formula of KGEʹ. formance of IMERG and ERA5-Land. The strengths and weaknesses of IMERG and ERA5-Land whether before bias correction or after bias correction are revealed, which can be summarized as follows. ERA5-Land performs better than IMERG in northeastern China and vice versa in southeastern China. On the seasonal scale, the accuracy of IMERG is better than ERA5 in summer, while the contrary is the case in autumn and winter. In different climatic zones, the performance of IMERG is superior to ERA5-Land from severe-humid to arid climatic zone but turns into inferior to ERA5-Land in the severe-arid climatic zone. In different classes of precipitation events, the detection ability of IMERG is lower than ERA5-Land for light and moderate precipitation events but higher than ERA5-Land for extreme precipitation events. In different precipitation phases, IMERG has an advantage over ERA5-Land in rain estimation. In mixed precipitation estimation, the accuracy of IMERG is worse than that of ERA5-Land. However, in snow estimation, if directly using un-corrected gauge data, the accuracy of IMERG is significantly overvalued and that of ERA5-Land is markedly undervalued, which leads to the better performance of IMERG than ERA5-Land before bias correction but the contrary case after bias correction. The results above maybe give us insight into the strengths and weaknesses of other satellite and reanalysis precipitation products and assist us in selecting appropriate satellite and reanalysis precipitation products for different applications. In addition, the complementary strengths and weaknesses of IMERG and ERA5-Land enlighten us that satellite precipitation, reanalysis precipitation, and gauge precipitation may be able to be merged to generate a more accurate precipitation dataset for relevant hydrological applications.
Limitations and uncertainties. The uneven distribution of weather stations may cause statistical errors when using gauge data in this study. In western China, rain gauges are scarce and unevenly distributed, especially in the western part of Tibet Plateau where rain gauges are few and even missing. Moreover, in the mountains of western China, precipitation is mainly affected by orographic uplift. Due to the lack of adequate rain gauges there, precipitation may be underestimated to some extent. In this study, we adopted a pixel-to-point strategy with the assumption that the point-scale precipitation observation is equal to the grid-scale precipitation estimates. This assumption might be justified in flat areas with a relative uniform precipitation pattern, while might not hold the regions with complex topography where precipitation has highly spatial heterogeneity. The bias correction method may have other uncertainties. The correction for wind-induced undercatch is based on the relationship between the catch ratio of CSPG and daily mean wind speed at 10 m height. However, wind speed can vary during the 24-h period of a day, and the daily mean wind speed applied to correct windinduced undercatch in some cases may not represent the simultaneous wind speed at the time of the occurrence of precipitation. Hourly data are preferred, but high-quality hourly precipitation and wind speed are unavailable at present. The gauge intercomparison experiments using hourly data should be carried out and new correction methods are expected to be developed on the premise of available hourly data in the future. It has been suggested that the wind speed at gauge level should be used to establish equations for catch ratios. Wind speed at gauge level can be estimated from the wind speed at standard height 77 . Reducing wind at standard height to the gauge level www.nature.com/scientificreports/ should consider gauge exposure 64 . Gauge exposure depends on the average vertical angle of obstacles around the gauge. The station metadata about gauge exposure were not available, so gauge exposure was not accounted for in estimating wind speed at gauge height. This may introduce some uncertainties in the estimation of gauge catch ratios. Unlike the works of Li et al. 32 , Ye et al. 36 , and Zhang et al. 58 which used two air temperature thresholds (− 2 °C and + 2 °C) to discriminate precipitation phase, this paper adopted a more accurate parameterization scheme developed by Ding et al. 60 that accepts wet-bulb temperature, relative humidity, and elevation instead of air temperature as inputs. However, the employed scheme doesn't consider the temperature profile and other geophysical variables. Sims and Liu 61 found surface skin temperature and vertical temperature lapse rate also affect the precipitation phase. The calculation of vertical temperature lapse rate needs temperature profile information, which can be obtained by observations from radiosondes and balloons. However, these data are unavailable in the sites of weather stations used in this study. On the premise of relevant data available in the future, it is expected to develop a parameterization scheme with considering the temperature profile and other geophysical variables over China in addition to the variables used by the adopted parameterization scheme in this study.

Conclusion
This study has investigated the influence of systematic errors in gauge data on the evaluation of IMERG and ERA5-Land over China. With the removed negative bias in gauge data, the updated dataset of the gauge is used as a reference for the performance evaluation of IMERG and ERA5-Land. Comparing the evaluation results with those using original gauge data as a reference, some significant changes are found as follows: (1) The relative biases of IMERG and ERA5-Land decrease and vary in different spatial locations, years, seasons, climatic zones, and precipitation phases, which results in the change of magnitude of deviation and the reverse of the direction of deviation, i.e., the magnitude of overestimation decreases, the magnitude of underestimation increases, and the direction of deviation changes from overestimation to underestimation. (2) The frequency biases of IMERG and ERA5-Land rise in no precipitation events and decline in light, moderate, heavy, and extreme precipitation events. This is caused by the amount of gauge-observed no precipitation events significantly decreasing and the amounts of gauge-observed other four classes of precipitation events significantly increasing. (3) The change of the accuracy of two products occurs and also varies with different locations, years, seasons, climatic zones, and precipitation phases. The improvement (deterioration) of accuracy is accompanied by the decreasing (increasing) magnitude of deviation. Moreover, the magnitude of change of accuracy is positively correlated with the magnitude of change of magnitude of deviation. (4) The detection ability of two products is noticeably affected in no and light precipitation events. In no precipitation events, IMERG has better detection ability than ERA5-Land before bias correction but the detection abilities of two products become comparable after bias correction. In detecting light precipitation events, ERA5-Land has an advantage over IMERG before bias correction and the detection ability gap between them becomes bigger after bias correction.

Data availability
IMERG data can be downloaded from GES DISC (https:// disc. sci. gsfc. nasa. gov/). ERA5-Land data are available from the ECMWF, please visit: https:// cds. clima te. coper nicus. eu. Due to the strict security requirements from CMA, gauge data and other ground observation data used in this study are proprietary or confidential in nature. If someone wants to request these data, he or she should contact National Meteorological Science Data Center via email datacenter@cma.gov.cn.