Hourly 5-km surface total and diffuse solar radiation in China, 2007–2018

Surface solar radiation is an indispensable parameter for numerical models, and the diffuse component contributes to the carbon uptake in ecosystems. We generated a 12-year (2007–2018) hourly dataset from Multi-functional Transport Satellite (MTSAT) satellite observations, including surface total solar radiation (Rs) and diffuse radiation (Rdif), with 5-km spatial resolution through deep learning techniques. The used deep network tacks the integration of spatial pattern and the simulation of complex radiation transfer by combining convolutional neural network and multi-layer perceptron. Validation against ground measurements shows the correlation coefficient, mean bias error and root mean square error are 0.94, 2.48 W/m2 and 89.75 W/m2 for hourly Rs and 0.85, 8.63 W/m2 and 66.14 W/m2 for hourly Rdif, respectively. The correlation coefficient of Rs and Rdif increases to 0.94 (0.96) and 0.89 (0.92) at daily (monthly) scales, respectively. The spatially continuous hourly maps accurately reflect regional differences and restore the diurnal cycles of solar radiation at fine resolution. This dataset can be valuable for studies on regional climate changes, terrestrial ecosystem simulations and photovoltaic applications.

include two categories: constructing empirical relationships between top of atmosphere and surface radiative fluxes 19,20 , and driving radiative transfer models by utilizing satellite-derived atmospheric parameters 1,21 .
Several global R s datasets have been generated through satellite retrievals. For instance, the Global LAnd Surface Satellite (GLASS) 22 provides global 5-km resolution, 3-h interval R s ; Tang et al. 23 produced a 16-year dataset (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) of high-resolution (3 h, 10 km) global R s . Nevertheless, none of them provide estimate of R dif . In addition, large uncertainties frequently occur under broken clouds due to the neglect of adjacency effect in their pixel-based retrieval schemes [24][25][26] that depend on an assumption of plane-parallel homogeneous clouds. However, this assumption does not always hold. For example, in the presence of broken clouds, multiple reflections and scattering events off the sides of clouds lead to significant photon transport [27][28][29] , which makes great difference at fine scales where R s of an individual footprint under inhomogeneous clouds is relevant to multiple adjacent satellite pixels 24 . Therefore, area-to-point retrievals seem the optimal solutions, i.e., adjacent signals within a certain extent are involved for radiation estimation.
The notable progress of deep learning in modelling spatial context opens new perspectives 30 . Convolutional neural networks (CNN) have been widely utilized to extract spatial features from satellite images for definition and classification of extreme situations, for instance, storms, spiral hurricanes, and atmospheric rivers 31 . Thus, it is feasible to capture the spatial distribution of clouds/aerosols through CNNs for handling spatial adjacent effects caused by photon transport. In our previous work, a deep network consisting of CNN module and multi-layer perceptron (MLP) has been developed for R s estimation for the first time 32 , and achieved breakthrough of data accuracy at hourly scale. In this study, we further extend the previous network to fit the requirements of R dif estimation through transfer learning, and then use the newly trained network and previous one to generate high-resolution (hourly, 5 km) R s and R dif time series data in China. The final published dataset 33 includes R s and R dif at hourly, daily and monthly scales from 2007 to 2018. This unique data source are useful for analysis of regional characteristics and temporal cycles of solar radiation at fine scales, as well as radiation-related applications or scientific researches particularly climate changes and utilization of renewable solar energy.

Methods
Basic data. To train the proposed deep network, training samples should be prepared at first. The output corresponds to ground measurements of R s or R dif . The inputs include satellite image blocks and associated attributes of time (month, day, and hour) and location (latitude, longitude and altitude). Hourly R s and R dif measurements are available from China Meteorological Administration (CMA) (http://data.cma.cn/ last accessed: 11 Jan. 2020). The used hourly records involve 98 radiation stations and cover a period from 1 Jan. 2007 to 31 Dec. 2008. The data in 2008 were used for training of deep network while that in 2007 were for independent validation. Figure 1 shows the spatial distribution of all related stations, of which 81 sites (circles) only provide R s while the rest 17 sites (triangles) provide both R s and R dif . These stations locate in different climate zones and their background land cover types include forests, grasslands, croplands, bare lands etc., ensuring the representativeness of training samples for deep network. A simple physical threshold test 34 was adopted to exclude the spurious and erroneous measurements. In total, 0.49% of all records not passing the test were deleted and 441547 samples for R s and 55096 samples for R dif were retained for subsequent experiments. Besides, daily and monthly records of 98 radiation stations from 2007 to 2014 were used for validation of time-series products. Their quality was controlled based on the reconstructed daily and monthly integrated R s data 35 .
Finally, altitude of each pixel should be determined thus DEM data are required. DEM data are from Shuttle Radar Topography Mission that generates the most complete high-resolution digital topographic database of the Earth, covering over 80% of the Earth's land surface between 60°N and 56°S. The data can be obtained from the website http://srtm.csi.cgiar.org/srtmdata/ (last accessed: 11 Jan. 2020). The original DEM data with data points posted approximately 30 m were resampled to grids with 0.05° resolution. DEM data provide elevation information for gridded inputs during spatially continuous estimation. estimation of surface solar radiation. The method we used to estimate surface solar radiation is mainly based on the CNN-based deep network developed in our previous work 32 . The network is demonstrated to be effective in handling spatial adjacent effects of surface radiation and simulating complicated radiative transfer processes and to be successful in retrieving accurate estimation of R s from geostationary satellite data. In summary, the deep network consists of CNN module and MLP. CNN module takes image blocks as inputs thereby allowing identical treatment of adjacent satellite pixels, and is further stacked to construct deep residual structure to extract hierarchical features from low-level details (e.g., geometric shapes, sizes, orientations, edges and distribution) to high-level comprehensive abstract representations (e.g., intrinsic physical and optical properties of mixed clouds). Such hierarchical architecture of spatial features is a reflection of the scattering and absorption effects as well as their interactions in the atmosphere; hence, it can be considered as substitutes for input parameters in radiative transfer models to describe atmospheric state. The MLP is utilized to link extracted features of CNN and additional auxiliary information (involving the state in time and space) to target hourly R s through implicit non-linear expressions, whose parameters are learnt from pre-prepared training samples in a supervised manner. Traditional physical algorithms retrieve surface radiation from satellite signals through various radiative transfer models or their simplified versions, where geometric/atmospheric conditions and aerosol types should be strictly defined, complex atmospheric processes need to be precisely simulated, and clear-sky and cloudy retrieval modes are independently developed. In contrast, all-sky situations are under a unified framework in our CNN-based algorithm and tedious intermediate simulations are avoided. Another advantage is that the deep network is capable of handling spatial adjacent effects of surface radiation, in other words, considering the influence of neighbouring pixels on radiation estimation of the central point. More details on the network structure and the spatial adjacent effects can refer to refs. 32,36 .
The graphical structure of the proposed deep network is illustrated in Fig. 2a. There are two input flows: Input1 for satellite image blocks and Input2 for additional attributes corresponding to the central point of Input1. The Output is target R s associated with the central point of Input1. More details can refer to ref. 32 . The input size for CNN is 16 × 16 pixels (~80 × 80 km on the ground) based on the recommendation that time series of satellite pixels are most correlative within an extent of approximately 60 km at hourly scale 25 and our previous experiments on the spatial scale effect of satellite-based R s estimation 36 . This setting also fits in the requirements of classical CNN structure and ensures the extraction of edge features. In addition, only visible band of satellite data is utilized for the convenience of cross-sensor applications because visible channel is available for nearly all satellite sensors. It is reasonable as visible channel provides the most proportion of information on aerosols, clouds and other atmospheric properties 20 .
In our previous experiment, an outstanding deep network for R s estimate has been obtained after continuous trial-and-error process and iterative parameter optimization. Herein, we further fine-tune the previous network for the sake of R dif estimation using new training samples consisting of ground measured R dif and corresponding satellite image block. The transfer learning was adopted to overcome the problem associated with insufficient R dif samples. The parameters for convolutional layers (Conv) were initialized from the trained R s model while that for fully-connected layers (FC) were reset to zero. Therefore, R dif samples were mainly responsible for MLP fitting. Training and tuning processes were the same as R s . In this way, the best model for R dif estimation can be obtained in short time as CNN module has mastered the rules to abstract spatial pattern from satellite image blocks. After model learning and optimization, the trained R dif model in combination with previous R s model was used to generate our radiation datasets.
Workflow of data generation. The schematic flowchart to generate our radiation datasets is illustrated in The training section concentrates on learning the underlying non-linear relationships between satellite images and measured surface radiation, and outputs two deep networks for R s and R dif estimation. The estimation section predicts spatially continuous R s and R dif data using the trained networks by feeding gridding inputs. The main procedures are numbered in Fig. 2b and described as follows: www.nature.com/scientificdata www.nature.com/scientificdata/ 1. Prepare training sets. For each ground station, a 16 × 16 neighbouring block was cut out from GAME image and matched up with quality-controlled R s and R dif record in 2008 according to time attributes. These samples were separated into three groups: R s training set (93 training sites in Fig. 1), R dif training set (12 triangle training sites in Fig. 1) and validation set (5 triangle validation sites with black cross in Fig. 1). 2. Simulate the state at the top of Mt. Everest. To guarantee a reasonable extrapolation of the deep network at high altitudes, constraints from radiative transfer model simulation at the top of Mt. Everest were mixed into the R s and R dif training set. The Santa Barbara DISORT Atmospheric Radiative Transfer (SBDART) model was adopted for the simulation 20 . 3. Initialize the deep network. The network was implemented using keras package 38 . All parameters of the network were initialized through Xavier 39,8 . The learning rate was initially 0.01 but multiplied by 0.5 across a learning plateau. 4. Train deep network for R s estimation. The Adagrad optimizer 40 was used to iteratively find the optimal weights and biases that minimize the mean-squared error between the network's predictions and the training targets. An early-stopping mechanism was utilized to relieve overfitting by relinquishing further optimization when the performance ceased to improve sufficiently. During training process, 20% of the paired samples were randomly selected to serve as a validation set to identify whether the network was overfitting. The model with the best performance was preserved for subsequent estimates. 5. Fine-tune the preserved model in 4) for R dif estimation. Similarly, the model with the best performance was preserved. More parameter configurations of step 2-5 can refer to ref. 32 .
6. Generate spatially continuous hourly estimation. Hourly gridded GAME products from 2007 to 2018 were associated with corresponding time/location attributes, and then the best models in 4) and 5) were used to simultaneously obtain R s and R dif maps by feeding gridded inputs. In addition, surface direct solar radiation (R dir ) was derived by subtracting R dif from R s . 7. Integrate daily and monthly estimates. The missing hourly value was filled by multiplying the corresponding hourly extraterrestrial radiation by the averaged clearness index calculated from available hourly estimates within the day. After that, daily values were sums of all hourly estimates within the day, and monthly values were the sum of all daily values within the corresponding month. 8. Validate radiation datasets. The spatial extensibility of deep network was evaluated using the validation set in 1) that was not involved at training phase. The accuracy of our datasets was further evaluated at hourly scale by comparing to ground measurements in 2007. Moreover, daily and monthly estimates were evaluated using station records from 2007 to 2014. Three indices were used to quantify data quality: correlation coefficient (R), mean bias error (MBE), and root-mean-squared error (RMSE) between estimates and ground measurements:¯ȳ www.nature.com/scientificdata www.nature.com/scientificdata/ where n is the total number of data samples indexed by i, y represents the measured value whose mean value is y , and ′ y is the predicted value with mean ′ y . Relative values of MBE and RMSE (rMBE and rRMSE) were also used.
Sensitivity analysis. The crucial step of this algorithm is to equip the deep network with the ability to extract abstract spatial pattern from satellite images. The representativeness and balance of training samples and the input size of satellite image blocks affect the reliability of gained pattern for R s estimation, thus the accuracy of estimated data. The 98 stations under different climates and with diverse land cover types guarantee the representativeness of R s training samples. To overcome the imbalance of samples, image blocks corresponding to high radiation values whose proportion is usually small were first rotated by 90/180/270 degrees and flipped up and down, left and right, then several copies of these samples were mixed into the full training set. The investigation of spatial scale effects in ref. 36 suggests an optimal input size of 16 × 16 pixels. Configurations of hyper-parameters were referenced to classical classification and object detection networks in computer vision, for example, the rectified linear unit (ReLU) was used as the activation function as it is effective in alleviating vanishing gradient problems and speeding up learning process; the early-stopping was adopted to prevent overfitting thus it was not necessary to control training epochs carefully. Other sensitive hyper-parameters (listed in Table 1) were determined based on a hierarchical search. To reduce the computational cost associated with the learning procedure of deep network, our experiments were conducted using a small training dataset (twelve training sites with blue triangles in Fig. 1). We first investigated different choices of the learning rate with a fixed configuration for other parameters (the first choice in the search space). After the optimal choice of learning rate (Initial value of 0.01 and multiplied by 0.5 after 10 epochs' plateau of validation loss) was determined, we continued searching for the optimizer, then the dropout rate and batch size. For learning rate, optimizer and dropout rate, the choice (the bold one in the search space) with the best validation accuracy at the five independent stations in terms of R and RMSE was finally selected. With respect to the batch size, it seems that the smaller size, the better performance but the longer time. Therefore, we chose the intermediate size of 500 for a balance between the performance and time consumption.

Data Records
All hourly, daily and monthly radiation datasets from 2007 to 2018 are freely available from the Pangaea 33 at https://doi.org/10.1594/PANGAEA.904136, through which users can link to the specific data entities of each year. The dataset for one year includes twelve folders for hourly radiation (twelve months), one folder for daily total radiation, one folder for monthly total radiation as well as other supporting documents: • Hourly radiation: twelve zipped folders named as "China_HourlyRadiation _yyyymm.h5". The hourly files are named as "RAD_yyyymmddhh.h5" and stored as int16 data type in HDF5 format in the unit of 10 −4 MJ m −2 . "yyyy", "mm", "dd", and "hh" denote year, month, day and hour (UTC time). Each file contains two variables representing R s and R dif , namely global radiation and diffuse radiation, respectively. The time coverage of hourly dataset is from 2007-01-01 0:00 to 2018-12-31 23:00 (UTC). www.nature.com/scientificdata www.nature.com/scientificdata/ • Daily and monthly radiation: Daily files are named as "RAD_yyyymmdd.h5" and monthly files are named as "RAD_yyyymm.h5" where "yyyy", "mm", and "dd" denote year, month, and day. Values are stored as floating-point data type in the unit of 10 −2 MJ m −2 . Each file contains two variables representing R s and R dif , namely daily/monthly total global radiation and daily/monthly total diffuse radiation, respectively.
The datasets provide gridded radiation estimates within 71°E-141°E and 15°N-60°N with an increment of 0.05° (about 5 km). The hourly radiation can also be expressed in unit of W/m 2 through the conversion: 0.01 MJ m −2 hour −1 = 1/0.36 W m −2 . More details and examples of data visualization can refer to the published description files in each dataset. It is stressed that all hourly data are provided in UTC time.

Technical Validation
Spatial mapping. Figure 3a shows the instantaneous atmospheric state in visible channel captured by MTSAT at UTC 6:00, 22 Jun. 2008 (BJT 14:00, 22 Jun. 2008). The estimated hourly R s and R dif are displayed in Fig. 3b,c, respectively. The influence of cloud depth, surface topography and elevation are reflected in the spatial distribution of surface radiation. Under the thick clouds (red regions in Fig. 3a), both R s and R dif are lower than surrounding areas. In contrast, with respect to regions below thin clouds (yellow regions in Fig. 3a), R s is relatively higher as more R dif is obtained on the surface. For areas under clear sky conditions (blue regions in Fig. 3a), R s is larger in high altitude areas (e.g., the Tibetan Plateau). Figure 3d-i illustrates the spatial distribution of R s , R dir and R dif at daily and monthly scales. Daily radiation on 22 June 2008 shares similar characteristics with hourly radiation, indicating a stable atmospheric state in the day. At monthly scale, regional differences are revealed thoroughly. The distribution of solar radiation exhibits obvious latitudinal dependency, but also affected by the surface topography, regional climate and distance to coastal line. In June, R s is highest on the Tibetan Plateau and lowest in the Szechwan and south China due to the significant difference of R dir . Conversely, R dif has the minimum value on the Tibetan Plateau while the maximum value locates on the North China Plain. R dir is predominant in regions with high altitudes (the Tibetan Plateau) or drought climate zones (the Mongolia Plateau) while R dif occupies the main proportion for areas with abundant rainfall or frequent cloud coverage (the middle and lower reaches of the Yangtze River, the Szechwan Basin and Guizhou). Although deep networks used for estimation are trained by samples within China, they also provide reasonable estimation in surrounding areas. For example, in June R dif contributes to the majority of surface radiation in India and Southeast Asia due to the coming rainy season.

Temporal variations.
We establish time series products to observe the temporal variations of surface solar radiation. Figure 4 shows the monthly variations of statistically averaged R s , R dir and R dif for different regions in China from 2007 to 2018. R s on the Qinghai-Tibet Plateau is the highest all the year round, benefiting from www.nature.com/scientificdata www.nature.com/scientificdata/ significantly higher altitudes, which in contrast leads to the lowest received R dif as shown in Fig. 4c. The proportion of R dif exhibits the highest in the south of China (relatively lower R s but higher R dif ) compared with other regions due to the frequent cloudy and rainy weather. A slight dimming of R s is observed in 2010, followed by the brightening from 2011 to 2015, and then by a dimming from 2016 to 2017. Howbeit the long-term trends of R dif are inconsistent with the variations of R s . For instance, neither obvious brightening nor dimming is manifested in the northwest while a decreasing tendency continues until 2015 on the Qinghai-Tibet Plateau. The fluctuation of R dir is more obvious than R dif , accounting for the overall variations of R s , because both absorption and scattering of the atmosphere lead to decrease of R dir while changes of R dif radiation result from scattering of the atmosphere alone.
Validation against ground measurements. The validation in our previous work 32 has demonstrated the outstanding performance of the hybrid deep network on estimation of R s . Herein, we evaluate the model performance for R dif estimation to check the viability of transfer learning. The evaluation process includes three stages: performance over training samples (12 triangle training sites in Fig. 1), independent spatial extensibility in 2008 (5 triangle validation sites with black cross in Fig. 1), and temporal extensibility in 2007 at all 17 stations, as shown in Fig. 5a-c. Overall, it provides good estimates for R dif at the site scale with an R of 0.88, MBE of 3.09 W/ m 2 and RMSE of 58.22 W/m 2 over training samples. The results with an R of 0.89, MBE of 9.09 W/m 2 and RMSE of 58.33 W/m 2 at five independent validation sites, and an R of 0.85, MBE of 8.63 W/m 2 and RMSE of 66.14 W/m 2 in 2007, are comparable to the training phase, revealing the powerful spatial and temporal extensibility of deep networks in estimating R dif . The positive MBE values confirm that our datasets overestimate R dif at some degree, which might attribute to relative lower measured values due to instrument drifting sensitivity and urbanization effects 41,42 . In fact, it is a challenging task to estimate R dif due to much higher demands for fully consideration of aerosols, clouds, and their interactions. Yet for all that, our estimates of R dif (Fig. 5c) outperform the widely-used ERA5 reanalysis data released by European Centre for Medium-Range Weather Forecasts (ECMWF) which has an R of 0.85, negative MBE of 43.08 W/m 2 and RMSE of 96.93 W/m 2 when evaluated at the same CMA diffuse radiation stations in 2007 42 .
Furthermore, our datasets are evaluated against ground measurements collected at 98 CMA radiation stations from 2007 to 2014 at daily mean and monthly mean scales as shown in Fig. 5d-i. Our daily results of R s at the spatial resolution of 5 km exhibit an R of 0.94, MBE of 3.61 W/m 2 and 30.65 W/m 2 . The intrinsic difference between point nature of ground measurements and areal average of gridded radiation products usually takes part of the responsibility for above deviations 24 . At a finer spatial resolution of 5 km the RMSE of our daily R s is still superior to widely-used products such as the ISCCP-FD data at 2.5° resolution with an R of 0.89 and RMSE of 68.3 W/m 2 (see Section 3.1 of ref. 36 ), the GEWEX-SRB data at 1° spatial resolution with an R of 0.91 and RMSE of 36.5 W/m 2 (see Section 4 of ref. 18 ), and recent ISCCP-HXG products at 10 km resolution with an R of 0.93 and RMSE of 32.4 W/m 2 (see Table 3 of ref. 23 ) which were also validated against observations at the CMA radiation stations. At monthly scale, the R value increases to 0.96, 0.93 and 0.92 meanwhile RMSE decreases to 17.24, 19.55 and 11.48 W/m 2 for R s , R dir and R dif , respectively, which is also remarkably better than other products (compare to Table 2 of ref. 36 ). It should be pointed out that the excellent performance at monthly scale benefits from the mutual offset of underestimation and overestimation, for instance, daily R dif shows an overestimation in the low-value part and an underestimation in the high-value part (Fig. 5f) while this does not occur for monthly R dif (Fig. 5i). Figure 6a,b shows the errors of hourly estimates grouped by local hours from 8:00 to 17:00.

Uncertainties.
All groups correlate well with the ground measurements with the lowest R being 0.96, 0.93 and 0.87 for R s , R dir and R dif , respectively, proving the good performance of deep network in hourly radiation estimation. Large www.nature.com/scientificdata www.nature.com/scientificdata/ rRMSEs are likely to appear in the morning and at night when the amounts of received surface radiation are very low. The data accuracy is acceptable with the average rRMSE lower than 20% (R s ) or 40% (R dir and R dif ). It points out that temporal deviations might result from the fact that satellite images reflect an instantaneous state of the atmosphere whereas ground measurements represent an average state within per unit time (herein one hour). When clouds move rapidly, ground stations are likely to be covered by cloud shadows during a momentary period (less than one hour) but satellite sensor may scan a clear sky because clouds have drifted across. In this case, ground measurements would be smaller than satellite-based estimates. Therefore, large positive deviations usually occur when coming across changeable clouds. A limitation of our method is that it is unable to simulate dramatic changes in short time because our trained network just takes into consideration the spatial adjacent effects of solar radiation but ignores the lag effect and cumulative effect in time series. The recurrent neural networks 43,44 that are able to model temporal dynamic behaviour are the promising solutions.
The R and rRMSE of hourly R s and R dif at each site are displayed in Fig. 6c-f, where obvious geographical differentiation is observed. Overall, our estimates correlate well with ground measurements at sites with high probability of cloud-free skies, for instance, the north and northwest China. Low R and large rRMSE are likely to occur at sites located in regions with cloudy days, such as the south and southwest China, especially the Szechwan Basin perennially covered by clouds. It is known that both dust aerosol particles in the north and northeast China and dense clouds in the south and southeast China lead to non-clear skies, but model performance is inconsistent in these areas. This phenomenon indicates that our developed deep network does well in simulating radiative effects of aerosols, but slightly bad in handling clouds. Such shortcoming is attributed to the inadequate information on cloud properties 45 . As only visible channel of MTSAT satellite is used, it is difficult for deep network to www.nature.com/scientificdata www.nature.com/scientificdata/ determine optical properties of clouds, thus their interactions with radiation. With the aid of additional channels, better retrieval under cloudy conditions might be possible, for instance, depending on the identification of ice clouds and liquid water clouds whose radiative effects are significantly different. Moreover, the gridded products from satellite data are inherently spatial domain-averaged while ground measurements focus on solar radiation in local areas. This discrepancy always leads to evaluation deviations in space 46 . When ground station is covered by shadows of cirrus clouds, the measured surface solar radiation would be lower than satellite-derived values because the footprint of satellite grid has larger spatial coverage.
With regard to R dif , the correlation between our estimates and ground measurements is worse than that of R s . Different from R s , estimates of R dif behave well in humid areas (southern China) rather than arid areas (northwest China), against our common sense that cloudy weather conditions in the southern China strongly affect the accuracy of radiation estimation. On the premise that deep network for R s estimation has proved its effectiveness in arid areas, the worse performance on R dif estimation under the same framework might be attributed to the poor data quality. Evidence comes from the fact that measurements of R dif in the western China are not in a full-automatic tracking manner but manual operations, of which the nonstandard ones often lead to measurement errors. This contradictory phenomenon also indicates that a small proportion of problematic ground measurements would not affect the performance of deep network owing to its powerful robustness.
Sampling errors. The representativeness of R dif training samples is worthy of special concern as only measurements at twelve stations are involved. To reduce the influence of insufficient samples on estimated data accuracy, we adopted the transferring learning approach to reuse the rules on how CNN extracts spatial pattern from satellite blocks that have mastered during R s estimation based on a larger dataset. We designed 7 experiments (listed in Table 2) to have an in-depth inspection of potential sampling errors associated with this approach. E1 trains the deep network using the fully R s training dataset. E2 trains the network using R s measurements at the twelve R dif training sites. E3 trains the network using R s measurements at randomly selected twelve training sites. E4 trains the network using R dif measurements at the twelve R dif training sites. E5 fine-tunes the trained network of E1 using R dif measurements at the twelve R dif training sites. The performance of the gained network in E1-E5 is validated at the same five independent sites in terms of R and RMSE on R s or R dif . E6 fine-tunes the trained network of E1 through K-fold cross-validation strategy, i.e., the 17 R dif sites were divided into 4 groups (4-4-4-5), and then 3 out of the 4 groups were used to train the network while the rest one was excluded. The training process was repeated four times for all the combinations and the R and RMSE of all predictions of the sites excluded in the four repeats were calculated to measure the performance of E6. E7 is a stress test where we used for validation only the five sites that are more humid or with higher elevation or closer to cities.
The results show that selecting densely and evenly distributed sites is the only way to improve the generalization ability of deep network (cf. E1 and E2), but it is also beneficial to make the limited sites distributed in representative areas with diverse characteristics (cf. E2 and E3, E5 and E6). Although the comparison is conducted on R s , we assume it bears valid information for R dif as well. Regardless of the small number, diffuse radiation stations cover all typical climate zones in China (Fig. 1), maximizing their spatial representation as much as possible; hence, it is rational to believe in the reliability of the trained network for R dif estimation. Compared with training a network for R dif estimation from the beginning (E4), fine-tuning the trained R s network through transferring learning (E5) makes up the limitation caused by insufficient R dif samples to a certain extent. Anyhow, the comparison between E1 and E6 demonstrates the existence of sampling errors and suggests that R dif estimation requires further attempts and efforts. The stress test (E7) gave us an idea of the maximum sampling error. Since R dif is highly influenced by humidity (function of climate and vegetation) and probably pollution and altitude, we pertinently removed sites that are more humid or with higher elevation, or closer to cities from training samples, but used them only for validation. Due to the inevitable reduction of the representativeness of training samples, the validation accuracy was lower than that of E5. These extreme cases show that the expected maximum sampling error of our R dif estimates may not exceed the worst value of E7, i.e., R of 0.584 and RMSE of 0.451 MJ/m 2 . Anyhow, such sampling errors announce the importance to collect more representative R dif measurements for improving the performance of deep network on R dif estimates.  (e,f) Spatial distribution of rRMSE for R s and R dif , respectively.