A multiscale dataset for understanding complex eco-hydrological processes in a heterogeneous oasis system

We introduce a multiscale dataset obtained from Heihe Watershed Allied Telemetry Experimental Research (HiWATER) in an oasis-desert area in 2012. Upscaling of eco-hydrological processes on a heterogeneous surface is a grand challenge. Progress in this field is hindered by the poor availability of multiscale observations. HiWATER is an experiment designed to address this challenge through instrumentation on hierarchically nested scales to obtain multiscale and multidisciplinary data. The HiWATER observation system consists of a flux observation matrix of eddy covariance towers, large aperture scintillometers, and automatic meteorological stations; an eco-hydrological sensor network of soil moisture and leaf area index; hyper-resolution airborne remote sensing using LiDAR, imaging spectrometer, multi-angle thermal imager, and L-band microwave radiometer; and synchronical ground measurements of vegetation dynamics, and photosynthesis processes. All observational data were carefully quality controlled throughout sensor calibration, data collection, data processing, and datasets generation. The data are freely available at figshare and the Cold and Arid Regions Science Data Centre. The data should be useful for elucidating multiscale eco-hydrological processes and developing upscaling methods.


Background & Summary
The modelling and observation of land-surface system processes must address the scaling issue, which is a complex problem that is intertwined with the nonlinearity and the heterogeneity. Scaling is challenging within all branches of land-surface science, including hydrology 1,2 , ecology 3 , soil science 4 , and boundary layer meteorology 5 , and becomes increasingly prominent with advancements in these fields. It is an urgent need to obtain multiscale observation to further improve our understanding of the scaling issue and validate scaling transformation methods.
However, multi-scale and multidisciplinary data in land-surface science were scarce until the mid-2000s 6 . Since then, data availability has improved and benefited from state-of-the-art in situ and remote sensing observations and data acquisition techniques. Moreover, multi-scale land surface observation experiments have been implemented globally within the last decade [7][8][9] . These experiments have provided a promising method for bridging knowledge gaps among microscopic-, mesoscopic-and macroscopic-scale understanding.
The Heihe Watershed Allied Telemetry Experimental Research (HiWATER) project is an example of such experiments 10 . HiWATER is a simultaneous airborne, satellite-borne and ground-based ecohydrological experiment designed from an interdisciplinary perspective. This project was initialized within the framework of the Integrated Research of the Eco-hydrological Processes of the Heihe River Basin (HRB), which is a major research program supported by the National Natural Science Foundation of China 11,12 . This program addresses scaling issues associated with eco-hydrological processes through process study, modelling, and observation. HiWATER focuses on obtaining multi-scale observation data to support the scaling studies of this major research program. In HiWATER, scaling is considered a grand challenge in two aspects. (1) Upscaling in situ observations to a scale of approximately 1 kilometre, which is consistent with medium spatial resolution remote sensing as well as river basin-scale eco-hydrological models. Then, the upscaled ground truth can be used to validate remote sensing products and model results on heterogeneous land surfaces and quantify the uncertainty associated with heterogeneity 13 .
(2) Use the multiscale data in understanding key eco-hydrological processes across multiple scales, including leaf, individual plant, community, landscape, watershed, and river basin. Therefore, preconditions include obtaining multi-scale observations with sufficiently high spatial and temporal resolution and providing data for disciplines, such as hydrology, ecology, and boundary layer meteorology.
HiWATER was implemented in the HRB, the second largest inland river basin in China, which has diverse landscapes, environmental extremes (mountain cryosphere and arid environments), and conflicting interests (economic development and ecosystem restoration). Additionally, the HRB is an experimental river basin that has been used for hydrological, ecological, and integrated studies for over 30 years 11 . HiWATER lasts from 2012-2016. Several intensive observation periods (IOPs) and continuous hydrometeorological observations were carried out during HiWATER. Only data collected during the IOP in 2012 are presented in this paper. The 2012 IOP was implemented during the growing season from April to September in an oasis with surrounding deserts located in the midstream area of the HRB 10 .
HiWATER 2012 IOP datasets were released after careful quality control throughout sensor calibration, observation, data collection, data processing, and dataset generation. The datasets have been made available to the scientific community through figshare. Additionally, the datasets can also be downloaded from the Cold and Arid Regions Science Data Centre at Lanzhou (CARD), a member of the World Data System. Metadata are available in both English and Chinese, with the digital object identifier (doi) and data citation attached to each dataset.

Experimental design and data acquisition
The 2012 IOP of HiWATER occurred in an oasis and surrounding deserts located in the midstream area of the HRB. Complex energy and water exchanges between oases and surrounding deserts exist on the river basin scale, which differ sharply in landscape as well as in hydrological and thermal conditions [14][15][16] . The widely distributed farmland shelterbelts and irrigation scheduling within the oasis can result in small-scale kinetic and thermal heterogeneities, respectively. Obviously, only observing land-surface variables at limited sites cannot capture the heterogeneities of the abovementioned processes. Hence, full coverage of the following spatial scales must be achieved to understand the complex eco-hydrological processes within the system: Instruments for the oasis-desert system were arranged in hierarchically nested scales to capture multiscale eco-hydrological processes. We established a sparse network to investigate the oasis-desert interaction. One superstation was constructed within the oasis, and four EC towers and four two-layer automatic meteorological stations (AMSs) were installed in different landscapes surrounding the oasis, including sandy desert, desert pavement, desert steppe, and wetland. All components of surface energy and water balances and associated near-surface atmospheric states were measured to capture the heterogeneity of the water and energy cycle in the oasis-desert system (Fig. 1). Additionally, several airborne remote sensing missions covered this area.
Intensive observations were implemented at the irrigation district scale. This foci experimental area (FEA) spans approximately 5.5 × 5.5 km 2 , which is a fragmented landscape occupied primarily by seed corn. Other crops or land use types include vegetable, orchard, and shelterbelt. The precipitation in this area is low, with approximately 150 mm yr − 1 , and irrigation water is withdrawn from streamflow and groundwater. The FEA was equipped with a flux observation matrix of 17 EC towers and AMSs; 4 LAS pairs 17,18 ; and an eco-hydrological sensor network of soil moisture with 180 sensor nodes 19 and leaf area index (LAI) with 42 sensor nodes 20 . Other in situ observations include stable isotope measurements of evapotranspiration (ET) 21 , Cosmic-Ray probe soil moisture (COSMOS) measurements, sap flow, irrigation water, photosynthesis, soil respiration, stomatal conductance, vegetation dynamics  (LAI, fraction of photosynthetically active radiation (fpar), vegetation coverage, vegetation/crop type, vegetation height, and phenology), emissivity, reflectance, atmospheric profiles of humidity and temperature, and aerosol optical depth 22 . Additionally, soil samples were collected and soil properties such as texture and thermal and hydraulic parameters were analysed in a laboratory. A total of 12 airborne remote sensing missions were conducted to cover the FEA using LiDAR, an imaging spectrometer, a multi-angle thermal imager, and an L-band microwave radiometer. Calibration and validation of airborne remote sensing were completed using the abovementioned ground observations and supplemented through intensive tasks designed to measure target variables on the ground. The instrumentation listed above is illustrated in Fig. 1. The sensors used in HiWATER are summarized in Table 1 (available online only), and the airborne remote sensors are detailed in Table 2. Satellite remote sensing data at different resolutions and from various satellite sensors were acquired through data sharing programmes and commercial purchases. The satellite data were archived with other HiWATER datasets.
Spatial scale is explicitly considered for all observations. The matrix of EC towers, AMS, and LAS was designed to fully encompass all landscapes in the oasis and to form true multi-scale observations. Observational footprints overlapped with landscape to kilometre scales, and the measurement foci included ET, sensible heat flux, radiation fluxes, and soil heat flux to close the energy balance. The soil moisture and LAI sensor network was designed to form an unbiased estimation from sub-metre to kilometre scales using geostatistical model-based sampling methods 23 . The design principles of the sampling included best linear unbiased estimation, multi-scale variation acquisition, cost effectiveness, and implementation feasibility. Ground-based observations of vegetation dynamics, photosynthesis, and soil respiration were completed at individual plant, leaf, and stomatal scales. These observations were designed with considerations of sampling different crops and scaling up to a resolution of approximately 1 kilometre so that the scaled values could be compared with satellite remote sensing products and used in river basin eco-hydrological modelling. Airborne remote sensing was deigned to bridge the scale gap between in situ and satellite remote sensing 24 . All airborne sensors' resolutions were at least one order of magnitude higher than those of satellite remote sensing. Therefore, high-resolution products, such as digital elevation model, land cover map, albedo, LAI, crop height, land surface temperature (LST), and soil moisture, could be derived to reveal landscape and thermal heterogeneities.
The use of multi-scale observations following multiple approaches and using different instruments is a concern in experiment design. On the ground, this strategy focused on ET and soil moisture. Stable isotope and sap flow methods were used to fill the gap in measuring ET processes at stomatal, leaf, individual plant, and metre scales and to separate evaporation and transpiration. The soil moisture sensor network is a nested and cross-scale observational approach because point measurements will be upscaled to gridded data with resolutions from 30 to 1,000 metres. Furthermore, the soil moisture sensor network was supplemented by AMSs and flux towers with soil moisture profile measurements up to approximately −1.0 to −1.6 metres in depth, COSMOS with a 350-metre radius, ground-penetrating radar, and manual observations of soil moisture at a fine scale. Additionally, optical and microwave sensors used concurrently in some airborne missions and multi-resolution, multi-angular, and multi- source airborne data were obtained by flying at different heights. These data are particularly useful in developing and validating upscaling methods. The temporal density of automatic systems, such as sensor networks and AMS, was up to one minute. Typically, sampling frequencies were 10 to 30 min so that the temporal resolution was sufficiently high to capture temporal dynamics and analyse temporal stationarity.

Data quality control
Data quality control is a last-for-ever process in HiWATER. Before, during, and after the field observation, data processing, dataset generation, and data release, a series of quality control measures were undertaken, which were implemented through the following procedures (Fig. 2).
(1) Experiment preparation period. Observation instrument operating specifications were formulated, and observers were trained. In addition, instrument selection, alignment and calibration were completed to ensure appropriate implementation of operating specifications as well as the accuracy and consistency of the observation instruments.
(2) Experimental IOPs. The integrity of observation information and data quality were achieved through the implementation of operation specifications, technical inspections, instrument alignment and calibration, and maintenance of detailed experimental procedure records and an experimental logbook, among other measures.
(3) Data collection. Data accuracy and integrity were achieved through data integrity checks, quality self-examination, and standardized data file naming.
(4) Data processing. Standard data processing procedures were performed. A thematic data processing group was established and was dedicated to data processing. Organized discussions and studies were employed to assess difficulties and problems associated with the processing of key datasets.
(5) Writing and reviews of the metadata. Many measures were adopted during this period, including metadata and raw observation data consistency checks, supplementation of standardized descriptive data information, data integrity and accessibility checks, missing data supplementation during collection and digitization, and invalid data investigation. After the collation of the metadata, numerous peer-review cycles were implemented to improve the metadata quality.
(6) Expert review. Peer-review methods were used to ensure HiWATER dataset quality. First, experts of thematic experimental observations, such as flux observation matrix, sensor network observations, and airborne remote sensing, completed internal quality reviews and crosschecks. Second, experts in the relevant fields performed data quality analyses, including data checks, data availability suggestions and overall data quality evaluations.
(7) Data users appraisal stage. Data issues were promptly corrected according to the advice given by data users after dataset release in order to improve data quality and services.

Data Records
HiWATER 2012 IOP data were organized according to the flux observation matrix, the eco-hydrological sensor network, other ground-based observations, and airborne missions and airborne remote sensing products. A total of 102 datasets were generated formally. Dataset quantities, sizes, and formats are summarized in Table 3. Detailed information, including title, observation variable, location, observation time, sensor or instrument used, quality control, spatial and temporal scales, and the doi of each dataset, is provided in Table 1 (available online only). High-resolution satellite remote sensing data from VNIR, thermal infrared (TIR), synthetic aperture radar (SAR), and LiDAR sensors were obtained via data sharing programmes and limited commercial purchases. Additionally, we archived the satellite remote sensing data in the HiWATER data repository (Table 4). However, the copyrights of these satellite remote sensing data belong to the original data providers, so we cannot release them as HiWATER datasets but can offer them to users in an offline mode. The data obtained from the flux observation network, ecohydrological sensor network, and other ground-based observations are publicly and freely downloadable from figshare (Data Citation 1) and the HiWATER data repository in the CARD (Data Citation 2-85), in which more detailed information including data citation, related publications, background introduction, and relationship with other datasets is available. As for the airborne remote sensing data, the L-band microwave radiometer data and the soil moisture data products derived from these data are also available online for users' direct download (Data Citation 103). In total, 85 HiWATER IOP datasets are fully and freely downloadable at figshare as a whole dataset and at the CARD as individual datasets. However, in accordance with the laws and regulations in China, the hyper-resolution remote sensing data, including those from LiDAR, imaging spectrometer, and multi-angle thermal imager, cannot be placed online. Therefore, these datasets (Data Citation 86-102) are offered in an offline mode at the CARD. The users can submit a data application form on line via the HiWATER data system. Once the application is approved, the data will be sent to the user.
Additionally, special navigation web pages were built on CARD to browse, navigate, search, and download HiWATER data (http://card.westgis.ac.cn/hiwater) (Fig. 3). Users can find the datasets via a keyword search, classified navigation or theme-based exploration (e.g., by timeline, map, author, or thumbnail), which are offered by the metadata database. The ISO 19,115 geographic metadata standard was used to describe the HiWATER data. All metadata are available in both English and Chinese. Additionally, doi and data citation information are attached to each data record. The unique doi of a dataset will lead the user to a web page that provides a detailed data description and a data download URL for the individual dataset. The data are redistributed by a File Transfer Protocol (FTP) server with an auto-generated FTP account.

Technical Validation
Sensor calibration, measurement validation and other quality control measures are prerequisites to ensuring data quality in HiWATER. We describe the quality control measures in the Methods section. Sensor and instrument calibration was conducted as follows.
The calibration of EC, AMS, and LAS systems is summarized in data records 1-50 in Table 1 (available online only). The 20 EC system sets, 7 LAS sets, and 18 radiometer sets used in the experiment were compared under a flat Gobi desert surface prior to the 2012 HiWATER IOP during May 14-24, 2012. The results indicate that all ECs, LASs and radiometers were consistent. Compared to the reference instrument, the average root-mean-square error (RMSE) and mean relative error (MRE) of the sensible heat flux measured by the EC were 13.00 W m − 2 and −2.02% and 4.47 W m − 2 and 0.11% for latent heat flux, respectively. The average RMSE and MRE values for the LAS were 10.26 W m − 2 and 5.48%, respectively. The RMSE and MRE for net radiation were 10.38 W m − 2 and 1.24%, respectively. The EC and LAS measurements were consistent with a regression slope of less than 8%, which indicated reliability    during HiWATER. The comparison results were consistent or better than the previous comparison results in the international experiments 18 . Additionally, the sensors of wind speed, air temperature and humidity profiles at the superstation and soil temperature and moisture profile at each site were intercompared as well prior the 2012 IOP. The calibrations of soil moisture, LST, and LAI sensors used in the sensor network are described in data records 51-54 in Table 1 (available online only). The sensor network employed a large number of different sensors. Soil moisture sensors included 200 SPADE and 150 Hydra Probe II. For reliability and efficiency, the accuracy and consistency of each sensor were calibrated using the two-point calibration method with desert sand and saturated soil samples as dry and wet points, respectively. Then, the oven-drying method was used to evaluate measurement accuracy. The calibration results indicated that the consistency between the same type of sensors is greater than 95%. The accuracies of soil moisture for SPADE and Hydra Probe II are 0.032 and 0.011 m 3 m − 3 , respectively. The LST sensor, SI-111, was calibrated using the BDB blackbody calibrator at a constant temperature of 23°C and a water-ice mixture at 0°C. The accuracy of the LST measured by the SI-111 sensor was within 0.15°C 19 . The LAI instrument used in the WSN, LAINet, was compared with LAI-2000, a commercial instrument used to measure LAI. Consistency was relatively high at LAIo3.5. However, LAINet could capture the dynamics at LAI >3.5, whereas the LAI-2000 measurements were saturated, indicating an improved accuracy of the LAINet measurements over those of LAI-2000 20 .
Many instruments were used in other ground measurements. Most instruments were calibrated using absolute and cross-calibration strategies (data records . The field spectrometers, including ASD and Spectra Vista Corp, were cross-calibrated with each other. The black board and old white board were calibrated according to a new standard white board. The GPS radiosondes, including Changfeng and Vaisala, were also cross-calibrated using a conventional radiosonde. The results indicated close agreement among different radiosonde measurements. Two CE-318 sun photometers used in HiWATER were cross-calibrated in June 15-16, 2012. The Scintec Flat Array Sodar, which was used to measure wind direction, wind speed, and disturbance characteristics in the lower atmosphere, was cross-calibrated with the wind profile data obtained at the Daman Superstation, a 40-metre boundary layer tower. The results indicated close agreement between these two types of wind profile measurements. Self-recording point thermometers and handheld infrared thermometers were used to measure LST in HiWATER. All sensors were absolutely calibrated at constant temperatures from 0 to 60°C with a 5°C interval. Calibration was repeated five times for each temperature. The calibration experiments indicate that the temperature accuracies of a majority of the sensors are less than 1 K. Airborne remote sensing instrument calibration and data validation are summarized in data records 77-102 in Table 1 (available online only). The radiometric parameter of the VIS/NIR sensor was calibrated in the calibration laboratory of the Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, using an integrating sphere as the light source, which was developed by the Labsphere Corporation. Additionally, the wavelength was calibrated using a monochromator. An EO-1 blackbody was used to calibrate the radiometric uniformity and temperature of the thermal infrared sensor. The geometric correction of the frame sensor was performed using a specifically designed threedimensional target. A bundle-adjustment procedure was completed for LiDAR calibration to characterize linear spatial displacements between the IMU and the sensing array. The L-band microwave radiometer, PLMR was calibrated using the two-end calibration method prior to and following mounting on the aircraft in each flight. The warm-end calibration adopted a closed blackbody box with the environmental temperature measured using 16 thermal sensors, whereas the cold end was calibrated by measuring the sky brightness temperature (Tb). The largest reservoir in the study area was selected as a calibration reference. The temperature of the top water layer was measured every minute in the experimental period. The Tbs over the reservoir were measured during each flight mission. Therefore, the measured Tb over a water body can be compared to the Tbs calculated by the radiative transfer model of water. The two-end calibration and water body reference indicated that the accuracy of measurements at a small incidence angle was superior to that at a large incidence angle, and the average accuracy was superior to 1.0 K for both of the vertical and horizontal polarizations. Caution should be taken when using Tb data because the radio frequency interference contamination was sometime higher than expected at v-polarization.
Airborne and satellite remote sensing products were quantitatively validated using simultaneous in situ observations, with a particular focus on upscaling point-and footprint-scale observations to the pixel scale 22,25 . The overall quality of remote sensing data products was evaluated based on the accuracy and uncertainty, and this information was made available in the metadata of the data products. A high-quality remote sensing data product was released only when its accuracy was higher than the required standard threshold. Otherwise, the algorithm was improved and then re-executed for product generation until a satisfactory accuracy was reached.