Assessing the accuracy of OpenET satellite-based evapotranspiration data to support water resource and land management applications

Volk, John M.; Huntington, Justin L.; Melton, Forrest S.; Allen, Richard; Anderson, Martha; Fisher, Joshua B.; Kilic, Ayse; Ruhoff, Anderson; Senay, Gabriel B.; Minor, Blake; Morton, Charles; Ott, Thomas; Johnson, Lee; Comini de Andrade, Bruno; Carrara, Will; Doherty, Conor T.; Dunkerly, Christian; Friedrichs, MacKenzie; Guzman, Alberto; Hain, Christopher; Halverson, Gregory; Kang, Yanghui; Knipper, Kyle; Laipelt, Leonardo; Ortega-Salazar, Samuel; Pearson, Christopher; Parrish, Gabriel E. L.; Purdy, Adam; ReVelle, Peter; Wang, Tianxin; Yang, Yun

doi:10.1038/s44221-023-00181-7

Download PDF

Analysis
Open access
Published: 15 January 2024

Assessing the accuracy of OpenET satellite-based evapotranspiration data to support water resource and land management applications

John M. Volk ORCID: orcid.org/0000-0001-9994-1545¹,
Justin L. Huntington¹,
Forrest S. Melton^2,3,
Richard Allen⁴,
Martha Anderson⁵,
Joshua B. Fisher ORCID: orcid.org/0000-0003-4734-9085⁶,
Ayse Kilic⁷,
Anderson Ruhoff ORCID: orcid.org/0000-0002-3585-2022⁸,
Gabriel B. Senay⁹,
Blake Minor¹,
Charles Morton¹,
Thomas Ott¹,
Lee Johnson ORCID: orcid.org/0000-0002-3560-634X^2,3,
Bruno Comini de Andrade⁸,
Will Carrara^2,3,
Conor T. Doherty²,
Christian Dunkerly ORCID: orcid.org/0000-0003-3592-4118¹,
MacKenzie Friedrichs ORCID: orcid.org/0000-0002-9602-321X¹⁰,
Alberto Guzman^2,3,
Christopher Hain¹¹,
Gregory Halverson¹²,
Yanghui Kang ORCID: orcid.org/0000-0001-8563-1503¹³,
Kyle Knipper ORCID: orcid.org/0000-0003-0889-8129¹⁴,
Leonardo Laipelt⁸,
Samuel Ortega-Salazar⁷,
Christopher Pearson¹,
Gabriel E. L. Parrish¹⁵,
Adam Purdy^2,3,
Peter ReVelle⁷,
Tianxin Wang ORCID: orcid.org/0000-0002-2709-9122¹³ &
…
Yun Yang¹⁶

Nature Water volume 2, pages 193–205 (2024)Cite this article

11k Accesses
8 Citations
136 Altmetric
Metrics details

Subjects

Abstract

Remotely sensed evapotranspiration (ET) data offer strong potential to support data-driven approaches for sustainable water management. However, practitioners require robust and rigorous accuracy assessments of such data. The OpenET system, which includes an ensemble of six remote sensing models, was developed to increase access to field-scale (30 m) ET data for the contiguous United States. Here we compare OpenET outputs against data from 152 in situ stations, primarily eddy covariance flux towers, deployed across the contiguous United States. Mean absolute error at cropland sites for the OpenET ensemble value is 15.8 mm per month (17% of mean observed ET), mean bias error is −5.3 mm per month (6%) and r² is 0.9. Results for shrublands and forested sites show higher inter-model variability and lower accuracy relative to croplands. High accuracy and multi-model convergence across croplands demonstrate the utility of a model ensemble approach, and enhance confidence among ET data practitioners, including the agricultural water resource management community.

Climate damage projections beyond annual temperature

Article Open access 17 April 2024

Environmental drivers of increased ecosystem respiration in a warming tundra

Article Open access 17 April 2024

Recent reductions in aerosol emissions have increased Earth’s energy imbalance

Article Open access 03 April 2024

Main

Accurate evapotranspiration (ET) data are essential for assessing the surface energy and water balance, the carbon cycle and the management of water resources¹. ET is the sum of the flux of water vapour from soil (evaporation) and through vegetation (transpiration) to the atmosphere. ET constitutes the second largest component of the terrestrial water balance, after precipitation. The usefulness of spatially contiguous mapping of ET, particularly over irrigated agricultural lands, has been amplified by drought, climate change, and high rates of human water withdrawal and agricultural consumption, leaving many aquifers and water reservoirs in the western United States at all-time-low levels^2,3,4. Satellite-based remote sensing of ET (RSET) offers a powerful approach for mapping ET over large geographic regions at semi-continuous timescales^1,5,6. Until recently, the availability of RSET data at spatial scales relevant for water resources management has been limited by cost and computational requirements.

OpenET⁵ employs six state-of-the-art satellite based RSET models, that is, ALEXI/DisALEXI⁷, eeMETRIC⁸, geeSEBAL⁹, PT-JPL¹⁰, SIMS^11,12 and SSEBop¹³, that have been widely applied and evaluated in the United States for a range of water management and agricultural applications. The models are applied on the Google Earth Engine cloud-based platform¹⁴ to provide historical and near real-time ET data at subfield scales (30-m pixels) over the western United States⁵. Five of the RSET models constrain components of the surface energy balance (SEB) using land surface temperature (LST) primarily derived from Landsat Collection 2, along with gridded weather data, and land cover datasets. The sixth model, SIMS, assumes well-watered conditions and computes crop coefficients based on vegetation density, derived from satellite surface reflectance values, along with a gridded soil water balance model. The models composing OpenET have been used by water managers, farmers and governmental organizations for irrigation scheduling, water accounting and allocation, and water rights administration^15,16,17. The OpenET platform provides an unprecedented level of accessibility to RSET data through its public online data explorer interface—including querying satellite ET within individually vectorized field boundaries. All six RSET models in OpenET operate automatically, including any required calibrations, which permits rapid calculations for the more than 100,000 Landsat images processed so far across the 23 western-most states in the contiguous United States. As the number of applications of RSET data for sustainable land and water resources management grow, it is important for practitioners to have information on the accuracy of RSET data across land cover types, climatic zones and agricultural production practices¹⁸.

In this Analysis, we present a large-scale benchmark assessment of the accuracy of OpenET data using a well-curated publicly archived dataset of in situ ET measurements from 152 stations (141 eddy covariance (EC) systems, 7 Bowen ratio systems and 4 lysimeters), over a variety of regions, climates and land cover types^19,20, collectively comprising ~45 years of paired model–measurement ET data (Fig. 1). The EC technique is generally viewed as the best available method for continuous measurement of in situ energy and heat flux at spatial scales that approach satellite-based retrievals^21,22, although we acknowledge the associated data uncertainties and made efforts to reduce them¹⁹. In addition to evaluation of individual model accuracies, we evaluated the OpenET ensemble ET value, computed as the mean of all models after flagging and removal of up to two outliers using the median absolute deviation (MAD) approach^23,24. The generation of an ensemble value is a widely used technique to combine outputs from diverse models, each having their own behaviour⁵ and random error^25,26,27. It also facilitates applications such as irrigation scheduling and water rights administration, where practitioners require a single value for use in management of water resources⁵. The publicly archived in situ flux dataset allows for reproducibility and benchmarking of future OpenET model versions or other RSET data.

**Fig. 1: Map of in situ ET measurement sites.**

ET data computed from micrometeorological measurements at EC sites were obtained from a variety of sources, primarily AmeriFlux²⁸. Supplementary Table 1 provides a full list of stations used in the study including land cover type, site principal investigators, Digital Object Identifiers (DOIs) and other metadata. Flux data were carefully post-processed, including gap-filling, screening for energy balance closure error and data completeness, and visual data quality assessments. Flux data that passed quality control and showed limited energy balance closure error were included in the study and underwent closure correction following the FLUXNET2015/ONEFlux approach for daily averaged fluxes^19,29. We refer to EC data as ‘ECET’ throughout the article. Closed ECET data were considered to be most representative of actual ET³⁰. To sample RSET pixels for comparison with ECET, flux footprints were developed for each station. Flux footprints are two-dimensional mappings of the areal extent of a station’s source area, that is, the area on the ground that contributes to fluxes measured by the tower instrumentation. Refer to Methods and Volk et al.^19,20 for details on flux data processing and footprint mapping methods used. Additional discussion of uncertainty in EC data and steps taken to limit that uncertainty are provided in Supplementary Discussion 1. An overview of the satellite-driven ET models in the OpenET ensemble is provided in Methods.

The discussion of statistical results that follows focuses on comparisons between monthly aggregated ECET and RSET. Although accuracy assessments were conducted using daily (date of overpass) data and monthly total ET aggregated to growing season and annual periods, our discussion focuses on monthly results for several reasons: monthly ET has utility for longer-term water accounting and planning; uncertainties in EC data due to closure and other factors are reduced at the monthly (compared with daily) timescale, and OpenET directly provides daily and monthly ET, along with data services that allow users to compute ET at other aggregation periods. Accuracy results are provided for daily, monthly, seasonal and annual timesteps in Supplementary Tables 2–6, and accuracy metrics for daily timesteps should be consulted for applications of ET data at timesteps of 1–15 days. Five well-known statistical metrics were used to evaluate OpenET accuracy (for equations, see Methods): the linear regression slope forced through the origin which measures bias (Slope), mean bias error (MBE), mean absolute error (MAE), root-mean-square error (RMSE) and the coefficient of determination (r²). Regression results with a non-zero intercept for monthly data are provided in Supplementary Table 7.

Performance over all agricultural flux sites

Of all the general land cover types sampled, OpenET models showed the strongest agreement with ECET collected in agricultural settings. For 44 agricultural sites combined, eeMETRIC, SIMS and PT-JPL showed the least bias in terms of MBE, all less than −4.5 mm per month or 5% of the mean ECET (Table 1). The ensemble value had a slightly higher magnitude bias of −5.3 mm per month, or 5.8% of the mean ECET. The ensemble value outperformed each individual model in terms of MAE 15.9 mm per month (17.3% of the mean ECET), RMSE 20.4 mm per month (22.4%) and r² (0.90). In comparison, MAE from individual models ranged from 17.9 to 22.7 mm, RMSE from 23.1 to 29.1 mm per month and r² from 0.83 to 0.87 with smallest errors from PT-JPL, SIMS and DisALEXI.

Table 1 Smmary statistics between modelled and observed monthly ET for cropland sites

Full size table

ET data from the individual RSET models were generally linearly related to ECET, with PT-JPL and SIMS exhibiting some curvature due to seasonally varying biases (Fig. 2). Many of the models underestimated ET during the cold season relative to the ECET, leading to the slightly low bias in the ensemble ET value (Table 2). To investigate seasonal variability in model accuracy, we pooled all monthly paired (model–measured) ET to generate monthly climatologies for major land cover classifications (Fig. 3 and Extended Data Figs. 1–5). The range between unclosed and closed ECET provides one measure of the uncertainty in the in situ data³¹.

**Fig. 2: Modelled versus observed monthly ET at cropland sites.**

Table 2 Summary statistics between modelled and observed monthly ET for cropland sites grouped by climate zone

Full size table

**Fig. 3: Monthly climatology of paired modelled and observed ET for cropland sites.**

For most months, the multi-model ensemble ET value was well bounded between the closed and unclosed mean ECET for cropland sites, while individual ensemble members showed more seasonal bias. In spring, SSEBop and eeMETRIC underestimated unclosed ET, whereas SIMS overestimated closed ET, probably due to the assumption of well-watered conditions. In peak summer months, most models were in good agreement with closed ECET, with geeSEBAL and PT-JPL biased low. In September and October, when actual ET rates decline quickly, several models were biased high, except DisALEXI and geeSEBAL, which tracked closer to the unclosed values. The higher agreement of RSET with ECET during the peak summer period is encouraging, as this is the period of intensive irrigation and consumptive use of water through ET. A post hoc test showed that DisALEXI, geeSEBAL and SSEBop had mean monthly ET values that were statistically different (as underestimation) from the mean closed ECET. The mean aggregated growing season ET for all models were no different from the mean closed ECET (Supplementary Tables 8 and 9).

The monthly climatologies derived at flux sites were upscaled using data from all cropland pixels over the full OpenET domain (Extended Data Fig. 6). We found similar seasonal patterns and relative model biases to those identified at the flux sites—giving confidence in the representativeness of the ECET comparisons.

Impact of sampling interval on model performance

Model accuracy often improves with temporal aggregation interval due to cancellation of errors⁸. In croplands, the accuracy metrics for the OpenET ensemble improved as the aggregation period increased from daily (overpass dates) to monthly to growing season to annual periods (Supplementary Tables 2–6). Daily ensemble results for the combined cropland sites showed a MAE of 23.6%, and RMSE of 31.1% of the mean ECET. At this timescale there is increased uncertainty both in the ECET data due to variability in micrometeorological conditions and energy balance closure, and remotely sensed ET due to potential cloud contamination and errors in footprint representation. These ensemble uncertainties are reduced when integrating to monthly (MAE of 17.3% and RMSE of 22.4% of ECET), growing season (MAE of 12.9% and RMSE of 15.5% of ECET) and water year (MAE of 11.3% and RMSE of 12.3% of ECET) timescales. Fortunately, during growing season periods we found lower energy balance closure error in EC data¹⁹ and there is less cloud cover in satellite data in the western United States as compared with the non-growing period. During the summer, the daily ensemble normalized MAE (NMAE) on overpass dates was typically between 5% and 25% (Supplementary Fig. 1), and monthly 7% and 20% (Fig. 4). We expect custom aggregation periods between 2 and 15 days to have similar or slightly improved accuracy to daily results that vary seasonally; subweekly to bi-weekly RSET may be of greatest use for irrigation scheduling³².

**Fig. 4: Monthly MAE of the model ensemble for different crop types.**

Performance among annual and perennial crops

Annual crops, including wheat, corn, soy, rice and others, make up the majority (80%) of cropland sites in the OpenET ECET dataset (Supplementary Table 1). Compared with perennial crops, annual crops tend to have shorter canopies and more homogeneous cover at peak growth stage. The annual crop sites in the OpenET flux dataset are predominantly irrigated, and are distributed across a range of climatic zones, with higher density in regions such as Mediterranean and semi-arid Central Valley, California, and humid continental regions in the High Plains and the Mississippi Alluvial Plain (Fig. 1).

For annual crops, each of the RSET models in the OpenET ensemble exhibited small bias and high levels of accuracy and precision (Table 1). Similar to all crop types combined, the ensemble value for annual crops outperformed individual models in terms of MAE (15.3 mm per month or 17.9% of mean ECET), RMSE (19.7 mm per month or 23.2% of mean ECET) and r² (0.9). Of the RSET models, eeMETRIC and PT-JPL exhibited the lowest magnitude of MBE, with PT-JPL and SIMS yielding the highest accuracy in terms of MAE and RMSE.

Dividing annual crops into C3 and C4 subclasses, we find the seasonal patterns and magnitudes of ensemble MAE are similar throughout the year (Fig. 4). NMAE in general reflects the inverse of the characteristic water use curve for each class, with C3 crops exhibiting a broader seasonal curve than C4 and therefore lower NMAE early and late in the season. While the higher NMAE values observed outside the growing season for all crop types (Fig. 4) are more indicative of low ET rates than of meaningful modelling error characteristics, cool-season errors may be generally inflated by higher cloud cover, increasing the time interval between cloud-free satellite retrievals. Improving satellite imaging frequency, as well as ET time integration and gap-filling techniques, should help to increase OpenET accuracy during the non-growing season (Discussion).

Another class of interest is woody perennials, which are high-value crops and pose distinct modelling challenges. High-quality eddy flux ET data were available for three vineyards, three nut tree orchards and one fruit orchard, all located in California^19,33. Vineyards and orchards have taller and more highly structured canopies, often with inter-row cover crops, and vineyards are often deficit irrigated. These qualities lead to shadowing and mixed pixel effects in remote sensing at the 30-m level, and the need for sensitivity to small changes in vine stress to inform deficit irrigation applications is a unique modelling requirement.

RSET model performance in the vineyard sites sampled was strong and consistent across models. The ensemble accuracy exceeded that for annual crops (Table 1 and Fig. 4), with lower bias (slope of 1.02 and MBE of 5.3 mm per month) and lower MAE and RMSE (13.7 and 16.2 mm per month, respectively, or 12.2% and 14.5% of the mean monthly ECET) and r² of 0.90. DisALEXI performed similarly or better than the ensemble at the vineyard flux sites, perhaps due to its two-source approach towards partitioning temperature fluxes between the substrate (inter-row) and canopy.

Performance was more varied across ensemble members for the orchards than for other broad crop types, and biases were more negative. This could be related to shadowing effects in the taller and more strongly clumped canopies, particularly for models that are strongly dependent on LST inputs. The ensemble value had a negative bias with mean slope of 0.87, MBE −11.9 mm per month, MAE 21.2 mm per month (16.8% of ECET) and RMSE 27.9 mm per month (22.1% of ECET), and an r² of 0.91. SSEBop and SIMS had the least bias in terms of slope and MBE, and SSEBop and DisALEXI had the lowest error in terms of MAE and RMSE (Table 1). While MAE in orchards is high mid-season, the normalized values are similar to those of annual crops (Fig. 4).

Variation of model performance across climate regions

To investigate variations in OpenET performance over different climates, cropland accuracy metrics were grouped by the Köppen–Geiger climate zones of the flux sites³⁴ (Fig. 1). Zones with fewer than five flux stations were omitted as a conservative measure, and some zones were lumped on the basis of secondary climate classifications (for example, hot- and warm-summer Mediterranean zones). Each resulting group had 7–13 flux stations used for calculation of accuracy statistics.

Overall, the OpenET ensemble had better agreement with ECET at crop sites in water-scarce, semi-arid to arid regions (Mediterranean and desert zones in the Southwest) as compared with humid zones (Table 2 and Supplementary Fig. 2). Irrigation is more prevalent in semi-arid to arid regions, and crop ET tends to be closer to potential ET rates and is more accurately modelled in some RSET modelling frameworks. High accuracy of models in semi-arid and arid regions is advantageous, given the high priority of water resource sustainability and management challenges in these regions.

Among the zones considered, the OpenET ensemble value was most accurate for crop sites in Mediterranean zones, with MAE of 13.3 and RMSE of 16.5 mm per month (14.2% and 17.6% of the mean ECET), with the ensemble outperforming individual members. Of the individual models, SIMS showed the best agreement with ECET in these regions, suggesting well-watered conditions for most sites or possible influence of adjacent non-irrigated areas on SEB models. Similarly, in arid sites (hot and cold desert), SIMS had the lowest MAE and RMSE (Table 2). During the growing season periods when the majority of irrigation is applied, the ensemble’s monthly NMAE was consistently below 10% for cropland sites in Mediterranean climates (Supplementary Fig. 2).

Model performance in the subhumid and humid continental regions of the Midwest and Central Plains was similar to that in the Mediterranean climate zone, again with the ensemble outperforming individual models in terms of collective statistics (Table 2 and Supplementary Fig. 2). Errors were higher at the humid subtropical sites, with SIMS tending to overestimate ET with a slope of 1.15 and normalized MBE of 19.9%, indicating ET is less well correlated with vegetation density in this region, and that irrigation practices may result in intermittent vegetation water stress. Hypotheses for increased RSET error in humid regions and paths for improvement are proposed in Discussion.

Performance in natural ecosystems

Most of the flux stations (61%) used in the intercomparison were in non-agricultural sites, including shrublands, grasslands, mixed forests, conifer forests, and wetlands or riparian areas (Fig. 1)¹⁹. The SIMS model is currently not designed for and implemented in non-agricultural land-cover types; for these pixels, the ensemble consists of five models with the possibility of removing a single outlier (Methods). Systematic model error and variability for non-agricultural sites was higher than cropland sites (Fig. 5).

**Fig. 5: Monthly modelled ensemble versus observed ET for sites grouped by land cover type.**

Most models exhibited a high bias in wetland/riparian sites, dominated by overprediction of ET during the spring (Extended Data Fig. 5). SSEBop had higher accuracy in these sites than other models and the ensemble value (Supplementary Tables 2–4). For models that estimate all components of the SEB (DisALEXI, eeMETRIC and geeSEBAL), this bias could result from an underestimation of the substrate (water) heat storage term in the spring before the vegetation canopy develops⁷. These errors can potentially be mitigated in the future through accurate classification of inundated land areas.

Natural ecosystems under high water stress, such as shrublands and grasslands in desert and semi-arid steppe climates in the western United States, showed the highest variability and error with respect to ECET (Fig. 5 and Supplementary Tables 2–4). In these systems, ET can be a small fraction of available energy, and difficult to both measure on the ground and model using RSET approaches. Shrublands also tend to be more heterogeneous than cropland sites, and this can introduce additional uncertainty into model–measurement comparisons⁵. Nevertheless, it is important to provide an evaluation of accuracy, both to benefit ET monitoring and land health assessments within shrub and grassland ecosystems, and to identify key areas for future research in RSET to reduce model error.

The Landsat-scale ET from OpenET also has applications in forested landscapes, as a predictor of forest health and mortality³⁵ and as a metric of water yield response to forest management³⁶. In forested locations, most OpenET models overestimated ET, particularly at the evergreen flux sites sampled, yielding a slope for the ensemble value of 1.24 and MBE of 16.8 mm per month (27.3%). At these sites, eeMETRIC showed the least bias with a slope of 1.17 and an MBE of 10.8 mm per month (17.5%), while for MAE and RMSE, the ensemble value outperformed each individual model. At mixed forest sites, however, eeMETRIC and DisALEXI were in better agreement with ECET than was the ensemble.

Ensemble outlier removal and spatial inter-model variability

See Supplementary Discussion 2 for analysis and discussion of the MAD outlier removal approach that is used for computing the ensemble value, including spatial analysis of the occurrence of outliers and the long-term differences between each model’s seasonal ET and the ensemble value (Extended Data Figs. 7 and 8, Supplementary Figs. 3–9 and Supplementary Tables 9 and 10). Evidence suggests that the MAD approach showed accuracy metrics similar to other simple methods. Over 2016–2022, typically no model was identified as an outlier in cropland pixels; however, SIMS was about 10% more likely to be identified as an ensemble outlier, and it often gave the highest ET value, particularly in the Central Plains.

Discussion

ET is a critical driver and metric of ecosystem function, weather and climate, agricultural practices and water resource management. However, field-scale ET has previously been difficult to estimate at scale; therefore, ready access to high-resolution (spatially and temporally) ET data offers societal benefits to a variety of stakeholders^1,5. Using monthly ET data, water managers can develop more accurate water budgets in support of incentive-driven conservation programmes and innovative management and trading strategies. For policymakers, such data can improve water supply tracking, simplify regulatory compliance and promote the co-development of solutions with local communities. Crop producers may be able to improve the efficiency of irrigation practices in some instances, resulting in enhanced sustainability and reduced costs for water, fertilizer and energy. Supplementary Discussion 3 continues the conversation on incentives towards improving irrigation efficiency and how OpenET data can provide value in an RSET-based irrigation scheduling framework.

In addition to informing water management, OpenET has multiple research and modelling applications. Carbon and climate modelling can benefit from 30-m RSET data as a diagnostic indicator of ecosystem health and function response under a changing climate¹. RSET is being used to reduce summertime warm-dry bias in weather forecasting and climate models by improving the representation of ET from irrigated land³⁷, ET–soil moisture coupling³⁸ and transpiration–evaporation partitioning³⁹. Hydrologic and land surface models at multiple scales can also benefit from high-resolution ET data, for example, as validation or forcing data in basins where streamflow measurements are not available to constrain the water budget^13,40,41.

Realizing the full potential benefits of RSET data for water resource and land management applications requires rigorous and reproducible accuracy assessment to inform practitioners on best use practices¹⁸. The accuracy results we present here provide valuable constraints on model uncertainty based on broad crop type, climate region and timescale.

Average error in the OpenET ensemble value with respect to mean ECET in cropland sites for monthly, growing season and annual aggregated ET, ranged from 10% to 17% for MAE and 11% to 22% for RMSE. These errors are within accuracy levels of 10–20% reported for supervised remote sensing techniques⁴². They are also consistent with accuracy targets set by the OpenET user groups: 10–20% at a monthly timestep, and 15–25% for daily ET data⁵. These errors include uncertainties in ECET data, which are estimated to range from 10% to 30% depending on site characteristics and instrumentation design and maintenance⁴².

These accuracy results may support advancements in water management applications that incorporate OpenET data. For croplands, all models except for SIMS had negative bias errors at the monthly timestep (−2.7% to −13.3%), with an MBE of −5.8% for the ensemble ET value (SIMS MBE is +4.7%). Awareness of these bias errors when using these data for irrigation management applications may prevent unintentional deficit irrigation that can suppress crop yields and farm revenue⁴³. Cross-comparisons between the primarily reflectance-based SIMS and PT-JPL models and the LST-driven models may be useful for identifying periods of intentional or unintentional crop water stress and deficit irrigation. Reducing errors in the OpenET daily data is a high priority for advancing their utility for on-farm water management.

At local to regional scales, the reported uncertainties at monthly to annual timesteps should inform applications related to water balance, water accounting and water rights administration. Comparison of OpenET data aggregated at the scale of irrigation districts or watersheds against carefully constrained water balances offers one path to assessment of biases at larger scales. Particularly in administration of water rights, the current uncertainty in the OpenET data (for example, growing season ensemble NMAE of 12.9% for croplands) must be recognized in evaluating consumptive water use, and OpenET data should only be used for this purpose in combination with other sources of information.

This study provides insights into potential pathways towards improving the accuracy of the individual models within the OpenET ensemble. Across both agricultural and some natural landscapes, most models underestimated cropland ET during the winter and spring, particularly the models that rely upon TIR measurements to compute ET. This underestimation may be related to loss of thermal contrast over an image, where differences between the hottest and coolest pixels are reduced relative to midsummer values, adding uncertainty to within-scene scaling approaches. It may also be related to misrepresentation of soil evaporation during extended wet periods, extended periods of cloudiness, and error in shared model inputs. In addition, treatment of effects of senesced standing vegetation and crop residue on SEB can impact model performance outside of the growing season. In terms of observational errors, the energy balance closure error and uncertainty in EC data are also amplified during periods outside of the growing season¹⁹.

We found increased model error in croplands in humid climates as compared with drier regions. Again, lower temperature contrasts across humid landscapes may contribute to errors in TIR-based within-scene scaling models. A primary driver, however, is probably the relative paucity of clear-sky satellite retrievals and potential for error in LST due to undetected clouds. Improving temporal sampling of RSET model inputs will be a major focus of on-going development in OpenET, through future use of imagery from additional Landsat-like optical (Sentinel-2) and thermal (ECOSTRESS, VIIRS) sensors⁴⁴, and integration of future TIR observations from satellite missions currently in development by NASA, USGS and the European Space Agency. Methods for computing ET values between cloud-free satellite observations, currently based on linear interpolation of the ratio of ET to a reference flux, can also be improved. Approaches used in mapping and predicting vegetation phenology⁴⁵ and dynamic time warping⁴⁶ algorithms developed for signal processing applications offer promise for reducing large errors during periods of rapid vegetation change or extended cloud cover, which would contribute to reduced RMSE values across the model ensemble.

Examining results for specific crop classes, we found strong results for DisALEXI and SIMS over vineyards, and DisALEXI, SIMS and SSEBop over fruit and nut orchard sites—key targets for irrigation management in the Central Valley. Increasing the number of validation sites in orchards would help to address remaining modelling issues associated with this challenging canopy architecture. The USDA ARS-led Tree-crop Remote sensing of Evapotranspiration eXperiment (T-REX) is aimed at addressing this observational gap⁴⁷.

All models, to varying degrees, have room for notable improvement in computation of ET in natural ecosystems. For example, most models systematically underestimate ET in drier ecosystems such as grasslands and shrublands and overestimate ET in evergreen forests. Incorporation of high-frequency and high-resolution visible and near-infrared data into the remote sensing models may improve their ability to capture phenological shifts particularly in arid/semi-arid regions, and agricultural systems in general^48,49. Improvement of gridded meteorological model inputs^50,51, land cover classification data and soils data⁵² may also lead to improved model performance in both natural ecosystems and in croplands. In particular, datasets compiled from agricultural weather stations and used to compute bias correction surfaces for reference ET could be re-evaluated to ensure reference surface compliance with the assumptions of the American Society of Civil Engineers Penman–Monteith equation⁵³.

Future OpenET accuracy evaluations will target primary causes of error in ground ET measurements and RSET methods. Specific factors to consider include local advective impacts on modelled and measured ET, EC energy budget closure, local thermal contrast, ET reduction in deficit irrigated or rainfed systems, potential biases in gridded meteorological inputs to RSET models, and accurate capture of ET over sparsely cultivated landscapes. Comparisons with other well-established spatially mapped ET products such as MOD16 or FLUXCOM⁵⁴ may provide further insights for operational global ET mapping at field scales (30–100 m). Comparisons against ET data computed from long-term water balance studies^13,55 would help fill in gaps of spatial coverage in measured in situ ET across the western United States in hydrologically important but sparsely cultivated regions such as the Upper Colorado River Basin.

Conclusions

The OpenET platform provides spatially continuous ET data at 30-m resolution throughout the western United States. An intercomparison and accuracy assessment involved six satellite-based RSET models composing the current OpenET version, ensemble ET computed from the six models, and a well-documented benchmark eddy flux dataset from 152 stations located in the contiguous United States. Based on results from 59 cropland ET stations located in a variety of climatic regions, little systematic model bias was observed in croplands, and error metrics were within or near the targets set forth by OpenET partners including farmers, irrigation managers and water management agencies. The best accuracy metrics were associated with seasonal and annual timescales, and for crops in arid/semi-arid regions. The OpenET ensemble mean, with outlier removal, typically outperformed any individual model in terms of error statistics. Generally, no more than one model was identified as an outlier during growing season months over most agricultural regions in the western United States, and frequently no models were excluded. This finding highlights the substantial progress achieved so far in developing fully automated RSET modelling approaches that can be employed to map ET over large areas at field-scale resolution. The study identified paths for future targeted research and model improvement, and is intended to support the RSET research community in the development of increasingly robust and accurate RSET techniques. We are also hopeful that this assessment will provide added confidence to water resource managers, farmers, ranchers, scientists and other potential users of OpenET due to the high rigour and transparency of methods that were employed.

Methods

Flux data processing and footprint sampling

We used a curated benchmark eddy flux-based ET dataset^19,20 and tools⁵⁶ for use in this and subsequent evaluations of OpenET RSET models⁵. The rationale and decision-making steps for the collection and post-processing of flux data, as well as analyses of footprint sampling techniques and energy balance closure error within the dataset, are described in Volk et al.^19,20. Data processing techniques for gap-filling and correction for energy balance closure error were conducted using open-source Python tools⁵⁶ that enhance data provenance and reproducibility. Data were also subject to qualitative, visual-based data screening and filtering^19,20. The final post-processed dataset consists of 161 stations, is public and includes daily and monthly ET and meteorological data, interactive graphics of such data for each station, and site information such as land use and Principal Investigator acknowledgements²⁰. We note that nine stations in the dataset were not included in the statistical results presented here because they had data coverage that did not overlap with the data that could be developed for all six OpenET models. For example, not all models could be implemented from satellite imagery recorded before 2001 (ref. ⁵). Figure 1 shows a map of the 152 stations used in this accuracy assessment as well as their land cover types and Köppen–Geiger climate zones, and Supplementary Table 1 provides additional metadata for each station.

Data for the majority (106) of the flux stations in this study were downloaded from the AmeriFlux website, last accessed on 27 October 2020, and the remaining stations were retrieved from a variety of sources and Principal Investigators from university partners, the US Geological Survey, the US Department of Agriculture and others¹⁹. In addition to EC systems, four precision weighing lysimeters measuring cropland ET in Texas⁵⁷ and seven high-quality Bowen Ratio instrumented sites, which measure ET in predominantly phreatophyte shrublands in Nevada²⁰, were included in the dataset. Gap-filling of initial half-hourly fluxes of the four main energy balance components—latent, sensible and soil heat flux, and net radiation—was conducted using linear interpolation where gaps up to 2 h during the daytime or 4 h during nighttime were interpolated. If a given 24-h period still contained gaps then the daily average was not calculated and the daily flux value was left as a gap. After this initial gap-filling, fluxes were averaged to daily periods and energy balance closure correction was applied following the daily energy balance ratio approach defined by FLUXNET2015/ONEFlux^19,29. The corrected daily latent heat flux, which is the energy consumed through ET, was used to calculate ET with an adjustment to the latent heat of vapourization for air temperature²⁰. This closure-adjusted value is referred to as closed flux ET or measured ET in the main text and all statistical measures reported for OpenET models were against the energy balance corrected ET data. Daily ET gaps were subsequently filled using gridMET fraction of reference ET and gridMET grass reference ET^19,20,58. To exclude flux stations with higher data uncertainty, only stations with mean daily energy balance closure of 0.75 or higher during the growing season and 0.6 or higher during the non-growing season were chosen for this intercomparison. Here, growing season periods were spatially mapped on the basis of a cumulative growing-degree-day and killing frost approach derived from long-term gridded climate data and are specific to each flux site^19,58. The final dataset is similar to the recent FLUXNET2015 (ref. ²⁹) release consisting of high-quality eddy flux station data that were subject to similar processing and correction techniques. The largest difference between the two datasets, in terms of daily latent heat flux estimates, results from different gap-filling procedures, where our approach is considered to be simpler and more conservative^19,20,29.

Two approaches were used to estimate flux tower footprints or source area for tower pixel sampling of RSET imagery: (1) simple square ‘static’ pixel (Landsat 30 m) grids of 3 × 3, 5 × 5 and 7 × 7 drawn around station locations, and (2) two-dimensional, physically based flux source area estimations modelled using hourly meteorological data using the Kljun et al.⁵⁹ approach, with hourly footprints converted to daily/monthly average footprint rasters weighted by reference ET¹⁹. The placement of the static grids was informed by high-resolution imagery to avoid inclusion of pixels of non-representative land cover (structures, roads and canals), and shifted slightly into the predominant wind direction as determined by long-term mean daytime windroses (built from data between 6:00 and 20:00 local time). Although the physically based and temporally dynamic footprints were preferred over the static footprints, only about half of the stations in the dataset had sufficient data for their production. Commonly, one or more input parameters to the Kljun et al.⁵⁹ model, such as the standard deviation of the crosswind component of wind due to turbulence or friction velocity, was not available. A detailed description of parameter estimation, processing steps and the method used for creating weighted mean footprint images (using reference ET from NLDAS2 gridded weather data⁶⁰) can be found in Volk et al.¹⁹. We also conducted a rigorous comparison of the intersection between source areas from the static grids of different sizes and the temporally dynamic footprints. The major finding was that the larger 7 × 7 grids tended to include substantially more of the dynamically defined footprint area than did the smaller grid sizes on average; however, the smaller 3 × 3 grids tended to overlap with pixels that were deemed part of the dynamic footprint on a more consistent basis. Therefore, we decided to use the 7 × 7 grids for pixel sampling at most flux sites where a dynamic footprint could not be generated, with exceptions for sites with heterogeneous surroundings or with non-representative land cover nearby the station. For these sites, we used 5 × 5 or 3 × 3 grids to avoid giving equal weight to pixels of potentially different land cover that lie near the perimeter of the typical actual footprint area¹⁹.

Model data

The majority of the models that make up the OpenET ensemble are based on full or simplified implementations of the SEB approach. The SEB approach accounts for the energy used to transform liquid water in plants and soil into vapour that is released to the atmosphere. The SEB approach relies on satellite measurements of surface temperature and surface reflectance combined with other key land surface and weather variables to calculate components of the energy balance—net radiation, sensible heat flux, ground heat flux and latent heat flux. eeMETRIC⁸, geeSEBAL⁹ and DisALEXI⁷ compute each component of the energy balance using optical (that is, short-wave) and thermal (that is, long-wave) data, whereas SSEBop¹³ and PT-JPL¹⁰ are simplified approaches in which certain components of the energy balance are not calculated, or are calculated using a set of simplifying assumptions. SIMS^11,12 relies on surface reflectance data, crop type information and a gridded soil water balance model to compute ET as a function of canopy density using a crop coefficient approach for agricultural lands.

The Google Earth Engine¹⁴ Python application programming interface was used to develop a workflow for sampling OpenET RSET model data at ET flux sites. Sampling of the daily and monthly RSET model data was performed at each site using a set of static (3 × 3, 5 × 5 and/or 7 × 7) and/or dynamic flux source-area footprints. Conditions for each of the extraction methods using static footprints were as follows: (1) daily ET from eeMETRIC, SIMS and SSEBop for sites outside of California was calculated as the product of the mean daily fraction of grass reference ET (EToF) produced by the models and the mean daily bias-corrected gridMET grass reference ET (ETo) (repeated for sites within California using daily CIMIS ETo, where CIMIS is more commonly used and depended upon in California); (2) daily ET from PT-JPL, geeSEBAL, and ALEXI/DisALEXI for all sites was computed as the spatial average of daily ET pixels produced by the models; (3) monthly ET from all RSET models for sites outside of California were calculated as the product of the mean monthly EToF and the mean monthly gridMET ETo (repeated for sites within California using the monthly CIMIS ETo). The process of extrapolating instantaneous data (time of overpass) to daily ET is an internal model calculation and differs for each model, and we refer readers to the individual model documentations for details as well as Melton et al.⁵. Daily Landsat image pixels with cloud contamination are flagged on the basis of the CFMask derived indicators⁶¹ in the pixel quality assurance band (QA_PIXEL) and those pixels are not considered. When computing monthly ET, all missing or masked daily ET pixels are computed by linearly interpolating between the nearest unmasked (cloud free) pixels in time within ±32 days.

Conditions for each of the extraction methods using dynamic footprints were as follows:

(1)
daily ET from eeMETRIC, SIMS and SSEBop for sites outside of California was calculated by first multiplying the sampled daily EToF pixels produced by the models in the footprint by each daily flux footprint weight to obtain daily weighted EToF pixels, and summing all daily weighted EToF pixels to obtain mean daily weighted EToF, normalizing the mean daily weighted EToF by the sum of weights to account for times when the sum of weights did not equal 1 (for example, caused by cloud masking of pixels), and then multiplying the mean daily weighted EToF by the mean daily bias corrected gridMET ETo (replaced for sites within California using the daily CIMIS ETo);
(2)
daily ET from PT-JPL, geeSEBAL and ALEXI/DisALEXI for all sites was calculated by multiplying the daily ET pixels by the daily flux footprint weights to obtain daily weighted ET pixels, summing all daily weighted ET pixels to obtain mean daily weighted ET, and then normalizing the mean daily weighted ET by the sum of weights, and
(3)
monthly ET from all RSET models for sites outside of California was calculated by first multiplying the monthly EToF pixels by the monthly flux footprint weights to obtain monthly weighted EToF pixels, summing all monthly weighted EToF pixels to obtain mean monthly weighted EToF, normalizing the mean monthly weighted EToF by the sum of weights, and then multiplying the mean monthly weighted EToF by the mean monthly bias-corrected gridMET ETo (replaced for sites within California using the monthly CIMIS ETo).

Additional processing was required after extracting the daily ET when duplicate days of data were extracted at select sites due to overlapping Landsat paths. Occasionally a site would lie within the footprints of two overlapping Landsat scenes, resulting in more than one ET value on a given overpass date. To obtain single daily ET values for the site, the daily weighted mean ET for each day was computed using the pixel count (that is, number of pixels used when deriving the respective spatial mean ET value) as the weight. ET pixel counts were occasionally less than the grid/footprint total because of the removal of poor-quality pixels (for example, cloud masking).

Ensemble computation

The ensemble mean of the six OpenET models was computed after removing up to two outlier models based on the MAD^23,24, a robust measure of spread that is suitable for small samples. The outlier removal occurs at the pixel level for each ET image generated. To identify outliers for a single scene, first the median value and the MAD from the median is computed as

$${{\mathrm{MAD}}}=b\times {{\mathrm{median}}}\left(\left|{X}_{i}-{{\mathrm{median}}}\left(X\,\right)\right|\right),$$

where X_i is the ET value for model i and X is the full set of all six model’s ET estimates. Here, b is a scalar set to 1.483, and it was derived on the basis of the assumption of normality of the sample population⁶². This approach is sometimes referred to as the MADe rule, where e = 1.483. The MAD value is typically scaled by 2, 2.5 or 3 on the basis of a subjective assessment of the data, which is then used to create a band around the median:

$${{\mathrm{median}}}\left(X\,\right)\pm 2{{\mathrm{MAD}}}.$$

Model estimates that fall outside the band are deemed as outliers, and up to two outliers (those furthest from the median) are removed from the set of model estimates before taking the ensemble mean.

Due to the tendency for some OpenET models to predict zero ET or even negative ET rates in some arid regions during dry periods we modified the above approach for these scenarios. Specifically, when the ensemble median estimate is zero but at least one model predicts a positive ET rate, the ensemble mean is taken to include that value without any prior outlier removal. In these cases, the outlier removal would result in removing the model estimates that are positive and although actual ET may be quite negligible, a zero estimate is not considered to be physically realistic. However, in these scenarios, because the majority of models may predict zero, the ensemble mean will also be highly skewed towards zero making this a conservative measure to prevent zero ensemble estimates.

Statistical analyses

Key summary statistics including the least squares linear regression slope forced through the origin (slope) as well as linear regression with an intercept (Supplementary Table 7), MBE, MAE, RMSE and the coefficient of determination (r²) were computed using paired observations between OpenET model ET estimates and post-processed and corrected flux ET estimates¹⁹. Daily accuracy statistics were not compared against any gap-filled station ET data, and monthly statistics only used station ET with 5 or fewer gap-filled days per month. Growing season and annual evaluations used paired monthly data and did not include any periods with monthly gaps. Also, the number of paired observations was always the same among models for all statistical analyses.

All statistics were calculated on a site-by-site basis using paired model–measured ET using the Python Numpy package version 1.17.2 (ref. ⁶³). For linear regression, the Numpy linalg.lstsq algorithm was used, and it applies the least squares approach. We used the modelled ET as the dependent variable and the measured ET as the independent variable.

The MBE was calculated as

$${{\mathrm{MBE}}}=\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}\left({{P}}_{{i}}-{{O}}_{{i}}\right),$$

where O_i is the observed ET, P_i is the model predicted ET and n is the total number of paired model–measured ET data points.

The MAE was calculated as

$${{\mathrm{MAE}}}=\frac{1}{n}\sum_{i=1}^{n}{\rm{|}}{{P}}_{{i}}-{{O}}_{{i}}{\rm{|}},$$

and the RMSE was calculated as

$${{\mathrm{RMSE}}}=\sqrt{\mathop{\sum }\limits_{i=1}^{n}\frac{{({{P}}_{{i}}-{{O}}_{{i}})}^{2}}{{n}}}.$$

Here, r² values were calculated as the square of the Pearson correlation coefficient, which was calculated from paired model–measurement ET data using the Python statsmodels package, version 0.12.1 (ref. ⁶⁴).

For grouping statistics by land cover or climate zone we used two methods: (1) for the computation of linear regression and r² all data from each ground observation in a group (for example, monthly paired model–station ET estimates for annual crop stations) were pooled together before computing a single statistic per model; and (2) MBE, MAE and RMSE were computed separately for each ground station, and then a weighted mean was taken. Grouped statistics were weighted by the square root of the number of paired observations per station (n); the rationale is to avoid giving too much weight to stations with excessively long data records while also not giving equal weight to stations with short data records⁶⁵. We also imposed data length requirements for in situ ET stations: to be included in daily grouped mean statistics we required stations to have a minimum of six paired station–model data points, and a minimum of three paired observations for inclusion in monthly grouped mean statistics. We note that Melton et al.⁵ presented similar statistical metrics from a subset of cropland sites used in this study, and in that study, the linear regression slope and r² metrics did incorporate weighting, which we deemed inappropriate or unnecessary in this study. For congruency, the statistics computed in the same manner as in Melton et al.⁵ are provided in Supplementary Table 12.

A post hoc Tukey test, also known as the honestly significant difference test, was used to compare multiple mean ET estimates from each model, the ensemble mean, and from the mean of the unclosed and closed flux ET data. The test was applied using all paired data from cropland stations, including for crop subgroups: annual crops, orchards and vineyards, at daily, monthly, growing season and annual timescales. The family-wise error rate was set to 0.05 and the test was performed using the Python statsmodels package, version 0.12.1 (ref. ⁶⁴).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The in situ measured ET data analysed during the current study are available in the Zenodo repository, with identifier https://doi.org/10.5281/zenodo.7636781. The OpenET model ET data analysed during the current study are available in the Zenodo repository, with identifier https://doi.org/10.5281/zenodo.10119477.

Code availability

The code used to post-process eddy flux tower data for the current study is publicly available on GitHub (https://github.com/Open-ET/flux-data-qaqc). The code used to generate flux footprints for the current study is publicly available on GitHub (https://github.com/Open-ET/flux-data-footprint).

References

Fisher, J. B. et al. The future of evapotranspiration: global requirements for ecosystem functioning, carbon and climate feedbacks, agricultural management, and water resources. Water Resour. Res. 53, 2618–2626 (2017).
Article ADS Google Scholar
Dieter, C. A. et al. Estimated use of water in the United States in 2015. Circular 1411 https://pubs.usgs.gov/publication/cir1441 (2018).
Cook, B. I., Ault, T. R. & Smerdon, J. E. Unprecedented 21st century drought risk in the American Southwest and Central Plains. Sci. Adv. 1, e1400082 (2015).
Article PubMed PubMed Central ADS Google Scholar
Liu, P.-W. et al. Groundwater depletion in California’s Central Valley accelerates during megadrought. Nat. Commun. 13, 7825 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Melton, F. S. et al. OpenET: filling a critical data gap in water management for the western United States. J. Am. Water Resour. Assoc. 58, 971–994 (2022).
Article ADS Google Scholar
Chen, J. M. & Liu, J. Evolution of evapotranspiration models using thermal and shortwave remote sensing data. Remote Sens. Environ. 237, 111594 (2020).
Article Google Scholar
Anderson, M. et al. Field-scale assessment of land and water use change over the California Delta using remote sensing. Remote Sens. 10, 889 (2018).
Article ADS Google Scholar
Allen, R. G., Tasumi, M. & Trezza, R. Satellite-based energy balance for mapping evapotranspiration with internalized calibration (METRIC)—Model. J. Irrig. Drain. Eng. 133, 380–394 (2007).
Article Google Scholar
Laipelt, L. et al. Long-term monitoring of evapotranspiration using the SEBAL algorithm and Google Earth Engine cloud computing. ISPRS J. Photogramm. Remote Sens. 178, 81–96 (2021).
Article ADS Google Scholar
Fisher, J. B., Tu, K. P. & Baldocchi, D. D. Global estimates of the land–atmosphere water flux based on monthly AVHRR and ISLSCP-II data, validated at 16 FLUXNET sites. Remote Sens. Environ. 112, 901–919 (2008).
Article ADS Google Scholar
Pereira, L. S. et al. Prediction of crop coefficients from fraction of ground cover and height. Background and validation using ground and remote sensing data. Agric. Water Manag. 241, 106197 (2020).
Article Google Scholar
Melton, F. S. et al. Satellite irrigation management support with the terrestrial observation and prediction system: a framework for integration of satellite and surface observations to support improvements in agricultural water resource management. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5, 1709–1721 (2012).
Article ADS Google Scholar
Senay, G. B. et al. Improving the operational simplified surface energy balance evapotranspiration model using the forcing and normalizing operation. Remote Sens. 15, 260 (2023).
Article ADS Google Scholar
Gorelick, N. et al. Google Earth Engine: planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27 (2017).
Article ADS Google Scholar
Allen, R. G. et al. Satellite-based energy balance for mapping evapotranspiration with internalized calibration (METRIC)—Applications. J. Irrig. Drain. Eng. 133, 395–406 (2007).
Article Google Scholar
Knipper, K. R. et al. Using high-spatiotemporal thermal satellite ET retrievals for operational water use and stress monitoring in a California vineyard. Remote Sens. 11, 2124 (2019).
Article ADS Google Scholar
Senay, G. B., Friedrichs, M., Singh, R. K. & Velpuri, N. M. Evaluating Landsat 8 evapotranspiration for water use mapping in the Colorado River Basin. Remote Sens. Environ. 185, 171–185 (2016).
Article ADS Google Scholar
Foster, T., Mieno, T. & Brozović, N. Satellite-based monitoring of irrigation water use: assessing measurement errors and their implications for agricultural water management policy. Water Resour. Res. 56, e2020WR028378 (2020).
Article ADS Google Scholar
Volk, J. M. et al. Development of a benchmark eddy flux evapotranspiration dataset for evaluation of satellite-driven evapotranspiration models over the CONUS. Agric. For. Meteorol. 331, 109307 (2023).
Article Google Scholar
Volk, J. M. et al. Post-processed data and graphical tools for a CONUS-wide eddy flux evapotranspiration dataset. Data Brief https://doi.org/10.1016/j.dib.2023.109274 (2023).
Baldocchi, D. Measuring fluxes of trace gases and energy between ecosystems and the atmosphere—the state and future of the eddy covariance method. Glob. Change Biol. 20, 3600–3609 (2014).
Article ADS Google Scholar
Baldocchi, D. et al. FLUXNET: a new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull. Am. Meteorol. Soc. 82, 2415–2434 (2001).
Article ADS Google Scholar
Hampel, F. R. The influence curve and its role in robust estimation. J. Am. Stat. Assoc. 69, 383–393 (1974).
Article MathSciNet Google Scholar
Leys, C., Ley, C., Klein, O., Bernard, P. & Licata, L. Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49, 764–766 (2013).
Article Google Scholar
Thompson, P. D. How to improve accuracy by combining independent forecasts. Mon. Weather Rev. 105, 228–229 (1977).
Article ADS Google Scholar
Kirtman, B. P. et al. The North American multimodel ensemble: phase-1 seasonal-to-interannual prediction; phase-2 toward developing intraseasonal prediction. Bull. Am. Meteorol. Soc. 95, 585–601 (2014).
Article ADS Google Scholar
Bai, Y. et al. On the use of machine learning based ensemble approaches to improve evapotranspiration estimates from croplands across a wide environmental gradient. Agric. For. Meteorol. 298, 108308 (2021).
Article Google Scholar
Novick, K. A. et al. The AmeriFlux network: a coalition of the willing. Agric. For. Meteorol. 249, 444–456 (2018).
Article ADS Google Scholar
Pastorello, G. et al. The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. Sci. Data 7, 1–27 (2020).
Article Google Scholar
Mauder, M., Foken, T. & Cuxart, J. Surface-energy-balance closure over land: a review. Bound. Layer Meteorol. 177, 395–426 (2020).
Article ADS Google Scholar
Ingwersen, J., Imukova, K., Högy, P. & Streck, T. On the use of the post-closure methods uncertainty band to evaluate the performance of land surface models against eddy covariance flux data. Biogeosciences 12, 2311–2326 (2015).
Article ADS Google Scholar
Knipper, K. R. et al. Evapotranspiration estimates derived using thermal-based satellite remote sensing and data fusion for irrigation management in California vineyards. Irrig. Sci. 37, 431–449 (2019).
Article Google Scholar
Bambach, N. et al. Evapotranspiration uncertainty at micrometeorological scales: the impact of the eddy covariance energy imbalance and correction methods. Irrig. Sci. 40, 445–461 (2022).
Article Google Scholar
Rubel, F., Brugger, K., Haslinger, K. & Auer, I. The climate of the European Alps: shift of very high resolution Köppen–Geiger climate zones 1800–2100. Meteorol. Z. 26, 115–125 (2017).
Article Google Scholar
Yang, Y. et al. Studying drought-induced forest mortality using high spatiotemporal resolution evapotranspiration data from thermal satellite imaging. Remote Sens. Environ. 265, 112640 (2021).
Article Google Scholar
Isaacson, B. N., Yang, Y., Anderson, M. C., Clark, K. L. & Grabosky, J. C. The effects of forest composition and management on evapotranspiration in the New Jersey pinelands. Agric. For. Meteorol. 339, 109588 (2023).
Article Google Scholar
Qian, Y. et al. Neglecting irrigation contributes to the simulated summertime warm-and-dry bias in the central United States. Npj Clim. Atmos. Sci. 3, 31 (2020).
Article ADS Google Scholar
Lei, F., Crow, W. T., Holmes, T. R., Hain, C. & Anderson, M. C. Global investigation of soil moisture and latent heat flux coupling strength. Water Resour. Res. 54, 8196–8215 (2018).
Article PubMed PubMed Central ADS Google Scholar
Dong, J., Lei, F. & Crow, W. T. Land transpiration–evaporation partitioning errors responsible for modeled summertime warm bias in the central United States. Nat. Commun. 13, 336 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Abolafia-Rosenzweig, R., Pan, M., Zeng, J. & Livneh, B. Remotely sensed ensembles of the terrestrial water budget over major global river basins: an assessment of three closure techniques. Remote Sens. Environ. 252, 112191 (2021).
Article Google Scholar
Wang, Q. et al. Land surface models significantly underestimate the impact of land-use changes on global evapotranspiration. Environ. Res. Lett. 16, 124047 (2021).
Article CAS ADS Google Scholar
Allen, R. G., Pereira, L. S., Howell, T. A. & Jensen, M. E. Evapotranspiration information reporting: I. Factors governing measurement accuracy. Agric. Water Manag. 98, 899–920 (2011).
Article Google Scholar
Adu, M. O., Yawson, D. O., Armah, F. A., Asare, P. A. & Frimpong, K. A. Meta-analysis of crop yields of full, deficit, and partial root-zone drying irrigation. Agric. Water Manag. 197, 79–90 (2018).
Article Google Scholar
Xue, J. et al. Improving the spatiotemporal resolution of remotely sensed ET information for water management through Landsat, Sentinel-2, ECOSTRESS and VIIRS data fusion. Irrig. Sci. 40, 609–634 (2022).
Article PubMed PubMed Central Google Scholar
Gao, F. & Zhang, X. Mapping crop phenology in near real-time using satellite remote sensing: challenges and opportunities. J. Remote Sens. 2021, 8379391 (2021).
Article ADS Google Scholar
Müller, M. Dynamic time warping. in Information Retrieval for Music and Motion. 69–84 (Springer, 2007).
Bambach, N. et al. The Tree-crop Remote sensing of Evapotranspiration eXperiment (T-REX): a science-based path for sustainable water management and climate mitigation. Bull. Am. Meteorol. Soc. In the press (2023).
Fisher, J. B. Hydrosat: towards daily, field-scale, global evapotranspiration from space. (2022).
Polhamus, A., Fisher, J. B. & Tu, K. P. What controls the error structure in evapotranspiration models? Agric. For. Meteorol. 169, 12–24 (2013).
Article ADS Google Scholar
Blankenau, P. A., Kilic, A. & Allen, R. An evaluation of gridded weather data sets for the purpose of estimating reference evapotranspiration in the United States. Agric. Water Manag. 242, 106376 (2020).
Article Google Scholar
Doherty, C. T. et al. Effects of meteorological and land surface modeling uncertainty on errors in winegrape ET calculated with SIMS. Irrig. Sci. 40, 515–530 (2022).
Article PubMed PubMed Central Google Scholar
Purdy, A., Fisher, J., Goulden, M. & Famiglietti, J. Ground heat flux: an analytical review of 6 models evaluated at 88 sites and globally. J. Geophys. Res. Biogeosci. 121, 3045–3059 (2016).
Article Google Scholar
Allen, R. G. et al. A recommendation on standardized surface resistance for hourly calculation of reference ETo by the FAO56 Penman-Monteith method. Agric. Water Manag. 81, 1–22 (2006).
Article Google Scholar
Jung, M. et al. The FLUXCOM ensemble of global land–atmosphere energy fluxes. Sci. Data 6, 74 (2019).
Article PubMed PubMed Central Google Scholar
Reitz, M., Senay, G. B. & Sanford, W. E. Combining remote sensing and water-balance evapotranspiration estimates for the conterminous United States. Remote Sens. 9, 1181 (2017).
Article ADS Google Scholar
Volk, J. et al. flux-data-qaqc: a Python package for energy balance closure and post-processing of eddy flux. Data. 6, 1–5 (2021).
Google Scholar
Evett, S. R. et al. The Bushland weighing lysimeters: a quarter century of crop ET investigations to advance sustainable irrigation. Trans. ASABE 59, 163–179 (2016).
Article Google Scholar
Abatzoglou, J. T. Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol. 33, 121–131 (2013).
Article Google Scholar
Kljun, N., Calanca, P., Rotach, M. W. & Schmid, H. P. A simple two-dimensional parameterisation for Flux Footprint Prediction (FFP). Geosci. Model Dev. 8, 3695–3713 (2015).
Article ADS Google Scholar
Xia, Y. et al. Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J. Geophys. Res. Atmos. 117, D03109 (2012).
ADS Google Scholar
Foga, S. et al. Cloud detection algorithm comparison and validation for operational Landsat data products. Remote Sens. Environ. 194, 379–390 (2017).
Article ADS Google Scholar
Rousseeuw, P. J. & Croux, C. Alternatives to the median absolute deviation. J. Am. Stat. Assoc. 88, 1273–1283 (1993).
Article MathSciNet Google Scholar
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conference vol. 57 10–25080 (SciPy, 2010).
Obrecht, N. A. Sample size weighting follows a curvilinear function. J. Exp. Psychol. Learn. Mem. Cogn. 45, 614 (2019).
Article PubMed Google Scholar

Download references

Acknowledgements

Author information

Authors and Affiliations

Desert Research Institute, Reno, NV, USA
John M. Volk, Justin L. Huntington, Blake Minor, Charles Morton, Thomas Ott, Christian Dunkerly & Christopher Pearson
NASA Ames Research Center, Moffett Field, CA, USA
Forrest S. Melton, Lee Johnson, Will Carrara, Conor T. Doherty, Alberto Guzman & Adam Purdy
California State University Monterey Bay, Seaside, CA, USA
Forrest S. Melton, Lee Johnson, Will Carrara, Alberto Guzman & Adam Purdy
University of Idaho, Kimberly, ID, USA
Richard Allen
USDA Agricultural Research Service, Beltsville, MD, USA
Martha Anderson
University of California, Los Angeles, Los Angeles, CA, USA
Joshua B. Fisher
University of Nebraska-Lincoln, Lincoln, NE, USA
Ayse Kilic, Samuel Ortega-Salazar & Peter ReVelle
Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
Anderson Ruhoff, Bruno Comini de Andrade & Leonardo Laipelt
US Geological Survey Earth Resources Observation and Science Center, North Central Climate Adaptation Science Center, Fort Collins, CO, USA
Gabriel B. Senay
KBR, Inc. under contract to the US Geological Survey Earth Resources Observation and Science Center, Sioux Falls, SD, USA
MacKenzie Friedrichs
NASA Marshall Space Flight Center, Huntsville, AL, USA
Christopher Hain
NASA Jet Propulsion Lab, Pasadena, CA, USA
Gregory Halverson
University of California Berkeley, Berkeley, USA
Yanghui Kang & Tianxin Wang
USDA Agricultural Research Service, Davis, CA, USA
Kyle Knipper
Innovate!, Inc. under contract to the US Geological Survey Earth Resources Observation and Science Center, Sioux Falls, SD, USA
Gabriel E. L. Parrish
Mississippi State University, Starkville, MS, USA
Yun Yang

Authors

John M. Volk
View author publications
You can also search for this author in PubMed Google Scholar
Justin L. Huntington
View author publications
You can also search for this author in PubMed Google Scholar
Forrest S. Melton
View author publications
You can also search for this author in PubMed Google Scholar
Richard Allen
View author publications
You can also search for this author in PubMed Google Scholar
Martha Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Joshua B. Fisher
View author publications
You can also search for this author in PubMed Google Scholar
Ayse Kilic
View author publications
You can also search for this author in PubMed Google Scholar
Anderson Ruhoff
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel B. Senay
View author publications
You can also search for this author in PubMed Google Scholar
Blake Minor
View author publications
You can also search for this author in PubMed Google Scholar
Charles Morton
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Ott
View author publications
You can also search for this author in PubMed Google Scholar
Lee Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Comini de Andrade
View author publications
You can also search for this author in PubMed Google Scholar
Will Carrara
View author publications
You can also search for this author in PubMed Google Scholar
Conor T. Doherty
View author publications
You can also search for this author in PubMed Google Scholar
Christian Dunkerly
View author publications
You can also search for this author in PubMed Google Scholar
MacKenzie Friedrichs
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Guzman
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Hain
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Halverson
View author publications
You can also search for this author in PubMed Google Scholar
Yanghui Kang
View author publications
You can also search for this author in PubMed Google Scholar
Kyle Knipper
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Laipelt
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Ortega-Salazar
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Pearson
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel E. L. Parrish
View author publications
You can also search for this author in PubMed Google Scholar
Adam Purdy
View author publications
You can also search for this author in PubMed Google Scholar
Peter ReVelle
View author publications
You can also search for this author in PubMed Google Scholar
Tianxin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

F.S.M., J.L.H., J.M.V., R.A., M.A., J.B.F., A.K., A.R., G.B.S. and C.P. designed and guided the study; J.M.V., F.S.M., M.A. and L.J. wrote the main text; J.M.V. performed statistical analyses; C.M., J.M.V., B.M., T.O., C.D. and T.W. prepared measured data or model input data, or ran models; F.S.M., R.A., M.A., J.B.F., A.K., A.R., G.B.S., J.L.H., C.M., W.C., C.T.D., M.F., A.G., C.H., G.H., L.J., Y.K., K.K., S.O.-S., G.E.L.P., A.P., P.R., Y.Y., L.L. and B.C.d.A. developed models and OpenET infrastructure; J.M.V., M.A., F.S.M., L.J., R.A., J.B.F., J.L.H., A.K., G.B.S., T.O., B.M., A.R., M.F. and T.W. reviewed and edited text and figures.

Corresponding author

Correspondence to John M. Volk.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Water thanks Tilden Meyers, Dennis Baldocchi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Monthly climatology of paired modeled and observed ET for evergreen forest sites.

Subplot (a) shows monthly climatology of paired OpenET⁵ and flux tower ET^19,20 from evergreen forested sites. Subplot (b) shows the residual of monthly mean ET (model minus mean closed flux ET). Unclosed and closed labels refer to flux tower ET before and after energy balance closure correction. Dashed lines represent the closed flux ET mean plus two standard errors of the mean and unclosed flux ET mean minus two standard errors of the mean.

Extended Data Fig. 2 Monthly climatology of paired modeled and observed ET for mixed forest sites.

Subplot (a) shows monthly climatology of paired OpenET⁵ and flux tower ET^19,20 from mixed forested sites. Subplot (b) shows the residual of monthly mean ET (model minus mean closed flux ET). Unclosed and closed labels refer to flux tower ET before and after energy balance closure correction. Dashed lines represent the closed flux ET mean plus two standard errors of the mean and unclosed flux ET mean minus two standard errors of the mean.

Extended Data Fig. 3 Monthly climatology of paired modeled and observed ET for grassland sites.

Subplot (a) shows monthly climatology of paired OpenET⁵ and flux tower ET^19,20 from grassland sites. Subplot (b) shows the residual of monthly mean ET (model minus mean closed flux ET). Unclosed and closed labels refer to flux tower ET before and after energy balance closure correction. Dashed lines represent the closed flux ET mean plus two standard errors of the mean and unclosed flux ET mean minus two standard errors of the mean.

Extended Data Fig. 4 Monthly climatology of paired modeled and observed ET for shrubland sites.

Subplot (a) shows monthly climatology of paired OpenET⁵ and flux tower ET^19,20 from shrubland sites. Subplot (b) shows the residual of monthly mean ET (model minus mean closed flux ET). Unclosed and closed labels refer to flux tower ET before and after energy balance closure correction. Dashed lines represent the closed flux ET mean plus two standard errors of the mean and unclosed flux ET mean minus two standard errors of the mean.

Extended Data Fig. 5 Monthly climatology of paired modeled and observed ET for wetland and riparian sites.

Subplot (a) shows monthly climatology of paired OpenET⁵ and flux tower ET^19,20 from wetland and riparian sites. Subplot (b) shows the residual of monthly mean ET (model minus mean closed flux ET). Unclosed and closed labels refer to flux tower ET before and after energy balance closure correction. Dashed lines represent the closed flux ET mean plus two standard errors of the mean and unclosed flux ET mean minus two standard errors of the mean.

Extended Data Fig. 6 Monthly climatology of modeled ET using all cropland pixels.

Monthly climatology of OpenET⁵ ensemble members and the ensemble mean using all monthly ET data for all pixels that were classified as croplands for each year from 2016–2022.

Extended Data Fig. 7 Spatial analysis of model ensemble outlier occurrence in cropland pixels.

Subplot (a) shows the spatial differences between the OpenET⁵ ensemble mean growing season (April through October) ET for cropland pixels using the median absolute deviation (MAD) outlier removal approach and the simple arithmetic mean (SAM); monthly ET from 2016–2022 was used to build the map. Subplot (b) shows the average count of models used in the ensemble after outlier removal using all growing season monthly data for cropland pixels. A value of six indicates that no model was identified as an outlier, while four is the lower limit where a maximum of two models were removed as outliers before taking the ensemble mean.

Extended Data Fig. 8 Spatial difference between mean growing season ET for each model from the ensemble value in cropland pixels.

Difference between mean growing season (April through October) ET from each OpenET⁵ model minus the ensemble mean using all monthly data from all pixels that were classified as croplands for each year from 2016–2022. See Supplementary Discussion 4 for a discussion of the Landsat striping exhibited by geeSEBAL.

Supplementary information

Supplementary Information

Supplementary Tables 1–12, Figs. 1–9 and Discussions 1–4.

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Volk, J.M., Huntington, J.L., Melton, F.S. et al. Assessing the accuracy of OpenET satellite-based evapotranspiration data to support water resource and land management applications. Nat Water 2, 193–205 (2024). https://doi.org/10.1038/s44221-023-00181-7

Download citation

Received: 21 June 2023
Accepted: 30 November 2023
Published: 15 January 2024
Issue Date: February 2024
DOI: https://doi.org/10.1038/s44221-023-00181-7

This article is cited by

At the interface between hydrology and ecology

Nature Water (2024)
Towards sharing water better with near real-time maps on evaporative water use by crops and natural vegetation
- Dennis Baldocchi
- Kanishka Mallick
Nature Water (2024)
Uncertain Benefits of Using Remotely Sensed Evapotranspiration for Streamflow Estimation—Insights From a Randomized, Large-Sample Experiment
- Hong Xuan Do
- Hung T.T. Nguyen
- Tam V. Nguyen
Water Resources Management (2024)