Improved dryland carbon flux predictions with explicit consideration of water-carbon coupling

Dryland ecosystems are dominant influences on both the trend and interannual variability of the terrestrial carbon sink. Despite their importance, dryland carbon dynamics are not well-characterized by current models. Here, we present DryFlux, an upscaled product built on a dense network of eddy covariance sites in the North American Southwest. To estimate dryland gross primary productivity, we fuse in situ fluxes with remote sensing and meteorological observations using machine learning. DryFlux explicitly accounts for intra-annual variation in water availability, and accurately predicts interannual and seasonal variability in carbon uptake. Applying DryFlux globally indicates existing products may underestimate impacts of large-scale climate patterns on the interannual variability of dryland carbon uptake. We anticipate DryFlux will be an improved benchmark for earth system models in drylands, and prompt a more sensitive accounting of water limitation on the carbon cycle. Upscaling in situ carbon flux measurements using remotely sensed and meteorological observations in a machine learning algorithm leads to improved estimates of average uptake, and interannual variability in global drylands.

D ryland (arid and semi-arid) systems have a dominant influence on both the trend and interannual variability in the global terrestrial carbon sink 1,2 , yet land surface and remote sensing models of primary production perform poorly in these regions [3][4][5] . Dryland ecosystems are distributed throughout the world and occupy 40% of global land area 6 . During the early development of Land Surface Models, model tests were more common in temperate and tropical ecosystems than in arid regions (e.g., 7,8 and many early Land Surface Models did not distinguish critical dryland plant functional types [9][10][11][12][13][14][15] . Additionally, dryland ecosystems are often poorly represented in datasets used to drive and calibrate remote sensing-based models of primary production 16 . Accurate representation of dryland carbon dynamics in global-scale process-and remote sensing-based models could improve the accuracy of terrestrial carbon sink estimates and advance our understanding of the global carbon cycle.
Several key features of dryland ecohydrology, phenology, and biogeography make it particularly challenging to predict carbon dynamics in these systems. Drylands are highly sensitive to variations in water availability 17 , and persistent water limitation in these systems has resulted in physiological adaptations that lead to tight coupling of biogeochemical and water cycles 6,[18][19][20] . Interannual variability in dryland precipitation can exceed 50% of mean annual precipitation, compared to 5-10% in more mesic systems, resulting in high interannual variability in dryland carbon uptake 21 . Strong sensitivity to highly variable hydroclimatic conditions can also manifest in 'flashy' ecosystem responses: rapid carbon uptake and growth in response to precipitation pulses 22,23 . Flashy responses to moisture inputs propagate at longer timescales in drylands, relative to more mesic systems, and strongly influence annual carbon uptake [23][24][25] . Drought, defined here as moisture stress that impacts ecosystem functioning 26 , is a particularly impactful and increasingly prevalent water availability condition in drylands 27 . Drylands also have high degrees of spatial heterogeneity, with diverse ecosystem types and moisture regimes contained within a single model pixel, which further complicates modeling efforts 13,14 . Although existing models are designed to represent patterns of the global carbon cycle, they are not attuned to these key features of drylands. Dryland carbon uptake strongly influences the variability of the global terrestrial carbon sink 1 and the magnitude of interannual variability in dryland carbon uptake is likely underestimated by existing models of productivity 19 .
Here we used satellite and gridded climate observations, measurements from eddy covariance towers, and machine learning to predict spatial and temporal patterns in plant carbon uptake via photosynthesis (gross primary productivity; GPP). We upscaled carbon flux observations using remotely sensed and gridded meteorological inputs to produce spatially and temporally continuous high-resolution estimates of GPP [28][29][30] . Existing continental and global upscaled products, including the FLUXCOM 31 and FluxSat 32 products, were designed for continental-scale and global analysis and are generally based on remotely sensed estimates of photosynthetically active vegetation, such as the fraction of absorbed photosynthetically active radiation (fAPAR) and vegetation indices like the normalized difference vegetation index (NDVI). The global Moderate Resolution Imaging Spectroradiometer (MODIS) GPP products are based on satellite data and include biome-specific light use efficiency values and atmospheric demand scalars, but still rely heavily on measures of satellite greenness. These models do not include important ecohydrological controls over carbon dynamics in drylands, such as explicit representation of water limitation or antecedent moisture conditions. To accurately project changes in the global carbon cycle, we need models that sufficiently represent seasonal and interannual variability in dryland carbon dynamics.
The southwest region of North America is an exemplary location to develop a model tuned to drought-carbon dynamics because of its high spatial heterogeneity, including broad biogeographic regions (California Mediterranean, Intermountain, and Mojave, Sonoran, and Chihuahuan deserts); complex topography and associated ecological transitions (grasslands, shrublands, and forests); and a large degree of variation in precipitation and climate regimes (i.e., a gradient from winter-to summerdominated precipitation regimes and mean annual precipitation range from 100 to 700 mm). In this study, we evaluated the heterogeneous, region-wide sensitivity of carbon uptake to climate in Southwestern dryland systems. Since drought is a prevalent mode of water availability in drylands that is projected to increase in frequency and intensity in the future, we explicitly accounted for drought and other moisture anomalies in our approach. We evaluated the heterogeneous, region-wide sensitivity of carbon uptake to climate and used this sensitivity to estimate spatial and temporal responses of carbon uptake in dryland ecosystems to global patterns of moisture anomalies. We used a network of 24 dryland eddy covariance sites in the Southwest (southwest United States and northwest Mexico) representing diverse dryland ecosystems and climate spaces 19 (Supplementary Fig. 1) to develop a machine learning model that predicted GPP.
Our product, DryFlux, is specifically tuned to key ecohydrological characteristics of dryland systems 33 . Our approach focused on accounting for the tight ecohydrological coupling between water and carbon cycles in drylands to represent the seasonal and interannual variability in carbon uptake in these systems. A crucial difference between our upscaled product and other remote sensing-driven upscaled flux products is the explicit consideration of the impact of antecedent moisture conditions (through the inclusion of previous months' precipitation and the Standardized Precipitation Evapotranspiration Index (SPEI) at various temporal windows as predictors) on GPP. Our approach consisted of two main components: first we derived relationships between climate and vegetation predictors with GPP from flux towers using the random forest machine learning algorithm, and second, we applied the trained model to remotely sensed data inputs to generate spatially and temporally continuous carbon uptake estimates from 2000 to 2016 at 0.5°spatial resolution. We compared our DryFlux product to a machine learning upscaled project, FLUXCOM 31 , and the MODIS GPP data product (MOD17A2H v006), which are widely used to evaluate terrestrial primary production in Land Surface Models 34,35 .

Results and discussion
DryFlux more accurately characterized inter-and intra-annual variation in dryland GPP than both FLUXCOM and MODIS GPP (Fig. 1). Representation of interannual variability in DryFlux GPP across Southwest sites exceeded an R 2 of 0.9 in 18 out of 24 sites, compared to 1 out of 24 sites in the FLUXCOM GPP estimates (Fig. 1a, b and Supplementary Table 1). The relationship between modeled (DryFlux) and observed (tower) monthly GPP for 2000-2015 with DryFlux had an R 2 of 0.88 across all sites, compared to R 2 = 0.43 for FLUXCOM and R 2 = 0.41 for MODIS GPP (Fig. 1c). DryFlux predicted large-magnitude GPP months better than MODIS GPP or FLUXCOM, maintaining accuracy at high levels of GPP based on the slope of the linear regression between modeled and observed GPP ( Fig. 1c; m = 0.88). The FLUXCOM and MODIS GPP products consistently underestimated months with high GPP values ( Fig. 1c; . DryFlux, which explicitly incorporates ecohydrological water-carbon coupling, showed improved estimates of interannual variability and seasonal cycling. DryFlux accurately reproduced the dynamic seasonal cycle of GPP in sites representing the dominant vegetation cover types typical of dryland regions including: semi-arid forests, shrublands, and grasslands better than either FLUXCOM or MODIS GPP (Fig. 1d-f). DryFlux better captured seasonal variation in dryland fluxes than FLUXCOM or MODIS at 23 of 24 Southwest sites based on the correlation coefficient between modeled and observed values ( Fig. 1 and Supplementary Fig. 2). DryFlux also better captured within-season variation in dryland fluxes than FLUXCOM or MODIS (at 22 of 24 and 24 of 24 Southwest sites, respectively; Supplementary Fig. 2). Characterization of the seasonal cycle of GPP captured the flashy seasonal dynamics of GPP in grasslands, shrublands, and forests related to intra-annual changes in water supply with both the preceding month's precipitation and the current month's potential evapotranspiration (PET) emerging as important predictors in the random forest models ( Supplementary Fig. 3). Accurate representation of land cover, particularly in highly heterogeneous dryland regions like the Southwest, is important for generating estimates of carbon fluxes. Due to challenges in spatial resolution and classification accuracy, land cover maps can be a major source of uncertainty in regional-scale upscaled flux estimates 36 . In DryFlux, we chose to include mean annual precipitation. (MAP), mean annual temperature (MAT), elevation, and vegetation indices in lieu of a land cover classification. Both MAP and vegetation indices like NDVI were important predictors ( Supplementary Fig. 3), implying these variables captured spatial heterogeneity in vegetation in the Southwest. DryFlux captures the typical bimodality in Southwest ecosystems-the summer peak driven by monsoon rains in grasslands (Fig. 1d), shrublands (Fig. 1e), and forests (Fig. 1f), and the springtime GPP peak driven by snowmelt in high-elevation forests ( Fig. 1f and Supplementary Fig. 4). While snowmelt was not a predictor in our model, the model's strong performance at high-elevation forest sites implies other predictor variables provided information related to the onset of springtime GPP.
Although improved model performance in the North American Southwest is expected (since our model was specifically trained in this region), investigation of model performance metrics revealed the importance of ecohydrological coupling in drylands. DryFlux accurately captured the large degree of interannual GPP variability in Southwestern drylands (Fig. 2). There was a large degree of variation in observed GPP values across all sites with the largest amount of variation observed in forests during the summer months (σ = 1.19) and savanna/shrublands having the least amount of variation (σ = 0.94; Fig. 2a). DryFlux represented more variability than the MODIS or FLUXCOM GPP data products in the summer months for sites included in the present analysis (Fig. 2a). DryFlux also captured interannual variability in GPP more accurately than MODIS or FLUXCOM data products ( Fig. 2b and Supplementary Table 1). Since our model captured interannual variability well, when we evaluated the differences between GPP predictions from the years following strong El Niño (2015) and La Niña (2011) events, total DryFlux GPP predictions encompassed a larger range of values than FLUXCOM predictions (Fig. 2c, d).
To assess the implications of our more realistic climatic control of dryland carbon uptake beyond the Southwest, we applied DryFlux to dryland regions globally at a 0.5°resolution (Fig. 3a,  b). The El Niño Southern Oscillation, triggers global climate teleconnections that result in both positive and negative departures from normal rainfall patterns 37 . To assess the impacts of these variable moisture conditions on carbon uptake variability in dryland systems beyond the Southwest, we selected the strongest El Niño year (2015-2016) and strongest La Niña year (2010-2011) in the MODIS data record based on the Oceanic Niño Index 38 . We compared annual carbon uptake per-pixel between 2011 and 2015 to assess the impacts of drought on carbon uptake (Fig. 3). The 2010-2011 La Niña, which was the strongest in the past eight decades 39 , led to strong carbon uptake in Australian semi-arid systems (Fig. 3a) that explained most of the exceptionally large global carbon sink in 2011 2 . Carbon uptake in the North American Southwest was spatially variableportions of Texas and northeastern Mexico had abnormally low GPP and western regions (i.e., Nevada, Oregon) had anomalously high GPP in 2011 (Fig. 3a). During the strong El Niño year in 2015, these trends were reversed (but weaker) in Southwestern North America (Fig. 3b), as much of the West had low GPP and the portions of Texas and northeastern Mexico had high GPP (based on per-pixel Z-scores). The El Niño event in 2015-2016 led to severe drought in much of Australia 40 . To test the ability of DryFlux to capture strong interannual variability related to global weather-producing phenomena, we calculated the per-pixel difference in GPP between 2011 and 2015 (Fig. 3c, d). Overall, both FLUXCOM and DryFlux showed an increase in carbon uptake over Australia in the La Niña year compared to the El Niño year (Fig. 3c, d), but, importantly, DryFlux showed a larger (38.6%) reduction in carbon uptake in the El Niño year compared to the La Niña year, compared to only a 11.8% reduction estimated by FLUXCOM (Fig. 3c, d). Together, these results suggest that models not tuned to dryland dynamics could underestimate interannual variability in dryland carbon fluxes. Supporting this idea, we find that in the Southwest, FLUXCOM and MODIS GPP uniformly overestimate carbon uptake in semi-arid grasslands and shrublands, and underestimate carbon uptake in semi-arid forests ( Supplementary Fig. 4 and Supplementary Fig. 5). We chose to compare spatial patterns in DryFlux to FLUXCOM instead of MODIS GPP in the Southwest and Australia (Figs. 2c, d and 3) for two reasons: first, we anticipate the two upscaled products will have similar applications and user bases and thus the comparison is more relevant to the research community, and second, FLUXCOM generally outperformed MODIS GPP in drylands (Supplementary Tables 1 and 2), so the comparison is more informative of DryFlux performance.
Our validation of globally upscaled DryFlux using eddy covariance sites with contrasting phenology in Africa, Australia, Europe, and South America (Supplementary Table 3) showed comparable or better performance than MODIS and FLUXCOM at the majority of global dryland sites (Supplementary Fig. 6 and Supplementary Table 2). The recent FluxSat 32 product was not compared with DryFlux, but will be included in future analyses. Like FLUXCOM, FluxSat was designed for global analyses, included few sites from the US Southwest in model building (four), and had poor performance at some dryland sites 32 . While eddy covariance data is not widely available throughout global drylands, our preliminary validation shows that DryFlux is likely broadly applicable beyond the Southwest. Sites where DryFlux performed particularly poorly were Australian sites with high measurement variability (AU-Lox) and savanna sites with phenology novel to the model (AU-Dry, AU-GWW, AU-DaS; Supplementary Fig. 6 and Supplementary Table 5). Future iterations of DryFlux should: incorporate sites with vegetation types poorly represented in the Southwest training dataset (savanna, deciduous broadleaf forest) in model building; pinpoint dryland regions where additional flux data is needed; and comprehensively validate DryFlux performance at global dryland sites. Furthermore, the potential for additional ecohydrological variables to inform GPP estimates, including soil moisture and actual evapotranspiration, should be explored. We anticipate the framework we developed will be readily extensible beyond Southwestern North America as dryland flux measurements become more available across the globe.
To establish that ecohydrological coupling was important for the improvements we saw in DryFlux, we built and validated a version of the DryFlux model without ecohydrological variables. The ecohydrological variables had the largest impact on predictions of interannual variation in fluxes driven by year-to-year variation in weather patterns in both the Southwest and Australia (Supplementary Figs. 7 and 8). Without ecohydrological variables, the model underestimated interannual variability in summer GPP at Southwest sites compared to the full DryFlux model ( Supplementary Fig. 7a, b). The model without ecohydrological variables applied to the full Southwest region had muted responses to differences in carbon uptake between a strong La Nina (2011) and strong El Nino (2015) years ( Supplementary  Fig. 7c, d). This trend was also shown in Australia dryland regions -when ecohydrological variables were excluded from DryFlux, the difference in GPP uptake between 2011 and 2015 decreased ( Supplementary Fig. 8). These results suggest that DryFlux's sensitivity to moisture conditions results in GPP estimates that are more responsive to interannual variability in weather patterns than existing models. Overall, water-carbon coupling appears particularly important for capturing interannual variability in dryland fluxes.
DryFlux more accurately represented other key features of dryland ecosystem dynamics including the characteristic dual peak in SW forests driven by springtime snowmelt and summer rains (Fig. 1f) and rapid carbon uptake in flashy response to moisture inputs in grasslands (as reflected in the root mean squared error (RMSE) and the root mean square of successive differences (RMSSD), a measure of variability in a time series in Fig. 1d-f) than the alternate models analyzed 41,42 . Furthermore, we consistently saw that months with abnormally high precipitation were often followed by months with abnormally high GPP ( Supplementary  Fig. 9). When applied to the global scale, the DryFlux model appeared more sensitive to the impact of moisture conditions on regional carbon cycling. Accounting for these dryland dynamics could improve global predictions of carbon uptake in earth system models. Our work underscores the tight ecohydrological coupling between water and carbon dynamics in water-limited ecosystem, and highlights the need to accurately represent these processes in models of terrestrial carbon uptake.
Drought impacts on the carbon cycle extend beyond dryland ecosystems 29,43 . Future climate change is likely to increase drought frequency and intensity in many regions globally 44 . Temperature rises and associated increases in atmospheric vaporpressure deficit 45,46 are likely to cause decreases in carbon uptake in ecosystems like forests that are not currently water limited 47 , potentially reducing the strength of the terrestrial carbon sink 48 . Drought duration, intensity, and frequency are expected to increase in dryland regions, making these systems especially vulnerable to climate change 49,50 . Already, the early 21st century has brought prolonged drought, warm temperatures, and extreme rainfall events to drylands including the Southwest 51,52 and Australia 53 . Models that account for drought impacts on the carbon cycle are crucial for predicting and understanding the global carbon cycle. Furthermore, accounting for drought impacts on carbon dynamics will become more important as temperatures increase and precipitation patterns change.
Overall, this study highlights the crucial need to better represent coupled water and carbon dynamics in dryland ecosystem models. Based on the comparison of our GPP product with a global model similar to those routinely used to benchmark Earth System models 54 , we suggest that dryland-driven interannual variability in the global carbon cycle may be underestimated by existing models that represent mainly vegetation greenness and therefore do not adequately account for the ecohydrological effects of annual and sub-annual moisture dynamics on vegetation productivity. Our DryFlux model indicated greater interannual variability than FLUXCOM or MODIS GPP, and was more highly correlated to observed interannual variability in eddy covariance data. Our product improves on comparable products tested because it is informed by a dense network of flux sites across varied dryland ecosystems, accounts for flashy ecosystem dynamics, and accounts for tight ecohydrological coupling between carbon and water cycling dynamics in drylands.

Methods
Carbon fluxes were upscaled from 24 eddy covariance sites across the North American Southwest and Northwestern Mexico (Supplementary Table 4) using remote sensing and gridded meteorological inputs using a machine learning (random forest) approach 55 . We used all available sites in each year, which are detailed in Supplementary Table 5. There were two steps to the upscaling process: first, relationships between predictor variables and monthly eddy covariance from flux towers were derived using random forest models. Second, the random forest models were applied to per-pixel gridded inputs to create spatially and temporally continuous GPP estimates from 2000 to 2016. All analyses were conducted in the R language (R version 4.0.2) and environment for statistical computing 56 .
Data acquisition. Relationships between predictor variables and monthly fluxes from eddy covariance towers were quantified using random forest models using the R 'caret' package 57 . Data inputs included 0.05°16-day Enhanced Vegetation Index (EVI) and NDVI data products from MODIS (MOD13C1v006) downloaded from the date closest to the 15th of each month for 2000-2016 58 . Elevation was acquired from the Shuttle Radar Topography Mission using the 'getData' function in the 'raster' package 59 . Precipitation, PET, vapor pressure, daily mean temperature, and monthly average daily maximum and minimum temperature (Tmax, Tmin) at 0.5°s patial resolution were downloaded from the Climatic Research Unit (CRU) 60,61 . For a given month, the previous month's precipitation from CRU was also included as a predictor. Day length was determined for the 15th or 16th of each month (corresponding to CRU dates) using the 'daylength' function in the 'geosphere' package and site latitude coordinates 62 . MAT and MAP were downloaded in 30 arc seconds from WorldClim using the 'getData' function in the 'raster' package 59,63 . All data products were aggregated to 0.5°spatial resolution using bilinear interpolation, and aligned to the same projection and extent using the 'projectRaster' function from the raster package in R.
To evaluate changing water regime effects on dryland system productivity, we used the SPEI. This multiscalar drought index is a good predictor of change in ecological responses to drought in drylands 64,65 . It accounts for the impacts of both supply-and demand-side limitations to carbon uptake (i.e., soil moisture and atmospheric vapor-pressure deficit) and also can be calculated to assess both intraand interannual water deficits 66 . SPEI was calculated from data using the 'SPEI' package 67 and included as a predictor from 1-month to 12-month timescales 68,69 . Time series of SPEI were calculated from time series of monthly precipitation and PET values using the function 'spei'.
Gap-filled eddy covariance data for the North American Southwest sites were acquired from site PIs (see Supplementary Table 4  Site-based random forest analysis, validation, and upscaling. The random forest model was trained using a random subset of 80% of the sites (19 sites; n = 1540 monthly observations), with 20% of the sites (n = 5; 366 monthly observations) held out for model testing 74 . The training method was repeated three times with 5-fold cross-validation. The minimum RMSE was used to select the optimal number of variables selected as candidates at each split (mtry) such that mtry = 10. The default number of trees (Ntree = 500) was used in model training. Variable selection was based on several factors intended to maximize both parsimony and model precision and accuracy. To avoid bias in importance metrics when there are highly correlated predictor variables, we assessed variable importance with conditional permutation importance metrics with the 'varimp (conditional = TRUE)' function in the 'party' package [75][76][77] . Importance metrics are measured as a drop in model accuracy when a specific variable is excluded from the model (the more model accuracy drops by excluding a variable, the more important that variable is in the prediction). If two predictor variables are highly correlated in nonconditional importance metrics, removal of one variable would not result in a large decrease in model accuracy and importance metrics for these variables could be underestimated. In contrast, conditional permutation metrics considers correlation between variables when assessing and provides more accurate importance metrics 76 . Variables and variable importance for the final random forest model are shown in Supplementary Fig. 3. Z-scores were calculated to evaluate anomalous GPP estimates and precipitation values. Z-scores were calculated for each year according to the following equation: such that the mean of all months for year 'i' is subtracted from mean precipitation ( Supplementary Fig. 9) or GPP estimate (Fig. 3) across all months and years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), then divided by the standard deviation of all precipitation or GPP estimates included in the analysis. For upscaling, random forest models were applied to global gridded satellite and meteorological inputs for masked dryland areas at 0.5°scale using the 'predict' function in the 'raster' package 59 . The 'predict' function applies a fitted model to each grid cell over a given spatial extent, using a stack of raster layers as inputs. In our case, the fitted model was the DryFlux random forest model trained in the Southwest and was applied over global drylands. The model had a training accuracy of R 2 = 0.815; RMSE = 0.521 with mtry = 10. The testing accuracy of the model was R 2 = 0.610 and RMSE = 0.876.
To evaluate DryFlux model predictions, we extracted the time series of GPP estimates from 2000 to 2016 of the DryFlux model for each of 24 flux sites and compared it to the eddy covariance tower GPP, MODIS GPP, and FLUXCOM GPP estimates. Then, 8-day 500 m MODIS GPP (MOD17A2H v006) values were extracted for all sites using Google Earth Engine and subset to dates included in the NDVI/EVI estimates 78 . Daily 0.5°resolution FLUXCOM GPP estimates using MODIS remote sensing and CRUJRA_v1 meteorological forcing data inputs (for consistency with the CRU datasets) were downloaded from the Data Portal of the Max Planck Institute for Biogeochemistry 31,79,80 . Daily FLUXCOM GPP estimates were used to calculate midmonth average values. Root mean square error (RMSE) values, correlation coefficients (r), coefficient of determination (R 2 ), and RMSSD were used to evaluate model performance. RMSSD, a measure of variability in a time series 41,81 , is obtained by taking the square root of the average squared successive differences. RMSSD values were calculated with the 'rmssd' function in the 'psych' package 82 , and r and R 2 values were calculated with the 'stats' package 56 . RMSE values were calculated with the 'RMSE' function in the 'caret' package 57 ,

Data availability
MODIS Vegetation Indices were downloaded from NASA's Land Processes Distributed Active Archive Center (LP DAAC) located at the USGS Earth Resources Observation and Science (EROS) Center and is available at https://lpdaac.usgs.gov/products/ mod13c1v006/. Climatological data were downloaded from the Climatic Research Unit (University of East Anglia) and Met Office at https://crudata.uea.ac.uk/cru/data/hrg/ cru_ts_4.04/ and were further used to calculate SPEI and vapor-pressure deficit variables. MAT, and MAP data from WorldClim (https://www.worldclim.org/) were downloaded using the 'raster' package in R. Elevation from the Shuttle Radar Topography Mission (https://srtm.csi.cgiar.org/) were downloaded using the 'raster' package in R. The MODIS GPP data product was downloaded from NASA's LP DAAC at the USGS EROS Center using Google Earth Engine and is available at https://lpdaac.usgs.gov/products/ mod17a2hv006/. Daily FLUXCOM GPP estimates were downloaded from the Data Portal of the Max Plank Institute for Biogeochemistry and are available at http:// www.fluxcom.org/. Daily flux tower GPP for global dryland testing sites is available at https://fluxnet.org/data/. Daily flux tower GPP for DryFlux training sites in the North American Southwest was acquired from site PIs and this data is available at https:// github.com/marthageb/DryFlux.